[HN Gopher] Logging in Python like a pro ___________________________________________________________________ Logging in Python like a pro Author : EntICOnc Score : 212 points Date : 2022-07-01 12:38 UTC (10 hours ago) (HTM) web link (guicommits.com) (TXT) w3m dump (guicommits.com) | jasonpeacock wrote: | This is very long (but thorough!) article about implementing | structured logging. | | Shorter version: Use structured logging! | | If using Python, use `structlog`: | https://www.structlog.org/en/stable/ | timost wrote: | I agree, structlog is such a quality of life improvement. | | It can be tricky to forward all logs through it [0] but it's | definitely worth it. | | [0] https://www.structlog.org/en/stable/standard-library.html | jteppinette wrote: | https://github.com/jteppinette/python-logfmter | | I made this library to make global structured logging in | Python super easy. | delijati wrote: | You always need some boilerplate to get structlog working. I | always use this (author) to have the needed defaults | https://github.com/peakiq/logma. | poopman68686 wrote: | Good Post! | mr-karan wrote: | Honestly, I like to keep the logger as decoupled and minimalist | as possible. There's https://vector.dev which can basically ship | your logs from source A to destination B (supports tonnes of | usecases). Separation of concerns makes things much more easier. | gregmac wrote: | I think this is a good overview, I especially like "descriptive, | contextual, reactive." I like to ask "who is going to see this | message and why? What action should they take?" | | I've also come to find a few "wide" log entries significantly | better than having many smaller log messages (especially as the | system grows). | | Ideally, one operation == one log message, and contains all | context: userid, sessionid, url path, domain, error | codes/messages, timing, etc. log error/warning if it fails, info | if success. It's much easier to follow than one operation spread | across a half dozen log messages. In dev, with one thing | happening at a time, it's no big deal, but in production there's | way more stuff happening and it is interspersed with hundreds of | lines of logs of other operations as well as other threads | running the same operation concurrently. | bakpakin wrote: | anyone else just define a function log(msg) the prints to stderr | or maybe stdout and use that instead for simple programs/daemons? | I never really got the big value proposition of seven-leveled | logging or the complexity of it. If I need to customize it I just | change my log function. | miohtama wrote: | A very good post on creating good log contexts. | | For powerful log channeling and notifications to Telegram, Slack, | etc. there is this Python package: | | https://pypi.org/project/notifiers/ | | I also created a logging handler for Discord, where one can get | notifications on errors to a private channel: | | https://github.com/tradingstrategy-ai/python-logging-discord... | Thaxll wrote: | Don't use warn, either info or error nothing in between. | m463 wrote: | I disliked the logging levels | | I really wanted to have an app with a reasonable default level, | then allow flags like --debug, --verbose plus --quiet and it | never seemed to map like I wanted. | | maybe it's obvious and I just don't see it. | LtWorf wrote: | Just try reading logs from journalctl, where you can filter | the levels you don't need and see errors and warnings in a | different colour. Then you will perhaps re-evaluate the | importance of levels. | 1337shadow wrote: | By default, use INFO to tell what has been done, use DEBUG to | describe what's going to be done and how it's deciding it. | gregmac wrote: | Warning is actually very useful. | | One of the biggest logging mistakes I see is people misusing | Error. What most developers consider an error -- external | service call failed, user gave bad input, or a timeout happened | - is often not really a problem. The service call is re-tried. | Users are going to user (and provide garbage data and then fix | it). None of these require operator intervention. | | To me, Errors imply action needed, or highlight an unresolved | situation. If the service call keeps failing after all retries, | and now a feature is impacted (eg a user won't get a report, or | there's a gap in data somewhere) then that can be an Error. | | All the temporary faults are useful as Warn to highlight them | and distinguish from other normal operations, but no action is | needed. This is most useful when a user reports something like | "it seems to be taking much longer than usual to get email | notifications" and when an operator looks at the logs, they're | full of warnings. It's also useful for monitoring: high number | of warnings in the log can cause a non-critical notification to | operators (during business hours; not a wake-me-up-at-2am | page). | jxramos wrote: | This is the first time I've ever learned about that `extra` | argument to logging calls | | Interesting capabilities. There's a rundown with log output | examples seen in | https://docs.python.org/3/library/logging.html#logging.Logge... | mekster wrote: | Does anyone ever log in multi dimensional way at all? | | Everyone seems to just go, logger.info('Doing | that') | | But logs really should be covering the context instead of being | single dimensional by being placed at the execution point. | //Sorry, I don't know python well but trying to run a lambda as a | parameter. myLogger.info('Doing that', lambda : ( | doThat() whatNot())) | | And then, you know exactly what the log is covering as well as | take a benchmark of the said parts and report how long it took to | run. | | You don't need 'Done that' logging as if you throw inside, the | log doesn't happen. | | It seems no logging libraries do that but it's an essential part | of how I do logging. | aunty_helen wrote: | "%(asctime)s - %(levelname)-5s - %(name)-20.20s - %(message)s - | (%(filename)s:%(lineno)s)" | | Here's a logging format that I use and probably you should too. | Among other things, it gets around most of the context issues by | outputting the file and line number of the code that called the | log. | | Before I would try and make every log message unique or mostly | unique so there weren't double ups. | | I don't agree with the authors logging clutter though. If you're | manually eyeballing the logs looking for a message you're already | doing it wrong. Use search, have ids in the messages etc. | | Remember folks, debug logging on because if it's not logged, when | it breaks you don't have the logs... | LtWorf wrote: | If I can't query for warning level on journalctl it's useless. | aunty_helen wrote: | That would be the levelname part. But if your log aggregator | or subsystem is hindering your ability to use your logs, I | would look to solve that problem before worrying about | anything within your application. | | Even storing logs in a format that you need to decode or | download would be a red flag for me. Seconds of debugging all | of a sudden turns into minutes or even hours. | connorbrinton wrote: | Another great project for Python logging is loguru: | https://loguru.readthedocs.io/en/stable/overview.html | | One of my favorite features of loguru is the ability to | contextualize logging messages using a context manager. | with logger.contextualize(user_name=user_name): | handle_request(...) | | Within the context manager, all loguru logging messages include | the contextual information. This is especially nice when used in | combination with loguru's native JSON formatting support and a | JSON-compatible log archiving system. | | One downside of loguru is that it doesn't include messages from | Python's standard logging system, but it can be integrated as a | logging handler for the standard logging system. At the company | where I work we've created a helper similar to Datadog's ddtrace- | run executable to automatically set up these integrations and log | message formatting with loguru | codethief wrote: | Does loguru come with a lot of global state like the standard | logging library? In my experience this makes logging a real | PITA when forking multiple processes... | isoprophlex wrote: | Cannot recommend loguru highly enough. After a coworker showed | it to me, I fell in love immediately. | | I'll never go back to that rancid pile of shit that is the | builtin logger lib. | memco wrote: | I mentioned this almost every time logging and python come up but | one of my favorite feature is logging.exception: | https://docs.python.org/3/library/logging.html#logging.Logge.... | It allows you to attach the whole exception with stack trace | without having to explicitly used "as e" and then use traceback | or embed the error in the message. Something like | logger.exception("Failed to connect.") will print the message and | the exception in one go. | atorodius wrote: | I'm using this too, but I always found it very unreadable for | people unfamiliar with it. It's so surprising that it catches | the exception for you! I end up always adding a commend | | ``` logger.exception("Failed to connect.") # Will print | exception. ``` | Narushia wrote: | I second this, very useful. I kind of expected the article to | mention it, but it didn't! | [deleted] | kordlessagain wrote: | As a logger in a previous lifetime, I approve of this content. | orf wrote: | The Python logging module is one of the worst modules in the | stdlib. You can see it's Java inspiration all over it, as being | one of the oldest modules it was created before anyone really | knew what "pythonic" code was. | | Things I really dislike about it is the lack of "context". | Usually log messages are about something - often a request. You | can't attach a request ID or something else a nested set of | function calls that log. | | The whole library is just really rubbish and has not aged well. | There's no standard way of outputting JSON logs, but there's a | built in way to spawn a socket server that allows the logging | configuration to be updated remotely via a bespoke protocol? And | everything is built mostly around logging to files. | | It also hits that perfect sweet spot of being both overly | engineered and totally inflexible. I wanted to funnel our log | messages to s3 via the "s3fs" module, which supports a "file | like" object that gets written to s3. I had to hack around the | various handlers to support this because it assumes it's a "real" | file. | | Hate it. | [deleted] | systemvoltage wrote: | Agreed. One of the most confusing libraries with a ton of magic | going on behind the scenes. I also detest PyTest, although not | a std lib, pytest is one of the worst in terms of explicitness. | It's very useful but that's orthogonal. I've seen newcomers to | python struggle to grasp how pytest just manipulates a billion | things behind the scenes. | | Overall, I prefer less magic and more explicitness. | simonw wrote: | Disagree on pytest: it's the first testing framework I've | used in ANY language which has genuinely made me enjoy | writing tests. | | It has a bit of an initial learning curve, but once you | understand how fixtures work it's amazing how productive you | can be with it. | [deleted] | miohtama wrote: | The implicit magic makes writing pytest tests painless. | | Also with improvement of type hinting, tooling like PyCharm | starts to understand pytest better and make it easier to | read and manage and a lot of the magic can be dispellt by | written out explicitly. Though this need some discipline | from the dev team. | codethief wrote: | Strong agree on all points! The number of times I had to | debug into the pytest source code because developers (the | ones using pytest) thought it would be cool to use some | decorator black magic to manipulate global state behind the | scenes which pytest not just makes very easy but actually | encourages and which was obviously going to break sooner or | later... | | And regarding the sibling comment: Yes, I prefer unittest as | well. | oefrha wrote: | PSL unittest is very explicit and has no magic. Ironically it | also has Java written all over it. You must be the first | person I met who prefer unittest to pytest. | dragonwriter wrote: | > PSL unittest is very explicit and has no magic. | Ironically it also has Java written all over it. | | By "Java written all over it" do you mean that like JUnit | for Java, it's an implementation of the xUnit pattern | derived from SUnit for Smalltalk? | [deleted] | oefrha wrote: | It didn't have to copy everything down to the camelCase | if it was just an implementation of the xUnit pattern | with no Java influence. | dragonwriter wrote: | camelCase is idiomatic in Smalltalk, which as a language | and a community was a pretty big influence on other | dynamic OO languages like Python and Ruby, and from | whence SUnit comes, and thus the whole xUnit world, | derives. | | So, it's maybe not the best evidence that the Python | version is particularly a clone of the Java port. | txru wrote: | I sometimes prefer unittest too! | | Pytest is very opinionated about how and when it's going to | be run. If you want to put a test in say, Jupyter or a | Markdown block, unittest is much more amenable to just | doing what you want without having to read through | documentation to break a whole bunch of magic behavior. | bornfreddy wrote: | What do you mean? Pytest rules! Yes there is some magic, but | if you let it take control, it is very convenient to write | and run tests with it. Just... you need to trust it and it | will do its job. | systemvoltage wrote: | I use pytest all the time. Doesn't mean it is well thought | out. It could have been so much better with more visibility | and control, and less magic. | akhmatova wrote: | _The Python logging module is one of the worst modules in the | stdlib_. | | One has to remember that logging was never intended to be some | infinitely generalizable metaprotocol for massively streaming | parallel notifications to the cloud or whatever. It was meant | as a replacement for "if (DEBUG > 4): print(...)". For which I | think it does quite well, thank you. | | _The whole library is just really rubbish and has not aged | well. Hate it._ | | So what have you written that has made people's (not just your | bosses' or your clients') lives easier, for nearly 20 years now | (if only incrementally)? Do share. | | I see and grant your points, but I find this strong emotional | rebuke of something that was never intended to be anything more | than a simple convenience library to be well, strange. | orf wrote: | It clearly is supposed to be more than a convenience library. | It's _the_ blessed, stdlib way to do _logging_ with Python. | The module isn't called "print_with_levels". | | So now everything needs to work with it, it's internals and | it's way of doing things. And to top it off it's near | impossible to refactor or replace. | akhmatova wrote: | Is there an alternative that you would prefer to bless? | thechao wrote: | I can't agree with you more. I switched to Python's logger off | of my janky thing about 6 months ago, and I'm hating every | second of it. I still have no clue how to turn it on, turn it | off, get it to fork output (stdout, stderr, file), nor how to | override its behavior. | | It's literally easier to just have an IOBuffer laying around. | oefrha wrote: | There are valid complaints in the grandparent, but if you | can't figure out how to fork output to stdout, stderr and | file in six months, that's really on you. You should know how | to register multiple handlers if you spent five minutes | reading the documentation. | thechao wrote: | I can easily fork output _manually_ ; I can't do it | reliably using Python's logger. | maleldil wrote: | Echoing grandparent's comment, it's not that hard: | | https://docs.python.org/3/library/logging.html#logging.Lo | gge... | | For dictConfig, the documentation has examples | | https://docs.python.org/3/library/logging.config.html#obj | ect... | | Python's logging has its faults, but you're complaining | about something I could find in 5 minutes of Google. | thechao wrote: | None of which work reliably across multiple modules all | fighting for control of the logger. Reliable is the key | term, here. | slaymaker1907 wrote: | Eh, it's really not that bad. I implemented this for a | tool at work last year and it seemed pretty | straightforward. | paulfurtado wrote: | It's definitely sad that there isn't built-in context support, | but if you're looking for a solution to this, it is actually | very easy to attach context to logs using threadlocal and a | logging filter. | | When running a webserver though, an even simpler trick is to | add middleware that sets the current request's info as the | current thread's name, and then including threadName in your | log format. | andrewingram wrote: | ContextVars are seemingly more robust and work with async. We | use them quite extensively for adding context to logs. | MapleWalnut wrote: | TIL that ContextVars can work in place of thread locals | too. Thanks for pointing that out! | | I had mistakenly assumed ContextVars only worked for async | code. | nerdwaller wrote: | I was going to say the same, I combine ContextVars with | LoggerAdapters[1]. | | [1] https://docs.python.org/3/library/logging.html#logging. | Logge... | 1337shadow wrote: | Take a look at pylogctx maybe it'll do what you need. | memco wrote: | The contextual support is lacking but it can be done. See | this comment from a previous discussion where I briefly | described how I've used the record factory to help: | https://news.ycombinator.com/item?id=31484527 | helsinki wrote: | It's rock solid, at least. I've also had the displeasure of | working in the logging package. Totally outdated. I created a | JSON formatter (also support for context). Huge pain, but | always works. | 1337shadow wrote: | Maybe worth trying pylogctx and python-logging-loki or see how | it's done there. | doliveira wrote: | > there's a built in way to spawn a socket server that allows | the logging configuration to be updated remotely via a bespoke | protocol | | That sounds like there's a Log4shell-like vulnerability waiting | to be found. But I couldn't even find it, I found the | SocketHandler but that's just a log destination. | | EDIT: shockingly, I did find it: | https://docs.python.org/3/howto/logging-cookbook.html#config... | emmelaich wrote: | That might explain why it lacks a NOTICE level. That bugged the | hell out of me with Java. | | Presumably these levels derive from syslog or older. I've no | idea why they omitted NOTICE. | miohtama wrote: | You can quite easily add your own logging levels - it is not | hardcoded. I think it's even covered in the documentation, as | I have done it before. | miohtama wrote: | Not to defend the legacy of good old logging, but if you have | ever tried to log in Node.js applications Python logging starts | to feel godsent. | rgovostes wrote: | I've been writing Python daily for 15 years but when I have to | do any configuration beyond `basicConfig()`, I still groan. In | many instances I have wanted to do something like temporarily | turn on logging of HTTP requests, which turns into a much more | involved effort than it ought to be. | | I just recently discovered that there's no documented way to | dump the current configuration | (https://stackoverflow.com/q/72624883) so you can see what | loggers exist and where their output is going. | drzoltar wrote: | Are there any good 3rd party alternatives that work better? | Jamie9912 wrote: | loguru perhaps | sage92 wrote: | This is the only library I that both can use out of the box | and extend to my needs. | seabrookmx wrote: | structlog | delijati wrote: | You always need some boilerplate to get structlog working. | I always use this (author) to have the needed defaults | https://github.com/peakiq/logma. | raverbashing wrote: | Yeah, it exists | | But IMO it inherited most of loggers problems while not | offering a better interface | | It is basically a kwargs to json or 'key=value' converter | | (Also json logs are overrated. There I said it. Especially | one key per line, please don't) | formerly_proven wrote: | It's unfortunate how blind copying of log4j into many languages | seems to have inhibited improvements in logging in the last | ~20ish years. | MichaelRazum wrote: | Yep. Have to agree. It works, but is really messy. | LtWorf wrote: | I normally just import syslog or journald. The latter allows | for arbitrary fields to be defined and then searched, but it's | not in the stdlib. | 1337shadow wrote: | Great idea but how do you work this out in a container where | the python process is PID 1? | LtWorf wrote: | bind mount /dev/log in the container of course. | | Also logging can be set up remotely, but it's easier to do | like that from the container and then configure the hos to | send everything to a single log machine. | sorokod wrote: | For whatever it's worth, java logging has mapped diagnostic | context (MDC). | efitz wrote: | Very good advice. | | The advice I give to people is "think about logging like a | feature with customers, just like any other feature." If you | think about logging this way, you ideally put yourself in the | position of your logs' "customer"- a bleary-eyed teammate who | just got paged out of bed at 2am, or a security guy who isn't | intimately familiar with your code. Providing extra context for | those users is really kind, as they don't have a "secret decoder | ring". | | Since the article discusses it, I also highly recommend AWS | CloudWatch logs. It's super easy to log to CloudWatch from Lambda | with the Python logger class. It's a little harder from EC2, but | not really that difficult. The advantage is that you don't need | to SSH somewhere and navigate to a file and then locate the | interesting stuff in that file - You just keep the CloudWatch | console open (or keep a link to it handy) and the latest stuff is | on top with easy filtering. | pid-1 wrote: | Cloudwatch is cheap (even free) for small environments and has | very small operational overhead. | | The cost, however, can easily get out of hand and AWS does not | have per resource billing enabled by default. So it might not | be obvious which Log Group is responsible for for which charge. | | I also think its console is way inferior than something like | Grafana for searching and visualisation. | jstarfish wrote: | My standard, to the extent possible: make your logs clear | enough that your _attorney_ would be able to understand them. | | You may well find yourself wishing you did this if you also | output DEBUG logs in production. | dragonwriter wrote: | > It's super easy to log to CloudWatch from Lambda with the | Python logger class | | Or even just print(): You've got a unique log group for a | lambda, CloudWatch logs autonatically timestamps entries, and | stdout & stderr from a lambda both go to the current CloudWatch | log stream in the lambda's CloudWatch logs group. | efitz wrote: | My experience with Python in Lambda was that my functions ran | significantly faster when I used logger rather than print for | logging. I agree that Lambda captures stdout to CloudWatch, | but my experience was that logger was better (and more | intentional). | | ed: clarified that this relates to AWS Lambda only | [deleted] | AmpsterMan wrote: | I don't know if it's the way cloud watch works in my current | company or what, but I preferred SSH into the server and grep | for what I want lol. | | I miss that I can't do that any longer | dygd wrote: | Using the official CLI (aws logs get-log-events) or | https://github.com/jorgebastida/awslogs is pretty close to | SSH-ing and grepping. | AmpsterMan wrote: | Oh thanks I'll have to give that a shot! It may also just | be that I need to not get so caught up in learning a new | way to do something. | efitz wrote: | I hear you, and when I did some research on this in the past, | going to get logs was the most common reason for engineers to | SSH to production hosts. | | As a security guy, I hate SSH to production (the whole | "cattle, not pets" thing). In my last company we had an | internal tool to federate you to the AWS console. We had | runbooks in a wiki, and had links literally to the logs for a | particular component/service/region - the link would federate | you to the right account and take you directly to the target | log in cloudwatch logs in the appropriate region. Safer and | easier than ssh-to-prod. | alanwreath wrote: | given you've configured logging somewhere early on in the | bootstrap phase of your app - there's really few reason to go up | and above having more than these two lines in all your logging | modules import logging logger = | logging.getLogger(__name__) | | But boy oh have I seen some funny business where people attach | the logger to their objects like so class Wat: | def __init__(self): self.logger = | logging.getLogger(...) | | All for what? For WAT?! | 1337shadow wrote: | To inherit the logger attribute in subclasses?? Save an import? | Nimitz14 wrote: | That is a super common mistake from beginner python | programmers. | hotfixguru wrote: | Best way to find logs connected to one of your API requests is by | using correlation IDs, which are injected to each log message. Me | and my friend wrote libraries to do this for both Django[1] and | ASGI apps(starlette, FastAPI)[2] | | [1] https://github.com/snok/django-guid | | [2] https://github.com/snok/asgi-correlation-id ___________________________________________________________________ (page generated 2022-07-01 23:00 UTC)