[HN Gopher] Logging in Python like a pro
       ___________________________________________________________________
        
       Logging in Python like a pro
        
       Author : EntICOnc
       Score  : 212 points
       Date   : 2022-07-01 12:38 UTC (10 hours ago)
        
 (HTM) web link (guicommits.com)
 (TXT) w3m dump (guicommits.com)
        
       | jasonpeacock wrote:
       | This is very long (but thorough!) article about implementing
       | structured logging.
       | 
       | Shorter version: Use structured logging!
       | 
       | If using Python, use `structlog`:
       | https://www.structlog.org/en/stable/
        
         | timost wrote:
         | I agree, structlog is such a quality of life improvement.
         | 
         | It can be tricky to forward all logs through it [0] but it's
         | definitely worth it.
         | 
         | [0] https://www.structlog.org/en/stable/standard-library.html
        
           | jteppinette wrote:
           | https://github.com/jteppinette/python-logfmter
           | 
           | I made this library to make global structured logging in
           | Python super easy.
        
         | delijati wrote:
         | You always need some boilerplate to get structlog working. I
         | always use this (author) to have the needed defaults
         | https://github.com/peakiq/logma.
        
       | poopman68686 wrote:
       | Good Post!
        
       | mr-karan wrote:
       | Honestly, I like to keep the logger as decoupled and minimalist
       | as possible. There's https://vector.dev which can basically ship
       | your logs from source A to destination B (supports tonnes of
       | usecases). Separation of concerns makes things much more easier.
        
       | gregmac wrote:
       | I think this is a good overview, I especially like "descriptive,
       | contextual, reactive." I like to ask "who is going to see this
       | message and why? What action should they take?"
       | 
       | I've also come to find a few "wide" log entries significantly
       | better than having many smaller log messages (especially as the
       | system grows).
       | 
       | Ideally, one operation == one log message, and contains all
       | context: userid, sessionid, url path, domain, error
       | codes/messages, timing, etc. log error/warning if it fails, info
       | if success. It's much easier to follow than one operation spread
       | across a half dozen log messages. In dev, with one thing
       | happening at a time, it's no big deal, but in production there's
       | way more stuff happening and it is interspersed with hundreds of
       | lines of logs of other operations as well as other threads
       | running the same operation concurrently.
        
       | bakpakin wrote:
       | anyone else just define a function log(msg) the prints to stderr
       | or maybe stdout and use that instead for simple programs/daemons?
       | I never really got the big value proposition of seven-leveled
       | logging or the complexity of it. If I need to customize it I just
       | change my log function.
        
       | miohtama wrote:
       | A very good post on creating good log contexts.
       | 
       | For powerful log channeling and notifications to Telegram, Slack,
       | etc. there is this Python package:
       | 
       | https://pypi.org/project/notifiers/
       | 
       | I also created a logging handler for Discord, where one can get
       | notifications on errors to a private channel:
       | 
       | https://github.com/tradingstrategy-ai/python-logging-discord...
        
       | Thaxll wrote:
       | Don't use warn, either info or error nothing in between.
        
         | m463 wrote:
         | I disliked the logging levels
         | 
         | I really wanted to have an app with a reasonable default level,
         | then allow flags like --debug, --verbose plus --quiet and it
         | never seemed to map like I wanted.
         | 
         | maybe it's obvious and I just don't see it.
        
           | LtWorf wrote:
           | Just try reading logs from journalctl, where you can filter
           | the levels you don't need and see errors and warnings in a
           | different colour. Then you will perhaps re-evaluate the
           | importance of levels.
        
           | 1337shadow wrote:
           | By default, use INFO to tell what has been done, use DEBUG to
           | describe what's going to be done and how it's deciding it.
        
         | gregmac wrote:
         | Warning is actually very useful.
         | 
         | One of the biggest logging mistakes I see is people misusing
         | Error. What most developers consider an error -- external
         | service call failed, user gave bad input, or a timeout happened
         | - is often not really a problem. The service call is re-tried.
         | Users are going to user (and provide garbage data and then fix
         | it). None of these require operator intervention.
         | 
         | To me, Errors imply action needed, or highlight an unresolved
         | situation. If the service call keeps failing after all retries,
         | and now a feature is impacted (eg a user won't get a report, or
         | there's a gap in data somewhere) then that can be an Error.
         | 
         | All the temporary faults are useful as Warn to highlight them
         | and distinguish from other normal operations, but no action is
         | needed. This is most useful when a user reports something like
         | "it seems to be taking much longer than usual to get email
         | notifications" and when an operator looks at the logs, they're
         | full of warnings. It's also useful for monitoring: high number
         | of warnings in the log can cause a non-critical notification to
         | operators (during business hours; not a wake-me-up-at-2am
         | page).
        
       | jxramos wrote:
       | This is the first time I've ever learned about that `extra`
       | argument to logging calls
       | 
       | Interesting capabilities. There's a rundown with log output
       | examples seen in
       | https://docs.python.org/3/library/logging.html#logging.Logge...
        
       | mekster wrote:
       | Does anyone ever log in multi dimensional way at all?
       | 
       | Everyone seems to just go,                 logger.info('Doing
       | that')
       | 
       | But logs really should be covering the context instead of being
       | single dimensional by being placed at the execution point.
       | //Sorry, I don't know python well but trying to run a lambda as a
       | parameter.       myLogger.info('Doing that', lambda : (
       | doThat()         whatNot()))
       | 
       | And then, you know exactly what the log is covering as well as
       | take a benchmark of the said parts and report how long it took to
       | run.
       | 
       | You don't need 'Done that' logging as if you throw inside, the
       | log doesn't happen.
       | 
       | It seems no logging libraries do that but it's an essential part
       | of how I do logging.
        
       | aunty_helen wrote:
       | "%(asctime)s - %(levelname)-5s - %(name)-20.20s - %(message)s -
       | (%(filename)s:%(lineno)s)"
       | 
       | Here's a logging format that I use and probably you should too.
       | Among other things, it gets around most of the context issues by
       | outputting the file and line number of the code that called the
       | log.
       | 
       | Before I would try and make every log message unique or mostly
       | unique so there weren't double ups.
       | 
       | I don't agree with the authors logging clutter though. If you're
       | manually eyeballing the logs looking for a message you're already
       | doing it wrong. Use search, have ids in the messages etc.
       | 
       | Remember folks, debug logging on because if it's not logged, when
       | it breaks you don't have the logs...
        
         | LtWorf wrote:
         | If I can't query for warning level on journalctl it's useless.
        
           | aunty_helen wrote:
           | That would be the levelname part. But if your log aggregator
           | or subsystem is hindering your ability to use your logs, I
           | would look to solve that problem before worrying about
           | anything within your application.
           | 
           | Even storing logs in a format that you need to decode or
           | download would be a red flag for me. Seconds of debugging all
           | of a sudden turns into minutes or even hours.
        
       | connorbrinton wrote:
       | Another great project for Python logging is loguru:
       | https://loguru.readthedocs.io/en/stable/overview.html
       | 
       | One of my favorite features of loguru is the ability to
       | contextualize logging messages using a context manager.
       | with logger.contextualize(user_name=user_name):
       | handle_request(...)
       | 
       | Within the context manager, all loguru logging messages include
       | the contextual information. This is especially nice when used in
       | combination with loguru's native JSON formatting support and a
       | JSON-compatible log archiving system.
       | 
       | One downside of loguru is that it doesn't include messages from
       | Python's standard logging system, but it can be integrated as a
       | logging handler for the standard logging system. At the company
       | where I work we've created a helper similar to Datadog's ddtrace-
       | run executable to automatically set up these integrations and log
       | message formatting with loguru
        
         | codethief wrote:
         | Does loguru come with a lot of global state like the standard
         | logging library? In my experience this makes logging a real
         | PITA when forking multiple processes...
        
         | isoprophlex wrote:
         | Cannot recommend loguru highly enough. After a coworker showed
         | it to me, I fell in love immediately.
         | 
         | I'll never go back to that rancid pile of shit that is the
         | builtin logger lib.
        
       | memco wrote:
       | I mentioned this almost every time logging and python come up but
       | one of my favorite feature is logging.exception:
       | https://docs.python.org/3/library/logging.html#logging.Logge....
       | It allows you to attach the whole exception with stack trace
       | without having to explicitly used "as e" and then use traceback
       | or embed the error in the message. Something like
       | logger.exception("Failed to connect.") will print the message and
       | the exception in one go.
        
         | atorodius wrote:
         | I'm using this too, but I always found it very unreadable for
         | people unfamiliar with it. It's so surprising that it catches
         | the exception for you! I end up always adding a commend
         | 
         | ``` logger.exception("Failed to connect.") # Will print
         | exception. ```
        
         | Narushia wrote:
         | I second this, very useful. I kind of expected the article to
         | mention it, but it didn't!
        
       | [deleted]
        
       | kordlessagain wrote:
       | As a logger in a previous lifetime, I approve of this content.
        
       | orf wrote:
       | The Python logging module is one of the worst modules in the
       | stdlib. You can see it's Java inspiration all over it, as being
       | one of the oldest modules it was created before anyone really
       | knew what "pythonic" code was.
       | 
       | Things I really dislike about it is the lack of "context".
       | Usually log messages are about something - often a request. You
       | can't attach a request ID or something else a nested set of
       | function calls that log.
       | 
       | The whole library is just really rubbish and has not aged well.
       | There's no standard way of outputting JSON logs, but there's a
       | built in way to spawn a socket server that allows the logging
       | configuration to be updated remotely via a bespoke protocol? And
       | everything is built mostly around logging to files.
       | 
       | It also hits that perfect sweet spot of being both overly
       | engineered and totally inflexible. I wanted to funnel our log
       | messages to s3 via the "s3fs" module, which supports a "file
       | like" object that gets written to s3. I had to hack around the
       | various handlers to support this because it assumes it's a "real"
       | file.
       | 
       | Hate it.
        
         | [deleted]
        
         | systemvoltage wrote:
         | Agreed. One of the most confusing libraries with a ton of magic
         | going on behind the scenes. I also detest PyTest, although not
         | a std lib, pytest is one of the worst in terms of explicitness.
         | It's very useful but that's orthogonal. I've seen newcomers to
         | python struggle to grasp how pytest just manipulates a billion
         | things behind the scenes.
         | 
         | Overall, I prefer less magic and more explicitness.
        
           | simonw wrote:
           | Disagree on pytest: it's the first testing framework I've
           | used in ANY language which has genuinely made me enjoy
           | writing tests.
           | 
           | It has a bit of an initial learning curve, but once you
           | understand how fixtures work it's amazing how productive you
           | can be with it.
        
             | [deleted]
        
             | miohtama wrote:
             | The implicit magic makes writing pytest tests painless.
             | 
             | Also with improvement of type hinting, tooling like PyCharm
             | starts to understand pytest better and make it easier to
             | read and manage and a lot of the magic can be dispellt by
             | written out explicitly. Though this need some discipline
             | from the dev team.
        
           | codethief wrote:
           | Strong agree on all points! The number of times I had to
           | debug into the pytest source code because developers (the
           | ones using pytest) thought it would be cool to use some
           | decorator black magic to manipulate global state behind the
           | scenes which pytest not just makes very easy but actually
           | encourages and which was obviously going to break sooner or
           | later...
           | 
           | And regarding the sibling comment: Yes, I prefer unittest as
           | well.
        
           | oefrha wrote:
           | PSL unittest is very explicit and has no magic. Ironically it
           | also has Java written all over it. You must be the first
           | person I met who prefer unittest to pytest.
        
             | dragonwriter wrote:
             | > PSL unittest is very explicit and has no magic.
             | Ironically it also has Java written all over it.
             | 
             | By "Java written all over it" do you mean that like JUnit
             | for Java, it's an implementation of the xUnit pattern
             | derived from SUnit for Smalltalk?
        
               | [deleted]
        
               | oefrha wrote:
               | It didn't have to copy everything down to the camelCase
               | if it was just an implementation of the xUnit pattern
               | with no Java influence.
        
               | dragonwriter wrote:
               | camelCase is idiomatic in Smalltalk, which as a language
               | and a community was a pretty big influence on other
               | dynamic OO languages like Python and Ruby, and from
               | whence SUnit comes, and thus the whole xUnit world,
               | derives.
               | 
               | So, it's maybe not the best evidence that the Python
               | version is particularly a clone of the Java port.
        
             | txru wrote:
             | I sometimes prefer unittest too!
             | 
             | Pytest is very opinionated about how and when it's going to
             | be run. If you want to put a test in say, Jupyter or a
             | Markdown block, unittest is much more amenable to just
             | doing what you want without having to read through
             | documentation to break a whole bunch of magic behavior.
        
           | bornfreddy wrote:
           | What do you mean? Pytest rules! Yes there is some magic, but
           | if you let it take control, it is very convenient to write
           | and run tests with it. Just... you need to trust it and it
           | will do its job.
        
             | systemvoltage wrote:
             | I use pytest all the time. Doesn't mean it is well thought
             | out. It could have been so much better with more visibility
             | and control, and less magic.
        
         | akhmatova wrote:
         | _The Python logging module is one of the worst modules in the
         | stdlib_.
         | 
         | One has to remember that logging was never intended to be some
         | infinitely generalizable metaprotocol for massively streaming
         | parallel notifications to the cloud or whatever. It was meant
         | as a replacement for "if (DEBUG > 4): print(...)". For which I
         | think it does quite well, thank you.
         | 
         |  _The whole library is just really rubbish and has not aged
         | well. Hate it._
         | 
         | So what have you written that has made people's (not just your
         | bosses' or your clients') lives easier, for nearly 20 years now
         | (if only incrementally)? Do share.
         | 
         | I see and grant your points, but I find this strong emotional
         | rebuke of something that was never intended to be anything more
         | than a simple convenience library to be well, strange.
        
           | orf wrote:
           | It clearly is supposed to be more than a convenience library.
           | It's _the_ blessed, stdlib way to do _logging_ with Python.
           | The module isn't called "print_with_levels".
           | 
           | So now everything needs to work with it, it's internals and
           | it's way of doing things. And to top it off it's near
           | impossible to refactor or replace.
        
             | akhmatova wrote:
             | Is there an alternative that you would prefer to bless?
        
         | thechao wrote:
         | I can't agree with you more. I switched to Python's logger off
         | of my janky thing about 6 months ago, and I'm hating every
         | second of it. I still have no clue how to turn it on, turn it
         | off, get it to fork output (stdout, stderr, file), nor how to
         | override its behavior.
         | 
         | It's literally easier to just have an IOBuffer laying around.
        
           | oefrha wrote:
           | There are valid complaints in the grandparent, but if you
           | can't figure out how to fork output to stdout, stderr and
           | file in six months, that's really on you. You should know how
           | to register multiple handlers if you spent five minutes
           | reading the documentation.
        
             | thechao wrote:
             | I can easily fork output _manually_ ; I can't do it
             | reliably using Python's logger.
        
               | maleldil wrote:
               | Echoing grandparent's comment, it's not that hard:
               | 
               | https://docs.python.org/3/library/logging.html#logging.Lo
               | gge...
               | 
               | For dictConfig, the documentation has examples
               | 
               | https://docs.python.org/3/library/logging.config.html#obj
               | ect...
               | 
               | Python's logging has its faults, but you're complaining
               | about something I could find in 5 minutes of Google.
        
               | thechao wrote:
               | None of which work reliably across multiple modules all
               | fighting for control of the logger. Reliable is the key
               | term, here.
        
               | slaymaker1907 wrote:
               | Eh, it's really not that bad. I implemented this for a
               | tool at work last year and it seemed pretty
               | straightforward.
        
         | paulfurtado wrote:
         | It's definitely sad that there isn't built-in context support,
         | but if you're looking for a solution to this, it is actually
         | very easy to attach context to logs using threadlocal and a
         | logging filter.
         | 
         | When running a webserver though, an even simpler trick is to
         | add middleware that sets the current request's info as the
         | current thread's name, and then including threadName in your
         | log format.
        
           | andrewingram wrote:
           | ContextVars are seemingly more robust and work with async. We
           | use them quite extensively for adding context to logs.
        
             | MapleWalnut wrote:
             | TIL that ContextVars can work in place of thread locals
             | too. Thanks for pointing that out!
             | 
             | I had mistakenly assumed ContextVars only worked for async
             | code.
        
             | nerdwaller wrote:
             | I was going to say the same, I combine ContextVars with
             | LoggerAdapters[1].
             | 
             | [1] https://docs.python.org/3/library/logging.html#logging.
             | Logge...
        
           | 1337shadow wrote:
           | Take a look at pylogctx maybe it'll do what you need.
        
           | memco wrote:
           | The contextual support is lacking but it can be done. See
           | this comment from a previous discussion where I briefly
           | described how I've used the record factory to help:
           | https://news.ycombinator.com/item?id=31484527
        
         | helsinki wrote:
         | It's rock solid, at least. I've also had the displeasure of
         | working in the logging package. Totally outdated. I created a
         | JSON formatter (also support for context). Huge pain, but
         | always works.
        
         | 1337shadow wrote:
         | Maybe worth trying pylogctx and python-logging-loki or see how
         | it's done there.
        
         | doliveira wrote:
         | > there's a built in way to spawn a socket server that allows
         | the logging configuration to be updated remotely via a bespoke
         | protocol
         | 
         | That sounds like there's a Log4shell-like vulnerability waiting
         | to be found. But I couldn't even find it, I found the
         | SocketHandler but that's just a log destination.
         | 
         | EDIT: shockingly, I did find it:
         | https://docs.python.org/3/howto/logging-cookbook.html#config...
        
         | emmelaich wrote:
         | That might explain why it lacks a NOTICE level. That bugged the
         | hell out of me with Java.
         | 
         | Presumably these levels derive from syslog or older. I've no
         | idea why they omitted NOTICE.
        
           | miohtama wrote:
           | You can quite easily add your own logging levels - it is not
           | hardcoded. I think it's even covered in the documentation, as
           | I have done it before.
        
         | miohtama wrote:
         | Not to defend the legacy of good old logging, but if you have
         | ever tried to log in Node.js applications Python logging starts
         | to feel godsent.
        
         | rgovostes wrote:
         | I've been writing Python daily for 15 years but when I have to
         | do any configuration beyond `basicConfig()`, I still groan. In
         | many instances I have wanted to do something like temporarily
         | turn on logging of HTTP requests, which turns into a much more
         | involved effort than it ought to be.
         | 
         | I just recently discovered that there's no documented way to
         | dump the current configuration
         | (https://stackoverflow.com/q/72624883) so you can see what
         | loggers exist and where their output is going.
        
         | drzoltar wrote:
         | Are there any good 3rd party alternatives that work better?
        
           | Jamie9912 wrote:
           | loguru perhaps
        
             | sage92 wrote:
             | This is the only library I that both can use out of the box
             | and extend to my needs.
        
           | seabrookmx wrote:
           | structlog
        
             | delijati wrote:
             | You always need some boilerplate to get structlog working.
             | I always use this (author) to have the needed defaults
             | https://github.com/peakiq/logma.
        
             | raverbashing wrote:
             | Yeah, it exists
             | 
             | But IMO it inherited most of loggers problems while not
             | offering a better interface
             | 
             | It is basically a kwargs to json or 'key=value' converter
             | 
             | (Also json logs are overrated. There I said it. Especially
             | one key per line, please don't)
        
         | formerly_proven wrote:
         | It's unfortunate how blind copying of log4j into many languages
         | seems to have inhibited improvements in logging in the last
         | ~20ish years.
        
         | MichaelRazum wrote:
         | Yep. Have to agree. It works, but is really messy.
        
         | LtWorf wrote:
         | I normally just import syslog or journald. The latter allows
         | for arbitrary fields to be defined and then searched, but it's
         | not in the stdlib.
        
           | 1337shadow wrote:
           | Great idea but how do you work this out in a container where
           | the python process is PID 1?
        
             | LtWorf wrote:
             | bind mount /dev/log in the container of course.
             | 
             | Also logging can be set up remotely, but it's easier to do
             | like that from the container and then configure the hos to
             | send everything to a single log machine.
        
         | sorokod wrote:
         | For whatever it's worth, java logging has mapped diagnostic
         | context (MDC).
        
       | efitz wrote:
       | Very good advice.
       | 
       | The advice I give to people is "think about logging like a
       | feature with customers, just like any other feature." If you
       | think about logging this way, you ideally put yourself in the
       | position of your logs' "customer"- a bleary-eyed teammate who
       | just got paged out of bed at 2am, or a security guy who isn't
       | intimately familiar with your code. Providing extra context for
       | those users is really kind, as they don't have a "secret decoder
       | ring".
       | 
       | Since the article discusses it, I also highly recommend AWS
       | CloudWatch logs. It's super easy to log to CloudWatch from Lambda
       | with the Python logger class. It's a little harder from EC2, but
       | not really that difficult. The advantage is that you don't need
       | to SSH somewhere and navigate to a file and then locate the
       | interesting stuff in that file - You just keep the CloudWatch
       | console open (or keep a link to it handy) and the latest stuff is
       | on top with easy filtering.
        
         | pid-1 wrote:
         | Cloudwatch is cheap (even free) for small environments and has
         | very small operational overhead.
         | 
         | The cost, however, can easily get out of hand and AWS does not
         | have per resource billing enabled by default. So it might not
         | be obvious which Log Group is responsible for for which charge.
         | 
         | I also think its console is way inferior than something like
         | Grafana for searching and visualisation.
        
         | jstarfish wrote:
         | My standard, to the extent possible: make your logs clear
         | enough that your _attorney_ would be able to understand them.
         | 
         | You may well find yourself wishing you did this if you also
         | output DEBUG logs in production.
        
         | dragonwriter wrote:
         | > It's super easy to log to CloudWatch from Lambda with the
         | Python logger class
         | 
         | Or even just print(): You've got a unique log group for a
         | lambda, CloudWatch logs autonatically timestamps entries, and
         | stdout & stderr from a lambda both go to the current CloudWatch
         | log stream in the lambda's CloudWatch logs group.
        
           | efitz wrote:
           | My experience with Python in Lambda was that my functions ran
           | significantly faster when I used logger rather than print for
           | logging. I agree that Lambda captures stdout to CloudWatch,
           | but my experience was that logger was better (and more
           | intentional).
           | 
           | ed: clarified that this relates to AWS Lambda only
        
             | [deleted]
        
         | AmpsterMan wrote:
         | I don't know if it's the way cloud watch works in my current
         | company or what, but I preferred SSH into the server and grep
         | for what I want lol.
         | 
         | I miss that I can't do that any longer
        
           | dygd wrote:
           | Using the official CLI (aws logs get-log-events) or
           | https://github.com/jorgebastida/awslogs is pretty close to
           | SSH-ing and grepping.
        
             | AmpsterMan wrote:
             | Oh thanks I'll have to give that a shot! It may also just
             | be that I need to not get so caught up in learning a new
             | way to do something.
        
           | efitz wrote:
           | I hear you, and when I did some research on this in the past,
           | going to get logs was the most common reason for engineers to
           | SSH to production hosts.
           | 
           | As a security guy, I hate SSH to production (the whole
           | "cattle, not pets" thing). In my last company we had an
           | internal tool to federate you to the AWS console. We had
           | runbooks in a wiki, and had links literally to the logs for a
           | particular component/service/region - the link would federate
           | you to the right account and take you directly to the target
           | log in cloudwatch logs in the appropriate region. Safer and
           | easier than ssh-to-prod.
        
       | alanwreath wrote:
       | given you've configured logging somewhere early on in the
       | bootstrap phase of your app - there's really few reason to go up
       | and above having more than these two lines in all your logging
       | modules                 import logging       logger =
       | logging.getLogger(__name__)
       | 
       | But boy oh have I seen some funny business where people attach
       | the logger to their objects like so                  class Wat:
       | def __init__(self):            self.logger =
       | logging.getLogger(...)
       | 
       | All for what? For WAT?!
        
         | 1337shadow wrote:
         | To inherit the logger attribute in subclasses?? Save an import?
        
         | Nimitz14 wrote:
         | That is a super common mistake from beginner python
         | programmers.
        
       | hotfixguru wrote:
       | Best way to find logs connected to one of your API requests is by
       | using correlation IDs, which are injected to each log message. Me
       | and my friend wrote libraries to do this for both Django[1] and
       | ASGI apps(starlette, FastAPI)[2]
       | 
       | [1] https://github.com/snok/django-guid
       | 
       | [2] https://github.com/snok/asgi-correlation-id
        
       ___________________________________________________________________
       (page generated 2022-07-01 23:00 UTC)