[HN Gopher] Guide to Concurrency in Python with Asyncio
       ___________________________________________________________________
        
       Guide to Concurrency in Python with Asyncio
        
       Author : LiamPa
       Score  : 130 points
       Date   : 2020-05-24 06:18 UTC (16 hours ago)
        
 (HTM) web link (www.integralist.co.uk)
 (TXT) w3m dump (www.integralist.co.uk)
        
       | theelous3 wrote:
       | Or just use curio or trio and have an infinitely better time with
       | async in python :)
        
         | nurettin wrote:
         | I see no reason to use curio or anything instead of standard
         | asyncio. Works really well for me. Queue tasks, they finish at
         | their leisure, create futures and locks to control flow, all
         | really easy and bugless code. And you get async redis, async
         | http, a lot of asyncio functions for free. So no, I will not
         | change it for some "really well architected library that takes
         | away all your pains" because there aren't any.
        
           | gjvc wrote:
           | What do you use for HTTP message handling?
        
             | nurettin wrote:
             | Well, there is no request router in asyncio-http, so I have
             | to use async views in django or if I need something
             | lightweight I go with Quart. But if I was given the proper
             | time, I'd probably invent my own routing on top of asyncio-
             | http using google's re2 to parse the REST requests like I
             | did here some years ago. https://github.com/nurettin/pwned/
             | blob/master/server/server....
        
           | downerending wrote:
           | The reason is that _curio_ is well-designed and has a
           | reasonably compact story for how things work. It fits in my
           | head. And the  "cool new things I can do with it" to "how
           | hard is it to learn" ratio is pretty high.
           | 
           | Asyncio on the other hand, feels like being beaten with a
           | stick. It's so complicated that normal humans are never
           | really going to understand it all.
           | 
           | On top of that, it's been morphing all through the 3.0
           | series, so there isn't even one spec to learn--it's more like
           | a set.
        
         | gjvc wrote:
         | curio is by David Beazley, [1] which should be reason enough to
         | investigate it.
         | 
         | [1] once memorably dubbed "the people's champion" in the
         | introduction to one of his PyCon speeches.
        
       | michalc wrote:
       | Shameless plug for what is essentially my own much shorter intro
       | to asyncio:
       | https://charemza.name/blog/posts/python/asyncio/I-like-pytho...
        
       | mkchoi212 wrote:
       | Every time I see concurrency and python together, I'm immediately
       | turned off by it. First of all, it probably won't be a true
       | "concurrency" due to GIL and by the point you are looking for
       | "advanced" libraries that allow you to do whatever you are trying
       | to do, maybe it's time to switch languages.
        
         | nurettin wrote:
         | Did you mean: not true parallelism?
        
       | mcdermott wrote:
       | Sorry, but after using Golang with its very simple and powerful
       | goroutines and channels approach to concurrency, this seems like
       | a convoluted mess. When I need concurrency, I certainly don't
       | think of Python as the right tool.
       | 
       | Python seems to have lost its way after the 2 to 3 shift and is
       | no longer what I'd consider "pythonic".
        
         | wegs wrote:
         | I think programming language design as a having a large random
         | component. New languages pop up every day, and it's really hard
         | to predict which constructs will resonate well with human
         | brains.
         | 
         | Guido was competent, but not nearly as brilliant as, for
         | example, Larry, but Python 2 was dramatically better than Perl.
         | I think this is mostly by chance. Hundreds of new languages
         | come out every year, and by whatever fluke, a few of them
         | really work. Python 2 was that.
         | 
         | In an ecosystem like that, past performance is no indication of
         | future performance. There are very few language designers who
         | did well more than once (Guy Steele is the only one I can think
         | of who wasn't a one-hit-wonder).
         | 
         | So I too felt like Python 3 was more of a step backwards than
         | forwards linguistically. But at this point, it has features I
         | need which Python 2 lacks, so I'm mostly working in Python 3.
         | But it definitely feels (unnecessarily) less clean.
        
         | chrstphrhrt wrote:
         | If you want to integrate with the data science ecosystem, would
         | you use modern Python for APIs as well, or do Go services with
         | minimal Python bits?
         | 
         | EDIT: coming from someone who has had great experiences with
         | FastAPI and Uvicorn combined with Dask and Redis.
        
       | adonese wrote:
       | There's this really interesting blog post Im not remembering
       | about nodejs and async concurrency style as compared to golangs
       | one. I cannot remember the author's name, but it goes into length
       | as of comparing async functions and the differences between
       | normal regular func and the async ones, and calling async func
       | only on async scope and so on. Only I remembered I found it
       | retweeted by Brad Fitz. I though maybe someone could guide me to
       | it here
        
       | css wrote:
       | One thing I haven't seen any blogs write about is that
       | multiprocessing in 3.8.x uses `spawn()` and not `fork()`[0] on
       | MacOS. Granted, most applications are not running on OS X Server,
       | but an update that changes a low level API like that led to some
       | issues where running code locally will fail when running it
       | remotely will work. The following code will run anywhere except
       | on MacOS on 3.8.x, where it crashes with `_pickle.PicklingError`
       | as it tries to resolve the name `func` that is not in the new
       | stack:                   import multiprocessing         some_data
       | = {1: "one", 2: "two"}         func = lambda: some_data.get(2)
       | process = multiprocessing.Process(target=func)
       | process.start()
       | 
       | [0]: https://bugs.python.org/issue33725
        
         | jamestimmins wrote:
         | I'm in the process of adding 3.8 support to Airflow. This was
         | the first (but not the last) obstacle in doing so.
        
       | birdyrooster wrote:
       | To those trashing Python in favor of Golang: A compiled language
       | with no OO is not a replacement for Python. Let's talk about the
       | complexity of putting code generators in all of your projects.
       | I've seen real golang projects and the complexity gets moved into
       | your repo.
        
       | [deleted]
        
       | dilandau wrote:
       | Asyncio detractors have been drowned in a stream of assurances
       | that, really, it's not that bad.
       | 
       | Obviously it is, or this wouldn't be post #800 that promises to
       | finally make asyncio clear to newcomers.
       | 
       | Twisted sucks. It was a joke. Now it's been tacitly blessed to
       | the point you can sprinkle the `async` keyword all over the
       | place. Good luck. I hope you don't forget about any spots.
       | 
       | I truly think part of the reason Guido left is because he
       | couldn't defend this shit.
        
         | nessunodoro wrote:
         | Not to stoke the flame, but I never used Twisted, where did it
         | fail?
        
         | birdyrooster wrote:
         | Yes Twisted had its draw backs but that was well over 15 years
         | ago. Async io is literally like a couple keywords and a library
         | with a few classes you will use for everything. No more
         | callback/errback chaining or function decorators every single
         | method for inline callbacks.
         | 
         | No one in these comments are offering any palatable
         | alternatives. A compiled language with no OO is not a
         | replacement for Python's Asyncio.
        
       | jonahbenton wrote:
       | Forgive me, but this is such a ball of mud. All of the "easy"
       | introductions into these primitives are basically whitepaper
       | length, with long digressions into the high-level vs the low-
       | level and historical vs modern patterns. And nothing gets into
       | error cases or problems with cooperative scheduling or
       | debugging/troubleshooting or where the GIL applies. In 5 years
       | debugging the rats nest of concurrent python code that people are
       | writing right now will make clear that go really got this right.
       | So sad that python did not.
       | 
       | Edit: elsewhere on HN right now:
       | 
       | https://nullprogram.com/blog/2020/05/24/
        
         | pgwhalen wrote:
         | > go really got this right
         | 
         | It's also exciting to see that Java has taken the same view,
         | and is well on its way towards delivering it
         | (https://wiki.openjdk.java.net/display/loom), or even improving
         | on it (https://wiki.openjdk.java.net/display/loom/Structured+Co
         | ncur..., see also: https://vorpus.org/blog/notes-on-structured-
         | concurrency-or-g...)
        
         | sys_64738 wrote:
         | The first design constraint with so much python development is
         | the battle to stay single threaded. As you point out, debugging
         | concurrent python code is horrendous.
        
           | downerending wrote:
           | To be fair, debugging _all_ concurrent code is horrendous. If
           | you can use multiple processes instead of multiple threads,
           | you should.
        
             | ricw wrote:
             | Absolutely the worst. Ten years ago I encountered a bug in
             | a wireless networking stack that would crash the whole
             | network of nodes. It was in the worst possible location
             | that was highly dependent on timed sending where every
             | nanosecond mattered. This meant i could only use leds to
             | indicate what was going on, as a print or anything the like
             | would screw the timers enough to break the network. Can't
             | actually remember what caused it in the end.
             | 
             | It took me 6 weeks to debug and fix. I did nothing but
             | debugging at the most primitive level. Probably could have
             | done it more efficiently, but I was young and
             | inexperienced.
             | 
             | Worst and most boring 6 weeks of my life caused by
             | concurrent debugging.
        
           | wegs wrote:
           | I think this could be done well, single-threaded, but Python
           | reinvented the wheel a half-dozen times now, and I don't
           | think any of the wheels are that great. Having them all in
           | the system at the same time, and without great connections,
           | is what makes this a ball of mud.
           | 
           | There's very little glue between them. And none of it is duck
           | typed, like the rest of Python. If I switch between them, I
           | need to rethread the entire system.
        
         | devxpy wrote:
         | Why would the Gil apply to asyncio, which is strictly single
         | threaded?
        
           | zomglings wrote:
           | The post also discusses concurrent.futures.
        
             | devxpy wrote:
             | Can't you just use the ProcessPoolExecutor[1] for CPU heavy
             | stuff? AFAIK, GIL isn't much of an issue with I/O tasks.
             | 
             | [1] https://docs.python.org/3.8/library/concurrent.futures.
             | html#...
        
               | zomglings wrote:
               | You are right, GIL isn't an issue with I/O-bound tasks.
               | 
               | The article didn't limit its focus to I/O, though. It
               | felt more like an explanation of modern concurrency and
               | asynchronous options in Python (besides just threading
               | and and multiprocessing).
               | 
               | The only practical issue with the GIL in Python is that
               | it forces you to use new (heavy) Python processes to
               | parallelize computationally heavy tasks. Not so much of a
               | concern depending on your application, but it is a gotcha
               | for people new to Python.
               | 
               | The real issue, though, as your parent poster mentioned,
               | is the number of different ways to access asynchronicity
               | and concurrency in modern Pythons. It really does run
               | counter to Python PEP20 [0] and can lead to some
               | communication difficulties among developers.
               | 
               | [0] https://www.python.org/dev/peps/pep-0020/
        
               | devxpy wrote:
               | Just an aside from the argument here, I think almost no
               | python API actually follows the zen of python. It's
               | certainly a nice idea, just incredibly hard to actually
               | live upto in the real world.
        
             | joshuamorton wrote:
             | Which is also single (OS) threaded.
        
               | devxpy wrote:
               | Why do you say that?
               | 
               | I believe concurrent.futures is meant as a way for
               | asyncio to spawn threads or processes -- which lets you
               | essentially call sync code from async and not have it
               | block the event loop.
               | 
               | From the docs[1] -
               | 
               | The concurrent.futures module provides a high-level
               | interface for asynchronously executing callables.
               | 
               | The asynchronous execution can be performed with threads,
               | using ThreadPoolExecutor, or separate processes, using
               | ProcessPoolExecutor. Both implement the same interface,
               | which is defined by the abstract Executor class.
               | 
               | [1] https://docs.python.org/3.8/library/concurrent.future
               | s.html
        
               | joshuamorton wrote:
               | Python's threadpoolexecutor is still single OS-threaded.
               | There will only ever be one thread executing in parallel
               | (due to the GIL). Basically, in python, asyncio and
               | threadpoolexecutor are almost entirely mappable to each
               | other. Neither can accomplish more than the other.
               | 
               | Multi-process code can, of course, address some of these
               | issues, but it comes at the cost of spinning up multiple
               | OS processes and costly inter-process communication etc.
        
               | zomglings wrote:
               | The only issue I see re: the GIL is that
               | ThreadPoolExecutor (by its name) can mislead them into
               | thinking that they are using proper threads when in fact
               | they are still constrained by the GIL.
               | 
               | It's a confusion that has been quickly resolved every
               | time a team mate of mine has run into the issue, but I
               | have seen it happen multiple times (at different places).
        
         | j88439h84 wrote:
         | Agree, asyncio is a mess; Trio gets it right.
         | 
         | https://trio.readthedocs.io/
         | 
         | It's a much simpler model that's just as powerful, and makes it
         | easy to get things right. It eliminates the concepts of
         | futures, promises, and awaitables. It has only one way to wait
         | for a task: await it.
         | 
         | For a theoretical explanation of it's "structured concurrency"
         | model see the now-famous essay https://vorpus.org/blog/notes-
         | on-structured-concurrency-or-g...
        
           | jonahbenton wrote:
           | Thanks for the pointer, hadn't heard of trio. Will check it
           | out.
        
         | brian_cloutier wrote:
         | It is absolutely a ball of mud; asyncio is an embarrassment and
         | what these "easy" introductions never get around to telling you
         | is that it's essentially impossible to write correct asyncio
         | code.
         | 
         | Luckily, Trio now exists. Trio has a consistent and simple
         | story for cleanly handling errors and cancellations. Debugging
         | is still a little difficult, but at least code written with
         | Trio has radically fewer problems to debug.
         | 
         | Here's a link on the Trio design:
         | https://trio.readthedocs.io/en/stable/design.html
         | 
         | There are some more great articles and talks, but I don't have
         | any links at the ready, sorry.
        
         | HumblyTossed wrote:
         | I'm a long time fan of Python (and still am), but I completely
         | agree that Go got concurrency right.
        
       | parhamn wrote:
       | This looks great. I wish python would have made the event loop
       | generally more transparent to the end user. Why didn't they just
       | use a single global loop? Sorta like javascript and golang. It
       | would have been more pythonic too. Anyone doing heavier context
       | switching could've had their own loop management.
       | 
       | Making the event loop self-managed added a ton of clunkiness in
       | aio apis (use-your-own-loop, loop lifetime management, etc) and
       | becomes mentally complex for newcomers. Theres also the issue
       | that these huge aio frameworks rewriting the same TCP clients
       | have emerged only differentiated by their loop management
       | patterns.
        
         | jordic wrote:
         | Since version 3.7 you don't need to touch the loop for nothing,
         | also some old signatures had deprecated the loop param... It's
         | almost the same as if you work with node, but with a bit more
         | sugar (tasks, gather, wait, wait_for) When you work with
         | asyncio daily it's so natural and fun.
        
           | skrtskrt wrote:
           | Agreed, Python's async model is the easiest one for me to
           | think about by far and there's so many high quality libraries
        
             | t-writescode wrote:
             | Have you used C#'s or Akka's or Go's?
             | 
             | I find 2 of those to be easier to reason about and all to
             | be superior.
        
               | jordic wrote:
               | yes, goroutines are fast and easy... but this thread is
               | about python. I love both languages, and go is fast and
               | safe, but, I also like python (it's fast developing... )
        
         | nemothekid wrote:
         | > _Why didn 't they just use a single global loop? Sorta like
         | javascript and golang._
         | 
         | I imagine the problem is, that neither JS or Golang have to
         | deal with, is Python supports plain old threads. If you have a
         | single global event loop, and then you try execute a future
         | what should happen?
         | 
         | 1. Should every every thread have its own loop? Then how would
         | you send futures to another thread.
         | 
         | 2. Should the global loop only exist on a single thread? Then
         | you would have to pay a synchronization tax every time you
         | pushed a task to the loop.
         | 
         | Not saying the Python solution is 100% right (I've only thought
         | about it for 5 minutes), but I can see how being compatible
         | with threads has caused this situation. I believe the default
         | now is you don't have to think about loops, and Python does the
         | right thing for 99% of use cases.
        
           | jordic wrote:
           | If you need to do CPU bound computations use a thread pool.
        
             | vazamb wrote:
             | Why this being downvoted:
             | 
             | You need to use a process pool in Python, thread pools are
             | for I/O bound tasks and cannot use more than 1 core.
        
               | nemothekid wrote:
               | I didn't downvote him, but his response isn't at all
               | relevant to what was being discussed. I was discussing
               | the API design of asyncio and how they had to add the
               | "loop" parameter to ensure the reactor would be thread
               | safe by default.
        
         | bluedays wrote:
         | The more time I spend with Python the more reasons I find to
         | hate it.
         | 
         | However, it's still my main language. Everything I end up
         | making just ends up being written in Python because that's the
         | language I know the best.
         | 
         | I could learn another language, and I have learned other
         | languages, but Python seems to be the language that mentally
         | clicks with me the most. I think it's because it's the first
         | language I learned.
         | 
         | Suffice to say, there are so many things wrong with Python and
         | they're all compromises that were made to appeal to wider
         | audience. Unfortunately, I don't think that the "wider
         | audience" is those who are looking for concurrency.
        
           | parhamn wrote:
           | We agree. I think python is great. I actually asked because
           | python usually makes things dead simple and straightforward.
           | 
           | > Unfortunately, I don't think that the "wider audience" is
           | those who are looking for concurrency.
           | 
           | I don't buy this. Many network heavy shops use Python so
           | concurrency is relevant. The language is used in web,
           | finance, cloud, data pipelining/etl, and so many more.
           | Dropbox, Robinhood, Spotify, Instagram, Uber, Google, etc,
           | are the "wider audience" and they certainly care about
           | concurrency.
        
           | maallooc wrote:
           | Me too. As I know more and more about languages and
           | computers, the more I hate it but I end up doing 85% of my
           | projects because with it I can focus on ideas, not
           | implementation details.
        
       ___________________________________________________________________
       (page generated 2020-05-24 23:00 UTC)