[HN Gopher] Guide to Concurrency in Python with Asyncio ___________________________________________________________________ Guide to Concurrency in Python with Asyncio Author : LiamPa Score : 130 points Date : 2020-05-24 06:18 UTC (16 hours ago) (HTM) web link (www.integralist.co.uk) (TXT) w3m dump (www.integralist.co.uk) | theelous3 wrote: | Or just use curio or trio and have an infinitely better time with | async in python :) | nurettin wrote: | I see no reason to use curio or anything instead of standard | asyncio. Works really well for me. Queue tasks, they finish at | their leisure, create futures and locks to control flow, all | really easy and bugless code. And you get async redis, async | http, a lot of asyncio functions for free. So no, I will not | change it for some "really well architected library that takes | away all your pains" because there aren't any. | gjvc wrote: | What do you use for HTTP message handling? | nurettin wrote: | Well, there is no request router in asyncio-http, so I have | to use async views in django or if I need something | lightweight I go with Quart. But if I was given the proper | time, I'd probably invent my own routing on top of asyncio- | http using google's re2 to parse the REST requests like I | did here some years ago. https://github.com/nurettin/pwned/ | blob/master/server/server.... | downerending wrote: | The reason is that _curio_ is well-designed and has a | reasonably compact story for how things work. It fits in my | head. And the "cool new things I can do with it" to "how | hard is it to learn" ratio is pretty high. | | Asyncio on the other hand, feels like being beaten with a | stick. It's so complicated that normal humans are never | really going to understand it all. | | On top of that, it's been morphing all through the 3.0 | series, so there isn't even one spec to learn--it's more like | a set. | gjvc wrote: | curio is by David Beazley, [1] which should be reason enough to | investigate it. | | [1] once memorably dubbed "the people's champion" in the | introduction to one of his PyCon speeches. | michalc wrote: | Shameless plug for what is essentially my own much shorter intro | to asyncio: | https://charemza.name/blog/posts/python/asyncio/I-like-pytho... | mkchoi212 wrote: | Every time I see concurrency and python together, I'm immediately | turned off by it. First of all, it probably won't be a true | "concurrency" due to GIL and by the point you are looking for | "advanced" libraries that allow you to do whatever you are trying | to do, maybe it's time to switch languages. | nurettin wrote: | Did you mean: not true parallelism? | mcdermott wrote: | Sorry, but after using Golang with its very simple and powerful | goroutines and channels approach to concurrency, this seems like | a convoluted mess. When I need concurrency, I certainly don't | think of Python as the right tool. | | Python seems to have lost its way after the 2 to 3 shift and is | no longer what I'd consider "pythonic". | wegs wrote: | I think programming language design as a having a large random | component. New languages pop up every day, and it's really hard | to predict which constructs will resonate well with human | brains. | | Guido was competent, but not nearly as brilliant as, for | example, Larry, but Python 2 was dramatically better than Perl. | I think this is mostly by chance. Hundreds of new languages | come out every year, and by whatever fluke, a few of them | really work. Python 2 was that. | | In an ecosystem like that, past performance is no indication of | future performance. There are very few language designers who | did well more than once (Guy Steele is the only one I can think | of who wasn't a one-hit-wonder). | | So I too felt like Python 3 was more of a step backwards than | forwards linguistically. But at this point, it has features I | need which Python 2 lacks, so I'm mostly working in Python 3. | But it definitely feels (unnecessarily) less clean. | chrstphrhrt wrote: | If you want to integrate with the data science ecosystem, would | you use modern Python for APIs as well, or do Go services with | minimal Python bits? | | EDIT: coming from someone who has had great experiences with | FastAPI and Uvicorn combined with Dask and Redis. | adonese wrote: | There's this really interesting blog post Im not remembering | about nodejs and async concurrency style as compared to golangs | one. I cannot remember the author's name, but it goes into length | as of comparing async functions and the differences between | normal regular func and the async ones, and calling async func | only on async scope and so on. Only I remembered I found it | retweeted by Brad Fitz. I though maybe someone could guide me to | it here | css wrote: | One thing I haven't seen any blogs write about is that | multiprocessing in 3.8.x uses `spawn()` and not `fork()`[0] on | MacOS. Granted, most applications are not running on OS X Server, | but an update that changes a low level API like that led to some | issues where running code locally will fail when running it | remotely will work. The following code will run anywhere except | on MacOS on 3.8.x, where it crashes with `_pickle.PicklingError` | as it tries to resolve the name `func` that is not in the new | stack: import multiprocessing some_data | = {1: "one", 2: "two"} func = lambda: some_data.get(2) | process = multiprocessing.Process(target=func) | process.start() | | [0]: https://bugs.python.org/issue33725 | jamestimmins wrote: | I'm in the process of adding 3.8 support to Airflow. This was | the first (but not the last) obstacle in doing so. | birdyrooster wrote: | To those trashing Python in favor of Golang: A compiled language | with no OO is not a replacement for Python. Let's talk about the | complexity of putting code generators in all of your projects. | I've seen real golang projects and the complexity gets moved into | your repo. | [deleted] | dilandau wrote: | Asyncio detractors have been drowned in a stream of assurances | that, really, it's not that bad. | | Obviously it is, or this wouldn't be post #800 that promises to | finally make asyncio clear to newcomers. | | Twisted sucks. It was a joke. Now it's been tacitly blessed to | the point you can sprinkle the `async` keyword all over the | place. Good luck. I hope you don't forget about any spots. | | I truly think part of the reason Guido left is because he | couldn't defend this shit. | nessunodoro wrote: | Not to stoke the flame, but I never used Twisted, where did it | fail? | birdyrooster wrote: | Yes Twisted had its draw backs but that was well over 15 years | ago. Async io is literally like a couple keywords and a library | with a few classes you will use for everything. No more | callback/errback chaining or function decorators every single | method for inline callbacks. | | No one in these comments are offering any palatable | alternatives. A compiled language with no OO is not a | replacement for Python's Asyncio. | jonahbenton wrote: | Forgive me, but this is such a ball of mud. All of the "easy" | introductions into these primitives are basically whitepaper | length, with long digressions into the high-level vs the low- | level and historical vs modern patterns. And nothing gets into | error cases or problems with cooperative scheduling or | debugging/troubleshooting or where the GIL applies. In 5 years | debugging the rats nest of concurrent python code that people are | writing right now will make clear that go really got this right. | So sad that python did not. | | Edit: elsewhere on HN right now: | | https://nullprogram.com/blog/2020/05/24/ | pgwhalen wrote: | > go really got this right | | It's also exciting to see that Java has taken the same view, | and is well on its way towards delivering it | (https://wiki.openjdk.java.net/display/loom), or even improving | on it (https://wiki.openjdk.java.net/display/loom/Structured+Co | ncur..., see also: https://vorpus.org/blog/notes-on-structured- | concurrency-or-g...) | sys_64738 wrote: | The first design constraint with so much python development is | the battle to stay single threaded. As you point out, debugging | concurrent python code is horrendous. | downerending wrote: | To be fair, debugging _all_ concurrent code is horrendous. If | you can use multiple processes instead of multiple threads, | you should. | ricw wrote: | Absolutely the worst. Ten years ago I encountered a bug in | a wireless networking stack that would crash the whole | network of nodes. It was in the worst possible location | that was highly dependent on timed sending where every | nanosecond mattered. This meant i could only use leds to | indicate what was going on, as a print or anything the like | would screw the timers enough to break the network. Can't | actually remember what caused it in the end. | | It took me 6 weeks to debug and fix. I did nothing but | debugging at the most primitive level. Probably could have | done it more efficiently, but I was young and | inexperienced. | | Worst and most boring 6 weeks of my life caused by | concurrent debugging. | wegs wrote: | I think this could be done well, single-threaded, but Python | reinvented the wheel a half-dozen times now, and I don't | think any of the wheels are that great. Having them all in | the system at the same time, and without great connections, | is what makes this a ball of mud. | | There's very little glue between them. And none of it is duck | typed, like the rest of Python. If I switch between them, I | need to rethread the entire system. | devxpy wrote: | Why would the Gil apply to asyncio, which is strictly single | threaded? | zomglings wrote: | The post also discusses concurrent.futures. | devxpy wrote: | Can't you just use the ProcessPoolExecutor[1] for CPU heavy | stuff? AFAIK, GIL isn't much of an issue with I/O tasks. | | [1] https://docs.python.org/3.8/library/concurrent.futures. | html#... | zomglings wrote: | You are right, GIL isn't an issue with I/O-bound tasks. | | The article didn't limit its focus to I/O, though. It | felt more like an explanation of modern concurrency and | asynchronous options in Python (besides just threading | and and multiprocessing). | | The only practical issue with the GIL in Python is that | it forces you to use new (heavy) Python processes to | parallelize computationally heavy tasks. Not so much of a | concern depending on your application, but it is a gotcha | for people new to Python. | | The real issue, though, as your parent poster mentioned, | is the number of different ways to access asynchronicity | and concurrency in modern Pythons. It really does run | counter to Python PEP20 [0] and can lead to some | communication difficulties among developers. | | [0] https://www.python.org/dev/peps/pep-0020/ | devxpy wrote: | Just an aside from the argument here, I think almost no | python API actually follows the zen of python. It's | certainly a nice idea, just incredibly hard to actually | live upto in the real world. | joshuamorton wrote: | Which is also single (OS) threaded. | devxpy wrote: | Why do you say that? | | I believe concurrent.futures is meant as a way for | asyncio to spawn threads or processes -- which lets you | essentially call sync code from async and not have it | block the event loop. | | From the docs[1] - | | The concurrent.futures module provides a high-level | interface for asynchronously executing callables. | | The asynchronous execution can be performed with threads, | using ThreadPoolExecutor, or separate processes, using | ProcessPoolExecutor. Both implement the same interface, | which is defined by the abstract Executor class. | | [1] https://docs.python.org/3.8/library/concurrent.future | s.html | joshuamorton wrote: | Python's threadpoolexecutor is still single OS-threaded. | There will only ever be one thread executing in parallel | (due to the GIL). Basically, in python, asyncio and | threadpoolexecutor are almost entirely mappable to each | other. Neither can accomplish more than the other. | | Multi-process code can, of course, address some of these | issues, but it comes at the cost of spinning up multiple | OS processes and costly inter-process communication etc. | zomglings wrote: | The only issue I see re: the GIL is that | ThreadPoolExecutor (by its name) can mislead them into | thinking that they are using proper threads when in fact | they are still constrained by the GIL. | | It's a confusion that has been quickly resolved every | time a team mate of mine has run into the issue, but I | have seen it happen multiple times (at different places). | j88439h84 wrote: | Agree, asyncio is a mess; Trio gets it right. | | https://trio.readthedocs.io/ | | It's a much simpler model that's just as powerful, and makes it | easy to get things right. It eliminates the concepts of | futures, promises, and awaitables. It has only one way to wait | for a task: await it. | | For a theoretical explanation of it's "structured concurrency" | model see the now-famous essay https://vorpus.org/blog/notes- | on-structured-concurrency-or-g... | jonahbenton wrote: | Thanks for the pointer, hadn't heard of trio. Will check it | out. | brian_cloutier wrote: | It is absolutely a ball of mud; asyncio is an embarrassment and | what these "easy" introductions never get around to telling you | is that it's essentially impossible to write correct asyncio | code. | | Luckily, Trio now exists. Trio has a consistent and simple | story for cleanly handling errors and cancellations. Debugging | is still a little difficult, but at least code written with | Trio has radically fewer problems to debug. | | Here's a link on the Trio design: | https://trio.readthedocs.io/en/stable/design.html | | There are some more great articles and talks, but I don't have | any links at the ready, sorry. | HumblyTossed wrote: | I'm a long time fan of Python (and still am), but I completely | agree that Go got concurrency right. | parhamn wrote: | This looks great. I wish python would have made the event loop | generally more transparent to the end user. Why didn't they just | use a single global loop? Sorta like javascript and golang. It | would have been more pythonic too. Anyone doing heavier context | switching could've had their own loop management. | | Making the event loop self-managed added a ton of clunkiness in | aio apis (use-your-own-loop, loop lifetime management, etc) and | becomes mentally complex for newcomers. Theres also the issue | that these huge aio frameworks rewriting the same TCP clients | have emerged only differentiated by their loop management | patterns. | jordic wrote: | Since version 3.7 you don't need to touch the loop for nothing, | also some old signatures had deprecated the loop param... It's | almost the same as if you work with node, but with a bit more | sugar (tasks, gather, wait, wait_for) When you work with | asyncio daily it's so natural and fun. | skrtskrt wrote: | Agreed, Python's async model is the easiest one for me to | think about by far and there's so many high quality libraries | t-writescode wrote: | Have you used C#'s or Akka's or Go's? | | I find 2 of those to be easier to reason about and all to | be superior. | jordic wrote: | yes, goroutines are fast and easy... but this thread is | about python. I love both languages, and go is fast and | safe, but, I also like python (it's fast developing... ) | nemothekid wrote: | > _Why didn 't they just use a single global loop? Sorta like | javascript and golang._ | | I imagine the problem is, that neither JS or Golang have to | deal with, is Python supports plain old threads. If you have a | single global event loop, and then you try execute a future | what should happen? | | 1. Should every every thread have its own loop? Then how would | you send futures to another thread. | | 2. Should the global loop only exist on a single thread? Then | you would have to pay a synchronization tax every time you | pushed a task to the loop. | | Not saying the Python solution is 100% right (I've only thought | about it for 5 minutes), but I can see how being compatible | with threads has caused this situation. I believe the default | now is you don't have to think about loops, and Python does the | right thing for 99% of use cases. | jordic wrote: | If you need to do CPU bound computations use a thread pool. | vazamb wrote: | Why this being downvoted: | | You need to use a process pool in Python, thread pools are | for I/O bound tasks and cannot use more than 1 core. | nemothekid wrote: | I didn't downvote him, but his response isn't at all | relevant to what was being discussed. I was discussing | the API design of asyncio and how they had to add the | "loop" parameter to ensure the reactor would be thread | safe by default. | bluedays wrote: | The more time I spend with Python the more reasons I find to | hate it. | | However, it's still my main language. Everything I end up | making just ends up being written in Python because that's the | language I know the best. | | I could learn another language, and I have learned other | languages, but Python seems to be the language that mentally | clicks with me the most. I think it's because it's the first | language I learned. | | Suffice to say, there are so many things wrong with Python and | they're all compromises that were made to appeal to wider | audience. Unfortunately, I don't think that the "wider | audience" is those who are looking for concurrency. | parhamn wrote: | We agree. I think python is great. I actually asked because | python usually makes things dead simple and straightforward. | | > Unfortunately, I don't think that the "wider audience" is | those who are looking for concurrency. | | I don't buy this. Many network heavy shops use Python so | concurrency is relevant. The language is used in web, | finance, cloud, data pipelining/etl, and so many more. | Dropbox, Robinhood, Spotify, Instagram, Uber, Google, etc, | are the "wider audience" and they certainly care about | concurrency. | maallooc wrote: | Me too. As I know more and more about languages and | computers, the more I hate it but I end up doing 85% of my | projects because with it I can focus on ideas, not | implementation details. ___________________________________________________________________ (page generated 2020-05-24 23:00 UTC)