[HN Gopher] Django Async: What's new and what's next? ___________________________________________________________________ Django Async: What's new and what's next? Author : sanketsaurav Score : 86 points Date : 2020-08-14 17:21 UTC (5 hours ago) (HTM) web link (deepsource.io) (TXT) w3m dump (deepsource.io) | djstein wrote: | I've seen lots of blog posts, even the Django docs, saying async | is available but still haven't seen any real world examples yet. | Do any exist? | | Also, I still haven't seen how the async addition will work with | Class Based Views. Also, Django Rest Framework is still | considering spending time for support. Until these two use cases | are viable many users won't benefit. | Bedon292 wrote: | I love Django / Django Rest Framework and have used it for a long | time, but we recently dumped it from a project in favor of | FastAPI. | | There is just so many layers of magic in Django, that it was | becoming impossible for us to improve the performance to an | acceptable level. We isolated the problems to serialization / | deserialization. Going from DB -> Python object -> JSON response | was taking far more time than anything else, and just moving over | to FastAPI has gotten us a ~5x improvement in response time. | | I am excited to see where Django async goes though. Its something | I had been looking forward to for a while now. | buttersbrian wrote: | What did you use in place of the DRF serialization to get from | DB -> json response? | Bedon292 wrote: | FastAPI uses Pydantic under it for python objects. And we | have been tinkering with orjson for the actual json | serialization, since it appears to be the winner in json | serialization at the moment. | dec0dedab0de wrote: | Was there still a significant speedup using the standard | library json module? | | For the DB requests Are you writing sql directly, using a | different ORM, or something like sqlalchemy core that makes | sql pythonic without being an ORM? | bredren wrote: | How detailed was the profiling on this? Reason I ask is I've | faced this myself and had to spend a lot of time on both query | and serializer optimization. | Bedon292 wrote: | We used `silk` a lot to profile the app. And basically all | the time was being spend inside django somewhere between | getting the data from the DB and spitting out the response. | We would have things like 15ms in the DB, but 250ms to | actually create the response. On simple things. Some of our | responses were into multiple second (large amounts of data) | but still only spending maybe 150ms in the db. And there was | at least two weeks spent on and off trying to improve it | before we finally decided we had to go somewhere else. And | thats after having to redo some of our queries by hand | because the ORM was doing something like 15 left joins. | mixmastamyk wrote: | So how does it get from DB --> JSON response? SQLAlchemy or | dbapi? | Bedon292 wrote: | Yeah, FastAPI uses SQLAlchemy under it. Along with pydantic | to define schemas with typing. And then just started | tinkering with orjson for the json serialization. Seems to be | the fastest library at the moment. | | I have also been experimenting with encode/databases for | async DB access. It still uses the SA core functions, which | is nice, but that means it does not do the nice relationships | stuff that SA has built in when using it to handle | everything. At least not that I have found. However it does | allow for things like gets without relationships, updates of | single records, and stuff like that quite nicely. | mixmastamyk wrote: | I see, thanks. Is it required to define models twice as | this page seems to recommend? | | https://fastapi.tiangolo.com/tutorial/sql-databases/ | Bedon292 wrote: | Yeah, you do end up defining everything more than once, | once for SA, and then for pydantic. Create, Read, and | Update may all be different pydantic models as well. They | are for defining what comes in and out of the actual API. | Your create request may not have the id field yet and | some optional fields, and then the response has | everything. And then an update may have everything as | optional except the id. Only been using it a few weeks | now, but liking it a lot so far. | | https://fastapi.tiangolo.com/tutorial/sql- | databases/#create-... | StavrosK wrote: | You can use the two together: | https://www.stavros.io/posts/fastapi-with-django/ | Bedon292 wrote: | That is quite interesting. There are a lot of things like | management, tests, and such that I love and miss from django. | Going to have to really think about what I think of this. | | Edit: Although, now that I think a little more its not that | surprising. Our initial tests did literally just define | FastAPI schemas on top of our existing DB. The co-mingling | while actually running is an interesting concept though. | lmeyerov wrote: | We ended up with 2 python layers: | | -- Boring code - Business logic, CRUD, management, security, | ...: django | | -- Perf: JWT services on another stack (GPU, Arrow streaming, | ...) | | So stuff is either Boring Code or Performance Code. Async is | great b/c now Boring Code can now simply await Performance Code | :) Boring Code gets predictability & general ecosystem, and | Performance Code does wilder stuff where we don't worry about | non-perf ecosystem stuff, just perf ecosystem oddballs. We've | been systematically dropping node from our backend, where we | tried to have it all, and IMO too much lift for most teams. | innomatics wrote: | I like this idea. Also I am looking at a separate GraphQL | stack alongside Django for flexible access points. | simonw wrote: | "Async is great b/c now Boring Code can now simply await | Performance Code" - that's really smart, I like that | philosophy. | mjhea0 wrote: | Interesting. | | Have any interest in expanding this into a blog post? I've been | working on a similar post. Maybe we can compare notes. I'm at | michael at testdriven dot io, if interested. | ArtDev wrote: | I had a bad experience with Django. I found it cluttered and | slow. I really wanted to like it. It might seem funny but a more | straightforward framework like Symfony didn't get it in the way | and ended up much faster. Python should be much much faster than | PHP but I guess the framework matters a lot too. | abledon wrote: | Whats the most elegant way for cutting edge Django to do | websockets? is it still to 'tack' on the channels package [0] ? | | compared to FastAPI[1] I really don't want to use it, I only miss | the ORM since in FastAPI it looks like you have to manually write | the code to insert stuff[2]. | | [0] https://realpython.com/getting-started-with-django-channels/ | | [1] https://fastapi.tiangolo.com/advanced/websockets/#create- | a-w... | | [2] https://fastapi.tiangolo.com/tutorial/sql- | databases/#create-... | [deleted] | [deleted] | leafboi wrote: | Wasn't there an article about how the async syntax was | benchmarked to actually be _slower_ than the traditional way of | using threads? What 's the current story on python async? | | reference: http://calpaterson.com/async-python-is-not-faster.html | tomnipotent wrote: | The built-in event loop has meh performance, would love to see | the benchmarks re-run using libuv - that would help close some | of the gap. | calpaterson wrote: | Hi, I am the author of the above article. Libuv was used - | for example by the uvicorn-based versions. | leafboi wrote: | They max out the speed with tests. It does use libuv. Uvicorn | is the indicator as it uses libuv underneath. | | If you heard of Gunicorn, Uvicorn is the version of Gunicorn | with libUV, hence the name. | toxik wrote: | None of the other replies acknowledge this but it seems you are | conflating concurrency and asynchronous. An asynchronous | program can be sequentially executed. It is a distinct concept. | pdonis wrote: | The "slower" is not really the problem--as the article notes, | the sync frameworks it tested have most of the heavy lifting | being done in native C code, not Python bytecode, whereas the | async frameworks are all pure Python. Pure Python is always | going to be slower than native C code. I'm actually surprised | that the pure Python async frameworks managed to do as well as | they did in throughput. But of course this issue can be solved | by coding async frameworks in C and exposing the necessary | Python API using bindings, the same way the sync frameworks do | now. So the comparison of throughput isn't really fair. | | The real issue, as the article notes, is latency variation. | Because async frameworks rely on cooperative multitasking, | there is no way for the event loop to preempt a worker that is | taking too long in order to maintain reasonable latency for | other requests. | | There is one thing I wonder about with this article, though. | The article says each worker is making a database query. How is | that being done? If it's being done over a network, that worker | should yield back to the event loop while it's waiting for the | network I/O to complete. If it's being done via a database on | the local machine, and the communication with that database is | not being done by something like Unix sockets, but by direct | calls into a database library, then that's obviously going to | cause latency problems because the worker can't yield during | the database call. The obvious way to fix that is to have the | local database server exposed via socket instead of direct | library calls. | leafboi wrote: | >whereas the async frameworks are all pure Python. | | No it's not pure python. It's a combination. The underlying | event loop uses libuv, a C library that's makes up the | underlying core of nodejs. The marker of "Uvicorn" is an | indicator of this as "Uvicorn" uses uvlib. | | Overall the benchmark is testing a bit of both. The event | loop runs on C but it has to execute a bit of python code | when handling the request. | | >If it's being done via a database on the local machine, and | the communication with that database is not being done by | something like Unix sockets, but by direct calls into a | database library, then that's obviously going to cause | latency problems because the worker can't yield during the | database call. | | I am almost positive it is being done with some form of non | blocking sockets. The only other way to do this without | sockets is to write to file and read from file. | | There is no "direct library calls" as the database server | exists as a separate process to the server process. Here's | what occurs: 1. Server makes a socket | connection to database. 2. Server sends a request to | database 3. database receives request, reads from | database file. 4. database sends information back to | server. | | Any library call you're thinking of that's called from the | library here may be a "client side" library meaning that the | library actually makes a socket connection to the sql server. | reticents wrote: | Wow, thank you for this link. I appreciate it when my | assumptions are challenged like this, particularly given the | fact that I have a tendency to take benchmark synopses like | FastAPI's [1] for granted. I'll have to be more conscious of | the ways in which authors hamstring the competition to game | their results. | | [1] https://fastapi.tiangolo.com/benchmarks/ | bob1029 wrote: | I think the story with async is always "it depends", unless we | are questioning whether the specific implementation is broken. | | For some web applications, it might actually be faster (in | meaningful aggregate volume) to service a complete request on | the calling thread rather than deferring to the thread pool | periodically throughout execution. I think the break over point | between sync and async comes down to how much I/O (database) | work is involved in satisfying the request. If each request | only hits the database 1-2 times on average incurring a few | milliseconds of added latency, making sync all the way down is | might be better than with any amount of added context | switching. If each request may take 100-1000 milliseconds to | complete overall due to various long-running I/O operations, | then async is certainly one good approach for maximizing the | number of possible concurrent requests. | | In most of my applications (C#/.NET Core) I default to | async/await for backend service methods, because 9/10 times I | am going to the database multiple times for something and I | cannot always guarantee that it will return quickly under heavy | load. For other items, I explicitly go wide on parallelizable | CPU-bound tasks. All of these are handled as a blocking call | against a Parallel.ForEach(). Never would a CPU-bound task be | explicitly wrapped with async/await, but one may be included as | part of a larger asynchronous operation. | | This stuff used to confuse the hell out of me, and then I | finally wrapped my head around the 2 essential code | abstractions: async/await for I/O, Parallel.For() (et. al.) for | CPU-bound tasks which have parallelism opportunities. Never try | to Task.Run or async/await your way out of something that is | CPU-bound and is blocking the flow of execution. Try to | leverage asynchrony responsibly when delays >1ms are possible | in large concurrent volumes. | syndacks wrote: | Why is this being downvoted? Seems like a fair counter-point to | me. | ghostwriter wrote: | I didn't downvote it, but apart from the fact that async io | is not meant to be faster (it's all about throughput, after | all), the benchmark is flawed and it's been discussed in full | before https://news.ycombinator.com/item?id=23496994 | leafboi wrote: | asyncio is meant to be "faster" for IO heavy tasks and low | compute. The benchmark tests requests per second which is | indeed directly testing what you expect it to test. | | It's been discussed before but the outcome of that | discussion (in the link you brought up) was divided. Highly | highly divided. There was no conclusion and it is not clear | whether the benchmark was flawed. | | The discussion is also littered with people who don't | understand why async is fast for only certain types of | things and slow for others. It's also littered with | assumptions that the test focused on compute rather than IO | which is very very evidently not the case. | theptip wrote: | This article doesn't evaluate the case that you actually want | ASGI for, so I don't think it's very useful. (Or at least, it | confirms something that should have already been clear). | | If you're compute-bound, then Python async (which uses | cooperative scheduling similar to green threads) isn't going to | help you. You get parallelism, but not concurrency from this | progamming model; only one logical thread of execution is | running on the CPU at a time (per-process), so this can only | slow you down if you are CPU-constrained. | | The standard usecase of a sync API backed by a local DB with | low request latency is typically going to be compute-bound. | | This is covered in the Django async docs | (https://docs.djangoproject.com/en/3.1/topics/async/) and also | in green threading libraries like gevent | (http://www.gevent.org/intro.html#cooperative-multitasking). | | The case where async workers are interesting is for I/O-bound | workloads. Say you're building an API gateway, or your | monolithic API starts to need to call out to other API | services, particularly external ones like Google Maps API. In | this case, the worst-case result is that the proxied HTTP | request times out; this could block your Django API's work | thread for many seconds. | | In the async / green-threaded model, this case is fine; you | have a green thread/async function call per request, and if | that gthread is blocked on an upstream I/O operation, the event | loop will just start working on a different API call until the | OS gives a response on the network socket. | | Essentially, there's no reason to use Django async if you're | doing a traditional monolithic DB-backed application. It's | going to give you benefits in usecases where the standard sync | model struggles. | | (Note, there's an argument that you might want green threads | even in a normal monolith, to guard against cases like | "developer accidentally wrote a chunky DB query that takes 60 | seconds to run for some inputs", but most DB engines don't | support one-DB-connection-per-HTTP-connection. There was a | bunch of discussion on this topic a few years ago, with the | SQLAlchemy author arguing that async is not useful for DB | connections: | https://techspot.zzzeek.org/2015/02/15/asynchronous-python-a... | although asyncio support was added: https://docs.sqlalchemy.org | /en/14/orm/extensions/asyncio.htm...) | leafboi wrote: | The tests aren't compute bound. They are testing requests per | second. It's testing biased towards IO, not compute. Please | read the article. | dec0dedab0de wrote: | I think the article and some of the comments are not really | looking at this the right way. | | For most things you're probably better off "doing the work" in a | celery task, regardless if it is IO bound or CPU bound. Then use | web sockets just for your status updates/progress bar, instead of | having your front end poll on a timer. | hyuuu wrote: | time and time again, whenever I start a new project, Django has | always been my go-to choice after analyzing the alternatives. | I've worked on large scale, mono-repo, billion users to side | projects over the weekend, Django really stay true to the | batteries included philosphy. | silviogutierrez wrote: | Great article. But I think this part may need as second look: | If your views involve heavy-lifting calculations or long-running | network calls to be done as part of the request path, it's a | great use case for using async views. | | That seems true for long-running network calls (IO). But for | heavy-lifting calculations? I thought that was _the_ canonical | example of situations async won 't improve. CPU bound and memory | bound, after all. | Znafon wrote: | You are correct, async will only help for long-running network | calls, which happens when calling another service or querying a | database. | | When doing a long computation the CPU is not idle so there is | no free compute power to use for something else. | | Finally, when doing IO calls in Python so GIL is usually | released so the kernel can already schedule another thread | while waiting for IO, so it is not sure that converting to | async will yield improvement and should be benchmarked if you | plan on converting an existing program. | pdonis wrote: | _> when doing IO calls in Python so GIL is usually released | so the kernel can already schedule another thread while | waiting for IO_ | | This is true, but scheduling another thread through the | kernel can have higher overhead since it requires context | switches. Running multiple threads also has other potential | issues with lock contention; how problematic they are will | depend on the use case. | | The potential advantage of scheduling another thread is, of | course, that it can do CPU bound work; but in Python, | unfortunately, doing that means the GIL doesn't get released | so that thread will prevent any further network I/O while | it's running, the same as would happen in an async framework | if a worker did a lot of CPU work. So Python doesn't really | let you realize the advantages of threads in this context. | Znafon wrote: | > doing that means the GIL doesn't get released so that | thread will prevent any further network I/O while it's | running, the same as would happen in an async framework if | a worker did a lot of CPU work. So Python doesn't really | let you realize the advantages of threads in this context. | | I don't think that's true, the GIL get released for many | computing intensive or IO bound tasks in Python, for | example when reading from a socket the GIL gets released at | https://github.com/python/cpython/blob/e822e37946f27c09953b | b... | dec0dedab0de wrote: | I thought the only reason to use ASGI is to use web sockets, | and the only reason to use web sockets is to avoid making | multiple requests for things that don't matter if a particular | message is lost. | ghostwriter wrote: | Perhaps they meant that heavy long-running calculations could | be offloaded to a worker pool with a help of concurrent futures | and run_in_executor() | | - https://docs.python.org/3/library/concurrent.futures.html | | - https://docs.python.org/3/library/asyncio- | eventloop.html#asy... | pdonis wrote: | This will only help if the workers are separate processes. | Thread workers will hold the GIL in Python and prevent | network I/O while they are doing CPU bound tasks. | ghostwriter wrote: | sure, the pool only cares about concurrent.futures.Executor | interface, an implementation could be processes or cloud | resources. | dr_zoidberg wrote: | > Thread workers will hold the GIL in Python and prevent | network I/O while they are doing CPU bound tasks. | | Using cython: with nogil: # | whatever you need to do, as long as it # | doesn't touch a python object | | If you're doing heavy calculations from python you should | at least be considering cython. ___________________________________________________________________ (page generated 2020-08-14 23:00 UTC)