[HN Gopher] Django Async: What's new and what's next?
       ___________________________________________________________________
        
       Django Async: What's new and what's next?
        
       Author : sanketsaurav
       Score  : 86 points
       Date   : 2020-08-14 17:21 UTC (5 hours ago)
        
 (HTM) web link (deepsource.io)
 (TXT) w3m dump (deepsource.io)
        
       | djstein wrote:
       | I've seen lots of blog posts, even the Django docs, saying async
       | is available but still haven't seen any real world examples yet.
       | Do any exist?
       | 
       | Also, I still haven't seen how the async addition will work with
       | Class Based Views. Also, Django Rest Framework is still
       | considering spending time for support. Until these two use cases
       | are viable many users won't benefit.
        
       | Bedon292 wrote:
       | I love Django / Django Rest Framework and have used it for a long
       | time, but we recently dumped it from a project in favor of
       | FastAPI.
       | 
       | There is just so many layers of magic in Django, that it was
       | becoming impossible for us to improve the performance to an
       | acceptable level. We isolated the problems to serialization /
       | deserialization. Going from DB -> Python object -> JSON response
       | was taking far more time than anything else, and just moving over
       | to FastAPI has gotten us a ~5x improvement in response time.
       | 
       | I am excited to see where Django async goes though. Its something
       | I had been looking forward to for a while now.
        
         | buttersbrian wrote:
         | What did you use in place of the DRF serialization to get from
         | DB -> json response?
        
           | Bedon292 wrote:
           | FastAPI uses Pydantic under it for python objects. And we
           | have been tinkering with orjson for the actual json
           | serialization, since it appears to be the winner in json
           | serialization at the moment.
        
             | dec0dedab0de wrote:
             | Was there still a significant speedup using the standard
             | library json module?
             | 
             | For the DB requests Are you writing sql directly, using a
             | different ORM, or something like sqlalchemy core that makes
             | sql pythonic without being an ORM?
        
         | bredren wrote:
         | How detailed was the profiling on this? Reason I ask is I've
         | faced this myself and had to spend a lot of time on both query
         | and serializer optimization.
        
           | Bedon292 wrote:
           | We used `silk` a lot to profile the app. And basically all
           | the time was being spend inside django somewhere between
           | getting the data from the DB and spitting out the response.
           | We would have things like 15ms in the DB, but 250ms to
           | actually create the response. On simple things. Some of our
           | responses were into multiple second (large amounts of data)
           | but still only spending maybe 150ms in the db. And there was
           | at least two weeks spent on and off trying to improve it
           | before we finally decided we had to go somewhere else. And
           | thats after having to redo some of our queries by hand
           | because the ORM was doing something like 15 left joins.
        
         | mixmastamyk wrote:
         | So how does it get from DB --> JSON response? SQLAlchemy or
         | dbapi?
        
           | Bedon292 wrote:
           | Yeah, FastAPI uses SQLAlchemy under it. Along with pydantic
           | to define schemas with typing. And then just started
           | tinkering with orjson for the json serialization. Seems to be
           | the fastest library at the moment.
           | 
           | I have also been experimenting with encode/databases for
           | async DB access. It still uses the SA core functions, which
           | is nice, but that means it does not do the nice relationships
           | stuff that SA has built in when using it to handle
           | everything. At least not that I have found. However it does
           | allow for things like gets without relationships, updates of
           | single records, and stuff like that quite nicely.
        
             | mixmastamyk wrote:
             | I see, thanks. Is it required to define models twice as
             | this page seems to recommend?
             | 
             | https://fastapi.tiangolo.com/tutorial/sql-databases/
        
               | Bedon292 wrote:
               | Yeah, you do end up defining everything more than once,
               | once for SA, and then for pydantic. Create, Read, and
               | Update may all be different pydantic models as well. They
               | are for defining what comes in and out of the actual API.
               | Your create request may not have the id field yet and
               | some optional fields, and then the response has
               | everything. And then an update may have everything as
               | optional except the id. Only been using it a few weeks
               | now, but liking it a lot so far.
               | 
               | https://fastapi.tiangolo.com/tutorial/sql-
               | databases/#create-...
        
         | StavrosK wrote:
         | You can use the two together:
         | https://www.stavros.io/posts/fastapi-with-django/
        
           | Bedon292 wrote:
           | That is quite interesting. There are a lot of things like
           | management, tests, and such that I love and miss from django.
           | Going to have to really think about what I think of this.
           | 
           | Edit: Although, now that I think a little more its not that
           | surprising. Our initial tests did literally just define
           | FastAPI schemas on top of our existing DB. The co-mingling
           | while actually running is an interesting concept though.
        
         | lmeyerov wrote:
         | We ended up with 2 python layers:
         | 
         | -- Boring code - Business logic, CRUD, management, security,
         | ...: django
         | 
         | -- Perf: JWT services on another stack (GPU, Arrow streaming,
         | ...)
         | 
         | So stuff is either Boring Code or Performance Code. Async is
         | great b/c now Boring Code can now simply await Performance Code
         | :) Boring Code gets predictability & general ecosystem, and
         | Performance Code does wilder stuff where we don't worry about
         | non-perf ecosystem stuff, just perf ecosystem oddballs. We've
         | been systematically dropping node from our backend, where we
         | tried to have it all, and IMO too much lift for most teams.
        
           | innomatics wrote:
           | I like this idea. Also I am looking at a separate GraphQL
           | stack alongside Django for flexible access points.
        
           | simonw wrote:
           | "Async is great b/c now Boring Code can now simply await
           | Performance Code" - that's really smart, I like that
           | philosophy.
        
         | mjhea0 wrote:
         | Interesting.
         | 
         | Have any interest in expanding this into a blog post? I've been
         | working on a similar post. Maybe we can compare notes. I'm at
         | michael at testdriven dot io, if interested.
        
       | ArtDev wrote:
       | I had a bad experience with Django. I found it cluttered and
       | slow. I really wanted to like it. It might seem funny but a more
       | straightforward framework like Symfony didn't get it in the way
       | and ended up much faster. Python should be much much faster than
       | PHP but I guess the framework matters a lot too.
        
       | abledon wrote:
       | Whats the most elegant way for cutting edge Django to do
       | websockets? is it still to 'tack' on the channels package [0] ?
       | 
       | compared to FastAPI[1] I really don't want to use it, I only miss
       | the ORM since in FastAPI it looks like you have to manually write
       | the code to insert stuff[2].
       | 
       | [0] https://realpython.com/getting-started-with-django-channels/
       | 
       | [1] https://fastapi.tiangolo.com/advanced/websockets/#create-
       | a-w...
       | 
       | [2] https://fastapi.tiangolo.com/tutorial/sql-
       | databases/#create-...
        
         | [deleted]
        
         | [deleted]
        
       | leafboi wrote:
       | Wasn't there an article about how the async syntax was
       | benchmarked to actually be _slower_ than the traditional way of
       | using threads? What 's the current story on python async?
       | 
       | reference: http://calpaterson.com/async-python-is-not-faster.html
        
         | tomnipotent wrote:
         | The built-in event loop has meh performance, would love to see
         | the benchmarks re-run using libuv - that would help close some
         | of the gap.
        
           | calpaterson wrote:
           | Hi, I am the author of the above article. Libuv was used -
           | for example by the uvicorn-based versions.
        
           | leafboi wrote:
           | They max out the speed with tests. It does use libuv. Uvicorn
           | is the indicator as it uses libuv underneath.
           | 
           | If you heard of Gunicorn, Uvicorn is the version of Gunicorn
           | with libUV, hence the name.
        
         | toxik wrote:
         | None of the other replies acknowledge this but it seems you are
         | conflating concurrency and asynchronous. An asynchronous
         | program can be sequentially executed. It is a distinct concept.
        
         | pdonis wrote:
         | The "slower" is not really the problem--as the article notes,
         | the sync frameworks it tested have most of the heavy lifting
         | being done in native C code, not Python bytecode, whereas the
         | async frameworks are all pure Python. Pure Python is always
         | going to be slower than native C code. I'm actually surprised
         | that the pure Python async frameworks managed to do as well as
         | they did in throughput. But of course this issue can be solved
         | by coding async frameworks in C and exposing the necessary
         | Python API using bindings, the same way the sync frameworks do
         | now. So the comparison of throughput isn't really fair.
         | 
         | The real issue, as the article notes, is latency variation.
         | Because async frameworks rely on cooperative multitasking,
         | there is no way for the event loop to preempt a worker that is
         | taking too long in order to maintain reasonable latency for
         | other requests.
         | 
         | There is one thing I wonder about with this article, though.
         | The article says each worker is making a database query. How is
         | that being done? If it's being done over a network, that worker
         | should yield back to the event loop while it's waiting for the
         | network I/O to complete. If it's being done via a database on
         | the local machine, and the communication with that database is
         | not being done by something like Unix sockets, but by direct
         | calls into a database library, then that's obviously going to
         | cause latency problems because the worker can't yield during
         | the database call. The obvious way to fix that is to have the
         | local database server exposed via socket instead of direct
         | library calls.
        
           | leafboi wrote:
           | >whereas the async frameworks are all pure Python.
           | 
           | No it's not pure python. It's a combination. The underlying
           | event loop uses libuv, a C library that's makes up the
           | underlying core of nodejs. The marker of "Uvicorn" is an
           | indicator of this as "Uvicorn" uses uvlib.
           | 
           | Overall the benchmark is testing a bit of both. The event
           | loop runs on C but it has to execute a bit of python code
           | when handling the request.
           | 
           | >If it's being done via a database on the local machine, and
           | the communication with that database is not being done by
           | something like Unix sockets, but by direct calls into a
           | database library, then that's obviously going to cause
           | latency problems because the worker can't yield during the
           | database call.
           | 
           | I am almost positive it is being done with some form of non
           | blocking sockets. The only other way to do this without
           | sockets is to write to file and read from file.
           | 
           | There is no "direct library calls" as the database server
           | exists as a separate process to the server process. Here's
           | what occurs:                 1. Server makes a socket
           | connection to database.       2. Server sends a request to
           | database       3. database receives request, reads from
           | database file.       4. database sends information back to
           | server.
           | 
           | Any library call you're thinking of that's called from the
           | library here may be a "client side" library meaning that the
           | library actually makes a socket connection to the sql server.
        
         | reticents wrote:
         | Wow, thank you for this link. I appreciate it when my
         | assumptions are challenged like this, particularly given the
         | fact that I have a tendency to take benchmark synopses like
         | FastAPI's [1] for granted. I'll have to be more conscious of
         | the ways in which authors hamstring the competition to game
         | their results.
         | 
         | [1] https://fastapi.tiangolo.com/benchmarks/
        
         | bob1029 wrote:
         | I think the story with async is always "it depends", unless we
         | are questioning whether the specific implementation is broken.
         | 
         | For some web applications, it might actually be faster (in
         | meaningful aggregate volume) to service a complete request on
         | the calling thread rather than deferring to the thread pool
         | periodically throughout execution. I think the break over point
         | between sync and async comes down to how much I/O (database)
         | work is involved in satisfying the request. If each request
         | only hits the database 1-2 times on average incurring a few
         | milliseconds of added latency, making sync all the way down is
         | might be better than with any amount of added context
         | switching. If each request may take 100-1000 milliseconds to
         | complete overall due to various long-running I/O operations,
         | then async is certainly one good approach for maximizing the
         | number of possible concurrent requests.
         | 
         | In most of my applications (C#/.NET Core) I default to
         | async/await for backend service methods, because 9/10 times I
         | am going to the database multiple times for something and I
         | cannot always guarantee that it will return quickly under heavy
         | load. For other items, I explicitly go wide on parallelizable
         | CPU-bound tasks. All of these are handled as a blocking call
         | against a Parallel.ForEach(). Never would a CPU-bound task be
         | explicitly wrapped with async/await, but one may be included as
         | part of a larger asynchronous operation.
         | 
         | This stuff used to confuse the hell out of me, and then I
         | finally wrapped my head around the 2 essential code
         | abstractions: async/await for I/O, Parallel.For() (et. al.) for
         | CPU-bound tasks which have parallelism opportunities. Never try
         | to Task.Run or async/await your way out of something that is
         | CPU-bound and is blocking the flow of execution. Try to
         | leverage asynchrony responsibly when delays >1ms are possible
         | in large concurrent volumes.
        
         | syndacks wrote:
         | Why is this being downvoted? Seems like a fair counter-point to
         | me.
        
           | ghostwriter wrote:
           | I didn't downvote it, but apart from the fact that async io
           | is not meant to be faster (it's all about throughput, after
           | all), the benchmark is flawed and it's been discussed in full
           | before https://news.ycombinator.com/item?id=23496994
        
             | leafboi wrote:
             | asyncio is meant to be "faster" for IO heavy tasks and low
             | compute. The benchmark tests requests per second which is
             | indeed directly testing what you expect it to test.
             | 
             | It's been discussed before but the outcome of that
             | discussion (in the link you brought up) was divided. Highly
             | highly divided. There was no conclusion and it is not clear
             | whether the benchmark was flawed.
             | 
             | The discussion is also littered with people who don't
             | understand why async is fast for only certain types of
             | things and slow for others. It's also littered with
             | assumptions that the test focused on compute rather than IO
             | which is very very evidently not the case.
        
         | theptip wrote:
         | This article doesn't evaluate the case that you actually want
         | ASGI for, so I don't think it's very useful. (Or at least, it
         | confirms something that should have already been clear).
         | 
         | If you're compute-bound, then Python async (which uses
         | cooperative scheduling similar to green threads) isn't going to
         | help you. You get parallelism, but not concurrency from this
         | progamming model; only one logical thread of execution is
         | running on the CPU at a time (per-process), so this can only
         | slow you down if you are CPU-constrained.
         | 
         | The standard usecase of a sync API backed by a local DB with
         | low request latency is typically going to be compute-bound.
         | 
         | This is covered in the Django async docs
         | (https://docs.djangoproject.com/en/3.1/topics/async/) and also
         | in green threading libraries like gevent
         | (http://www.gevent.org/intro.html#cooperative-multitasking).
         | 
         | The case where async workers are interesting is for I/O-bound
         | workloads. Say you're building an API gateway, or your
         | monolithic API starts to need to call out to other API
         | services, particularly external ones like Google Maps API. In
         | this case, the worst-case result is that the proxied HTTP
         | request times out; this could block your Django API's work
         | thread for many seconds.
         | 
         | In the async / green-threaded model, this case is fine; you
         | have a green thread/async function call per request, and if
         | that gthread is blocked on an upstream I/O operation, the event
         | loop will just start working on a different API call until the
         | OS gives a response on the network socket.
         | 
         | Essentially, there's no reason to use Django async if you're
         | doing a traditional monolithic DB-backed application. It's
         | going to give you benefits in usecases where the standard sync
         | model struggles.
         | 
         | (Note, there's an argument that you might want green threads
         | even in a normal monolith, to guard against cases like
         | "developer accidentally wrote a chunky DB query that takes 60
         | seconds to run for some inputs", but most DB engines don't
         | support one-DB-connection-per-HTTP-connection. There was a
         | bunch of discussion on this topic a few years ago, with the
         | SQLAlchemy author arguing that async is not useful for DB
         | connections:
         | https://techspot.zzzeek.org/2015/02/15/asynchronous-python-a...
         | although asyncio support was added: https://docs.sqlalchemy.org
         | /en/14/orm/extensions/asyncio.htm...)
        
           | leafboi wrote:
           | The tests aren't compute bound. They are testing requests per
           | second. It's testing biased towards IO, not compute. Please
           | read the article.
        
       | dec0dedab0de wrote:
       | I think the article and some of the comments are not really
       | looking at this the right way.
       | 
       | For most things you're probably better off "doing the work" in a
       | celery task, regardless if it is IO bound or CPU bound. Then use
       | web sockets just for your status updates/progress bar, instead of
       | having your front end poll on a timer.
        
       | hyuuu wrote:
       | time and time again, whenever I start a new project, Django has
       | always been my go-to choice after analyzing the alternatives.
       | I've worked on large scale, mono-repo, billion users to side
       | projects over the weekend, Django really stay true to the
       | batteries included philosphy.
        
       | silviogutierrez wrote:
       | Great article. But I think this part may need as second look:
       | If your views involve heavy-lifting calculations or long-running
       | network calls to be done as part of the request path, it's a
       | great use case for using async views.
       | 
       | That seems true for long-running network calls (IO). But for
       | heavy-lifting calculations? I thought that was _the_ canonical
       | example of situations async won 't improve. CPU bound and memory
       | bound, after all.
        
         | Znafon wrote:
         | You are correct, async will only help for long-running network
         | calls, which happens when calling another service or querying a
         | database.
         | 
         | When doing a long computation the CPU is not idle so there is
         | no free compute power to use for something else.
         | 
         | Finally, when doing IO calls in Python so GIL is usually
         | released so the kernel can already schedule another thread
         | while waiting for IO, so it is not sure that converting to
         | async will yield improvement and should be benchmarked if you
         | plan on converting an existing program.
        
           | pdonis wrote:
           | _> when doing IO calls in Python so GIL is usually released
           | so the kernel can already schedule another thread while
           | waiting for IO_
           | 
           | This is true, but scheduling another thread through the
           | kernel can have higher overhead since it requires context
           | switches. Running multiple threads also has other potential
           | issues with lock contention; how problematic they are will
           | depend on the use case.
           | 
           | The potential advantage of scheduling another thread is, of
           | course, that it can do CPU bound work; but in Python,
           | unfortunately, doing that means the GIL doesn't get released
           | so that thread will prevent any further network I/O while
           | it's running, the same as would happen in an async framework
           | if a worker did a lot of CPU work. So Python doesn't really
           | let you realize the advantages of threads in this context.
        
             | Znafon wrote:
             | > doing that means the GIL doesn't get released so that
             | thread will prevent any further network I/O while it's
             | running, the same as would happen in an async framework if
             | a worker did a lot of CPU work. So Python doesn't really
             | let you realize the advantages of threads in this context.
             | 
             | I don't think that's true, the GIL get released for many
             | computing intensive or IO bound tasks in Python, for
             | example when reading from a socket the GIL gets released at
             | https://github.com/python/cpython/blob/e822e37946f27c09953b
             | b...
        
         | dec0dedab0de wrote:
         | I thought the only reason to use ASGI is to use web sockets,
         | and the only reason to use web sockets is to avoid making
         | multiple requests for things that don't matter if a particular
         | message is lost.
        
         | ghostwriter wrote:
         | Perhaps they meant that heavy long-running calculations could
         | be offloaded to a worker pool with a help of concurrent futures
         | and run_in_executor()
         | 
         | - https://docs.python.org/3/library/concurrent.futures.html
         | 
         | - https://docs.python.org/3/library/asyncio-
         | eventloop.html#asy...
        
           | pdonis wrote:
           | This will only help if the workers are separate processes.
           | Thread workers will hold the GIL in Python and prevent
           | network I/O while they are doing CPU bound tasks.
        
             | ghostwriter wrote:
             | sure, the pool only cares about concurrent.futures.Executor
             | interface, an implementation could be processes or cloud
             | resources.
        
             | dr_zoidberg wrote:
             | > Thread workers will hold the GIL in Python and prevent
             | network I/O while they are doing CPU bound tasks.
             | 
             | Using cython:                   with nogil:             #
             | whatever you need to do, as long as it             #
             | doesn't touch a python object
             | 
             | If you're doing heavy calculations from python you should
             | at least be considering cython.
        
       ___________________________________________________________________
       (page generated 2020-08-14 23:00 UTC)