[HN Gopher] Ban 1+N in Django
       ___________________________________________________________________
        
       Ban 1+N in Django
        
       Author : Suor
       Score  : 42 points
       Date   : 2023-03-26 12:03 UTC (10 hours ago)
        
 (HTM) web link (suor.github.io)
 (TXT) w3m dump (suor.github.io)
        
       | spapas82 wrote:
       | This is very useful! I'm gonna start integrating it with my
       | projects. However having a way to allow/not allow n+1 queries
       | (like a context manager) would be much better.
       | 
       | The thing is that there are times where n+1 isn't a big problem
       | and fixing it would be a form of premature optimisation. I'd
       | prefer to be in control and decide if I care about the n+1 query
       | situation or not for some specific view.
        
       | nickjj wrote:
       | There is a case where having N+1 queries are beneficial.
       | 
       | In Rails terms, it's when you perform Russian doll caching, but
       | you can do this in any framework. The idea is you can cache a
       | specific X thing which might make a query to an associated Y
       | thing. A textbook N+1 query case (ie. a list of posts (X) that
       | get the author's name (Y)).
       | 
       | If you render the view without any cache with 10 things then
       | you'd perform 20 queries but after the cache is warm you'd
       | perform 0 queries. If item 5's Y gets updated then you only need
       | to bust the cache for item 5 and query only item 5's Y
       | association. Performing a preloaded query to get all X 10 things
       | with their Y associated things could be an expensive query.
        
         | fiddlerwoaroof wrote:
         | The downside here is a potential thundering herd issue if
         | you're forced to clear the cache.
        
         | zdragnar wrote:
         | That's great if you can fit a lot of your database in your
         | server's memory, but seems like a terrible headache once you
         | get a decent number of users.
         | 
         | Personally, I'd much rather have sane queries in the first
         | place, but rails isn't really my cup of tea either, so take my
         | opinion with a large pinch of salt if you do.
        
           | Kamq wrote:
           | > That's great if you can fit a lot of your database in your
           | server's memory, but seems like a terrible headache once you
           | get a decent number of users.
           | 
           | You'd surely care about getting a significant chunk of your
           | usage in server memory rather than what percentage of total
           | data that is, no?
           | 
           | To take the site we're on as an example, I'd be willing to
           | bet the 30 things on the front page have one or two orders of
           | magnitude more traffic than anything else (and probably a few
           | more orders of magnitude more than the median post).
        
         | tveita wrote:
         | You'd ideally want to do something like dataloader, where you
         | look up your N Xs in a single cache query, and then do a single
         | database lookup for the (N-C) Xs that weren't in cache. You can
         | then either eagerly load the Ys with the Xs like you said, or
         | do a secondary cache lookup for every Y, and potentially
         | another single database query for the Ys not in cache.
         | 
         | Unfortunately this pattern gets really hairy if you're not
         | using promises and an event loop.
         | 
         | https://www.npmjs.com/package/dataloader
        
           | stephen wrote:
           | +1. The JS event loop auto-monad-izing Promises into Haxl
           | [1]-esqe trees of implicitly-batched loads has been a big win
           | for us building on JavaScript/TypeScript.
           | 
           | If I had to move to another language, I'd really want to find
           | a "powered by the event loop / dataloader" framework, i.e.
           | Vert.x for Java.
           | 
           | Also, per dataloader, a shameless plug for our ORM that has
           | dataloader de-N+1-ing built natively into all object graph
           | traversals:
           | 
           | https://joist-orm.io/docs/goals/avoiding-n-plus-1s
           | 
           | [1]: https://github.com/facebook/Haxl
        
       | code_biologist wrote:
       | On an unrelated note, Python folks should check out OP's library
       | funcy [1]: "A collection of fancy functional tools focused on
       | practicality. Inspired by clojure, underscore and my own
       | abstractions."
       | 
       | Thanks for the library Suor!
       | 
       | [1] https://github.com/Suor/funcy
        
       | jonatron wrote:
       | See also django-zen-queries https://github.com/dabapps/django-
       | zen-queries , which can make it impossible for changes to a
       | template to trigger queries.
        
       | Dachande663 wrote:
       | Laravel has had a similar feature for a while[0]. It's been
       | useful to enable this in development but only warn in production
       | to prevent breaking things unexpectedly.
       | 
       | [0] https://laravel.com/docs/10.x/eloquent-
       | relationships#prevent...
        
       | lr4444lr wrote:
       | I don't disagree with the author in principle, but I find once
       | the data gets big enough where it makes a difference, I've
       | already shifted to using ".values()" to avoid the overhead of
       | model creation, and the KeyErrors that will throw if I leave the
       | query lazy is tantamount to the solution he describes.
        
       | btown wrote:
       | My Chaotic Good take on this: one could implement a
       | qs.auto_fetch_deferred() that emits model instances with weak
       | back-references to a WeakSet of all instances emitted, and on a
       | deferred get on ANY instance, it prefetches that attribute onto
       | ALL of the instances... so that it doesn't just complain, but
       | actually _fixes_ your 1+N issue. But here lies absolute
       | madness...
        
       | binarymax wrote:
       | This is why I always advocated against ORMs. It's so easy to fall
       | into traps like this without even knowing it, and while you can
       | work around it in some ORMs it is not obvious.
       | 
       | Writing SQL is not that hard, and mapping the results to a type
       | isn't that hard either. So with an ORM you might end up saving
       | several hours of work up front for lots of pain later.
        
         | mattbillenstein wrote:
         | You're on the right side of the bell-curve meme my friend, but
         | there are a lot more people in the thick-framework camp who
         | spend their days getting lost in the complexity of ORMs and
         | related tech...
        
       | gus_massa wrote:
       | There is a problem with the URL. I think this is the correct one
       | https://suor.github.io/blog/2023/03/26/ban-1-plus-n-in-djang...
        
         | Suor wrote:
         | Thanks
        
         | Suor wrote:
         | HN keeps automatically replacing the URL. It used to be a
         | redirect before, but not anymore
        
           | dang wrote:
           | As gus_massa pointed out, HN's software uses canonical URLs
           | when it finds them. The canonical URL on the page you
           | submitted was http://hackflow.com/blog/2023/03/26/ban-1-plus-
           | n-in-django, so our software used that. I've fixed it above
           | now.
        
           | gus_massa wrote:
           | HN has a feature that uses the canonical address in the
           | webpage. Is your page configured correctly?
           | 
           | Looking at the source:                 <link rel="canonical"
           | href="http://hackflow.com/blog/2023/03/26/ban-1-plus-n-in-
           | django">
           | 
           | Don't repost it again (for now). Try fixing the canonical
           | link, and send an email to the mods hn@ycombinator.com with a
           | short explanation and a link to this post
           | https://news.ycombinator.com/item?id=35313565 to save them a
           | few minutes searching. They may fix it using admin magic or
           | ask you to repost again once the problem is fixed.
        
       | Waterluvian wrote:
       | On the topic of accidentally doing lots of extra queries, I love
       | using Model.objects.raw to control exactly how it's retrieving
       | model data. I love how it keeps the results inside the Model
       | realm but gives you careful control.
       | 
       | I also like that if you access fields you didn't ask for, it'll
       | go get them. But I wish it screamed louder when this happened.
       | "Your logic works but we had to do extra queries. This might be
       | an error!" So the featured article is incredibly valuable. This
       | ought to be built-in.
       | 
       | Django Silk has been critical for discovering these cases.
       | They're too easy to do.
       | 
       | I love the middle ground of "ORM but you write the SQL."
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2023-03-26 23:00 UTC)