[HN Gopher] Negative dentries, 20 years later
       ___________________________________________________________________
        
       Negative dentries, 20 years later
        
       Author : bitcharmer
       Score  : 23 points
       Date   : 2022-04-11 19:28 UTC (3 hours ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | alyandon wrote:
       | So much fun it is to have multiple servers with 1.5 TB of ram
       | slowly fill up with negative cache dentries to the point that the
       | server kernel finally decides that memory pressure is a thing and
       | purges the negative entries all at once - which results in the
       | server being locked up and unresponsive for about ~3 minutes. Oh
       | yeah, and no tunables to control the negative dentry specific
       | behavior.
        
       | bjourne wrote:
       | How could a process accumulate hundreds of negative dentries
       | during normal operation? If that happens something fishy is going
       | on.
        
         | pdonis wrote:
         | The third paragraph in the article addresses this. Most of the
         | negative dentries a process generates are going to be invisible
         | to the user.
        
           | Groxx wrote:
           | Yeah, my immediate thought was of stuff like "ask nginx to
           | give me this misspelled webpage" -> look up that file -> bam,
           | dentry.
           | 
           | Expose practically anything on the internet that leads to a
           | piece of user input causing a filesystem operation (which is
           | going to be _extremely_ common and often unavoidable), and
           | "hundreds" isn't the concern. Millions to billions is.
        
         | teraflop wrote:
         | The article gives some examples, but here's another one:
         | consider what happens when you run "git status" on a large
         | repository. In order to determine the status of every file, git
         | needs to check each subdirectory for a ".gitignore" file, even
         | though the vast majority of the time there's only one at the
         | root. All of those nonexistent files can become negative
         | dentries.
         | 
         | In theory, git could do its own caching, or it could be
         | refactored so as to move the .gitignore scanning inline with
         | the main directory traversal. In practice, it doesn't do so (at
         | least as of 2.30.2, which is the version I just tested). And I
         | don't think it's reasonable to expect every program to contort
         | itself to minimize the number of times it looks for nonexistent
         | files, when that's something that can reasonably be delegated
         | to the OS.
        
           | bjourne wrote:
           | Well, recursive configuration is a misfeature of git. No sane
           | program should work that way. But regardless, if git scans
           | every sub directory then the kernel should have already
           | cached all "positive" dentries obviating the need for any
           | negative ones. And that cache must be an order of magnitude
           | larger than any negative cache.
        
           | Groxx wrote:
           | The only way git could cache this is with a file-system
           | watcher.... and tbh my experience with git fswatchers has
           | been _utterly abysmal_. They miss things  / have phantom
           | changes routinely, on the order of a couple times per week,
           | so I just disable them outright now.
           | 
           | Without a truly reliable filesystem watching technique, git
           | _cannot_ cache this information. Any cached knowledge needs
           | to be verified because it could have changed, which defeats
           | the whole purpose of a cache here. It could change its
           | strategy, to e.g. only check at repo root or require a list
           | of  "enabled" .gitignores, but it doesn't do that currently.
        
             | teraflop wrote:
             | Oh, I just meant caching within the lifetime of a single
             | git command's execution.
             | 
             | As it stands, when you run "git status", the git executable
             | goes through each directory in your working copy, lists all
             | of its directory entries, and then _immediately afterward_
             | calls open() to see if a .gitignore file exists in that
             | directory -- even if there wasn 't one in the list it just
             | read.
             | 
             | That's what I meant by saying that in principle, there's no
             | reason git couldn't just cache the file's presence/absence
             | on its own, just for those few microseconds (maybe "cache"
             | was a poor choice of words). In practice, I can understand
             | that the implementation complexity might not be worth it.
        
             | bsder wrote:
             | Does not caching this really help, though?
             | 
             | What happens when the .gitignore above you changes while
             | you are in the midst of scanning a subdirectory? Is that
             | not the same problem?
             | 
             | The problem is that git operations aren't transactional
             | with respect to the filesystem, no? Sure, the window of
             | uncertainty changes, but it's never zero.
        
       | PaulHoule wrote:
       | ... hmmm, makes me want to write a script that looks up 5 files
       | that aren't there in every directory in the system.
        
         | zokier wrote:
         | something like?                   find / -type d -exec sh -c
         | 'for i in $(seq 5); do stat {}/"$(head -c 15 /dev/urandom |
         | base32)" 2> /dev/null; done' \;
        
       | pdonis wrote:
       | Aren't there already well-tested LRU cache algorithms out there?
       | It seems like adapting one of those to the general problem of
       | kernel cache management would be a good way to address the
       | general problem described in the article.
        
         | MBCook wrote:
         | LRU works by by using a specific cache size. How big should
         | that be? Will that number work for small microcontrollers and
         | giant servers? Is that per disc? Per directory? Per system? Per
         | kernel?
         | 
         | Should that size expand and contract overtime somehow?
         | 
         | The article gets into a little bit of that stuff, but they're
         | all real problems.
        
         | Groxx wrote:
         | tbh this was my thought too. No need for background processing
         | either, there are quite a few options for amortized-constant-
         | time old-value deletion at lookup time. Or do that cleanup
         | while checking the disk on a cache miss - if you have no
         | misses, the cache is working _fantastically_ and probably
         | shouldn 't be pruned.
         | 
         | Or is there some reason those aren't acceptable in-kernel,
         | while apparently a lack of cleanup is fine? Maybe CPU use is
         | too high...?
        
         | tedunangst wrote:
         | And how many entries will you keep in the LRU?
        
       | jjoonathan wrote:
       | NXDOMAIN caching has been the bane of my existence recently and I
       | have been meaning to survey how other cache implementations deal
       | with the problem. Looks like everyone suffers from it. I've heard
       | edgy teenagers say that existence is suffering, but I suppose
       | caches teach us that non-existence is suffering too.
        
         | trashcan wrote:
         | NXDOMAIN at least has a TTL so it will not grow forever. What
         | problem are you referring to, not being able to invalidate a
         | cached NXDOMAIN? Or something else?
        
       ___________________________________________________________________
       (page generated 2022-04-11 23:00 UTC)