[HN Gopher] Cache made consistent: Meta's cache invalidation sol...
       ___________________________________________________________________
        
       Cache made consistent: Meta's cache invalidation solution
        
       Author : uvdn7
       Score  : 143 points
       Date   : 2022-06-08 17:46 UTC (5 hours ago)
        
 (HTM) web link (engineering.fb.com)
 (TXT) w3m dump (engineering.fb.com)
        
       | Hnrobert42 wrote:
       | You can't even visit the FB blog while using NordVPN. Amazing.
        
       | [deleted]
        
       | gwbas1c wrote:
       | > Based on consistency tracing, we know the following happened...
       | 
       | I'm a little confused:
       | 
       | - Is the database updated notification going out while the
       | transaction is still in-progress? Doesn't it make more sense to
       | delay the notification until after the transaction commits?
       | 
       | - If a cache tries to update itself while there's an open
       | transaction against the data it's trying to read, shouldn't the
       | database block the operation until the write is complete? And
       | then shouldn't another notification after the transaction commits
       | trigger the cache to update?
        
         | uvdn7 wrote:
         | > Is the database updated notification going out while the
         | transaction is still in-progress?
         | 
         | No, that's not what happened.
         | 
         | > If a cache tries to update itself while there's an open
         | transaction against the data it's trying to read, shouldn't the
         | database block the operation until the write is complete?
         | 
         | Yes. Generally speaking, having a read transaction here would
         | be better. There are unrelated reasons why it's done this way.
         | The point of the example is that it's really intricate and we
         | can still identify the bug regardless.
        
       | pdevr wrote:
       | From the article:
       | 
       | >>Polaris pretends to be a cache server
       | 
       | De facto, it is a cache server, with the associated performance
       | decrease. Isn't the main purpose of caching to improve
       | performance?
       | 
       | >>We deploy Polaris as a separate service so that it will scale
       | independently from the production service and its workload.
       | 
       | Assuming the scaling is horizontal, so then, to synchronize among
       | the service instances, what do you do? Create another meta-
       | Polaris service? Not rhetoric or sarcasm - hoping for an open
       | discussion.
        
         | [deleted]
        
         | uvdn7 wrote:
         | > De facto, it is a cache server, with the associated
         | performance decrease. Isn't the main purpose of caching to
         | improve performance?
         | 
         | Polaris only receives cache invalidation event, and doesn't
         | serve any client queries.
         | 
         | > so then, to synchronize among the service instances
         | 
         | It doesn't synchronize among the service instances. It's
         | actually basically stateless. It pretends to be a cache server
         | only for receiving invalidation events, and acts as a client,
         | and assumes no knowledge of the cache internals.
        
           | pdevr wrote:
           | Thanks for the clarification. However, in that case, won't
           | the querying of all the cache copies/replicas turn out to be
           | the bottleneck at high volumes? Because you are going to have
           | to check all the existing copies/replicas somewhere, right?
        
             | uvdn7 wrote:
             | Yep. Polaris checks can be sampled, but still covers pretty
             | much all types of workload and scenarios/races over time.
        
       | bklaasen wrote:
       | > Phil Karlton famously said, "There are only two hard things in
       | computer science: cache invalidation and naming things."
       | 
       | ...and off-by-one errors.
        
       | [deleted]
        
       | Trufa wrote:
       | This seems to be a pure click bait title, cache invalidation will
       | always be hard since it's basically a balanced give and take
       | issue which simple logic dictates can't avoid.
        
         | uvdn7 wrote:
         | Can you elaborate?
         | 
         | I went into details in the post about what I believe is the
         | root cause of why cache invalidation is hard, which seems
         | different than what you are saying.
        
       | orf wrote:
       | > Take TAO, for example. It serves more than one quadrillion
       | queries a day
       | 
       | That's 11,574,074,074 requests per second.
        
       | irrational wrote:
       | Phil Karlton famously said, "There are only two hard things in
       | computer science: cache invalidation and naming things and off-
       | by-one errors."
        
       | ntoskrnl wrote:
       | Correct me if I'm wrong, but the title seems a little clickbaity.
       | "Cache invalidation might no longer be hard" --> "We built a
       | tracing library and manually fixed the bugs it uncovered"
        
         | kilburn wrote:
         | "A little clickbaity" if you are generous. Short-sighted and
         | grandiloquent if you are not.
         | 
         | Short-sighted because it narrows the original problem (cache
         | invalidation) to a specific version ( _detecting_ cache
         | invalidation errors _past an arbitrary time-frame_ ).
         | 
         | Grandiloquent because it then claims to have "solved" the
         | general problem, whereas they haven't even solved the narrowed
         | version. Notice their own premise:
         | 
         | > For any anomaly in a stateful service, it is an anomaly only
         | if clients can observe it one way or the other. Otherwise, we
         | argue that it doesn't matter at all.
         | 
         | This is a practical attitude that can yield great results, and
         | probably even good engineering practice. However, it will never
         | lead to _solving_ a problem, because a problem isn 't solved
         | until you _prove_ that it is.
        
           | uvdn7 wrote:
           | > For any anomaly in a stateful service, it is an anomaly
           | only if clients can observe it one way or the other.
           | Otherwise, we argue that it doesn't matter at all.
           | 
           | Thanks for the comment! I stand by this claim; I would love
           | to hear more about why an anomaly (any anomaly) matters if it
           | can't possibly be observed by any client for stateful
           | service.
        
             | tofuahdude wrote:
             | It is just the difference between "matters" and "solved".
        
             | kilburn wrote:
             | The claim itself is fine because it is kind of
             | tautological. An error that _cannot be observed in any way_
             | is probably not an error. However, you _have_ to prove that
             | it can 't be observed.
             | 
             | This is quite different than what the described system does
             | though. The system flags errors that _have_ been observed.
             | The correct claim to go with what the article does would
             | thus be:
             | 
             | > For any anomaly in a stateful service, it is an anomaly
             | only if the current clients running their current
             | operations happen to observe it. Otherwise, we argue that
             | it doesn't matter at all.
             | 
             | It's much more noticeable this way, isn't it?
        
               | uvdn7 wrote:
               | This is a fair point!
               | 
               | The exact same argument applies to normal exceptions and
               | errors. It might be the case that out of quadrillions of
               | queries a day, just none of them hit the error path, and
               | hence we won't observe any issues.
               | 
               | I can see that in this case, if it matters or not becomes
               | somewhat subjective. It's a good point though!
        
         | uvdn7 wrote:
         | I am the author; so I am obviously biased here.
         | 
         | I am serious when I say cache invalidation might no longer be a
         | hard thing in computer science. In the post, I explained why
         | cache invalidation is hard; and how we solve/manage its unique
         | challenge and complexity. By my definition, we are solving the
         | cache invalidation problem.
         | 
         | The analogy I have is Paxos. It's notoriously hard to implement
         | Paxos correctly. Google published a paper on Paxos Made Live
         | just on how they managed the implementation and
         | productionization complexity.
        
           | continuational wrote:
           | Please be careful with such bold claims. You don't really
           | address the hard part of cache invalidation, which is to
           | figure out _when_ to do it.
        
             | uvdn7 wrote:
             | > which is to figure out when to do it.
             | 
             | Can you elaborate?
        
               | continuational wrote:
               | Sure - you typically cache the result of some expensive
               | query.
               | 
               | The hard part of cache invalidation is to detect _when_
               | an update somewhere in your system is going to affect the
               | result of that query, such that you need to trigger an
               | invalidation.
        
               | uvdn7 wrote:
               | Good point!
               | 
               | We have memcache which is a look-aside cache that serves
               | this type of workload. What you described can be solved
               | by adding one level of abstraction and letting reads and
               | writes all go through it.
               | 
               | Now on your read path, you can construct arbitrary
               | complex sql queries or whatnot, but it must take some
               | kind of input to filter on. Those become part of the
               | "keys".
               | 
               | The invariant is that as long as the "context/filter" you
               | encode covers all the mutations which would impact your
               | cache data, you should be good. Based on our experience,
               | it has been fairly managable.
        
               | irrational wrote:
               | >The invariant is that as long as the "context/filter"
               | you encode covers all the mutations which would impact
               | your cache data, you should be good.
               | 
               | Well, isn't this the truly hard part of cache
               | invalidation?
        
               | uvdn7 wrote:
               | I can see that. One can argue that the "difficulty" is
               | front loaded. If or not you can identify the list of
               | "context" is local. I guess you can probably come up with
               | complicated dependencies and argue it's hard to capture
               | the dependencies and I would agree with you.
               | 
               | Now getting back to the "what's really hard about cache
               | invalidation" part. Even with a much simpler model. Say
               | you just have a k/v store, no joins, nothing. Is cache
               | invalidation simple in that case?
               | 
               | I went into details about why it's still insanely hard.
               | And the big example at the end might help make that
               | point.
               | 
               | And this is one level beneath challenges from tracking
               | dependencies, and I argue that's what makes cache
               | invalidation hard.
               | 
               | Now going back to your example with complicated
               | dependencies. Maybe TTL is a better solution. With many
               | dependencies, any changes from the dependency list might
               | trigger invalidation. At some point, just doing TTL,
               | would be simpler.
        
               | ahahahahah wrote:
               | Yeah, the entire discussion here is fucked because the
               | poster doesn't understand what the original quote even
               | meant.
        
           | darig wrote:
        
         | dang wrote:
         | Yes - we changed it now and I posted
         | https://news.ycombinator.com/item?id=31672562.
        
       | londons_explore wrote:
       | > In other words, 99.99999999 percent of cache writes are
       | consistent within five minutes.
       | 
       | Those are the kind of rates that lead to very hard to find bugs.
       | You won't properly write code in downstream systems to properly
       | handle that failure case, and suddenly someone will get paged at
       | 3am because it has just happened years after the code was
       | deployed and nobody really understands how this corner case
       | suddenly happened!
       | 
       | When I'm building systems, every happy codepath should either
       | happen frequently or never.
       | 
       | Having a cache whose design allows it to be stale but only very
       | rarely seems like a bad idea. If I was building something on top
       | of this cache I would have a layer on top to force staleness for
       | 1 in 1000 requests just to force the application built on top to
       | be designed to handle that properly.
        
         | uvdn7 wrote:
         | I agree with everything you said.
         | 
         | > Having a cache whose design allows it to be stale but only
         | very rarely seems like a bad idea.
         | 
         | And that's exactly why we first brought this number from 6 9's
         | to more than 10 9's; and we are not done yet. Also the cache is
         | not "designed" to be inconsistent. It's like Paxos is designed
         | to solve the consensus problem, but when implemented, most
         | Paxos implementations do not work properly.
        
       | VWWHFSfQ wrote:
       | As naive as it is, I always kinda liked MySQL's query cache
       | invalidation strategy: blow away the whole cache anytime a
       | mutating statement arrives at the server.
       | 
       | Simple. Effective. But it obviously doesn't work for everything.
        
         | Trufa wrote:
         | Cause who needs scaling anyway!? Joking aside, strategy seems
         | logical for smaller stuff.
        
       | judofyr wrote:
       | > At a high level, Polaris interacts with a stateful service as a
       | client and assumes no knowledge of the service internals. This
       | allows it to be generic. We have dozens of Polaris integrations
       | at Meta. "Cache should eventually be consistent with the
       | database" is a typical client-observable invariant that Polaris
       | monitors, especially in the presence of asynchronous cache
       | invalidation. In this case, Polaris pretends to be a cache server
       | and receives cache invalidation events.
       | 
       | I'm a bit confused here. Polaris behaves as a regular cache
       | server which receives invalidation events, but isn't the typical
       | bug related to cache invalidation that a service _forgets_ to
       | contact the cache server for invalidation? So this will only
       | catch cases where (1) you remember to contact Polaris, but you
       | forgot to contact other cache servers [which Polaris happens to
       | know about], OR (2) you 're not handling errors during
       | invalidation requests to the cache server [and the request to
       | Polaris was successful]? Or are you cache servers "smart" and
       | might have internal logic which "ignores" an invalidation?
       | 
       | What am I missing?
       | 
       | EDIT: Reading through "A real bug we found and fixed this year"
       | and I'm still a bit confused. It seems like a very contrived bug
       | directed directly to how you deal with versioning (e.g. you allow
       | the latest version to be present with stale metadata, or
       | something?). My main concern with cache invalidation is _what_ to
       | invalidate at what time.
        
         | uvdn7 wrote:
         | > isn't the typical bug related to cache invalidation that a
         | service forgets to contact the cache server for invalidation?
         | 
         | That's not the case based on our experience at FB. At-least-
         | once delivery is a solved problem basically.
         | 
         | But you are absolutely right that if there's an issue in the
         | invalidation delivery, it's possible that polaris won't receive
         | the event as well. Polaris actually supports a separate event
         | stream (all the way from client initiated writes) to cover this
         | case.
        
           | benlivengood wrote:
           | It helps that TAO is a write through cache so clients can't
           | really forget to invalidate, correct? If someone were to
           | directly write to MySQL shards there would be stale data
           | ~indefinitely.
           | 
           | I'm assuming the versioning is what ensures proper write,
           | invalidation, and fetch ordering so that e.g. slow mysql
           | writes/replication don't cause remote TAO clusters to receive
           | an invalidation message and then read a stale value from a
           | replica?
        
             | uvdn7 wrote:
             | > It helps that TAO is a write through cache so clients
             | can't really forget to invalidate, correct?
             | 
             | That's not the case.
             | 
             | > If someone were to directly write to MySQL shards there
             | would be stale data
             | 
             | For the sake of this discussion, you can assume this is the
             | setup, and everything should still stand.
             | 
             | > I'm assuming the versioning is what ensures proper write,
             | invalidation, and fetch ordering so that e.g. slow mysql
             | writes/replication don't cause remote TAO clusters to
             | receive an invalidation message and then read a stale value
             | from a replica?
             | 
             | You should join us :p This is getting into the details of
             | our cache invalidation protocol and our cache consistency
             | model. Maybe another post on another day!
        
         | uvdn7 wrote:
         | > EDIT: Reading through "A real bug we found and fixed this
         | year" and I'm still a bit confused.
         | 
         | yeah that specific bug is convoluted.
         | 
         | > It seems like a very contrived bug directed directly to how
         | you deal with versioning (e.g. you allow the latest version to
         | be present with stale metadata, or something?).
         | 
         | This is mostly for performance reason that I won't go into
         | details too much here. Version is mostly an internal concept
         | that clients don't care about. So it's OK-ish.
         | 
         | > My main concern with cache invalidation is what to invalidate
         | at what time.
         | 
         | Can you elaborate? Our experience on cache invalidation is that
         | you invalidate on mutation, and in terms of "what to
         | invalidate" it depends on the computation, which you might have
         | to give me a little more details.
        
       | benlivengood wrote:
       | Do you guarantee referential integrity in TAO? Last I heard the
       | answer is no; complex queries are not supported and clients would
       | have to do their own work with internal versions to achieve a
       | consistent view on the whole graph (if such a consistent view
       | even exists). But it seems to work fine since global referential
       | integrity doesn't seem to be a big deal; there aren't a lot of
       | cases where dependency chains are long enough or actions quick
       | enough that it matters (e.g. a group admin adds a new user, makes
       | them an admin, who adds an admin, and then everyone tries to
       | remove the admin who added them [always locally appearing as if
       | there is >=1 remaining admin], but causes the group to ultimately
       | have no admins when transactions settle). Run into any fun issues
       | like that?
       | 
       | Contrasting that with Spanner where cache coherency is solved by
       | reading from the recent past at consistent snapshots or by
       | issuing global transactions that must complete without conflict
       | within a brief truetime window. I am guessing the cost of Spanner
       | underneath the social graph would be a bit too much for whatever
       | benefits might be gained, but curious if anyone looked into using
       | something similar.
        
         | uvdn7 wrote:
         | For the fun issues you described, read-modify-write (either
         | using optimistic concurrency control or pessimistic) can work,
         | if I understand your question correctly.
         | 
         | Spanner is awesome and one of my favorite systems/papers. I
         | think it would be very computational and power expensive to run
         | the social graph workload on a spanner-like system. Do you have
         | an estimate of how many megawatts (if not more) are needed to
         | support one quadrillion queries a day on Spanner?
        
       | lucideer wrote:
       | This would have been a really good article in itself, without the
       | frankly unbelievable hubris of the author.
       | 
       | TL;DR this is a blogpost about some very interesting problems
       | with distributed cache consistency and observability. The article
       | doesn't address cache invalidation and - based on their replies
       | here - the author doesn't seem to understand the cache
       | invalidation problem at all.
        
         | uvdn7 wrote:
         | I am glad that you liked the content.
         | 
         | On the definition of cache invalidation, and specifically why
         | it's hard. Can you send me a link to any definition of it? This
         | is what's in wikipedia and I think it's reasonable.
         | 
         | > Cache invalidation is a process in a computer system whereby
         | entries in a cache are replaced or removed.
         | 
         | And I think I am describing that process, and what's hard about
         | it. Some comments here explicitly talk about dependencies,
         | which I can see why it's hard. My point is that even without
         | dependencies, cache invalidation remains a hard problem. Now
         | about dependency tracking, some of my thoughts are captured
         | here https://news.ycombinator.com/item?id=31674933.
        
           | lucideer wrote:
           | Numerous replies to your comments here have pointed out why
           | cache invalidation is hard. You've responded to them by
           | saying "Good point!" (always with an exclamation point), and
           | then proceeded to demonstrate in your response that you
           | didn't understand their point.
           | 
           | This comment describes the "hard" part of cache invalidation
           | best: https://news.ycombinator.com/item?id=31674251 - your
           | response to them makes very little sense.
           | 
           | First, you acknowledge that they're correct, though you use
           | obtuse language to say so: you seem to like using the very
           | abstract term "dependency" to represent the very simple
           | concept of detecting updates.
           | 
           | In your second paragraph you then go off on an unrelated
           | tangent by saying:
           | 
           | > _Now getting back to the "what's really hard about cache
           | invalidation" part._
           | 
           | No. You're not "getting back" to that - you're changing the
           | subject back to the topic of your article, which is unrelated
           | to cache invalidation.
           | 
           | > _Say you just have a k /v store, no joins, nothing._
           | 
           | Cache invalidation is about invalidation - it's not about
           | your store architecture. It's not about the implementation of
           | logic that processes the clearing/overwrite of values, it's
           | about the _when_ and nothing else.
           | 
           | > _just doing TTL, would be simpler._
           | 
           | Yes. It is simpler. If you want to avoid solving _the_ hard
           | problem, you can use a TTL. Now... how long should it be?
        
             | uvdn7 wrote:
             | First of all, I do think you folks make good points. Let me
             | try again.
             | 
             | > The invariant is that as long as the "context/filter" you
             | encode covers all the mutations which would impact your
             | cache data, you should be good.
             | 
             | I wrote this line, which is referred to as that being
             | exactly why cache invalidation is hard, in the commented
             | you linked.
             | 
             | > it's about the when and nothing else.
             | 
             | Let's talk about that. Not to over generalize this, with a
             | simpler cache model (say you just cache a single item), do
             | you agree that solving the "when" problem is very
             | managable? If not, I would like to be enlightened.
             | 
             | Now with this very simple cache model, where we have
             | "magically" solved the "when" problem. Do you think cache
             | invalidation is solved? Or it's simple? After knowing when,
             | you still need to actually update cache right, and not to
             | leave it in an inconsistent state (against the source of
             | truth). Is that simple?
             | 
             | Let's essentially break cache invalidation into a few parts
             | 1. knowing when/who to invalidate 2. actually processing
             | the invalidate
             | 
             | My argument is that #1 can be very managable with simpler
             | data models. #2 can't be avoided; and #2 is very hard.
        
       | pyrolistical wrote:
       | I don't get it. If you already have the version in the cache,
       | then when a client does GET X, the cache can reply X@v4.
       | 
       | As long as the client submits transactions with "I calculated
       | this based off of X@v4" then the database can reject if there is
       | a newer version of X.
       | 
       | The client can then replay the transactions against the cache and
       | by then there will be a X@v5.
       | 
       | With this scheme you can track the transaction replay rate
       | instead of having to build a new system to read-back the cache
       | for consistency.
       | 
       | To get the traceability on which cache node is stale, the client
       | can forward a header from the cache node that identifies it. Then
       | using the same transaction rejection rate ops has visibility on
       | which cache node isn't being updated.
       | 
       | No cache invalidation needed. Always just cache forever
       | X@version, it's just that the cache allows an unstable query for
       | GET X.
        
         | jitl wrote:
         | Tracking all data dependencies used for all writes in a large
         | system seems rather challenging.
         | 
         | Plus what about read workloads, like "pull message from queue
         | and send it as a push notification to user Y"? I guess it's
         | fine if the push is re-delivered due to stale cache?
        
         | dragontamer wrote:
         | Versions of cache-data aren't numbers, they're vectors-of-
         | numbers (at a minimum).
         | 
         | Here's a bunch of versions of variable "X":
         | 
         | X@Alice-V1, X@Alice-V2, X@Bob-V1, X@Alice-V2/Bob-V2, X@Carol-V1
         | 
         | Which version of "X" should you read?
         | 
         | EDIT: Alice, Bob, and Carol are three different servers. What
         | has happened here is that Alice and Bob are closer together, so
         | their updates are faster between each other (synchronizing the
         | writes). Carol is slower for some reason (bad connection?), and
         | is going to update off of old data.
         | 
         | In this case, the two valid caches are X@Alice-V2/Bob-V2, and
         | X@Carol-V1. The other cached data is old and can be discarded.
         | 
         | Things get complicated when you have not only reads (such as in
         | your simplified question), but also writes that are cached
         | (such as what happens in practice).
        
       | moralestapia wrote:
       | ???
       | 
       | We seem to have different concepts with regards to the cache
       | invalidation problem.
       | 
       | The one I know has to do with "Is there a change in the source
       | data? How should I check this? How often?" and such things.
       | 
       | Yours seems to be "In my distributed system, I cannot guarantee
       | the order of the messages that get exchanged, and thus, different
       | nodes may end up having different information at times".
       | 
       | IMO you have a consensus/distributed data problem, not a cache
       | invalidation problem (the inconsistency between your cache nodes
       | is a consequence of the underlying design of your system).
        
         | throwdbaaway wrote:
         | Exactly. The race condition problem from the example below
         | still exists even without any caching, and perhaps still exists
         | even without any replica.
         | 
         | > Imagine that after shuffling Alice's primary message store
         | from region 2 to region 1, two people, Bob and Mary, both sent
         | messages to Alice. When Bob sent a message to Alice, the system
         | queried the TAO replica in a region close to where Bob lives
         | and sent the message to region 1. When Mary sent a message to
         | Alice, it queried the TAO replica in a region close to where
         | Mary lives, hit the inconsistent TAO replica, and sent the
         | message to region 2. Mary and Bob sent their messages to
         | different regions, and neither region/store had a complete copy
         | of Alice's messages.
         | 
         | A solution to this problem is to update Alice's primary message
         | store to be "region 2;region 1" for a period of time, and get
         | Alice to retrieve messages from both message stores during that
         | period.
         | 
         | However, since you have a caching layer that doesn't have an
         | explicit TTL, such that any cache inconsistency can last
         | indefinitely, you now have a much bigger distributed data
         | problem.
        
         | tremon wrote:
         | FAFAIK the original cache invalidation problem is always seen
         | from the point of view of a writer in a multiple-writer
         | distributed system. It's not about determining the freshness of
         | data as seen from a reader/consumer, but about atomically (!)
         | updating all data copies by a write action in a distributed
         | system.
         | 
         | If you assume loosely-coupled, eventually-consistent data, you
         | haven't solved the problem at all -- you've just side-stepped
         | it. If you assume a single-writer, multiple-reader
         | architecture, you don't have the original problem at all.
        
           | uvdn7 wrote:
           | > the original cache invalidation problem
           | 
           | Do you have any reference or links about it? Thanks.
           | 
           | > point of view of a writer in a multiple-writer distributed
           | system > If you assume a single-writer, multiple-reader
           | architecture, you don't have the original problem at all.
           | 
           | It starts to sounds like you are describing Paxos?
        
         | uvdn7 wrote:
         | I think https://news.ycombinator.com/item?id=31672541 might
         | address your question.
        
           | moralestapia wrote:
           | No, mine was not a question, it was a statement.
           | 
           | You misnamed the problem you are trying to solve.
        
             | uvdn7 wrote:
             | OK. I am here to learn. Say we go with your definition of
             | cache invalidation problem. Do you think that's THE hard
             | part of cache invalidation?
             | 
             | Say, you are just caching simple k/v data (no joins,
             | nothing). Are you claiming cache invalidation is simple in
             | that case?
             | 
             | Also the reason why I didn't mention the cache invalidation
             | dependencies is that _I believe_ it's a solved problem (see
             | the link above, we do operate memcache at scale). I am
             | happy to discuss if and why it wouldn't work for your case.
        
               | moralestapia wrote:
               | >Are you claiming cache invalidation is simple in that
               | case?
               | 
               | No.
               | 
               | >I believe it's a solved problem
               | 
               | It's not.
               | 
               | Suppose you have source-of-truth A (doesn't really matter
               | if it's a key-value store or whatever, it could even be a
               | function for all intents) and a few clients B1, B2, B3,
               | ... that rely on the data from A. You have to keep them
               | in sync. When should B* check if A has changed? Every
               | time they need it? Every minute? Every hour? Every day?
               | This is the cache invalidation problem; which IMO is not
               | even a problem but a tradeoff, but whatever.
               | 
               | Epilogue: With all due respect, you have an unbelievable
               | career, IBM, Autodesk, Google, Box and now Meta. None of
               | those companies would give me five minutes of their time
               | because I am self-taught, yet here we are :)
        
               | unboxingelf wrote:
               | Epilogue: With all due respect, you have an unbelievable
               | career, IBM, Autodesk, Google, Box and now Meta. None of
               | those companies would give me five minutes of their time
               | because I am self-taught, yet here we are :)
               | 
               | Correlation does not imply causation. I'm also self
               | taught and some of those companies have given me a lot
               | more than 5 minutes of their time.
               | 
               | Here's some unsolicited feedback - I interpreted your
               | comment above as implying you know more than $peer, and
               | it comes off a little snarky with the emoji face. That
               | approach doesn't demonstrate professionalism to me. On
               | the other end, $peer focuses on the technical challenge.
               | Most of the gig boils down to communicating ideas.
        
               | uvdn7 wrote:
               | > Suppose you have source-of-truth A (doesn't really
               | matter if it's a key value store or whatever, it could be
               | a function for all intents) and a few clients B1, B2, B3,
               | ... that rely on the data from A. You have to keep them
               | in sync. When should B* check if A has changed? Every
               | time they need it? Every minute? Every hour? Every day?
               | That is the cache invalidation problem.
               | 
               | A few things to clarify here what you are referring to as
               | clients are cache hosts (as they keep data). You seem to
               | imply that the cache is running on the client? I was
               | referring to cache servers (think memcache, Redis, etc.),
               | for which the membership can be determined. So on update
               | (e.g. when you mutate A, you know all the Bs to
               | invalidate).
               | 
               | Now continuing with your example, with cache running on
               | the client. Assuming we are talking about same concept
               | when we say "client", the membership is non
               | deterministic. Clients can come and go (connect and
               | disconnect as they wish). There are some attempts to do
               | invalidation-based cache on clients, but they are hard
               | because of the reason I just mentioned. So usually client
               | cache is TTL'ed. E.g. the very browser you are using to
               | see this comment has a lot of things cached. DNS is not
               | going to send an invalidate event to your browser. It't
               | TTL based.
               | 
               | I guess what I am saying is that cache invalidation
               | rarely applies to cache side cache as far as I know.
               | Maybe you have a different example, which we can discuss.
        
               | moralestapia wrote:
               | Caches, clients, Facebooks, hosts, Metas, Redises(?), ...
               | all those things don't really matter.
               | 
               | What matters is B* reads from A, how often should that
               | happen? That's it, literally. That's the whole problem.
        
               | teraflop wrote:
               | It doesn't matter whether the cache is co-located with
               | the "client" that ultimately uses the data. Say A is a
               | database, and B1, B2, B3... are memcached servers. The
               | exact same situation applies.
               | 
               | > So on update (e.g. when you mutate A, you know all the
               | Bs to invalidate).
               | 
               | But "knowing" this is a big part of what people mean when
               | they say cache invalidation is hard! If the value in B is
               | dependent on a complicated function of A's state, then it
               | may be difficult to automatically determine, for any
               | given mutation to A, which parts of B's cache need to be
               | invalidated.
               | 
               | > There are some attempts to do invalidation-based cache
               | on clients, but they are hard because of the reason I
               | just mentioned. So usually client cache is TTL'ed.
               | 
               | Given this statement, the original title of this
               | submission is even more baffling. If you recognize that
               | data can be cached in clients, and that invalidating
               | those caches is so hard that most systems -- including
               | yours -- just completely abandon the goal of being able
               | to do it correctly/reliably, then how can you claim your
               | system makes it no longer a hard problem?
        
               | uvdn7 wrote:
               | Good points. I will try to address them.
               | 
               | > But "knowing" this is a big part of what people mean
               | when they say cache invalidation is hard!
               | 
               | I can see that. Memcache is a look-aside cache we have at
               | scale. There are abstractions on top to manage this
               | complexity; and it has been fairly managable. I am sure
               | you can come up with a complicated dependency tree that
               | things are not obvious at all. But when you do have a
               | very large dependency tree, any change in them can
               | trigger cache invalidation, at which point, caching with
               | TTL will be a better option IMO. I can see where you are
               | coming from.
               | 
               | > If you recognize that data can be cached in clients,
               | and that invalidating those caches is so hard that most
               | systems
               | 
               | My reasoning for this being hard is different than yours
               | I think. In my comment, it's due to indeterminism of the
               | cluster membership. I think in that case, we are talking
               | about different problems.
        
         | uvdn7 wrote:
         | It's a good observation that out-of-orderness and asynchrony
         | contribute a lot to the challenge.
         | 
         | > the inconsistency between your cache nodes is a consequence
         | of the underlying design of your system
         | 
         | I disagree with this. First of all, imposing order and
         | synchrony is expensive, slow and unreliable. Even if we do
         | that, cache fill and invalidation will never be externally
         | coordinated; and will collide on the cache host.
        
           | wildmanx wrote:
           | > > the inconsistency between your cache nodes is a
           | consequence of the underlying design of your system
           | 
           | > I disagree with this. First of all, imposing order and
           | synchrony is expensive, slow and unreliable. Even if we do
           | that, cache fill and invalidation will never be externally
           | coordinated; and will collide on the cache host.
           | 
           | This is a weird reply. GP writes "A implied B" and you
           | disagree by arguing that you had to do A. Sure. But it's
           | still what implied B. Quite a fundamental logic flaw, no?
        
             | uvdn7 wrote:
             | I don't see that. Yes I did say I had to do A. Then I said
             | that it doesn't imply B. Because even if the underlying
             | system is strongly ordered, we can't order external events.
             | Did I miss something?
        
               | wildmanx wrote:
               | A == "Design of your system"
               | 
               | B == "Inconsistency between your caches"
               | 
               | GP: A -> B
               | 
               | You: "Imposing order and synchrony is [bad things]" ==
               | "If you don't do A then [bad things]", i.e., (not A) ->
               | [bad things]
               | 
               | Also you: "Even if we did, then [other issues]" == "Even
               | if you don't do A then [other issues]", i.e., (not A) ->
               | ..what? B?
               | 
               | You are certainly not arguing against the A -> B
               | implication.
        
               | uvdn7 wrote:
               | I really appreciate your comment. I am learning as we
               | speak.
               | 
               | I was essentially saying !A !-> !B - even if we have a
               | strongly ordered system, there will still be
               | inconsistencies/races/problems.
               | 
               | And I can see why it's a bad/weird argument. Thank you
               | for pointing it out. Really appreciate it!
        
         | seqizz wrote:
         | Yeah I was also intrigued by the title and couldn't find what I
         | expected tbh.
        
         | cratermoon wrote:
         | Yeah it reads to me like they redefined "cache consistency" to
         | mean something different.
        
           | uvdn7 wrote:
           | That's not intended. What's your definition of cache
           | consistency?
        
       | rossmohax wrote:
       | How both of these can be true?
       | 
       | > Data in cache is not durable, which means that sometimes
       | version information that is important for conflict resolution can
       | get evicted.
       | 
       | > We also added a special flag to the query that Polaris sends to
       | the cache server. So, in the reply, Polaris would know whether
       | the target cache server has seen and processed the cache
       | invalidation event.
       | 
       | To make special flag work, cache server need to track not only
       | current version state, but also past versions. If it tracks past
       | versions, then conflicts can be resolved at the cache server
       | level, but whole premise of article is that cache servers can't
       | resolve conflicts by themselves.
        
         | uvdn7 wrote:
         | Great question!
         | 
         | I will try to answer this one without too much implementation
         | details.
         | 
         | First of all, cache items can be evicted and are not durable,
         | which is just a fact. But that doesn't mean we can't track
         | progress (in terms of "time" or "logical time"/"order" in
         | distributed system terms). The social graph is sharded (not
         | surprisingly). We can keep progress of each cache host per
         | shard, which is just a map kept in memory, which doesn't get
         | evicted.
         | 
         | I hope this answers your question.
        
       | isodev wrote:
        
         | yodsanklai wrote:
         | Totally off-topic, but...
         | 
         | https://www.facebook.com/help/152637448140583
         | 
         | > No, we don't sell your information. Instead, based on the
         | information we have, advertisers and other partners pay us to
         | show you personalized ads on the Facebook family of apps and
         | technologies.
         | 
         | A lot can be said about Meta but I find it counterproductive to
         | make up facts. It's not correct that they "sell personal
         | details".
        
         | gtirloni wrote:
         | Your comment does not contribute anything useful to this
         | technical subject.
        
       | uvdn7 wrote:
       | I am the author of the blog post. I believe the methodology
       | described should be applicable to most if not all invalidation-
       | based caches. I am serious when I say that cache invalidation
       | might no longer be a hard thing in computer science. AMA!
        
         | robmccoll wrote:
         | Saying a problem isn't hard to solve because you have tools to
         | analyze the correctness of your solution seems like a stretch.
        
           | uvdn7 wrote:
           | But that's not what I am saying though ...
           | 
           | > you have tools to analyze the correctness of your solution
           | 
           | That's half of it. Cache invalidation is hard not only
           | because of the complexity of cache coherence protocols, some
           | of which is not very complicated. But cache invalidation does
           | introduce countless races that manage to introduce cache
           | inconsistencies in ways that are just hard to imagine ahead
           | of time (in my experience). IMO, that's the harder part of
           | cache invalidation - when cache inconsistencies happen,
           | answering the "why" question is much harder than having a
           | cache invalidation protocol (you can have TLA+ for one if you
           | will). And answering the "why my cache is inconsistent" is
           | the problem we solved, which I think is the harder part of
           | the cache invalidation problem.
        
             | latchkey wrote:
             | > answering the "why my cache is inconsistent" is the
             | problem we solved
             | 
             | That should be the title and focus of your post. Instead,
             | it feels like grandiose claims about solving cache
             | invalidation itself.
        
               | uvdn7 wrote:
               | > Instead, it feels like grandiose claims about solving
               | cache invalidation itself.
               | 
               | That is definitely something I worried about. But at the
               | same time, I do think we solved the harder part of the
               | cache invalidation problem. TAO and Memcache serves
               | quadrillions queries a day. Based on our experience,
               | answering the question of "why my cache is inconsistent"
               | is the hardest thing about cache invalidation.
               | 
               | Cache invalidation protocols can be complicated, but some
               | are pretty managable. You can also verify it using TLA+
               | if you will. But once the rubber hits the road, some
               | cache entries tend to be inconsistent.
               | 
               | I definitely worry about people taking this the wrong
               | way, but at the same time, I stand by the claim of "cache
               | invalidation might no longer be a hard thing in computer
               | science".
        
               | latchkey wrote:
               | The jist of what I got from the post was that you created
               | a tool which monitors caches and that helped find bugs
               | that cause inconsistent cache issues. How is that solving
               | a computer science problem?
        
               | uvdn7 wrote:
               | The tool that monitors cache consistency is easy to
               | build. That by itself doesn't solve anything major. The
               | most important contribution is a novel approach on
               | consistency tracing that helps find out "why" caches are
               | inconsistent - pinpoint a bug is much harder than saying
               | "there is a bug". This is based on the key insight that a
               | cache inconsistency can only be introduced (for an
               | invalidation-based cache) in a short time window after
               | the write/mutation, this is what makes the tracing
               | possible.
               | 
               | > How is that solving a computer science problem?
               | 
               | It depends on your definition of a computer science
               | problem. I am definitely not solving P = NP. By your
               | definition, does Google's Paxos Made Live paper solve a
               | computer science problem?
               | 
               | The claim is more a play on Phil Karlton's quote, as the
               | work here makes cache invalidation much easier (in my
               | opinion). Also Phil Karlton's quote doesn't necessarily
               | _make_ a problem a computer science problem, don't you
               | think? I think it's a good quote and there's a lot of
               | truth in it.
        
         | jitl wrote:
         | How do you handle caching derived data assembled from multiple
         | individual records? For example, how would you maintain a cache
         | for a query like getFriendsOfFriends(userId)?
         | 
         | My context is working on Notion's caches for page data. Our
         | pages are made out of small units called "blocks" that store
         | pointers to their child content. To serve all the blocks needed
         | to render a page, we need to do a recursive traversal of the
         | block tree. We have an inconsistent cache of "page chunks"
         | right now, how would you think about making a consistent cache?
        
           | uvdn7 wrote:
           | This is fantastic question! We face something very similar
           | (if not identical).
           | 
           | There are two parts to solve this problem 1. monitor and
           | measure how consistent the cache is 2. figure out why they
           | are inconsistent
           | 
           | I will focus on #1 in this comment. You can build something
           | very similar to Polaris (mentioned in the blog) that - tails
           | your database's binlog so it knows when e.g. "friendship"
           | data is mutated - it can then perform the computation to
           | figure out which cache entries "should have been" updated.
           | E.g. if Alice just friended Bob, then Alice's friends-of-
           | friends and Bob's friends-of-friends should reflect the
           | change. And your monitoring service will "observe" that and
           | alert on anomalies
        
             | ahahahahah wrote:
             | So if you just implement the cache invalidation logic in
             | your monitoring system, you can tell if you got the cache
             | invalidation logic correct in your caching system. That
             | sounds really helpful!
        
               | uvdn7 wrote:
               | That's not the case though. Polaris acts as a client and
               | only monitors client observable effect, and assumes no
               | knowledge of the server internals.
               | 
               | I am trying to help.
        
         | continuational wrote:
         | With the risk of stating the obvious - the hard part of cache
         | invalidation is to know _when_ to invalidate the cache.
        
           | uvdn7 wrote:
           | We invalidate cache upon mutations. When else would you do
           | it?
        
             | continuational wrote:
             | Please see https://news.ycombinator.com/item?id=31672541
        
         | rajesh-s wrote:
         | Off topic but what tool did you use to create those cache
         | hierarchy diagrams?
        
         | simonw wrote:
         | Is your argument here that cache invalidation may no longer be
         | hard because you can implement systems like Polaris which
         | continually test your cache implementation to try and catch
         | invalidation problems so you can then go and fix them?
         | 
         | EDIT: Already answered in this reply:
         | https://news.ycombinator.com/item?id=31671794
        
           | uvdn7 wrote:
           | Polaris is actually fairly simple to build. The harder
           | question to answer is "why cache is inconsistent" and how you
           | debug. The second half of the post talks about consistency
           | tracing, which tracks all cache data state mutations.
           | 
           | Distributed systems are state machines, with consistency
           | tracing keeping track of all the state transitions, debugging
           | cache inconsistencies in an invalidation-based cache is very
           | actionable and managable based on our experience.
        
         | politician wrote:
         | I read the article. It seems to be suggesting that cache
         | invalidation might no longer be a hard thing in computer
         | science because of the insights uncovered by your tracing and
         | observability solution. IOW, now that you can measure and audit
         | the system, you're able to find and fix bugs in the system. Is
         | that the correct take-away?
        
           | uvdn7 wrote:
           | Yep. You're exactly right. And the approach is generic and I
           | think it should work for everyone.
           | 
           | The idea is that with all these observability capabilities,
           | debugging cache inconsistencies is getting very actionable
           | and close to how we debug an error with message telling us
           | exactly where an exception happened.
        
         | ArrayBoundCheck wrote:
         | I didn't understand the tracing part. Is tracing 100% inside of
         | polaris? If not does it start at the database? The first cache?
         | 
         | Does the invalidation need to be predictable before you can use
         | tracing? What kind of parameters do you have so you don't have
         | too much logging or too little?
        
           | uvdn7 wrote:
           | Tracing is not in polaris. It's a separate library that runs
           | in every cache host. Its main job is logging every cache
           | state mutation for traced writes. So when polaris detects
           | cache inconsistencies, we know what happened and why cache is
           | inconsistent.
           | 
           | It starts all the way from client initiated write (where we
           | mint a unique id, and plumb it all the way through).
           | 
           | > What kind of parameters do you have so you don't have too
           | much logging or too little?
           | 
           | This is the key question! If you go to the second half of the
           | post, it talks about an insight about how we managed to log
           | only when and where cache inconsistencies _can_ be
           | introduced. There's only a small window after mutation/write
           | where cache can become inconsistent due to invalidation. So
           | it's very cheap to trace and provide the information we need.
        
         | [deleted]
        
       | dang wrote:
       | The submitted title ("Cache invalidation might no longer be a
       | hard thing in Computer Science") broke the site guidelines badly.
       | They say:
       | 
       | " _Please use the original title, unless it is misleading or
       | linkbait; don 't editorialize._" -
       | https://news.ycombinator.com/newsguidelines.html
       | 
       | Editorializing the original title to make it _more_ linkbait is
       | going the wrong way down a one-way street. Please don 't.
       | 
       | (If you're the author, then please either use the original title
       | or, if it's linkbait, change the HN title to make it less so.)
        
         | gjs278 wrote:
        
         | bigbillheck wrote:
         | As far as I'm concerned I haven't seen any change.
        
         | uvdn7 wrote:
         | Accepted. Just to be clear, it's not a link bait as far as I
         | understand it for two reasons 1. I am dead serious about making
         | cache invalidation a simpler problem 2. "Cache invalidation
         | might no longer be a hard problem in Computer Science" is the
         | subtitle of the corresponding systems @scale talk I just gave.
         | 
         | I respect the guidelines and I am OK with the rename. I just
         | want to clarify that I don't think the old title is misleading.
        
           | PKop wrote:
           | It's still massively click-bait until such time as the claim
           | isn't pure speculation. Also, casually optimistic claims
           | about complex and hard problems are the essence of click-
           | bait. There isn't much unique about _your_ instance of it
        
             | uvdn7 wrote:
             | > until such time as the claim isn't pure speculation
             | 
             | Which I don't think it is.
             | 
             | > casually optimistic claims about complex and hard
             | problems are the essence of click-bait
             | 
             | A group of people at Facebook worked on this for many
             | years. TAO and Memcache serves quadrillions of queries a
             | day. I wanted to make this claim - cache invalidation might
             | no longer be a hard problem in Computer Science - years
             | ago; but I didn't because I don't want to jump the gun. We
             | baked the solution for years. There's nothing casual about
             | this.
        
               | pvg wrote:
               | Just about all papers and write-ups that describe big
               | advances on important problems aren't titled 'Big advance
               | in Problem X' or 'X now an ex-problem'.
        
               | uvdn7 wrote:
               | This is a fair point. Lesson learned. Thank you :D
        
               | o_____________o wrote:
               | Small comment of appreciation for your attitude and quick
               | adjustment, nice to see on the internet!
        
               | tomcam wrote:
               | Agreed. Gracefully handled. I happened to agree with the
               | arguments, but treasure civil discourse.
        
       | AtNightWeCode wrote:
       | So basically, classic timestamp versioning with some consistency
       | checking. Might work. Cache invalidation is hard. The only
       | problem I ever faced while working in the biz that I could not
       | solve even at theory level was cache related.
       | 
       | What I thought the article would tackle is the dependencies
       | between stored objects. Some solve it with super complicated
       | graphs others by just flushing the entire cache.
       | 
       | At Meta, why not just use flat files or a store that act as one,
       | pubsub the changes, listen on the changes and update aggregated
       | views as needed and store them. Then just short-term cache
       | everything on the edge.
        
         | uvdn7 wrote:
         | > What I thought the article would tackle is the dependencies
         | between stored objects.
         | 
         | Like tracking dependencies, so it knows what to invalidate on
         | mutations?
        
           | AtNightWeCode wrote:
           | Yes. The Boss changes his name from Mark. It will directly
           | hit a lot of caches. It will need an update of all bosses
           | that materialized this. Then all the staff that materialized
           | the boss of the boss and so on. This is simple example
           | because it is a tree. This can be circular dependencies as
           | well.
        
       ___________________________________________________________________
       (page generated 2022-06-08 23:00 UTC)