[HN Gopher] Millions of Tiny Databases [pdf]
       ___________________________________________________________________
        
       Millions of Tiny Databases [pdf]
        
       Author : aratno
       Score  : 80 points
       Date   : 2020-02-14 19:02 UTC (3 hours ago)
        
 (HTM) web link (assets.amazon.science)
 (TXT) w3m dump (assets.amazon.science)
        
       | vimota wrote:
       | Tangentially related, BigQuery uses a similar usage-based
       | approach to place and replicate data in a manner that's likely to
       | be available for users:
       | 
       | https://cloud.google.com/blog/products/data-analytics/how-bi...
        
       | ignoramous wrote:
       | Abstract at: https://www.amazon.science/publications/millions-of-
       | tiny-dat...
       | 
       | > _...Physalia is a transactional key-value store, optimized for
       | use in large-scale cloud control planes, which takes advantage of
       | knowledge of transaction patterns and infrastructure design to
       | offer both high availability and strong consistency to millions
       | of clients. Physalia uses its knowledge of data center topology
       | to place data where it is most likely to be available. Instead of
       | being highly available for all keys to all clients, Physalia
       | focuses on being extremely available for only the keys it knows
       | each client needs, from the perspective of that client._
       | 
       | > _...We believe that the same patterns, and approach to design,
       | are widely applicable to distributed systems problems like
       | control planes,configuration management, and service discovery._
       | 
       | It'd be interesting to constrast this approach with Route53's or
       | IAM's datastore which need to be globally-replicated with time-
       | bounded eventually-consistent reads, and transactional but
       | verifiable writes.
       | 
       | I hope AWS begins publishing about S3, now. One can look at the
       | patents AWS engineers author to get a feel for some of the
       | internals, but they are (intentionally?) hard to read.
       | 
       | For instance, patents filed by two of the many S3 founding-
       | engineers:
       | https://patents.google.com/?inventor=James+Christopher+Soren...
       | 
       | Also see:
       | 
       | https://aws.amazon.com/builders-library/
       | 
       | https://research.google/pubs/
        
         | mjb wrote:
         | It's not really about the design of S3, but if you're
         | interested in some of the philosophy and thinking behind S3 you
         | might enjoy "Beyond eleven nines: Lessons from Amazon S3
         | culture of durability"
         | https://www.youtube.com/watch?v=DzRyrvUF-C0
        
       | kthejoker2 wrote:
       | Thought for sure this'd be a thinkpiece on Excel in the
       | enterprise ...
       | 
       | Seriously, though, this whole paper uses an amazing amount of
       | terminology - blast radius, colony, color, game day, split brain
       | - and an awesome biological metaphor of the Portuguese man o'war.
       | 
       | Great read even if you don't care about fault tolerance, CAP
       | theorem, or distributed balancing at AWS-scale.
       | 
       | One sample quote of the value of cheap heuristics over full-blown
       | number-crunching:
       | 
       | > Globally optimizing the placement of Physalia volumes is not
       | feasible for two reasons, one is that it's a non-convex
       | optimization problem across huge numbers of variables, the other
       | is that it needs to be done online because volumes and cells come
       | and go at a high rate in our production environment. Figure 11
       | shows the results of using one very rough placement heuristic: a
       | sort of bubble sort which swaps nodes between two cells at random
       | if doing so would improve locality. In this simulation, we
       | considered 20 candidates per cell. Even with this simplistic and
       | cheap approach to placement, Physalia is able to offer
       | significantly (up to 4x) reduced probability of losing
       | availability.
        
       | mjb wrote:
       | This is my paper (along with Tao and Fan). It's a great feeling
       | to have this published and available, and I'm super proud of the
       | team behind Physalia.
       | 
       | There's a lighter-weight introduction to the work here:
       | https://www.amazon.science/blog/amazon-ebs-addresses-the-cha...
       | and for those attending NSDI, I'll be talking about Physalia in
       | the "Deployment Experience" session on Wednesday.
        
         | maxmcd wrote:
         | Very cool. I haven't read the whole paper yet, but from a quick
         | overview it seems somewhat similar to SLOG (which also deals
         | with world-scale replication by trying to keep data closer to
         | the nodes that use it):
         | 
         | - http://www.vldb.org/pvldb/vol12/p1747-ren.pdf
         | 
         | - https://blog.acolyer.org/2019/09/04/slog/
         | 
         | Any thoughts on this comparison?
        
           | mjb wrote:
           | Very interesting, I hadn't seen SLOG before. It seems like
           | there's a fairly similar core insight: placing data near
           | where it's used can help with some system properties. They
           | appear to be more latency-focused and we were (primarily)
           | aiming at availability.
           | 
           | The other different part of Physalia is our focus (again, for
           | availability) on placement for 'blast radius'. That means we
           | try limit the number of cells than any one failure (software,
           | infrastructure, etc) can touch. Geo-replicated systems can
           | have similar concerns, but I haven't seen the same level of
           | focus on it as a key design goal.
        
         | revertts wrote:
         | RE: NSDI - Is the Firecracker paper available somewhere, or not
         | yet?
         | 
         | Edit: I'm an idiot -
         | https://www.amazon.science/publications/firecracker-lightwei...
        
       ___________________________________________________________________
       (page generated 2020-02-14 23:00 UTC)