[HN Gopher] Metastable Failures in Distributed Systems [pdf] ___________________________________________________________________ Metastable Failures in Distributed Systems [pdf] Author : zekrioca Score : 70 points Date : 2021-10-04 17:52 UTC (5 hours ago) (HTM) web link (sigops.org) (TXT) w3m dump (sigops.org) | ctlachance wrote: | This paper introduced me to a new concept in system architecture. | Thanks for posting it! | mjb wrote: | I think this paper is super important, and anybody who designs or | runs big systems should read it and take the core point to heart. | As system designers, we're very used to thinking about systems as | 'stable' and 'unstable', where stability is good, and instability | is bad. What this paper points out is that many kinds of | distributed systems have multiple 'stable' modes, some of which | are modes where the system is stable (in a control theory sense), | but not doing any useful work from the client's perspective. This | is dangerous, because the system won't kick itself out of this | "stable but down" mode without something changing: human input, a | control plane taking action, etc. | | I don't think this paper covers anything particularly new, but | writing it down in this form, with the evidence they present, is | very valuable. Hopefully this paper will deepen the conversation | about applying control theory to distributed systems design and | control problems, and allow a more theoretical approach to be | taken to the design of these systems to avoid common causes of | instability and bistability. | | One of the authors has a great summary of the paper on his blog: | http://charap.co/metastable-failures-in-distributed-systems/ | | I wrote a summary and discussion too: | https://brooker.co.za/blog/2021/05/24/metastable.html | dang wrote: | Discussed 4 months ago: | | _Metastable Failures in Distributed Systems_ - | https://news.ycombinator.com/item?id=27506167 - June 2021 (11 | comments) | | ...but on a day like today we dare not mark it as a dupe. ___________________________________________________________________ (page generated 2021-10-04 23:00 UTC)