[HN Gopher] What we learned after I deleted the main production ... ___________________________________________________________________ What we learned after I deleted the main production database by mistake Author : fernandopess1 Score : 36 points Date : 2022-09-19 20:19 UTC (2 hours ago) (HTM) web link (medium.com) (TXT) w3m dump (medium.com) | sulam wrote: | Being one click away from a DELETE vs a GET sounds like a serious | foot-gun that I would wrap a check around. "Are you sure? This | operation will delete 17M entries." | fabian2k wrote: | I'd be seriously scared of putting any production credentials | with write access into my Postman/Insomnia/whatever. Those | tools are meant for quickly experimenting with requests, they | don't have any safety barriers. | partdavid wrote: | I mean, it shouldn't really be very easy to even _get_ a | read-write token to a production database, unless you 're a | correctly-launched instance of a publisher service. This | screams to me that they're ignorant of, and probably very | sloppy with, access control up and down their stack. | layer8 wrote: | This is the Postman HTTP method selection dropdown that you can | see on the screenshots on this page ("GET"): | https://learning.postman.com/docs/sending-requests/requests/... | | Postman doesn't know that sending a single DELETE request to | that URL will delete 17 million records. | | Arguably, REST interfaces shouldn't allow deleting an entire | collection with a single parameterless DELETE request. | theptip wrote: | Honestly I'd make the case for writing a simple python script | for this kind of thing. | | `requests.get(url)` is a lot harder to mis-type as | `requests.delete(url)`. | | At $dayjob we would sometimes do this sort of one-off request | using Django ORM queries in the production shell, which could | in principle do catastrophic things like delete the whole | dataset if you typed `qs.delete()`. But if you write a one-off | 10-line script, and have someone review the code, then you're | much less likely to make these sort of "mis-click" errors. | | Obviously you need to find the right balance of safety rails | vs. moving fast. It might not be a good return on investment to | turn the slightly-risky daily 15-min ask into a safe 5-hour | task. But I think with the right level of tooling you can make | it into a 30 min task that you run/test in staging, and then | execute in production by copy/pasting (rather than deploying a | new release). | | I would say that the author did well by having a copilot; | that's the other practice we used to avoid errors. But a | copilot looking at a complex UI like Postman is much less | helpful than looking at a small bit of code. | alexjplant wrote: | I once worked on an app where the staging database was used for | local testing, all devs used the same shared credentials with | write access, and you switched environments by changing hosts | file entries (!!!). This resulted in me accidentally nuking the | staging database during my first week on the job because I ran a | dev script containing some DROPs from my corporate Windows system | and failed to flush the DNS cache. | | I had already called out how sub-optimal this entire setup was | before the incident occurred but it rang hollow from then on | since it sounded like me just trying to cover for my mistake. The | footguns were only half-fixed by the time I ended up leaving some | time later. | Johnny555 wrote: | _An old discussion arose about the need for backups. We had | backups for most databases but no process was implemented for | ElasticSearch databases. Also, that database was a read model and | by definition, it wasn't the source of truth for anything. In | theory, read models shouldn't have backups, they should be | rebuilt fast enough that won't cause any or minimal impact in | case of a major incident. Since read models usually have | information inferred from somewhere else, It is debatable if they | compensate for the associated monetary cost of maintaining | regular backups_ | | My biggest concern about restoring that Elasticsearch backup | would be that the restored backup would be inconsistent with the | real source of truth and it might be hard to reconcile to bring | it up to date. | soco wrote: | While everything there is true, why not having a backup anyway? | I have Elasticsearch backups and even used it once (with | success) when I terraformed the index away. The delta was | sourced then on the fly. | antisthenes wrote: | The backup only needs to last long enough until the production | database is rebuilt from the source of truth, and then swapped | back to the most recent search database. | | In other words, it only has to be good enough for a few days | (ideally - hours). | glintik wrote: | <<We had backups for most databases but no process was | implemented for ElasticSearch databases.>> - that's all you need | to know | benjaminpv wrote: | Funny to think that the issue here is just a relative of the 'no- | preserve-root' feature rm (now) has: it's easy to let the user | use the same actions equally on the branches of a hierarchy as | you could the leaves, but _should_ they? | | Pretty recently corporate changed something on my work laptop | that resulted in a bunch of temporary files generated during the | build getting redirected to OneDrive. I went in and nuked the | temp files and shortly thereafter got a message from OD saying | 'hey noticed you trashed a ton of files, did you mean to do | that?' | | The developer side of me thought 'of course I did, duh' but I can | imagine that's useful information for most users that made an | innocent yet potentially costly mistake. | duxup wrote: | Having an endpoint that can just delete... everything seems kinda | risky. | SoftTalker wrote: | > In the fifteen minutes I had before the next meeting, I quickly | joined with one of my senior members to quickly access the live | environment and perform the query. | | Don't do stuff in a rush like this. That's when I almost always | make my worst mistakes. If there is a "business urgency" then | cancel or get excused from the upcoming meeting so you can focus | and work without that additional pressure. If the meeting is | urgent, then do the other task afterwards. | racl101 wrote: | Now this meeting will be get many more urgent meetings. | PeterisP wrote: | For me, an interesting statement was "However, it took 6 days to | fetch all data for all 17 million products." - in my experience | of DB systems, 17 million entries is significant but not | particularly much, it's something that fits in the RAM of a | laptop and can be imported/exported/indexed/processed in minutes | (if you do batch processing, not a separate transaction per | entry), perhaps hours if the architecture is lousy but certainly | not days. | thayne wrote: | That kind of depends on how big each record is. And it sounds | like these records are denormalized from multiple sources, so | you probably have several transactions for each record. It's | possible to do batching in that situation, but it definitely | isn't always easy. | fabian2k wrote: | I think this is a very clear disadvantage of the microservice | architecture they chose in this case, and the post does allude | to that. To recreate this data they needed to query several | different microservices that would not have been able to | sustain a higher load. | | If I calculated this right the time they mention comes down to | 30 items per second. Which is maybe not unreasonable for | something that queries a whole bunch of services via HTTP, but | is kinda ridiculous if you compare it to directly querying a | single RDBMS. | | You could probably fix this by scaling _everything_ | horizontally, if that is possible. But the real solution would | be as you say to have bulk processing capabilities. | PeterisP wrote: | Yes, adding a "return X items" mode to the same microservices | often is a way to get a significant performance boost with | only minor changes, where even if your main use case needs | only one item, it enables mass processing without incurring | the immense overhead of a separate request per each item. | gtirloni wrote: | _> Any kind of operation was done through an HTTP call, which | you would otherwise do with a SQL script, in ElasticSearch, you | would do an HTTP request_ | | There you go. | motoboi wrote: | People, please don't post things in medium, because it wants | people to sign up. Use GitHub pages or anything else really. | ThunderSizzle wrote: | I'm torn on this, honestly. | | We want an internet with less ads, but good writers deserve to | get paid. They can get paid via Medium (though how much, I | don't know) through subscriptions. Is that the worse than ads | or newspapers? | bachmeier wrote: | What you say is true, but that doesn't mean it should be | posted to HN. The purpose of this site is to discuss | articles. This one's behind a paywall. Even if you sign up | for a free account, you may have used up your two free | articles per month. That invites people to comment without | reading the article. That's not why HN exists. (I actually | checked the comments hoping someone posted a copy of the | article.) | Victerius wrote: | I enjoy good writing, but the only writing I'm willing to pay | for is print books (I just bought a copy of J.R.R. Tolkien's | "The Fall of Gondolin", the hardcover, illustrated one by | HarperCollins). I don't want to pay for newspapers, for | investigative journalism, or for long form article magazines | like The Atlantic or The New Yorker. Nevermind Medium of all | places, because Medium has no barrier of entry. No | gatekeeping (and, given how easy it is to merely write a | blurb of text, I have rather high standards for what I choose | to pay to read). I'd rather consume from the likes of Amazon | and have them run these writing platforms (e.g. WaPo) at a | loss. Which means I'm paying for writers, in the end, just in | a very indirect way. This sits well with me. | | But if the choice before me was to pay for writers directly | (like Medium), or let non-book writers as a profession | disappear, I'd opt for the latter. You may criticize this | attitude. I assume the responsibility for that and I'm being | honest. | jacooper wrote: | There is also hashnode.com | rlewkov wrote: | Exactly. Won't read because it requires sign up. | gumby wrote: | archive.ph cuts through the medium paywall too. | contravariant wrote: | As does basic cookie hygiene. | | At least I assume that's what's happening, I haven't seen a | medium paywall yet. | baal80spam wrote: | If you're on Firefox there's an extension to bypass this (only | for Medium's free articles) - | https://gitlab.com/magnolia1234/bypass-paywalls-firefox-clea... | demindiro wrote: | There is also LibRedirect[1] which automatically redirects to | an alternative frontend. | | [1] https://github.com/libredirect/LibRedirect | metadat wrote: | I'm not keen on playing the browser plugin escalation game | with fundamentally UX hostile sites like Medium. They clearly | have no respect for the human being at the end of the line | trying to simply read a document. | thatguy0900 wrote: | This is extremely melodramatic. They literally just want | money so they don't have to run ads ___________________________________________________________________ (page generated 2022-09-19 23:00 UTC)