[HN Gopher] I Accidentally Deleted 7TB of Videos Before Going to... ___________________________________________________________________ I Accidentally Deleted 7TB of Videos Before Going to Production Author : thevinter Score : 445 points Date : 2022-05-05 10:00 UTC (13 hours ago) (HTM) web link (blog.thevinter.com) (TXT) w3m dump (blog.thevinter.com) | lnxg33k1 wrote: | But are you a junior dev with less than one year of experience | working by yourself alone at a company? No tech lead/help? | birdyrooster wrote: | Is 7TB a lot? Peers at personal arrays at orders of magnitude | greater. | iamben wrote: | I like these stories. I think they resonate well for 'the rest of | us'. I've made plenty of mistakes like this - you learn and grow, | right? | | One of the best things about HN is that so many incredible, | talented people post. It's incredibly inspiring to raise your own | game, to see what the best are doing. But sometimes it's equally | important to realise we all fuck up, and for every unicorn dev | there's another thousand of us grinding away. | | OP - well done for sorting the problem and telling us all about | it! | rossdavidh wrote: | Amen | JacobiX wrote: | > It involves bad practices and errors from multiple parties in a | world that might seem | | > foreign to the "Silicon Valley" world but paints an accurate | picture of what | | > development is for small IT companies around the world | | Everybody makes mistakes even in the "Silicon Valley" world, but | such problems cloud be easily caught by testing (which he did but | it was restricted to the first page) and performing a simple dry- | run. | crispyambulance wrote: | Exactly, _everyone_ makes mistakes. Sometimes huge ones. In | hindsight or on the sidelines it 's always easy to point out a | few technical things that WOULD HAVE avoided catastrophe, but | does that help? I think not (aside from a cautionary parable | for interns). | | Things are complicated, people are human and forget things, | there are pressures to "get it done" and override the | guardrails. Everybody has horror stories. Some worse than | others. Welcome to the OP's day of horror. I would think | "Silicon Valley" dev-ops horror stories make this one seem like | a triviality. | batch12 wrote: | It's like the first time you run rm -rf | /path/to/delete/ * | | And realize it is taking too long... | SnowHill9902 wrote: | Can you explain? I feel like it removes / but not sure why. | switch007 wrote: | The error is the space before the asterisk. The original | intention was to delete the contents of the folder | /path/to/delete/. Instead, the asterisk enumerates files in | the current directory and they get deleted | KarlKode wrote: | Besides recursively deleting /path/to/delete/ the command | also deletes all (non hidden) content of the current | directory (note the * at the end of the line). I assume the | correct command would be /path/to/delete/*. | Tesl wrote: | It removes everything in the current directory | pwg wrote: | rm -rf /path/to/delete/ * | | Note the space between the last / and the * | | This will recursively remove the directory /path/to/delete | and remove every file/directory that matches * in the current | directory where 'rm' is being run. | | When what was most likely meant was: rm -rf | /path/to/delete/* | | Note the lack of a space between the last / and _. This will | remove all files that match_ that reside in the | /path/to/delete/ directory. | progx wrote: | Now you learned what a backup is. | lpointal wrote: | How can any enterprise only rely on such online services and | not keep copies of their job on their own storage ? | | At least store in large TB hard disks connected with a SATA | adapter when needed, and put them in a case in a safe place | (better: two copies, stored in two places). What is the HD + | copy time price relatively to production work ? | stareatgoats wrote: | A great success story as far as I'm concerned, even if it doesn't | reflect well on Vimeo support. But a good reminder to have | someone doublecheck your logic if you aim to delete massive | amounts of data from production. And to check if the backups are | working (producing restorable data) on a regular basis. Sometimes | they just seem to be working, as I have learned the hard way... | legalcorrection wrote: | [deleted] | JauntyHatAngle wrote: | I'm baffled by this too. Unnecessary bridge burning I'd call | it. | | It's not even necessary to the story. | dkersten wrote: | Its explained on the first line: " I'm a Junior Developer | with less than one year of actual experience. Some of the | things that might seem obvious to some might not be so for | me". I guess it applies to this, too, not just the technical | aspects. | dsego wrote: | I might've missed it, but I don't think that line existed | when this was first posted. | thevinter wrote: | You're right and I edited the company's name (might be too late | but better this way). That said I'm not very happy with the | experience of working for TheCompanyTM anyways so I'm in the | process of switching jobs. | | Thanks for the comment :) | philliphaydon wrote: | I would take down the post entirelly. | | Your current job is linked in your CV. | legalcorrection wrote: | And try emailing the hackernews mods asking them to take | this post down. | yowlingcat wrote: | As sibling comments indicate, I would advise emailing HN mods | to take this post down and remove it from your blog and post | it on an anonymous one. Here are the problems you will face: | | 1) Your current blog has your current employer + client | linked to it. 2) Your github has your real name. 3) All of | these have been crawled/archived. | | None of this bodes well for your career in the future. While | I think your blog post is a great war story, it's really not | a good idea to post it on your main account which can be | traced back to your real name and CV because it will come up | the next time you apply for a job. | | Unfortunately, even if it illustrates a great deal of | ingenuity and creativity on your part in fixing a mess you | made, many folks will take one look at it and be judgmental. | You have to manage your reputation online and be careful. | legalcorrection wrote: | You're welcome and good luck! | KingOfCoders wrote: | Talking bad about your employer is great for finding a new | job. Companies are eager to hire people who bad-talk them. | breakfastduck wrote: | He doesn't talk bad about his employer. He talks bad about | his employers client. | mkr-hn wrote: | Tech is like any other human endeavor. People talk. | People change jobs and still like the people in the place | they left. | urbandw311er wrote: | Would you have had the courage to post this here if you hadn't | been able to fix it? | tomkwong wrote: | First, I want to say that this is a great post. You always grow | stronger when you make mistakes. Writing it up solidify | understanding in the learning process. | | This story resonates with many people here because many | experienced engineers had done something similar before. For me, | destructive batch operations like this would be two distinct | steps: | | 1. Identify files that need to be deleted; 2. Loop through the | list and delete them one by one. | | These steps are decoupled so that the list can be validated. Each | step can be tested independently. And the scripts are idempotent | and can be reused. | | Production operations are always risky. A good practice is to | always prepare an execution plan with detailed steps, a | validation plan, and a rollback plan. And, review the plan with | peers before the operation. | notyourday wrote: | > 1. Identify files that need to be deleted; 2. Loop through | the list and delete them one by one. | | > These steps are decoupled so that the list can be validated. | Each step can be tested independently. And the scripts are | idempotent and can be reused. | | This is the most underrated comment. | | I'm saying it as someone who had the ultimate oversight of | deleting hundreds of TBs per day spread of billions of files on | different clouds and local storage. | dsego wrote: | > but at the time the code seemed completely correct to me | | It always does. | | > Well, it teaches me to do more diverse tests when doing | destructive operations. | | Or add some logging and do a dry run and check the results, | literally simple prints statements: | print("-----") print("Downloading videos ids from url: | {url}") print(list of ids) ... ... | ... # delete() dangerous action commented out until I'm | sure it's right print("I'm about to delete video {id}") | print("Deleted {count} videos") # maybe even assert ... | | Then dump out to a file and spot check it five times before | running for real. | aqme28 wrote: | Rather than commenting it out, I suggest adding a --live-run | flag to scripts and checking the output of --live-run=false (or | omitted) before you run it "live." | sdevonoes wrote: | But then you have double the chances of introducing a bug for | the specific scenario we are talking about: | | Before: there is chance there is a bug in my "delete" use | case | | Now: what we have before plus the change that there is a bug | in my "--live-run" flag | aqme28 wrote: | You can make automated tests for your flag. You can't make | automated tests for your code comments. | mbiondi wrote: | Agreed, I've also been burned doing stupid things like this and | always print out the commands and check them before actually | doing the commit. | | As they say, measure twice, cut once. | | Don't feel bad, I think every professional in IT goes through | something similar at one time or another. | lifthrasiir wrote: | Human-in-the-loop is so important concept in ops and yet | everyone (that's including me) seems to learn it the hard way. | GordonS wrote: | It's amazing the number of times I look at some simple code and | think "nah, this is so simple it doesn't need a test!", add | tests anyway (because I know I should)... and immediately find | the test fails because of an issue that would have been | difficult to diagnose in production. | | Automated tests are awesome :) | dncornholio wrote: | Dry run really is key here. Most automated tests wouldn't find | this bug. | pc86 wrote: | I just want to say as someone currently working on a script to | delete approximately 3.2TB of a ~4TB production database, this | subthread is pure gold. | hayd wrote: | I'd make sure those include WARN or ERROR (I'd use logging to | do that), that way you can grep for those. Spot checking might | be difficult if the logs get long. | V__ wrote: | This was my first thought too. Another think I like to do, is | to limit the loop to say one page or 10 entries and check after | each run that it was correctly executed. It makes it a half- | automated task, but saves time in the long run. | hinkley wrote: | Condensed to aphorism form: Decide, then act. | | There's a whole menagerie of failure modes that come from | trying to make decisions and actions at the same time. This is | but one of them. | | Another of my favorites is egregious use of caching, because | traversing a DAG can result in the same decision being made | four or five times, and the 'obvious' solution is to just add | caches and/or promises to fix the problem. | | As near as I can tell, this dates back to a time when | accumulating two copies of data into memory was considered a | faux pas, and so we try to stream the data and work with it at | the same time. We don't live there anymore, and because we | don't live there anymore we are expected to handle bigger | problems, like DAGs instead of lists or trees. These | incremental solutions only work with streams and sometimes | trees. They don't work with graphs. | | Critically, if the reason you're creating duplicate work is | because you're subconsciously trying to conserve memory by | acting while traversing, then adding caches completely | sabotages that goal (and a number of others). If you build the | plan first, then executing it is effectively dynamic | programming. Or as you've pointed out, you can just not execute | it at all. | | Plus the testing burden is so drastically reduced that I get | super-frustrated having to have this conversation with people | over and over again. | password4321 wrote: | SELECT COUNT(1) FROM table -- UPDATE table SET | col='val' WHERE 1=1 | worble wrote: | BEGIN TRANSACTION UPDATE table SET col='val' WHERE | 1=1 ROLLBACK | password4321 wrote: | Definitely better, when you can afford the overhead! | tomrod wrote: | Exactly! | mipmap04 wrote: | I do this, too, but I also take a count of the expected number | of items to be deleted as well. If my collection I'm iterating | over doesn't have exactly that number of objects I expect, I | don't proceed. | kortex wrote: | This is why I like to always write any sort of user-script | batch-job tools (backfills, purges, scrapers) with a "porcelain | and plumbing" approach: The first step generates a fully | declarative manifest of files/uris/commands (usually just json) | and the second step actually executes them. I've used a --dry- | run flag to just output the manifest, but I just read some | folks use a --live-run flag to _enable_ , with dry-run being | the default, and I like that much better so I'll be using that | going forward. | | This pattern has the added benefit that it makes it really easy | to write unit tests, which is something often sorely lacking in | these sorts of batch scripts. It also makes full automation | down the line a breeze, since you have nice shearing layers | between your components. | | http://www.laputan.org/mud/mud.html#ShearingLayers | InfoSecErik wrote: | I tend towards a --dry-run flag for creative actions and | --confirm for destructive actions. Probably sightly annoying | that the commands end up seemingly different, but it sure | beats accidentally nuking something important. | gilleain wrote: | Yes, I find command line tools that have a "--dry-run" flag to | be very helpful. If the tool (or script or whatever) is | performing some destructive or expensive change, then having | the ability to ask "what do you think I want to do?" is great. | | It's like the difference between "do what I say" and "do what I | mean"... | bzxcvbn wrote: | That's what I like about powershell. Every script can include | a "SupportsShouldProcess" [1] attribute. What this means is | that you can pass two new arguments to you script, which have | standardized names across the whole platform: | | - -WhatIf to see what would happen if you run the script; | | - -Confirm, which asks for confirmation before any | potentially destructive action. | | Moreover these arguments get passed down to any command you | write in your script that support them. So you can write | something like: | [CmdletBinding(SupportsShouldProcess)] param | ([Parameter()] [string] $FolderToBeDeleted) | # I'm using bash-like aliases but these are really powershell | cmdlets! echo "Deleting files in $FolderToBeDeleted" | $files = @(ls $FolderToBeDeleted -rec -file) echo | "Found $($files.Length) files" rm $files | | If I call this script with -WhatIf, it will only display the | list of files to be deleted without doing anything. If I call | it with -Confirm, it will ask for confirmation before each | file, with an option to abort, debug the script, or process | the rest without confirming again. | | I can also declare that my script is "High" impact with the | "ConfirmImpact = High" switch. This will make it so that the | user gets asked for confirmation without explicitly passing | -Confirm. A user can set their $ConfirmPreference to High, | Medium, Low, or None, to make sure they get asked for | confirmation for any script that declare an impact at least | as high as their preference. | | [1]: https://docs.microsoft.com/en- | us/powershell/scripting/learn/... | spookthesunset wrote: | I'm a bit confused (because I didnt read the docs)... does | calling it with "--whatif" exercise the same code path as | calling without, only the "do destructive stuff" | automagically doesn't do anything? Or is it a separate | routine that you have to write? | | Cause if it is an entirely separate code path, doesn't that | introduce a case where what you say you'll isn't exactly | what actually happens? | bzxcvbn wrote: | It's the first option. And yes, sometimes you have to be | careful if you want to implement SupportsShouldProcess | correctly, it's not something you can add willy-nilly. | For example, if you create a folder, you can't `cd` there | in -WhatIf mode. | FriedrichN wrote: | All my tools that have a possible destructive outcome use | either a interactive stdin prompt or a --live option. I like | the idea of dry running by default. | rjh29 wrote: | Going further, make it dry run by default and have an | --execute flag to actually run the commands: this encourages | the user to check the dryrun output first. | mmcclimon wrote: | The rule we have is that anything that is not idempotent and | not run as a matter of daily routine must dry-run by default, | and not take action unless you pass --really. This has saved | my bacon many times! | maweki wrote: | Deleting actually is idempotent. Doing it twice wont be | different from doing it once. | maccard wrote: | Deleting * may not be though. Your selection needs to be | idempotent. | maweki wrote: | idempotency means that f(X) = f(f(X)). Modifying the X | inbetween is not allowed. Is there really an initial | environment where rm * ; rm * ; does something different | than rm * once? | einsty wrote: | In the case of any live system, i would say yes. | Additional, and different, files could have appeared on | the file system in between the times of each rm *. | mikeryan wrote: | * is just short hand for a list of files. Calling rm with | the same list of files will have the same results if you | call it multiple times. That's idempotent. | | Your example is changing the list of files, or arguments | to rm between runs. Same as pc85's example where the | timestamp argument changes. | pc86 wrote: | In addition to what einsty said (which is 100% accurate), | if you're deleting aged records, on any system of | sufficient size objects will become aged beyond your | threshold between executions. | jameshart wrote: | Right. You can kind of consider the state of a filesystem | on which you occasionally run rm * purges to be a system | whose state is made up of 'stuff in the filesystem' and | 'timestamp the last purge was run'. | | If you run rm * multiple times, the state of the system | changes each time because that 'timestamp' ends up being | different each time. | | But if instead you run an rm on files older than a fixed | timestamp, multiple times, the resulting filesystem is | idempotent with respect to that operation, because the | timestamp ends up set to the same value, and the | filesystem in every case contains all the files added | later than that timestamp. | hansel_der wrote: | > Is there really an initial environment where rm * ; rm | * ; does something different than rm * once? | | if * expands to the rm binary itself, maybe. | maweki wrote: | How is the system different after the first and after the | second call? | jgoldshlag wrote: | If there is an rm executable in the current directory, | and also one later in your PATH, the second run might use | a different rm that could do whatever it wants to | zrail wrote: | Early in my career I used --yes-i-really-mean-it and then a | coworker removed it with the commit message "remove | whimsy". | | T'was a sad day. | inglor_cz wrote: | Yeah, that is what I recommend too. | | Instead of performing the dangerous action outright, just log a | message to screen (or elsewhere) and watch what is happening. | | Alternatively, or subsequently, chroot and try that stuff on | some dummy data to see if it actually works. | thunderbong wrote: | That is called experience. | | Good decisions come from experience. Experience comes from | making bad decisions. | dkersten wrote: | I was involved with archiving of data that was legally required | to be retained for PSD2 compliance. So it was pretty important | that the data was correctly archived, but it was just as | important that it was properly removed from other places due to | data protection. | | This is basically the approach that was taken: log before and | after every action exactly what data or files is being acted on | and how. Don't actually do it. Then have multiple people | inspect the logs. Once ok'd, run again, with manual prompts | after each log item asking to continue, for the first few | files/bits of data. Only after that was ok'd too did it run the | remainder. | | In other things I've worked on, I've taken the terraform-style | plan first, then apply the plan approach, with manual | inspection of the plan in between. | dredmorbius wrote: | mv then rm is another idiom. So long as you have the space. | | For database entries, flag for deletion, then delete. | | In the files case, the move or rename also accomplishes the | result of breaking any functionality which still relies on | those file ... whilst you can still recover. | | Way back in the day I was doing filesystem surgery on a Linux | system, shuffling partitions around. I meant to issue the 'rf | -rm .' in a specific directory, I happened to be in root. | | However ... | | - I'd booted a live-Linux version. (This was back when those | still ran from floppy). | | - I'd mounted all partitions _other_ than the one I was | performing surgery on '-ro' (read-only). | | So what I bought was a reboot, and an opportunity to see what | a Linux system with an active shell, but no executables, | looks like. | | Plan ahead. Make big changes in stages. Measure twice (or 3, | or 10, or 20 times), cut once. Sit on your hands for a minute | before running as root. Paste into an editor session (C-x C-e | Readline command, as noted elsewhere in this thread). | | Have backups. | marcosdumay wrote: | You mean cp then rm? | | And yes, copy, verify, delete. And make sure by the code | structure that you either do the three on the same files, | or their fail. | | Also, do it slowly, with just a bit of data on each | iteration. That will make the verification step more | reliable. | | Anyway, for a huge majority of cases, only having backups | is enough already. Just make sure to test them. | andi999 wrote: | I think mv then rm is probably meant as 'windows trash | bin' style. | csours wrote: | Make a plan, check the plan, [fix the plan, check the plan | (loop)], do the plan | | See PDCA for more a more time critical decision loop. | https://en.wikipedia.org/wiki/PDCA | zeristor wrote: | Yes, I love the idea of the Plan Apply. | crispyambulance wrote: | > ... Then have multiple people inspect the logs. Once ok'd, | run again, with manual prompts after each log item asking to | continue... | | This sort-of reminds me of some "critical" work I had to do a | couple of decades ago. I was in a shop that used this | horrifically tedious tool for designing masks for special | kinds of photonic devices-- basically it was tracing out | optical waveguides that would be placed on a crystal that was | processed much like a silicon IC. | | The process was for TWO of us to sit in front of computer and | review the curves in this crazy old EDA layout tool called | "L-edit" before it got sent to have the actual masks made | (which were very expensive). It took HOURS to check | everything. | | The first hour was tolerable but then boredom started to | creep in and we got sloppy. The whole reason TWO people got | tasked with this was because it was thought that we would | keep each other focused-- 2 pairs of eyes are better than | one, right?. Instead, it just underscored the tedium of it | all. One day someone walked in and found us BOTH in DEEP | SLEEP in front of the monitor. Having two people didn't | decrease the waste caused by mistakes, it just bored the hell | out of more people. | foota wrote: | How many mistakes did you catch? | Freestyler_3 wrote: | From his story I can tell he found one big mistake. The | tedious work itself. | mmmm2 wrote: | Another good approach is do deletions slowly. Put sleeps | between each operation, and log everything. That way if you | realize something is broken, you have a chance of catching it | before it's too late. | water8 wrote: | It never hurts to ask for another set of eyes to review. At | the least if something goes awry, the blame isn't solely on | you. | tauwauwau wrote: | Once we get used to doing same thing multiple times a day, it | doesn't matter if the log shows that we're about to take a | destructive action, we'll still do it. Only thing that is | foolproof is to not take the destructive action because | people make mistake, it's human nature. I don't know how this | can be implemented, may be encrypt the files, take a backup | in some other location (which may not be allowed). | | Multiple reviewers here didn't catch the mistake | | https://www.bloombergquint.com/markets/citi-s-900-million- | mi... | dkersten wrote: | > Multiple reviewers here didn't catch the mistake | | Sure, but we can only do so much. I find its good bang for | buck and alternatives that might prevent that are not | always available, so we do the best we can. You gotta make | a call on whether its enough or not. | slaymaker1907 wrote: | I'm a fan of doing things temporally so data is very rarely | actually deleted from the database. Most of the time, you | just update the "valid_to" field to the current time. | Sometimes real deleted are required such as with privacy | requests, but I think that sort of thing is pretty rare. | | If your application has space concerns, you can modify this | approach to be like a recycle bin where you delete records | which are no longer valid and have been invalid for over a | month (or whatever time frame is appropriate for your | application). However, I think this is unnecessary in most | cases except for blob/file storage. | Danieru wrote: | That form had a couple weird checkboxes with odd wording. | It is a famous mistake, but also rather understandable just | because the form was cryptic. | irrational wrote: | Because everyone assumes that everyone else is looking at | it more closely than they are. "I'll just do a cursory look | since I'm sure everyone else is doing a in-depth look." | Narrator: nobody did an in-depth search. | HowardStark wrote: | While this is a huge issue, a solution (well, a partial | mitigation) I've seen and used is the "Pointing and | Calling" technique. The basic idea is that you incorporate | more actions beyond reading and typing or pressing a button | --generally by having people point at something and say | aloud what it is they're doing and what they expect to | happen. | | It's used rather extensively in safety-critical public | transportation in Japan [1] and to a lesser extent in New | York (along with many other countries) [2]. This can easily | extend to software without overcomplicating by just setting | the expectation that engineers, Q&A, etc. do this even when | alone. | | [1] https://www.atlasobscura.com/articles/pointing-and- | calling-j... | | [2] https://en.wikipedia.org/wiki/Pointing_and_calling | emerged wrote: | "I'm removing that semicolon!" (Pointing) | bbarnett wrote: | Parent meant this sort of pointing. | | https://t.co/TjfX5K54H7 | akavel wrote: | I heard of this technique, but unfortunately I don't see | how it can be easily applied in software | engineering/devops. | | Also, I now realized that aviation checklists seem to | tend to be done similarly with gestures - at least from | what I saw on YouTube, not sure if that's representative | or only used during education (?) | samus wrote: | Spelling out loudly the command you are about to execute | and explaining the reasoning behind it can help a lot | too. | samhw wrote: | Hell, GitHub does that to an extent, with the "type the | name of this repository to delete it" prompts. Typing the | name of the repository isn't exactly perfect, but it's an | interesting direction. | Blackcatmaxy wrote: | There was a thread recently about a repo that | accidentally went private and lost all of its stars | because of confusion with GH teams vs GH profile readme | repo naming. I think this type of prompt is very useful | for explicitly preventing the rare worst case scenarios | but the problem is making any type of prompt "routine" so | that our brains fail to process it. | lostlogin wrote: | This is it I think. | https://news.ycombinator.com/item?id=31033758 | swid wrote: | The suggestion in that post about how to fix it is good, | and mirrors one I read in the Rachael by the Bay blog - | type the number of machines to continue: | | https://rachelbythebay.com/w/2020/10/26/num/ | | The take away by both is there is actually something to | do which can wake people up when the stakes are high, and | they might not be doing what they expect. | oauea wrote: | And most importantly, don't let yourself get into the | habit of copy pasting the value | underwater wrote: | I wonder if your could print some non visible characters | in there to taint the copied value in some detectable | way. | skrtskrt wrote: | I always copy-paste into that box as well, they should | probably make at least an attempt at disabling pasting | into it | JadeNB wrote: | > Then have multiple people inspect the logs. | | I think that this is the most important part of any check. | Your parent refers to checking the log five times, but, at | least in my experience, I won't catch any more errors on the | fifth time than the first--if I once saw what I expected | rather than what was there, I'll keep doing so. Of course | everyone has their blind spots, but, as in the famous Swiss- | cheese approach, we just hope that they don't line up! | veltas wrote: | Yep, even writing a simple wildcard at command-line I will | 'echo' before I 'rm'. | pjerem wrote: | On computers I own, I always install "trash-cli" and i even | created an alias for rm to trash. It's like rm, but it goes | to the good old trash. It will not save your prod but it's | pretty useful on your own computer at least. | sam0x17 wrote: | Indeed. I would say that framework or even language-level | support for putting things in "dry-run" mode is something | sorely missed from many modern frameworks and languages, that | old C libraries used to do. | OrwellianTimes wrote: | Experience is the best teacher(tm) | rawgabbit wrote: | To ensure that the files are actually are downloaded (step1), | before deleting the original (step2). I would make make step1 | an input to step2. That is step2 cannot work without step1. | Something like: (step1) Download video from | URL. Include the Id in the filename. (step2) Grab the | list of files that have been downloaded and parse to get the | Id. Using the Id, delete the original file. | bambax wrote: | Yes. Also, maybe not have a delete action in the middle of a | script. It's usually better to build a list of items to be | deleted. In that case, two lists: items to be deleted, items to | be kept. Then compare the lists: | | - make sure the sum of their lengths == number of total current | items | | - make sure items_to_be_kept.length != 0 | | - make sure no two items appear in both lists | | - check some items chosen at random to see if they were sorted | in the correct list | | At this point the only possible mistake left is to confuse the | lists and send the "to_be_kept" one to the delete script; a dry | run of the delete list can be in order. | pc86 wrote: | I've had good success with this approach, have two distinct | scripts generate the two lists, then in addition to your | items here also checking that every item appears in one of | the lists. | ectopod wrote: | This. The original approach can fail horribly if there's a | problem on the server when you run the script for real. Your | code can be perfect but that's no guarantee the server will | always return what it ought to. | ufo wrote: | What do you recommend, to not get intro trouble if there are | spaces or newlines in the file names? | marcosdumay wrote: | Try not to delete stuff with Bash. | | This is the most reliable way. Bash has a few niceties for | error handling, but if you are using them, you would | probably fare better in another language. | | If you do insist on Bash, quote everything, and use the | "${var}" syntax instead of "$var". Also, make sure you | handle every single possible error. | ricardobeat wrote: | `set -e` will abort on any error, anywhere in the | pipeline. It's a must for any critical script. | kevinmgranger wrote: | Don't use a shell script. | ufo wrote: | Do you mean, always pass the list directly to the next | script via function calls, without writing it to an | intermediate file / pipeline? | plonk wrote: | Yes, use the list argument to Python's subprocess.run for | example. It's much easier to not mess up if your | arguments don't get parsed by a shell before getting | passed. | mkr-hn wrote: | This sounds like a "do nothing script." | | https://news.ycombinator.com/item?id=29083367 | | It defaults to not doing anything so you can gradually and | selectively have it do something. | | Learned about when I posted my command line checklist tool on | HN: https://github.com/givemefoxes/sneklist | | (https://news.ycombinator.com/item?id=25811276) | | You could use it to summon up a checklist of to-dos like "make | sure the collection in the dictionary has the expected number | of values" before a "do you want to proceed? Y/n" | jagged-chisel wrote: | This is how I do it in compiled code. In shell, I print the | destructive command for dry runs - no conditions around whether | to print or not, I go back to remove echo and printf to | actually run the commands. | zrail wrote: | Another technique that I've used with good success is to write | a script that dumps out bash commands to delete files | individually. I can visually inspect the file, analyze it with | other tools, etc and then when I'm happy it's correct just | "bash file_full_of_rms.sh" and be confident that it did the | right thing. | cruano wrote: | That was our SOP for running DELETE SQL commands on | production too, a script that generates a .sql that's run | manually. It saved out asses a fair amount of times | ineedasername wrote: | Yeah, wish I'd learned that the easy way. Fresh into one of | my first jobs I was working with a vendor's custom | interface to merge/purge duplicate records. It didn't have | a good method of record matching on inserts from the | customer web interface so a large % of records had | duplicates. | | Anyway, I selected what I though was a "merge all | duplicates" option without previewing results. What I had | _actually_ done was "merge all selected". So, the system | proceeded to merge a very large % of the database... Into | One. Single. Record. | | Luckily the vendor kept very good backups, and so I kept my | job. Because I also luckily had a very good boss and I had | already demonstrated my value in other ways, he just asked | me "Well, are you going to make that mistake again?". I | wisely said no, and he just smiled and said "Then I think | we're done here." | | I have been particularly fortunate throughout my career to | have very good managers. As much as managers get a lot of | flack here on HN, done well they are empowering, not a | hindrance, and I attribute a lot of success in my career to | them. | JadeNB wrote: | > Yeah, wish I'd learned that the easy way. | | I think that, if you've only learned something like that | the easy way, then you haven't learned it yet. As long as | everything's only ever gone right, it's easy to think, | I'm in a rush this one time, and I've never really needed | those safety procedures before, .... | karlding wrote: | At a previous job the DB admin mandated that everyone had | to write queries that would create a temporary table | containing a copy of all the rows that needed to be | deleted. This data would be inspected to make sure that it | was truly the correct data. Then the data would be deleted | from the actual table by doing a delete that joined against | the copied table. If for some reason it needed to be | restored, the data could be restored from the copy. | XorNot wrote: | At the point you're doing this, you should be using a proper | programming language with better defined string handling | semantics though. In every place it comes up you'll have | access to Python and can call the unlink command directly and | much more safely - plus a debugging environment which you can | actually step through if you're unsure. | zrail wrote: | Eh, I think that misses the point a bit. Use whatever you | want to generate the output, but make the intermediary | structure trivial to inspect and execute. If you're | actually taking the destructive actions within your | complicated* logic then there's less room to stop, think, | and test. | | You could always generate an intermediary set, | inspect/test/etc, and then apply it with Python. I've done | that too, works just as well. The important thing is to | separate the planning step from the apply step. | | * where "complicated" means more complicated than, for ex, | `rm some_path.txt` or `DELETE FROM table WHERE id = 123`. | KMnO4 wrote: | Ah, I'm glad I'm not the only one who did this. It also means | that you can fix things when they break halfway. Say you get | an error when the script is processing entry 101 (perhaps | it's running files through ffmpeg). Just fix the error and | delete the first 100 lines. | hinkley wrote: | I tend to write one script that emits a list of files, and | another that takes a list of files as arguments. | | It's simple to manually test corner cases, and then when | everything is smooth I can just script1 | | xargs script2 | | It's also handy if the process gets interrupted in the | middle, because running script1 again generates a shorter | list the second time, without having to generate the file | again. | | When I'm trying to get script1 right I can pipe it to a file, | and cat the file to work out what the next sed or awk script | needs to be. | francis-io wrote: | This was taught to me in my first linux admin job. | | I was running commands manually to interact with files and | databases, but was quickly shown that even just writing all | the commands out, one by one gives room personally review and | get a peer review, and also helps with typos. I could ask a | colleague "I'm about to run all these commands on the DB, do | you see any problem with this?". It also reduces the blame if | things go wrong if it managed to pass approval by two | engineers. | | While I'm thinking back, another little tip I was told was to | always put a "#" in front of any command I paste into a | terminal. This stops accidentally copying a carriage return | and executing the command. | koolba wrote: | > This stops accidentally copying a carriage return and | executing the command. | | For a one-liner sure, but a multi line command can still be | catastrophic. | | Showing the contents of the clipboard in the terminal | itself (eg via xclip) or opening an editor and saving the | contents to a file are usually better approaches. The | latter let's you craft the entire command in the editor and | then run it as a script. | afiori wrote: | From [0]: | | [For Bash] Ctrl + x + Ctrl + e : launch editor defined by | $EDITOR to input your command. Useful for multi-line | commands. | | I have tested this on windows with a MINGW64 bash, it | works similarly to how `git commit` works; by creating a | new temporary file and detecting* when you close the | editor. | | [0] https://github.com/onceupon/Bash-Oneliner | | * Actually I have no idea how this works; does bash wait | for the child process to stop? does it do some posix | filesystem magic to detect when the file is "free"? I | can't really see other ways | mh- wrote: | It does create and give a temporary file path to the | editor, but then simply waits for the process to exit | with a healthy status. | | Once that happens, it reads from the temporary file that | it created. | remram wrote: | The 'enable-bracketed-paste' setting is an easier and more | reliable way to deal with that: | https://unix.stackexchange.com/a/600641/81005 | | It will prevent any number of newlines from running the | commands if they're pasted instead of typed. | | You can enable it either in .inputrc or .bashrc (with `bind | 'set enable-bracketed-paste on'`) | ineedasername wrote: | _> literally simple prints statements_ | | Yes, that can be a simple but powerful live on screen log. I | developed a library to use an API from a SaaS vendor, in much | the same way as the author. It was my first such project & I | learned the hard way (wasted time, luckily no data loss or | corruption) that print() was an excellent way to keep tabs on | progress. On more than one occasion it saved me when the | results started scrolling by and I did an _oh sh*t!_ as I | rushed to kill the job. | krono wrote: | The No. 2 philosophy! | | Make sure you got everything out and off before you pull up | your pants, or else you better be prepared to deal with all the | shit that might follow! | ElCapitanMarkla wrote: | Nice work :D I tend to always add a `--dryrun` flag to any | scripts like this these days so that when we move it to | production we can run an extra test there just to be sure. | mikotodomo wrote: | > Some of the things that might seem obvious to some might not be | so for me, thanks! | | > my mind thought that url would refresh itself as soon as the | page variable changed | | This is what I thought too when I read the code. I don't think | it's obvious at all! | xmprt wrote: | That's actually surprising to me. In most languages that I've | worked with, strings are immutable so the fact that url doesn't | update is more obvious to me and I'd be surprised if it did | update. | shantnutiwari wrote: | What negativity and arrogance in the comments here. Jeez, it's | like no one HN ever made a mistake, a bunch of 10xers ninja | programmers here. Please read this: | | >I also want to preface this whole post by saying that I'm a | Junior Developer with less than one year of actual experience. | Some of the things that might seem obvious to some might not be | so for me, thanks! | | It's just some kid sharing a mistake they made and owning up. | Ease up on the "LOL what an idiot" attitude | nicbou wrote: | More importantly, this person is helping us learn from their | mistake. This is something that should be encouraged, not | mocked. | JacobiX wrote: | Just to be fair also to some commenters, I think that the post | had been edited after posting from what I remember ... so maybe | the older comments are not very relevant. | thevinter wrote: | To clarify, I only removed the company name and added the top | disclaimer | [deleted] | [deleted] | noufalibrahim wrote: | I think it was a great post. Reveals a knack for clarity in | explanations. The mistake is simple enough and natural for a | junior. If it were just one video or something, it would | probably not even be noteworthy. I think the developer learned | from the incident too. So all good. | | I do think Vimeo was irresponsible in the whine affair though. | snowwrestler wrote: | I'm impressed by their commitment to automation. If that was | me, once I realized that manually uploading from Gdrive to | Vimeo would fix the problem, I probably would have just | committed myself to manually doing that all weekend. It would | feel safer and serve as a sort of penance for screwing up the | automation the first time. | | But nope, they went right back to scripting and got it done. | KrishnaShripad wrote: | I have done a lot of such blunders myself. Accidentally deleted | my unchecked code and had to re-write everything from memory. | | I envy those who claim to do no mistakes at all. | boygobbo wrote: | Don't envy them - they are deluding themselves. | aeroplanetext wrote: | I've been there! At least when you write it the second time | it goes more quickly. | FunnyLookinHat wrote: | I was actually really impressed with this individual! For | someone who has less than a year of experience, they're showing | quite a bit of initiative, drive, and curiosity - which really | are what make or break engineers as they develop. Taking the | time to do a blog post (effectively a post-mortem) and share it | is even better! | | And yes - I've literally done this exact same error (with TB of | video data!). Spending the following week remediating all of | that data loss was a great lesson in patience and attention to | detail. :-) | | OP: If you're ever looking for a job be sure to send me a | message. Contact info in profile. | Moru wrote: | My mistake was on floppy disc with source code, other text | files and images. Was hand editing (in hex disc editor) the | floppy to get back the data, sector by sector. Fun times. Not | going back there though :-) | nso wrote: | Mine was a DELETE FROM Users; WHERE... Fun was had. | codegeek wrote: | Usually the recommendation is to not start writing the | DELETE query first. Write the SELECT query first and see | the results. If you miss the WHERE clause, you will see | that immediately. Then change SELECT * to DELETE. But I | assume you have learned that lesson already :) | Moru wrote: | Yes, but it can't be stressed enough, always the first | time for someone. | tasuki wrote: | Wrt "less than one year of experience", looking at Nikita's | CV and GitHub, despite the title, they aren't really a junior | developer :) | franciscop wrote: | True, he's been teaching programming since at least 2018, I | was in a similar boat where I'd been programming for almost | 5-7 years for fun and profit before my first official | fulltime job. | [deleted] | 692 wrote: | there's an argument that the best people around are the people | who have already (or almost) made some big mistakes. | | I have made a couple of huge ones - luckily I kept my job | comprev wrote: | When interviewing candidates I always enquire about their | professional mistakes. Their reply often is the decider | between hiring/rejecting. | | I want to have colleagues who admit fault, be truthful about | actions which lead to the issue, and learn from it. The | learning includes organisations perhaps putting additional | measures in place to prevent future issues. | | One candidate told of a story how he was On-Call early in his | career and was told situations happened so rarely, just to | continue living life as normal. | | Unfortunately for him, his pager went off at 02:00am while he | was high as a kite on drugs - but felt he had to take action | (mostly due to arrogance!). | | He promptly deleted production data and things only got worse | when he tried to rectify the situation. | | Of course he was fired for his actions but ever since he's | been stone cold sober when on-call.... just in case. | | He learned a valuable lesson about professional | responsibilities. | vsareto wrote: | >When interviewing candidates I always enquire about their | professional mistakes. | | "You see, my biggest mistake was programming in the first | place! Since then, it's just been an apology tour" | avgcorrection wrote: | It's funny how so many managers on this board are like, | yeah I focus disproportionately much on this one factor. | Why? Because my intuition and experience says so. | DoubleDerper wrote: | Don't fire for the mistake. Fire for the inability of | someone to own it, cover it up, or point fingers at others. | comprev wrote: | His honesty of admitting to being off his nut while on- | call led to his firing, not the action of deleting | things. | BolexNOLA wrote: | >His honesty of admitting to being off his nut | | This now my favorite euphemism for being high | YorickPeterse wrote: | I currently have about 12 years of experience, and a few years | back I accidentally cleaned up GitLab's database a bit too | well. I wouldn't be surprised if the people being dismissive | simply never worked on a moderately complex and large system, | and thus don't understand how easy it is to make these kinds of | mistakes. | nspattak wrote: | LOL! | | I have multiple years of experience than this man and still I | could *very* *too* *easily* make a 7Tb mistake (or likely more | :P ) | grumple wrote: | This sort of mistake happens all the time when you write in | multiple languages. A key solution is code review, a standard | practice which doesn't seem to have happened here (and | certainly isn't the fault of a junior). | [deleted] | aristus wrote: | Hey, everyone, ease up. I have: 1) dropped a production database | because I thought it was the test database. 2) screwed up a print | job costing $100,000 in today's money and had to do it again 3) | crashed all of Facebook with a C++ bug. 4) crashed Facebook photo | uploads, with a JavaScript bug, in my first month. 5) literally | killed a startup's cash flow and caused them to lose their | merchant account because I over focused on the wrong bugs. | paintman252 wrote: | You worked at Facebook, we get it | hbn wrote: | At my first development job (paid internship at a moderately- | sized, though fast-growing business - maybe 300 people at the | time?) I introduced a bug that didn't appear until a certain | microservice stopped working (my code defaulted in the wrong | direction when the ms failed) and as far as I can tell they may | have lost or almost lost a pretty big account from it. In an | after-hours meeting regarding the issue, one of the higher ups | ended up storming out and never showing up again. | | In my defence, we had to get 2 PR approvals before anything was | merged! But I definitely learned a thing or two from that | experience | [deleted] | JasonFruit wrote: | I believe if we're honest, we've all done stupid things we should | have avoided. I remember a group of about 3000 emails that went | out to insurance agents saying that policy #123456789 for Someone | Funky was going to be cancelled by underwriting. I also remember | very quickly figuring out how to automate Outlook's email recall | feature. | | We've all made big dumb mistakes. Recover and learn. | hexsprite wrote: | when doing migrations/conversions I always write a script in dry- | run mode first. I exhaustively check the results to make sure | they are expected. Then try to do a real conversion/transfer of | only the 1st file and make sure that worked. Then do a couple | more. Etc. Only then do I feel confident to do the whole thing. | uptown wrote: | Junior Dev: "I'm under an NDA" | | Also Junior Dev: "Here's my source code" | [deleted] | bufferoverflow wrote: | Always do a dry run when deleting many things with code. | | - Captain Obvious | mastazi wrote: | > Vimeo doesn't provide an easy way of doing it. I wrote to the | support team around October asking them if it was possible to do | a migration, and they told us that they "will look into it" | without letting us know anything ever since. [...] At one point, | without letting us know anything, Vimeo decided it was a great | idea to comply with our request and dumped all the videos present | on OTT onto the new platform. No questions were asked [...] they | were duplicating videos that were already uploaded. | | Oh yes Vimeo, the crappy company that won't let you play videos | unless you enable autoplay in your browser[1]. | | Selecting them as a provider was the actual mistake. | | [1] https://askubuntu.com/questions/777489/vimeo-video-not- | playi... | 0xbadcafebee wrote: | This is more common than you think. Not just losing data, but not | having a good handle on where the important parts of the system | are, and how close you are to catastrophe. I find diagrams really | help. I can recall a visual map of the system when I work on some | component, and think, "OH, I remember seeing this component | connected to a really critical thing, I need to check something | first." | | Start by creating one empty page for every component of your | system. You won't remember them all, but over time you can add | missing ones. Each page is the authoritative source of info on | that component. If you need more pages for one component, put | them in a directory of the same name as the page and add ".d" to | the directory name, and link to them from the first page. | Finally, create a diagram (however you want) that includes every | component you have a page for. Add the count of components to the | top of the diagram. If the count on the diagram doesn't match the | number of documents, time to update the diagram. If you ever add, | remove or rename a page, time to update the diagram. If you do | this the same way for every different system you have, you can | link them all together and get both small and large scale | diagrams. (p.s. don't waste time automating this unless you find | the system changing constantly or you have a very big system) | fedeb95 wrote: | in my opinion any process that isn't preceded by another | identical and automated process that varies only by the data | involved is very risky to do in production. your management | hopefully had a big reality check? or not because of backups? | chanandler_bong wrote: | Experience is directly proportional to the amount of equipment | ruined or data lost. | | Even though you were fortunate not to lose any data, you gained a | lot of experience! | rexreed wrote: | A big part of the reason for the problem in this post is because | Vimeo made it impossible to move videos from one Vimeo product to | another Vimeo product: "There were roughly 500 videos on VimeoOTT | that had to be transferred to Enterprise and Vimeo doesn't | provide an easy way of doing it." | | I have found working with Vimeo to be very frustrating, | especially recently. They have a great video solution, especially | for streaming, but they seem to put these unnecessary and | frustrating roadblocks that make me constantly question my | decision to use Vimeo. From in ability to move videos from one | place to another, requiring complete uploads (resulting in | problems like this post) to nonsensical limits and pricing, | especially on their new webinar offering, which has a limit of | 100 registered attendees. For anyone who has run webinars before, | this makes no sense since 100 registered attendees usually means | 20-30% of those people actually attend, so you're capped at 20-30 | live attendees. They should price it like most event sites and | charge per live attendance rather than registration. | | Regardless, I've been very frustrated with Vimeo since it could | be so much better if they didn't have these roadblocks in place. | If they could have easily enabled moving videos from one product | to another, the post (and 7TB of lost videos) would never have | happened. It wasn't always this way with Vimeo, but they went IPO | in May 2021 and it's no surprise they're turning the screws on | their product offering and pricing now. | beeforpork wrote: | > I Accidentally Deleted 7TB of Videos ... | | Spoiler: | | But there was a backup that could be reuploaded in time and | everything was fine in the end. | nix23 wrote: | ZFS -> Snapshot....always!! Before touching writable-data (my | personal mantra) ;) | hnlmorg wrote: | I love ZFS too but that's not really relevant to this | discussion because the deleted items were on a video hosting | platform and the company did already have local copies. | nix23 wrote: | Yes and? Make a snapshot on live. Again, never touch data | before snapshot. | volume wrote: | This reminds of some IRC threads. You post a question and | someone's answer assumes you are going to rip out and | replace your existing prod setup just so you can use their | pet tool. | hnlmorg wrote: | At risk of sounding snarky, you do understand how video | hosting platforms work? Customers, even enterprise ones, | don't have shell access let alone control over what file | system is used. | | There are a hundred ways this problem could have been | prevented but ZFS isn't one of them. | whiplash451 wrote: | So, "i am under NDA" but I reveal my client's name and a lot of | sensitive details about what we are doing. LOL. | dewey wrote: | Where do you see the clients name? I only see Vimeo being | mentioned. | ceejayoz wrote: | It has been edited. | | https://news.ycombinator.com/item?id=31271836 | daniel-cussen wrote: | Well at least deleting the secret is a step back toward the | NDA he left behind. | Closi wrote: | It still breaks the NDA: | | * Firstly, you don't have to name the company to break the | NDA anyway (you are still disclosing information you aren't | supposed to disclose regardless of if it can be linked back | to the company). | | * Secondly, the client is still named on the front page of | the website. | | * Thirdly, OP posted this with his real name that trivially | links back to the dev shop he is working for. The site also | has his CV which lists the client again, with a description | of the project to link it to the post. | | * Finally, The client can trivially be identified by | googling the description in the second paragraph (i.e. just | search the named countries in operation plus the word Gym). | 12ian34 wrote: | Not all NDAs have the same terms. I could write up and | serve an NDA right now that still counts as an NDA yet | permits everything in your list. | Closi wrote: | All contracts vary in terms, but I've never seen an NDA | that says "you can talk about the content under NDA as | long as you don't mention the businesses name, and just | identify who they are in a roundabout way instead". | | "Well i'm under an NDA, so I can tell you all the | specifics of the project, but I can't tell you the | companies name. I _can say_ they own the largest search | engine though, and have a market cap of 1.5 trillion, and | rhyme with "Roogle", but I really can't say who they | are. Anyway, here is some code I wrote for them and a | description of how we nearly ruined their project along | with me calling them incompetent..." | dewey wrote: | Got it. To be honest I'd be hesitant to publish a blog post | like that with your name + current company name attached to | it. | | It's a bit different to share a fun story a few years later | about that time you almost wiped production. | [deleted] | unfocused wrote: | I'm currently working with FOIA software, and a regular user can | only delete one document at a time from the information that they | verify/redact before sending out. They can't even multi select! | Only an admin can delete multiple documents at one time. | | I'm guessing users accidentally deleted multiple documents one | too many times, and now it's baked in. | qwertox wrote: | Aaaahhh, the feeling you get when you notice that you fucked up. | Everything gets quiet, body motion stops, cheeks get hot, heart | starts to beat and sinks really low, "fuck, fuck, fuck, fuck, | fuck, fuck, fuck, fuck, fuck, fucking shit". Pause. Wait. Think. | "Backups, what do I have, how hard will it be to recover? What is | lost?". Later you get up and walk in circles, fingers rolling the | beard, building the plan in the head. Coffee gets made. | wonderwonder wrote: | lol, its amazing how fast the blood leaves your face when your | mind transitions from "cool that worked well" to "Oh no, what | have I done?" | | That backups comment sounds very familiar. | | I accidentally deleted a clients products table from the | production database in my early years as a solo dev. There was | only a production database. Luckily I had written a feature to | export the products to an excel sheet a while before and | happened to have an excel copy from the prior day. I managed to | build an export to ingest the excel and repopulate the table in | record speed while waiting for my phone to ring and the client | to be furious. Luckily they never found out. | [deleted] | gwerbret wrote: | I had this experience when, years ago on my first day as group | lead at $JOB, I was being shown a RAID 5 production server that | held years of valuable, irreplaceable data (because there were | no backups. Let me repeat that there were no backups). For some | bizarre reason, I thought "oh cool, hot-swappable drives" and | pulled one out of the rack. This naturally resulted in loud, | persistent beeping from the machine, which everyone ignored on | the assumption that the fellow who was just hired as the group | lead knew what the f he was doing. | | While I _didn 't_ know what I was doing, I did manage to get | the beeping to stop, and had to come in at 5 a.m. the next day | to restripe the drive I'd yanked out. | | Did I mention there were no backups? When I was a little bit | more seasoned on the job, I raised a polite but persistent | issue with management of the need for durable backups. Although | I kept at it for months, they thought about it, talked about | it, and ultimately did nothing. A few months after I left, the | entire array failed. Since the group's work relied on the | irreplaceable data, all work ground to a halt for the several | months it took for an off-site company to recover the data. | ycmjs wrote: | My previous boss stores company data this same way. I begged | him to approve the $5 per month cost for Backblaze on the | computers I used. He approved it for some, but not all (about | half of the ten computers). He completely rejected the idea | for the company's data. After all, it was already protected | by RAID. | ricardobeat wrote: | Isn't RAID 5 supposed to survive a single disk being taken | out? | windsurfer wrote: | If a second drive fails after the first while rebuilding | (which happens more often with larger and slower drives), | the data is lost. | arminiusreturns wrote: | Theoretically but there are often other things at play. I | know the story is older but since about 2015 raid5 has been | dead to me, mostly because at current drive sizes a raid5 | rebuild takes so long your chance of a cascade failure and | losing a second drive which makes it a "send to a recovery | lab" risk. Anywhere you would use raid5 just do raid6. | cntrl wrote: | damn, your description is spot on and reading this triggered | PTSD in me... Last time I had this feeling was two years ago | when I destroyed one of our development servers because of a | failed application update. I know exactly how I wished Ctrl + Z | to exist in real life... We had backups of the machine, but it | was still kind of a humiliating feeling to tell everybody and | ask for restore from backup (everybody was cool though in the | end) | Taylor_OD wrote: | God the feeling of having your body temp rise based purely on | realizing you fucked up is so relatable. | deltarholamda wrote: | Pffft, it's not a real panic until you weigh the pros and cons | of leaving the country with nothing but the clothes on your | back and becoming a illegal immigrant shepherd in a nation with | too many consonants in its name. | | (Your description is so, so, spot on.) | beardedetim wrote: | Ah, the goat farmer fantasy that always seems to come _at the | cusp_ of the solution. | CapmCrackaWaka wrote: | The worst panic I've felt actually took me over the precipice | into peaceful oblivion. I started simply saying to myself "oh | well... It's just a job". | sergiotapia wrote: | I lost 1hr and 30 minutes of a Slack like app (chat messages). | Luckily at the time we were pretty small so not much data was | lost but holy shit did that make me almost throw up. | | Thank God my automatic backups were so close to the mistake I | made and I didn't lose 24 hours. | | Haven't made a mistake like that since and I don't destroy DB | records like that anymore. | Oarch wrote: | Poetic! Love it | Helitio wrote: | Just a note: being able to click yourself a server at Google, AWS | etc. Might be cheap enough even paying for 15tb of traffic. | DonHopkins wrote: | >... the "Silicon Valley" world ... | | To rebillionizing! | | https://www.youtube.com/watch?v=wGy5SGTuAGI&t=369s | | ...yeah, the Tres Commas bottle was on the DELETE key. The corner | of it was just, it juuuust got on there... | lesgobrandon wrote: | [deleted] | [deleted] | dclowd9901 wrote: | His solution reminds me of how I used Cypress to generate test | accounts on our local admin dashboard for Cypress tests, since | our api was inadequate (it didn't do the billing signoff required | to create accounts that last longer than a month... don't | ask...). | SnowHill9902 wrote: | Related: is there any HTTP API model that supports transactions | with commit and rollback? Also isolation levels? Usually one | wants to set_stock(get_stock() + 10) but there may be competing | from various clients between both calls, resulting in races. | Usual web APIs seem vulnerable to this. | jffry wrote: | Wouldn't the model be to expose an increment_stock(10) type | HTTP endpoint instead, and the backend can ensure it's atomic? | LinAGKar wrote: | Shouldn't that be `page={page}` rather than `page{page}`? Or | better yet, use the requests `params` argument. | hanly_paul wrote: | I am also a junior with 1 year's experience, just in Python but | none with the requests module or web development. If the 'page' | variable is being changed, was the error something specific to | this module, not refreshing the page? | orange_puff wrote: | As everyone else has already pointed out, better testing would | have been very useful here. For instance, print(len(our_ids)) | would have been a dead giveaway that that something was up | | I am also a junior dev and completely empathize with being given | a lot of responsibility and potentially messing up. I think for | someone with < 1 year of experience, to solve the problems you | created as fast as you did is really impressive. Thankfully your | story ends well :) | AtNightWeCode wrote: | The conclusion should include that backup at separate locations | is key. Also, that the backups are tested and work. I worked with | clients that had everything from lightning strikes destroying | servers to ransomware to people making mistakes. No problem with | solid backups. There is a difference between a good process and | skill. | thisNeeds2BeSad wrote: | The only thing that I can remember helping against such actions, | is the exponential need for confirmation by intent. | | Means, if you delete one small file you need one confirmation, if | you delete thousands, you need a intent stating i expect thousand | files to be deleted. Same goes for size. So not a okay button, | but instead a form allowing you to enter the dimension of the | intented outcome. 100 files max, 1 gb max deleted. | | If the request goves over the intent, the system aborts. | dncornholio wrote: | What is the f doing in | | url = f"https://api.ourservice.com/media?page{page}&step=100 ? | throwaway744678 wrote: | It's a Python f-string [0]. A way of formatting a string by | directly including a Python expression between curly braces. | | [0] https://docs.python.org/3/tutorial/inputoutput.html#tut-f- | st... | qwertox wrote: | "f-strings", a (new) way to format strings. | jraph wrote: | f for format ("formatted string"). | | It does the same thing as | `https://api.ourservice.com/media?page${page}&step=100` [sic] | in Javascript, or | "https://api.ourservice.com/media?page$page&step=100" in Bash, | PHP, Perl or Groovy (and other languages). It outs you into | variable substitution / interpolation in the string literal. | | In Python these string literals are called f-strings if you | want to look it up. They are defined in PEP 498 - Literal | String Interpolation [1] and available since Python 3.6. | | [1] https://peps.python.org/pep-0498/ | | [sic] there probably would be a missing '=' in this url after | "?page" | fifticon wrote: | if it's python, it's the formatting/interpolation string | marker. | vjust wrote: | So much wisdom in these comments, people have different styles of | being careful, and each makes sense in a nuclear "go" situation | p0d wrote: | For many years I have had a private blog. I like to write but | realised 99% of us are not interesting to read. This is a young | guy processing his thoughts. Not "teaching" the rest of us as he | frames it. This should have stayed in-house and personal. The | company can then decide which clients, authorities to contact if | necessary. There is a book in all of us as they say. For most of | us it should stay there. | donalhunt wrote: | fwiw I would probably have turned to rclone.org for this. It | doesn't have support for vimeo out of the box but the Vimeo API | seems sane enough that it would be trivial to implement uploads | quickly. | | Previously used rclone for doing massive transfers between cloud | providers using "cheap" on-demand servers which provide unlimited | data transfer (the public clouds make this very expensive). | ghoomketu wrote: | The more I read about vimeo the more I wonder what's up with | these guys. | | Only recently they made some god aweful policy changes for | content creators(1), but it looks like they treat their | enterprise customers just the same. | | Surely, there must be better alternatives for hosting videos than | being at the mercy of a company who couldn't care less about big | paying customers. | | (1) https://www.theverge.com/2022/3/18/22985820/vimeo- | bandwidth-... | pfista wrote: | mux.com seems like a great alternative and is super developer | focused. | bbbush wrote: | scary. maybe as well just pay vimeo to restore data. | IYasha wrote: | So, apparently, vimeo has better support than youtube (not | informative, but at least they DO something). Duly noted. | aasasd wrote: | After having read about plenty of such cases over the years, I | have a persistent dread of pulling something like that myself, to | the point of being nervous with '*' in the terminal, and | generally checking everything twice. (And also have some kind of | mild horror-high from corporate snafu stories, weirdly | reminiscent of Ballard's 'Crash'). | | So: I never feed the data straight from the gathering script into | the modifying script, at least not in the first runs. Instead, I | dump the whole list of items into a file, count them in there, | gawk at them to see that they're right, and compare with the | source data by hand until I begin to annoy myself. Then I feed | that file to the second script. | Peleus wrote: | Under NDA but I'll give rough details of what's occurring while | also naming my client and disparaging them to the public. | | Well that's a brave move... | searchableguy wrote: | They said they are a junior developer with not much experience. | I'm afraid they may not know what is and isn't covered under | NDA. | KingOfCoders wrote: | My tip would be: read what you sign. | thevinter wrote: | Just to clarify, my company is under an NDA and not | personally me. It also encompasses only the actual project | details so a post like this is legally compliant. (Not a | lawyer, might be wrong) | KingOfCoders wrote: | So you're not under an NDA as you wrote. | | I don't know your position but I would assume a NDA is | part of your freelancer or employee contract. | mkr-hn wrote: | OP might at least want to consult with a contract lawyer | in Italy to make sure. | Closi wrote: | You likely have a confidentiality clause in your | contract. | | If your company is under an NDA, your company will have | an obligation to ensure that _you_ also do not disclose | information. | | Companies are mostly just collections of people, and an | NDA is mostly meant to stop people working on the project | from talking about the project. | bluehatbrit wrote: | In every contract I've ever signed, part of the NDA | clause with my employer is that I'm also bound by NDA's | my employer is bound by, so if the employer signs an NDA | with a customer, I would also be bound by that. It might | be worth checking your contract, otherwise having a | company sign an NDA doesn't hold much weight if their | staff are free to go around sharing the information | themselves. | [deleted] | photon-torpedo wrote: | Apart from all the advice on how to do such destructive | operations more safely, I think there's also a lesson to be | learned about communicating more actively: | | 1. Vimeo responds to the original request with "will look into | it", then... nothing happens? This may depend on culture, but at | least from my experience in the UK, this is a very non-committal | response, and if you really want them to do something, you'll | need to chase them. Wait a few days and inquire if they have any | estimate for when it might get done, or if they need more | information. I find that the "looking into it" response is | sometimes used to gauge how important the request is to you. | | 2. Once you go with your own solution, just drop a quick message | to Vimeo: "Hey, just wanted to let you know we've found our own | solution for this, and won't require your help any more. Sorry if | you've already committed any resources for this task. Have a nice | day, yada yada." This not just avoids what happened here, but is | also a courtesy to them. | mbostleman wrote: | Related: The change is fine, it's only one line. | amtamt wrote: | A computer lets you make more mistakes faster than any invention | in human history, with the possible exceptions of handguns and | tequila. | mindcrime wrote: | Imagine coding while drinking tequila... | johnklos wrote: | We can all poke at this person for doing things incorrectly, but | one has to wonder what mindset could lead to any programmer ever | thinking that: 1) parsing a web page shouldn't be | considered incredibly fraught with problems 2) that | reloading web pages should be part of (1) 3) that this | should ever possibly be run without validating the list of files | that would be deleted | | So forget the specifics. Where are people learning these things, | and what do we do to teach them better things? | bsder wrote: | "rm -rf" blowing you foot off is a Unix Right of Passage(tm). | | You _will_ do it at least once in your career. If you 're old | enough you will do it twice. If you're really old, you get the | joy of doing it a third time. | | The subtlety increases each time because you _do_ learn. | dboreham wrote: | College? Parents? In my experience it runs pretty deep so not | sure it can be easily trained out. This mindset is probably | quite useful in evolutionary terms: rush at the attacking bear | without thinking, for example. | plonk wrote: | > rush at the attacking bear without thinking, for example | | Would that work? I don't see a bear backing down and I don't | see the human winning either. | qayxc wrote: | > Where are people learning these things, and what do we do to | teach them better things? | | Learn to learn and learn to work carefully. It starts in school | and should be part of a proper college/university education or | vocational training. | | There's several ways of learning the specifics: by experience | on-the-job, which can be hard if mistakes can get you fired; or | by putting in the work in your free time. | | If your job is to work with certain web frameworks and you're | not very experienced, either ask senior devs to assist/review | before going live with critical changes. Alternatively, | practice at home. Unpopular, but you need to get experience | from somewhere. OSS projects are a great way to do that - be | that by creating your own or by contributing to an existing | one. | dncornholio wrote: | Some mistakes can only be learned by making them. Sometimes you | can tell someone a hundred times something, they won't learn | until they experience it. | | The point is not to prevent these mistakes, but to keep the | consequences low. | | Have backups, have version control, etc. | ufmace wrote: | True, and worth remembering why. Most of us are constantly | getting warned about the dire potential consequences of huge | numbers of things, most of which are either massively | unlikely to ever happen or not actually that bad, or both. | It's very difficult to tell which of the things we get warned | about are actually high risk until something bites us. | Mo3 wrote: | Seriously.. also, looking at these code snippets... | | If someone delivers code that looks like that, especially if | intended for a production system, I'm firing immediately. | | It's a miracle nothing has happened sooner. | ziddoap wrote: | From the article: | | > _I 'm a Junior Developer with less than one year of actual | experience._ | | > _The bad news is that this was on Friday, and we needed to | have the videos back up at most for Tuesday morning._ | | You say: | | > _If someone delivers code that looks like that, especially | if intended for a production system, I 'm firing immediately_ | | Fire immediately? What a miserable sounding place to work. | Mo3 wrote: | In this case - seeing how they let them have direct access | to production - I agree on the miserable sounding place to | work and repeat myself - | | It's a miracle nothing happened sooner | ziddoap wrote: | I was referring to your workplace. | Mo3 wrote: | At least we don't let junior developers with close to | zero experience anywhere near production.. | | I didn't quite read the part about his experience in the | article, I agree firing over that wouldn't be fair, but | that just raises other questions. | DeathArrow wrote: | There's a thing called unit tests. | muglug wrote: | The root of this particular issue was Vimeo's failure to do this | migration for their customers. | | Vimeo OTT has a codebase written in Rails, whereas the main PHP | application is written in PHP. At the time Vimeo acquired Vimeo | OTT's codebase, the Vimeo OTT codebase was small -- around 10,000 | lines of Ruby. Rewriting that codebase inside the Vimeo PHP | application would have been a tough technical challenge for the | all-Ruby team, and they'd have likely lost some people along the | way and missed out on some content deals, so they decided instead | to maintain two separate codebases and two separate login | systems. | | The video-playback and video-storage infra has since been | unified, but all the business logic is still siloed. | conductr wrote: | He wasn't asking them to refactor their internal code bases. | But they should be able to whip up the 20 lines of code needed | to do this between APIs (or just directly on their servers). | Essentially what author was trying to do when he screwed up. | For the author this was disposable code, for Vimeo this would | have been a reusable utility. | | I know how these things happen. Support ticket queues and all. | And while I don't fully know the difference in cost, I would | assume a customer upgrading to an Enterprise plan would get a | better support experience. | | Whoever within authors company negotiated the upgrade to | Enterprise (or didn't) and failed to embed some agreement | around OTT to Enterprise transition assistance was the one who | made the first mistake. | macspoofing wrote: | >The root of this particular issue was Vimeo's failure to do | this migration for their customers. | | Yes and No. At the end of the day, you as a business have to | insulate yourself from your infrastructure provider. | notyourday wrote: | Vimeo is the only infrastructure provider providing that | service. It is impossible to insulate a business from it. | chernevik wrote: | Per the post, Vimeo DID do it -- without telling the customer! | And then wouldn't help uncluster the situation. | macspoofing wrote: | > but at the time the code seemed completely correct to me | | I venture this kind of (misplaced) over-confidence is not | atypical of many junior developers. As someone with a few years | under my belt, I don't care how sure I was of the code I wrote | that deletes important data, I would have gone through the code | over and over again, and at least ran a simulation (by maybe | logging the generated delete urls for manual verification). | | It's a rite of passage and we all went through something like | this. It's how you learn and grow. | | >It also should probably teach something to Vimeo | | No. Even if Vimeo could have made things better, it's still your | fault. You have to take responsibility for your business. At the | end of the day, if this causes the closure of your company, Vimeo | is still fine. | wumms wrote: | Not completely off topic (as one of my scripts deleted files | recently which dates were off by one): | | > Fri May 06 2022 | | > I'm currently working [...] in Italy | masswerk wrote: | Controversial opinion: And this is why block syntax by white | space is not for production. | krit_dms wrote: | This is hardly a whitespace issue | masswerk wrote: | Ah, yes, I just noticed the difference in indentation. In | actuality, the error about the mental model of variable | states. | havkom wrote: | The company was lucky to have someone like you that could | actually sort out real problems efficiently. I would bring up | this story when negotiating for a raise. | davbryn1 wrote: | "What does this teach us? Well, it teaches me to do more diverse | tests when doing destructive operations. It also should probably | teach something to Vimeo and to my contractor but I doubt it will | (and yes, the upload for some reason is still manual to this day. | Go figure!)" | | So you wrote bad code, didn't test it properly, ran it on | production on the Friday before a release and are blaming Vimeo | and [name redacted]? | | And your resolution was yet another cobbled together script that | you probably didn't test? | | This isn't a great article to have attached your name to | gala8y wrote: | Not to mention that he _deleted_, but not _lost_ videos. | Nothing to see here. | oneepic wrote: | Earlier in the article, the author does call out that it's bad | code, so he's not entirely blaming these companies. Anyway: You | should not be afraid of thinking about what _each_ party could | have done better. Not just yourself, but other people too. When | I look back on times where I only blamed myself for prod | issues, it was less of a learning experience, and more focused | on beating myself up for no good reason. That approach shows | that I 'm afraid of the consequences, and it's an effective way | to feel isolated from the team instead of improving. | nickkell wrote: | Better to do it before the release then afterwards. I'm | assuming this way nobody noticed the issue. | | Also, would you rather everyone only ever posted about all the | times they were successful? | chopin wrote: | I'd hire this guy if only being for this frank about his | mistake. He owned it and that is what I would look for. | | After deletion, what should he have done? Postpone the go-live? | That's often not a a cost-effective option. As for a risk- | analysis the worst what could happen was deletion of the | remaining videos. I don't think that that makes big difference | in this situation. And to do the right thing, you have to have | the infrastructure in place, if you are in a hurry. I doubt | that's the case for a 10 heads shop. | GordonS wrote: | Aye, this is how you learn and make sure it doesn't happen | again. | | I did a similar thing ~20 years ago when I first started my | career, accidentally deleting a production database because I | thought I was working on the test database. | | I owned it, learned lessons from it, and it's never happened | again. | davbryn1 wrote: | Owning the mistake would be fine if he did that - he did'nt. | He blamed the company he was contracting for. That's a big no | from me | esquivalience wrote: | It's as if we read different articles. He literally writes | that he made "A series of mistakes that could've probably | been easily prevented." | thevinter wrote: | I'm sorry if it came off like that. The mistake in this | case was completely mine (bad code and bad testing). The | detour on the other two companies was mostly because this | way of deleting/recovering stuff should've probably been | avoided in the first place, other than that I'm absolutely | not blaming anyone else! | davbryn1 wrote: | Don't worry about all that - there isn't a developer | worth their salt that hasn't made a mistake. But I'd | consider having this blog post and HN post retracted | purely for future internet checks. It isn't a reflection | on you, and your honesty is fantastic. But there is a lot | to be said about using a pseudonym when it comes this | close to your employers | desarun wrote: | I'd probably make your github profile private for a while | as well. Or at least removing your real name from it. | [deleted] | malexbone wrote: | Agree 100%. Acknowledged mistake, moved forward to find a | solution. Reflected on lessons learned. Shared valuable | lesson. | | To me this indicates intelligence, competence, integrity, | grit and generosity. TechnicL proficiency is much easier to | come by than integrity, grit and generosity. I would trust | the author to deliver on commitments. | honksillet wrote: | Agreed. But I'd also fire him from this job. | Beltiras wrote: | For having got into a sticky situation and out of it? | [deleted] | SparkyMcUnicorn wrote: | "Recently, I was asked if I was going to fire an employee | who made a mistake that cost the company $600,000. No, I | replied, I just spent $600,000 training him. Why would I | want somebody to hire his experience?" | | -- Thomas J. Watson | yohannparis wrote: | Doesn't make sense. Their employer literally paid them to | learn from their mistake. | | Now, you think they should be fired? So that another | employer rips the benefits of that learning experience. | kwertyoowiyop wrote: | Will every developer who has never checked in bad code on | Friday, or accidentally deleted the wrong data, please raise | their hand? | | 'Judgment comes from experience, and experience comes from poor | judgment.' | | :-) | dang wrote: | (Since the OP redacted the company name from the post, I've | done the same in your comment here. I hope that's ok.) | | (We do this sort of thing to protect users, usually as the | result of an emailed request, and you can tell when we've done | it because of the word 'redacted' in square brackets.) | jasonlotito wrote: | > This isn't a great article to have attached your name to | | A million times better than your comment. | davbryn1 wrote: | All I did was give advice. If you don't like it it's fine. | smokey_circles wrote: | Oof, we wouldn't work well together. Very rarely is someone | good enough to be this obnoxious. | davbryn1 wrote: | I very much doubt you would ever work with or for me. | [deleted] | breakfastduck wrote: | Vimeo completed a major migration of videos between accounts | with no confirmation or communication before commiting it, then | refused to reverse the change. Hardly the best service. | | The article hardly comes across as 'blaming' them for the core | issue but they were definitely not helpful. | wruza wrote: | Code without constant logging of "utc [who] does what exactly" is | a no-go for me for a long time. Also, if you have to be | destructive, replace the <rm/sell/halt> with log() for at least | one time (aka --verbose --dry-run) and check your expectations. | One-shot scripts like this are screaming disaster. | | (The problematic line lacks the closing ", probably a typo? I | though it closed in an unexpected location) | ge96 wrote: | The product I work on, I can watch the events occur afterwards | (videos of people using it) and it's so embarrassing watching it | fail. The wasted time. Ahh... I've gotten better to check deps | and run a full automated E2E test everytime new code is deployed | (before/after diff envs). | | Still things happen. Hopefully you have a large enough client | base where some bad experience doesn't define the whole thing. | BillyTheKing wrote: | For larger 'live' production changes I've now started to rely on | generative programming. I've got one script in some 'normal' | programming language like javascript, or python, which in turn | generates a script that contains a list of curl or other cli | commands which do the actual deletion, modification, addition, | etc. | | This allows me to run a small sub-set of commands and test those | under a live-environment before running all commands at once. In | addition, this also functions as a complete log of what has been | changed manually in production. | RankingMember wrote: | I'm impressed you went with an automated solution (PlayWright) | for 500 videos after all that, considering they could be cross- | loaded from Google Drive almost instantaneously. I'm glad it | worked, but coding around a screw-up under the gun seems like a | high-risk operation compared to spending 4 hours doing the task | manually (albeit being super bored the whole time), but with the | benefit of knowing it's being done correctly instead of hurriedly | writing a script to potentially do something else wrong very | efficiently and dig your hole deeper. | leokennis wrote: | Actually I was surprised reading that the person wrote a script | to delete 900 videos. | | If you need to do it once, it's probably 2-3 hours of work? | That is identifying a duplicate video and then clicking the | button(s) to delete it once every 20 seconds. | | Reminds me of https://xkcd.com/1205/ | bruhbruhbruh wrote: | +1 to this. After the few major screw-ups I've caused at work, | my self-confidence in my coding ability is rocked, and I tended | to react by erring towards manual cleanup, rather than coding | some scalable solution for fixing the issues | alkaloid wrote: | Does anyone else get that deep, dark, disturbing feeling in their | gut when they know they have done something bad like this? | | This is why I use so many print statements and comment out | destructive actions! Lots of experience with these feelings! | arein3 wrote: | You can automate using puppeteer or selenium | dsego wrote: | The author used Playwright in the end to automate uploads. | Using e2e tools for automating tasks is clever, I'm not sure I | would've thought of it. | chopin wrote: | It's clever, but also brittle. And might have disastrous | error conditions (like hitting "Delete" instead of "Continue" | if the wrong UI part has focus). | andreagrandi wrote: | It should really be something like: "a flaw in our system allowed | me to delete 7am TB of videos". Not entirely your fault. | mrkwse wrote: | System and/or development processes | desarun wrote: | Oh dude, we've all been there. | | 9 years ago I was working for a major broadcasting company in the | arse end of London as a junior dev, building one of their Android | apps. | | We'd roll features out months before & enable them with feature | flags via a json file we'd manually push to a prod server at a | later date. | | We'd just built a huge new feature letting you request content to | be downloaded to your set top box remotely & it had a 250k | marketing campaign to go along with the launch. | | Senior dev trusted me with prod deployment rights. | | I pushed the wrong json config to prod, launching the feature | weeks before the marketing campaign. | | Thank god I was a junior perm, that was definitely a firing | offence. | hayd wrote: | > Senior dev trusted me with prod deployment rights. | | That part's crazy! If you think it was a firing offence | wouldn't they've been fired? (I don't think it is, but | obviously requires system changes/explanation.) | BurningPenguin wrote: | I accidentally deleted a printer from the printserver by using a | python script. The docs weren't exactly clear, so i thought it | would only remove the local printer connection. After reading | this post i feel better now. My fuckup wasn't that bad in | comparison. :) | furyofantares wrote: | Great post and great attitude. | | I think I would reflect on why this is a script to begin with. | It's run once and with only 500 items could be done manually, | though 500 is certainly a bit much. | | But it's not a massive time saver; the point of the script should | be almost entirely to increase accuracy. I think I would write | one script to generate the list of videos to delete; that's the | part that's actually difficult, and a human can then verify the | list. I would probably just delete them by hand after that, but | if I really wanted a script for that part too, it would be a | separate script that uses a list that has been vetted by a human | even if initially created by the first script. | Reason077 wrote: | > _" What does this teach us? Well, it teaches me to do more | diverse tests when doing destructive operations."_ | | I think it also teaches us that adversity sometimes leads to | better solutions. I love that the OP made a hacky script that did | in 4 hours what a guy was paid to do manually over several | months! | KingOfCoders wrote: | "I'm under an NDA" | | Don't write a blog post. | franciscop wrote: | This is a great technical write up, I'd love to hear the human | side of this story as well! When did you tell the higher ups that | you deleted production? Was no one more senior on call to try to | fix it? Did they want you to learn how to fix it? Or were you the | most senior responsible for this whole area? Or did they don't | know? | thevinter wrote: | The first part of my write up slightly explains it but the | point is that HN is the top 1%. In my current company we have | 10 developers, most of them without a technical degree. They | know how to do what they've been doing for the past 10 years | but (as with most small companies here in Italy) people don't | know what best practices are used in the industry, what a | pipeline is or what a dry-run is (I learned about it today | myself!). | | What happened is that no one knew how to react and I was | probably the best suited for it, we don't really have seniority | in office. | | That said when I deleted the videos I immediately told my boss. | He was kind of scared but his reaction was mostly "Well, now we | have to re-upload them immediately, find a way. The people that | uploaded them once won't be doing it twice". I was basically | left on my own to find a solution (which I luckily did). | | Please note that I'm in no way blaming my company or accusing | it of something, this is the standard knowledge base and way of | dealing with things in many places, contrary to what working in | big tech or reading HN might make you believe! | franciscop wrote: | Thanks for the explanation, that makes a lot of sense! | | > "HN is the top 1%" + "this is the standard knowledge base | and way of dealing with things in many places, contrary to | what working in big tech or reading HN might make you | believe!" | | I'm in fact from Spain and now live in Japan, and I believe | the practices in Spain would be as bad as Italy, and in Japan | they are def worse (great at hardware, horrible at software), | so I do understand a lot of what you are saying. FWIW, in | Spain I've seen whole dev teams composed only of interns! | | > "we landed a big contract for one of the biggest gym | companies in Italy, the UK and South Africa" + "we don't | really have seniority in office" | | Maybe now that seems like you have the budget it's a good | time to go to management and suggest to hire some senior devs | who can mentor the rest into learning best practices? You can | sell it like a reinvestment in the company to management if | they want to take it as pure profit. If Italy is like Spain, | many devs won't really even want to learn these things, but | some will and then those will become seniors at some point. | Sirikon wrote: | Everyone makes mistakes, juniors and seniors alike, but I | consider you have the right mindset and resolutive skills that | will make you thrive :) | ricardobayes wrote: | Any process that makes a junior directly access prod | codebase/database is flawed. No matter how small of a company you | are, you can set up a proper CI/CD pipeline. | thevinter wrote: | 90% of IT companies in Italy don't even know what a CI/CD | pipeline is. That said I don't think it's something we could've | integrated in our pipeline as it's an error that originated | from an external service! | Fritsdehacker wrote: | This is why you have backups. Good on you to have them! | | When I just started as a junior dev at a small company I made the | classic mistake of emptying the prod db instead of my local dev | db. This was a small and in hindsight insignificant project. But | Google was our customer, so it didn't feel insignificant at the | time. | | In this case my inexperience was partly my savior. All the data | was inputted by people via a web form. Normally you're supposed | to use POST to submit a form. But I was quite clueless at the | time, so I had used GET. This meant all requests were still in | the Apache logs. I could simply replay all requests. | | I still feel my hard pounding when I think about the moment I | realized what had happened. I was really relieved when everything | was back! | | What I learned from this incident: | | - make automated backups | | - no access to prod db from anywhere but prod | cassandratt wrote: | Yea, I've wiped out an entire government's form library once. | Backups are a career saver. | NikolaNovak wrote: | Honestly, this is positively representative of any junior | developer with comparable experience. Depending on their | background and how much production work they had, there's an | overwhelming sense of eagerness and enthusiasm. Quick to script | and perhaps a bit too quick to execute. | | A friendly team will harness that enthusiasm and tame the | quickness / encourage respect for production. We all made a | massive doo doo and its how you proceed that'll define your | career. | RcouF1uZ4gsC wrote: | This is one of those times that even if you don't use a fully | functional language, trying to make as much of your program logic | pure functions would be helpful. | | It also makes it more testable. Instead of putting the delete | call right in the loop, split it into four functions. | function getAllVimeoVideos() function | getAllDbVideos() function | getVideosToDelete(vimeo_videos, db_videos) function | deleteVideos(videos_to_delete) | | Your core logic lives in getVideosToDelete which is simply a set | difference. | | Given that there are only a few hundred videos, it is easy to run | the getter functions above and quickly verify they are returning | what you expect. | acutis_fan wrote: | Yes that's fun. a List<Foo> | getFoosToUpdate(List<Foo> foos, List<Bar> bars) | | function is the first time I thought about time complexity in | my job. | | Say Foo and Bar have fields in common, such that you can say a | Foo object "equals" or "matches to" a Bar object, like if they | have name and dateOfBirth fields or something else that are the | same (nothing like a common ID between the two). Now say there | are some other fields too, like amountSpentThisYearOnDogFood | that you know is always accurate for Bars, but might be out of | date for Foos. How do you get the list of all the Foos to | update? | | Initially I did the nested for loop solution that's like | List<Foo> getFoosToUpdate(List<Foo> foos, List<Bar> bars) | { List<Foo> returnList = new List<Foo>(); | foreach (var foo in foos) { foreach (var bar | in bars) { // check if "equal" or "matching" | based on some criteria // if equal, update foo dog | food expenditure with bar dog food expenditure, add to | returnList, and break } } return | returnList; } | | but that's O(n^2) right. | | The solution with a Dictionary is obviously better. All you | need to ensure is that you have a method for both the Foo and | Bar classes that will produce the equivalent hash for both, if | they would be considered equal or matching by whatever criteria | you are using. | | So you could have something like int | GetHashOfFoo(Foo foo) { string firstName = | foo.FirstName; string lastName = foo.LastName; | DateTime dob = foo.Dob; return (firstName, | lastName, dob).GetHashCode(); // convenient c# method } | int GetHashOfBar(Bar bar) { string firstName = | bar.FirstName; string lastName = bar.LastName; | DateTime dob = bar.Dob; return (firstName, | lastName, dob).GetHashCode(); } | | These two functions will return the same value if those fields | are the same. So then you can do something like | List<Foo> getFoosToUpdate(List<Foo> foos, List<Bar> bars) | { List<Foo> returnList = new List<Foo>(); | Dictionary<int, Bar> barsByHash = new Dictionary<int, | Bar>(bars.Count); foreach (var bar in bars) | { int barHash = GetHashOfBar(bar); | barsByHash[barHash] = bar; } foreach (var | foo in foos) { int fooHash = | GetHashOfFoo(foo); if (barsByHash.ContainsKey(fooHash) | { returnList.Add(foo.CopyWith(dogFoodExpenditure: | barsByHash[fooHash].DogFoodExpenditure)) } } | return returnList; } | | Which is faster cause you only have to go through the bars list | once. | | I actually messed up something like OP with this, but with | doing undesired additions instead of undesired deletions. | | You can think of it as having two endpoints, both expecting a | .csv with rows being the things you were | updating/changing/deleting. | | The problem was, there was a column to indicate (with a | character) whether the row was for an edit, or addition, or | deletion, but this was only with one of these endpoints. For | the other, there was only addition functionality, but I thought | changes and deletions were also options for the other kind of | .csv due to some unwise assumptions on my part (thinking that | the other .csv would have the same options as the other). | That's how we accidentally put in over 100 additions that | should have been changes that had to be manually deleted. | Luckily I had a list of all the mistaken additions. | tomhallett wrote: | This was going to be my exact recommendation. By "separating | the concerns", you make it easier on my pretty much every | dimension: testing in unit tests, doing a dry run in | production, ability to read the code (you and code reviews), | and in some cases your code will be written in a more | functional way reducing variable scoping issues. | DeathArrow wrote: | This wouldn't be an issues if providers like Vimeo would soft | delete and hard delete the items after a period of time, allowing | recovery between. | | Everywhere I have to implement a delete operation, I never hard | delete data on first call. | kirillzubovsky wrote: | Mistakes happen. Kudos to the author on taking it as a learning | opportunity. I am friends with a lot of smart devs, and many of | them have dropped a production db at least once, and if not then, | then accidentally emailed 10k people ...etc. It happens. Work to | avoid it, but plan for what to do when it inevitably happens. | -\\_(tsu)_/- ___________________________________________________________________ (page generated 2022-05-05 23:00 UTC)