[HN Gopher] Tell HN: I salute everyone on call/working support t... ___________________________________________________________________ Tell HN: I salute everyone on call/working support through the holidays Thank you for keeping systems available and safe. I've been there many times in the past, including having to fly at the last minute to a non-internet-connected data center in NJ to babysit an emergency production bug fix that took the entire holiday to create, install, verify, and monitor. Author : waynesoftware Score : 370 points Date : 2023-12-21 18:54 UTC (4 hours ago) | maerF0x0 wrote: | Yes, absolutely thanks to all who keep our world running when no | one is looking. To keep the yule log on Youtube, to keep our | christmas tree lights on, to keep a fresh glass of water from the | tap, warm natural gas to keep the freezing cold outside etc. | Thank you for keeping society ticking away :) | akomtu wrote: | Let's not confuse on-call firefighters or a water facility | staff with the on-call admins that maintain money-making | machines monetizing attention of billions. The latter is a net | negative on society. | kridsdale1 wrote: | Not a fan of the YouTube Yule Log, I see. | krallja wrote: | Yeah, how dare Netflix provide entertainment on-demand and | for cheaper than the other entertainment companies? | op00to wrote: | I am currently viewing this on ethically sourced rfc1149 | (birds gave consent via a scientifically proven "brain | electrode interface"), manually decoding packets using an | abacus made out of various animal droppings foraged on the | forest floor. If I can't view your content this way, it | should not be on the internet. | oceanplexian wrote: | That firefighter is probably using YouTube or scrolling | through Instagram to unwind while they're stuck at the | station waiting for a call. Just because someone works in | entertainment or ads doesn't mean that the economic puzzle | piece they represent isn't valuable to society. | kridsdale1 wrote: | I'll give a shout out too to everyone in the military | monitoring warning systems and maintaining stance to protect us | from being killed while we're with our families. | sleazebreeze wrote: | My wakeup alarm this morning was 9am when OpsGenie let me know | I'm on-call today. Praying for peace. | frakt0x90 wrote: | In a similar vein, I'm grateful for the people who maintain the | foundational pieces of our digital world that often go unnoticed | like date & time systems. | isoprophlex wrote: | Big up the on call heroes! Hope you're getting paid well, hope | you get no red lights on the bug hotlines. | chasd00 wrote: | > Thank you for keeping systems available and safe. | | theres that word "safe" again. What systems are dangerous | otherwise? Do you mean like traffic lights or something? The API | serving ads to your mobile game isn't dangerous. | dotnet00 wrote: | 'Safe' in the context of systems can mean hacking attempts, | safe from data leaks and other emergencies relative to the | system that may arise. It can refer to things that are | dangerous for the system itself. | tecleandor wrote: | On call till 31st, so please don't hit refresh too much this days | ;-) | Scoundreller wrote: | Me too, but they pay a few bucks an hour to carry the phone so | at least that adds up. | | Ultimate trick is to have a diverse team. Someone that doesn't | care about Christmas but absolutely needs some random day off | in March (cool with us!). Someone that celebrates new years | some other time. | kridsdale1 wrote: | Speaking of diversity, if you don't do Christmas dinner I | strongly recommend ordering takeout from a Chinese place on | the 25th! There's lots of happy photos of Chinese chefs and | Jewish customers doing a Christmas fist-bump. | rollcat wrote: | Two live video productions, including one on the evening of 31st. | I managed to push back on last minute infra/workflow changes (: | mkhnews wrote: | Thanks, been there many seasons, and same to you all. | datpuz wrote: | For some of us, we look forward to the peace and quiet | tetha wrote: | Yeah, we discourage production changes starting first or second | december week, and start freezing changes third december week | until it's frozen solid fourth december week until second week of | january. | | December tends to be hell for our customers, so stability should | be a priority there. | | And honestly, no one wants to work on holidays. So lets just wrap | everything starting in december, maybe use the third week for | some unnoticed issues and then just lay down the tools. Use that | time for documentation, or shorter days, quite frankly. | | That way we minimize the on-call situations occuring. Let's hope | it goes well for the engineer this year as well. We have a streak | to keep. | ainiriand wrote: | We do the same, I work in logistics software and we usually | freeze early November up until Christmas. | bertil wrote: | I think that's a great policy as it's clearly intended to help | people when they need it, and get people to unplug when it's | valued by their loved ones. | | _However_ (that part is probably best bookmarked until Jan | 2nd), it also betrays that your system is brittle and can be | broken by a bad commit. Don't do it because you want people to | grind until Dec 24th at 6 pm. Do it because it's great the rest | of the year, too. I'd recommend you look into (or ask me about) | feature flags, alerting, and automated roll-backs. | | The short version is: there's a meta-system on top of your | release process that can tell (if you are using roll-back not | features flags): - commits until xyzsdf are fine; - roll-outs | starting from commit abcdef have a 2% error rate, 80% on | Android; - revert to xyzsdf, send a message (low-priority, | email) to the DevOps on call and the author of abcdef that it | happened; - for all commits after abcdef: if there no conflicts | with xyzsdf, re-try to roll them out; - if there is a conflict | because they were on top or abcdef, send a message (low- | priority email) to the authors that there is a conflict. | | There are more sophisticated versions that can do things like, | if you use feature flags, flagging Android users to use the | previous version. Another way to do this is to scale who has | access to abcdef gradually: say 1% every hour, and revert if | you detect issues. | | All those seem daunting to teams that haven't worked like this | before, but it my experience, they love it very fast. | yardstick wrote: | How do you detect errors like this? | | What is an error? Is a business logic bug going to be picked | up by this process automatically, or is some manual steps | involved? | | Ie a point of sale app releases an update that automatically | halves the amount to charge, but displays the full amount to | the merchant in the UI. Unit tests pass (because an engineer | made a human mistake). Backend calls are correctly used, no | errors thrown, simply the wrong amount is used. | | How would this be automatically detected and reverted? | | Would anyone writing point of sale software want to risk this | over one of the biggest trading periods of the year? | codebolt wrote: | Yeah, that model may work for many public facing apps, but | probably less so for enterprise systems that are heavy in | business logic. | bertil wrote: | As you point out, it really depends on what is an error. | Most of the companies I know of have a Holiday freeze are | video games, casual ones, even. Changes are minor fixes and | optimization--glitches that a player likely won't notice, | but you want to detect them early to avoid losing your | ability to detect more. | | Back-end tools are different, and I definitely see reasons | other than bugs to not change business logic this month. | tetha wrote: | We use these systems liberally on other times of the year and | no one notices, usually. If they do, downtime and | interruption budgets handle this. | | /However/, let me counter with the point: Just one of our | customer has 8000 FTEs working with our system. During hell- | time (aka, December and Christmas shopping and shipping), | each of those dudes spends their shift taking customer calls | lasting 2-4 minutes, which in turn require a few requests | into our systems. | | Due to the stress of their customers^2 (because it's | Christmas and holidays and such), if an agent of a customer | is unable to access our systems, they cannot handle the use | case of the customer^2 and that will piss of the customer of | the customer. | | So if we push a bad change during this time, we're going to | piss of hundreds of customers^2 per minute for that one | customer alone. Even with a fast automatic rollback, that's a | long time during hell-time. And they have people who know how | to yell at vendors in nasty ways who don't like that. | | I enjoy moving software fast and enabling moving software | quickly, but customer focus and customer orientation means to | understand when to move slow as well. | | And hey, if that means more quiet holidays for the hard | working operators on my team, who's gonna complain? | bertil wrote: | You are a lot more ahead than most companies. | | I've worked for too many places where the Christmas break | was because of a lack of tooling. I'm glad you are two | steps ahead. | ok_dad wrote: | The place I work for pushed v2 of their software, a full | rewrite (nothing from the old system, not even databases) by a | new team, into production this week for several customers. | Mostly they did it so they could say they met their made up | 2023 KPIs for the v2 rewrite. There was no good reason to push | it out now other than that, and there were several reasons not | to, such as it wasn't well tested and it's fucking December | 20th. Anyways, I'm not really on call so I can't complain much, | but my poor coworkers have to support this over the holidays | now. | hotsauceror wrote: | Ugh. Several years ago I spent an entire Christmas vacation, | including all day Christmas Day, putting out fires because a | team couldn't be bothered to do five minutes of cursory load | testing. As a consequence, multiple production systems went | down under load. | | Later, after we regrouped after a month of this brutality, | they wandered around the office bragging like they'd hung the | fucking moon after they fixed the crippling, obvious design | issue they'd released. I confronted the dev lead with the | fact that they would have seen this after 30s of load testing | and he just laughed, I think he literally said "LOL". A giant | middle finger, that's what Ops got from Dev for Christmas | that year. | | Here's to the people who KTLO. My people. | lynx23 wrote: | I announced a downtime for a smallish GPU Cluster starting from | christmas eve just a few hours ago. It is just the perfect time | to schedule a day or two of downtime for a system like that. And | if IPMI doesn't fail me, I can get a lot of things done without | leaving the comfort of my home. I scheduled this without pressure | from my boss. It was a totally voluntary decision... While being | raised as a Christian, this time of the year is for me more about | solstice then about the Christian clelbration. A time to enjoy | the comfort of a heated home. A time to celebrate that the days | are going to be longer from now on again. A time to reflect on | the past year. And all of this is easily done while having a few | terminals open and waiting for remote stuff to complete... | wavemode wrote: | Just barely started my current job too recently to be in the on- | call rotation yet. Lucked out! Props to those keeping the wheels | turning. | RationPhantoms wrote: | As if this week had attempted to take a measure of blood from my | body, I'll be on-call next week. Looking forward to all things | quiet on the HEP network front. | 6stringmerc wrote: | Agreed, there are some gigs that just really require support to | exist - I know this first-hand from working at a Zoo (very large | exotic animal rescue basically). Animals do not take holidays. | They need to eat and do animal things in spite of our costumes | that day. | | On the flip side, having worked Cinema on Christmas Day two years | I think, there is no amount of Grace and Patience I can give that | is enough to those earning their living. Still have a hat and | polo. Why? I had to buy them! | blast wrote: | https://www.youtube.com/watch?v=zB1T3zgne5Y | bertil wrote: | Always be kind, and say it's your fault. | | If you don't do it for the sake of the person you are asking for | help, do it because it works better. That's the most practical | advice [0] ever given by Hans Rosling [1], the Fact master | himself: | | > In fact, I have the secret to how to get the best help | immediately from any customer service, like the phone company or | the bank or anything. I have the best line, it always works. You | want to know what it is? When I call, I say, "Hello. I am Hans | Rosling and I have made a mistake." People immediately want to | help you when you put it this way. You get much more when you | don't offend people. | | [0]: Unless you are in charge of a developing country's budget | and have to decide between education and healthcare. | | [1]: https://blog.ted.com/qa_with_hans_ro_1/ | chunkymilk wrote: | > Always be kind, and say it's your fault. | | I do this with internal teams at work. I've found approaching | other teams with issues with their library/framework in a "this | could be our mistake" manner really helps in keeping them from | getting defensive and stonewalling. | steve_adams_86 wrote: | I do something similar. Hey, I'm pretty sure I'm doing | something wrong. Can you help me figure it out? | | Then be grateful for the help, because it truly isn't granted | or a given that people have to drop everything and figure | things out for you, even if you work together. And even if | the mistake was actually theirs. Gratitude is huge. | smoyer wrote: | I'm going to try that's but will need more information about | Hans Rosling to get through the identity verification ... | caminante wrote: | You forgot the rest of the story! | | _> "Hello. I am Hans Rosling and I have made a mistake."_ | | continues on as... | | _> "I foolishly chose to rely on <insert your service>. I just | spent 7 minutes hopping around your phone tree with deadend | voicemail terminuses and an outdated monologue starting with | "Due to high call volumes..." that has been running since | before Covid19. Finally, I found the right combination to talk | to you. A human! There's no option to cancel my service online | and the help menu threw an error after filling out a detailed | form. Can I please reset my password?"_ | whalesalad wrote: | Meanwhile a huge number of us (non-religious? introverted kernel | compiling cave dwellers?) treat this period no differently than | any other week in the year. I'll be here keepin the servers | runnin :horns: | | It's actually my favorite time of the year. Everyone is gone, it | is quiet, and I can get shit done. | kaashif wrote: | > non-religious | | Or a member of one of the religions that don't celebrate | Christmas. | muzani wrote: | It's me. But we still have a holiday period at the end of | year - normally financial targets are hit and it's a 4 day | leave to get 10 days off. | sneak wrote: | Holidays are special because they're special, both the winter | solstice festival (rebranded for christianity) and the spring | equinox one (same deal) can be treated differently for cultural | variety by the non-observant. | | I'm a militant proselytizing atheist raised by a jew and I | still have a tree with pretty lights, give presents, and drink | and eat some things I only drink/eat once per year (never make | homemade eggnog if you ever want to enjoy it guilt free again, | you're basically drinking a megacalorie of heavy cream, yum). | It's fun to celebrate the generic concept of "holiday" - a time | that is different from other times. | | You're allowed to feel nice about peppermint candy (and/or | chocolate gelt, I go for both) at the end of December without | bringing the supernatural into the equation. :) | | \m/ | whalesalad wrote: | Oh ya same I love the smell of evergreen wreaths and trees | and enjoy partaking in festive activities. A 4K cracklin' | Yule log goes a long way too. | iddan wrote: | FYI Israelis are not on holiday - our holidays are on whole | different dates. Hire Israelis and experience no down time while | working with Silicon Valley level talent | liorsbg wrote: | True story | loloquwowndueo wrote: | Just hope stuff doesn't break on sabbath, they ain't touching | no computer that day :) | INTPenis wrote: | They'd need a goy on-call to flick the switch. | cco wrote: | I'm not sure if there is something in the water this year, but | this week, Dec 18th to Dec 21st (only a partial week), has been | our busiest week all time already. | | Sweating over here trying to make it through the week and praying | that it slows at least for the first half of next week. | yardstick wrote: | Just remember this time of year is often peak vulnerability time. | When attackers exploit that teams are at reduced strength and off | guard. Slower response times to investigate and fix issues etc. | timwaagh wrote: | I wouldn't mind honestly. Seems like a good excuse to skip the | social obligations. | smoyer wrote: | I'm on call 12 hours a day and hoping things are very quiet next | week. Best wishes to everyone else too! | acedTrex wrote: | We've been on freeze for weeks now in preparation for the holiday | season. | er0k wrote: | Thanks for the salute, but we also accept cash :) | mateusfreira wrote: | Thanks for all the great work; I hope no one has an outage this | holiday and has time to enjoy family and alone. | | Keep up the good work, folks | comprev wrote: | I salute those in the startup world - the ones in a team of 5 and | they're the only Ops person who always gets paged. | | Been there, done that. | itqwertz wrote: | Holidays are excellent times for hackers to take advantage. It's | not just Christmas or other Western holidays, either. Extend this | principle to any holiday/world conflict/anniversary of conflict | made into holiday/calendar new year and then adjust your time of | attack. | | protip: US companies with offshore groups are usually | underfunded, understaffed, and underskilled. Time to see if that | disaster recovery environment works! | | Happy holidays to those who encounter system stress tests. Can't | spell salary without some elements of slavery... | dallas wrote: | Managers, if you're reading this, and you have | engineers/developers "on call" but not contractually (off book), | make it so, because it slightly sucks when you're having | Christmas drinks but can't enjoy yourself because you might need | to drive somewhere and climb up a ladder to tend to a product. ___________________________________________________________________ (page generated 2023-12-21 23:00 UTC)