[HN Gopher] I booted Linux 293k times in 21 hours ___________________________________________________________________ I booted Linux 293k times in 21 hours Author : jandeboevrie Score : 560 points Date : 2023-06-14 13:54 UTC (9 hours ago) (HTM) web link (rwmj.wordpress.com) (TXT) w3m dump (rwmj.wordpress.com) | ineedasername wrote: | If there is a platonic ideal of 'uptime' then this has got to be | its opposite. | w-m wrote: | 292,612 is not an interesting number, it's not contained in any | known integer sequence. The search in OEIS only brings up | sequence A292612 | (https://oeis.org/search?q=292612&fmt=data&sort=number). | akira2501 wrote: | 2 * 2 * 191 * 383 | | Which is mildly interesting. | w-m wrote: | Neat indeed. | | 3 * 2 ^ {0, 0, 6, 7} - 1 | | And all of them are palindromes. | jwilk wrote: | For people confused by the above notation: | | 2 = 3 x 20 - 1 | | 191 = 3 x 26 - 1 | | 383 = 3 x 27 - 1 | high_pathetic wrote: | > it's not contained in any known integer sequence | | I think this makes it interesting! | w-m wrote: | Ah yes, the good old | https://en.wikipedia.org/wiki/Interesting_number_paradox | Dylan16807 wrote: | Then your standard is too low. | | And I mean that objectively. That standard would not allow an | uninteresting number. | adverbly wrote: | Make sure you add it to the integration test suite so it doesn't | get re-introduced later ;) | vintagedave wrote: | > I found the culprit, a regression in the printk time feature: | https://lkml.org/lkml/2023/6/13/733 | | The issue hasn't been fixed yet, but if it affects you the | proximate cause is known and can be reverted locally. | efitz wrote: | I told him not to turn on Windows Update. | Laremere wrote: | Here they mention that each bisect ran a large number of times to | try and catch the rare failure. Reminds me of a previous | experience: | | We had a large integration test suite. It made calls to an | external service, and took ~45 minutes to fully run. Since it | needed an exclusive lock on an external account, it could only | run a few tests at a time. We started getting random failures, so | we were in a tough spot: bisecting didn't work because the | failure wasn't consistent, and you couldn't run a single version | of a test enough times to verify that a given version definitely | did or didn't have the failure in any practical way. I ended up | triggering a spread of runs over night, and then used Bayesian | statistics to hone in on where the failure was introduced. I felt | mighty proud about figuring that out. | | Unfortunately, it turns out the tests were more likely to pass at | night when the systems were under less strain, so my prior for | the failure rate was off and all the math afterwards pointed to | the wrong range of commits. | | Ultimately, the breakage got worse and I just read through a | large number of changes trying to find a likely culprit. After | finally finding the change, I went to fix it only to see that the | breakage had been fixed by a different team a hour or so before. | It turned out to be one of our dependencies turning on a feature | by slowly increasing the probability it was used. So when the | feature was on it broke our tests. | ambicapter wrote: | > Ultimately, it turned out to be one of our dependencies | turning on a feature by slowly increasing the probability it | was used. | | Wow. I feel like this dependency should be named and shamed. | thehappypm wrote: | Isn't this how multi-armed bandits work? | rootsudo wrote: | algo 101, but I can see how it can be nifty for | $internalapp. | Laremere wrote: | Big company internal dependency. So nothing for the public to | care about. | vamega wrote: | What company. I've seen this being done (and my team does | it a lot at Amazon) but curious to know if others are doing | it at build time too. | | If done in a company with a monorepo I'd be especially | interested in hearing more | aeyes wrote: | > If done in a company with a monorepo I'd be especially | interested in hearing more | | Are there any big companies left which haven't adopted a | monorepo? | [deleted] | PartiallyTyped wrote: | AWS. We probably have the worst build systems :( | n49o7 wrote: | Probabilistic feature flags! Love it. | thehappypm wrote: | Multi-armed bandits utilize this | Thorrez wrote: | Always base the probability on something stable, such as hash | of the username. | IshKebab wrote: | Bug report: changing my username breaks $product. | | Yeah no thanks. It's probably better than completely random | but software should be predictable and unsurprising. | btilly wrote: | I've used the hash of username+string trick before for a | flag. I used it to replace a home-grown heavyweight A/B | testing framework which had turned into a performance | bottleneck. | | It worked quite well. | burnished wrote: | The important part is the stability - if your usernames | can change then they aren't stable so you don't select | it. | | I think it is a good reminder that most things you think | of as being unchanging that are also directly related to | a person.. aren't unchanging. Or at least any conceivable | attribute probably has some compelling reason why some | one will need to change it. | dietr1ch wrote: | That's why you have internal user ids instead of using | data directly provided by users. | | Will it cost an extra lookup? It's cheap, and if you | really need to, you could embed the lookup in some | encrypted cookie so you can verify you approved some | name->id mapping recently without doing a lookup. | robocat wrote: | > changing my username breaks $product. | | https://m.youtube.com/watch?v=r-TLSBdHe1A&t=14m10s | | Discussing a performance regression due to longer | username due to username being in ENVIRONMENT variable | which changes memory layout of process. | [deleted] | painted-now wrote: | Man, this story sounds like you could be on my team :-) Pretty | much experienced the same stuff working at BigCo! | | In the end, I think the real problem is that you can't test all | combinations of experiments. I don't trust "all off" or "all | on" testing. In my book, you should indeed sample from the true | distribution of experiments that real users see. Yes, you get | flaky tests, but you also actually test what matters most, i.e. | what users will - statistically - see. | joosters wrote: | This sounds like a situation that would benefit from using an | approach like all-pairs testing - | https://en.wikipedia.org/wiki/All-pairs_testing | | Basically, if you have N different features (let's assume | they are all on/off switches, but it works for multi-values | too), in theory you'd need to run 2^N tests to cover them | all, which would become completely impractical. But, you can | generate a far, far smaller set of test setups that guarantee | that every pair of features gets tested together. Run those | tests and you'll probably encounter most feature-interaction | bugs in a much quicker time. | cscheid wrote: | All-pairs is for _pairs of features_. For subsets you're in | much deeper trouble because of the exponential dependence | on N. For a fixed polynomial dependence, you can get clever | and let tail bounds eventually work for you, but for | exponentially growing hypothesis sets, that won't work. | yojo wrote: | Yikes! | | FWIW, I think best practice here is to hardcode all feature | flags to off in the integration test suite, unless explicitly | overwritten in a test. Otherwise you risk exactly these sorts | of heisenbugs. | | At a BigCo that's probably going to require coordinating with | an internal tools team, but worth getting it on their backlog. | All tests should be as deterministic as possible, and this goes | double for integration tests that can flake for reasons outside | of the code. | btilly wrote: | No, the best practice is that on each test run, every feature | flag used implicitly or explicitly needs to be captured AND | it must be possible to re-run the test with the same set of | feature flags. | | That way when you get a failure, you can reproduce it. And | then one of the easy things to do is test which features may | have contributed to it. | nosefrog wrote: | But then you won't catch the bug before it hits production :) | dmoy wrote: | Also you end up with some strange long term test behavior. | Because people will often leave feature flags in place long | after full release ( _years_ sometimes), you end up with a | default-off-in-tests only testing behavior with everything | newer than N years since the last feature flag cleanup | disabled. | | Yes it's kinda fractal of bad practices that have to align | for this problem to occur, but that's the nature of tech | debt. | linuxdude314 wrote: | You are both misunderstanding the post. | | He's not saying to alter any of the feature flags used | for the test, but simply to record which were used during | the test. | | Simply logging doesn't introduce any of the issues you | are describing. | ASinclair wrote: | This is my daily life at BigCo. These bugs are the worst. | anotherhue wrote: | Excellent 'obsessed detective' story | hinkley wrote: | I used to think I was amazing at performance tuning and | debugging but after working with a few hundred different people | it turns out I'm just really fucking stubborn. I am not going | to shrug at this bug again. You are going down. I do have a | better way of processing concurrency information in my head, | but the rest is just elbow grease. | | I had a friend in college who was dumb as a post but could | study like nobody's business. Some of us skated through, some | of us earned our degree, but he really _earned_ his. We became | friends over computer games and for a long time I wondered if | games and fiction were the only things we had in common. Turns | out there's maybe more to that story than I thought at the | time. | allenrb wrote: | I think you're absolutely right. Some of the things I've been | most proud of have been products of stubbornly refusing to | give up. On the other hand, some vast oceans of wasted time | have been another result. It's tricky to know _when_ to be | tenacious! | hinkley wrote: | In my defense, I am a strong proponent of refactoring to | make all problems shallow. So there are classes of bug that | I will see before anyone else because I move the related | bits around and it becomes obvious that there are missing | modes in the decision tree. | | I tend to believe that discipline and tenacity are separate | traits. Often appearing in the same people, but different | skills with different exercises. | allenrb wrote: | Bingo, that is very well put. Discipline is where I'll | tend to fall short. :-) | 7ewis wrote: | Reminds me how cosmic rays were noted to have caused computer | glitches. [0] | | Impressive that they managed to discover this bug. | | [0] - https://www.bbc.com/future/article/20221011-how-space- | weathe... | Musky wrote: | In the speed running community there is a pretty famous clip | [0], where a glitch caused a Super Mario speed runner to | suddenly teleport to the platform above him, saving him some | valuable time. | | Of course people tried to find ways to reproduce the bug | reliably, as saving even milliseconds can mean everything in a | speed run. They went as far as replicating the state of the | game from the original occurrence 1:1, but AFAIK no one has | been able to reproduce the glitch without messing with the | games memory. | | For that reason it is speculated that a cosmic ray caused a | bit-flip in the byte that stores the players y coordinate, | shooting him up into the air and onto the next platform. | | [0] - https://youtu.be/o3Cx2wmFyQQ?t=16 | DerekBickerton wrote: | Before clicking I thought someone kept note of how many times | Linux booted in regard to their computing habits, and not testing | software. I know for me I boot roughly 3 times a day into | different machines, do my work, shutdown, then rinse & repeat. | | Then you have those types who put their machine into | hibernate/sleep with 100+ Chrome tabs open and never do a full | boot ritual. Boggles my mind that people do that. | Tubru3dhb22 wrote: | > Boggles my mind that people do that. | | Why? I only restart my (linux) laptop every 3-4 months when I | update software. | | I can't think of any downside that I've experienced from this | practice. I do a lot of work with data loaded in a REPL, so | it's certainly saved me time having everything restored to as I | left it. | bbarn wrote: | I had a developer that I inherited from a previous manager some | years ago. Made tons of excuses about his machine, the | complexity of the problem, etc. I offered to check his machine | out and he refused because it had "private stuff" on it. He had | the same machine as the rest of the team, so since he hadn't | made a commit in two weeks on a relatively simple problem, | refused help from anyone, etc., we ultimately let him go. | | When we looked at his PC to see if there was anything useful | from the project, his browser had around a thousand tabs open. | Probably 80% of them were duplicates of other tabs, linking to | the same couple stack overflow and C# sites for really basic | stuff. The other 20% were... definitely "private stuff". | hinkley wrote: | I'm at the other extreme of "private stuff". Nothing work | related should live on my work machine. It should all be | pushed to git or dumped in the wiki (personal pages if | nothing else). | | On one of my largest projects the IT dept made bulk orders | for hardware and doled them out to new hires. 18 months into | our new project someone's hard drive died. | | Everyone acted like his dog died. I said no problem let's go | through the onboarding docs. The longest step by far was that | the company mandated Whole Disk Encryption but IT hadn't put | it in their old inventory yet. So that was 2/3 of setup time. | We found some issues with the docs and fixed them. | | Every two to four weeks that summer, someone else's drive | would go. You see, we got all of these machines from the same | production run. So the hard drives came from the same | production run, which was apparently faulty. The process got | a little faster as we went. By the end of the summer it was | my turn, and people still looked at me like I needed | condolences. I got a faster machine for a few hours worth of | work. I'm not sad. All my stuff was in the network already. I | lost a couple hours' of work, tops. | opello wrote: | This is the best way to reduce bus factor and not fall | behind documenting key details! | teachrdan wrote: | > Nothing work related should live on my work machine. | | I thought this was a typo at first. Love this as an | engineering koan. | noSyncCloud wrote: | And the corollary - nothing personal should be on your | work machine, either | canucker2016 wrote: | "Nothing work related should live ONLY on my work | machine." is the intent. | sureglymop wrote: | He was let go after two weeks? No confrontation nothing? | | Sounds very american. In European working culture if you | don't show up for two weeks people will be worried that | something happened to you and try to work it out with you. | This type of all or nothing reaction is a bit sporadic imo. | mikestew wrote: | _Sounds very american._ | | Yeah, it's not like that part of the story was condensed | and might have left out a bunch of details that weren't | important to the story. So let's give OP a hard time and | make judgements about a situation for which we have not | even the slightest bit of context. | sureglymop wrote: | Oh absolutely, you're right. I am saying that despite | whatever may have happened, two weeks is very short. I | feel like it would be at least a month here regardless. | RandallBrown wrote: | He was let go after two weeks of not doing any work, | despite the manager offering to help him. | JohnFen wrote: | > he refused because it had "private stuff" on it. | | There's a huge red flag. "Private stuff" (embarrassing or | otherwise) shouldn't be on company machines in the first | place. | dijit wrote: | I agree completely. | | However if anyone touches my computer: don't you dare | f*%king touch my private key. | | (ditto for my browsers sessions database, google cloud | credentials directory etc;) | | I'm paranoid about it, but not enough to buy a yubikey, | apparently. | lostlogin wrote: | > However if anyone touches my computer: don't you dare | f*%king touch my private key. | | Touch the computer, sure, but please don't touch the | screen with your filthy grease fingers. | mdpye wrote: | My work laptop has a touchscreen. I've never used it, but | other people use it by accident fairly often. Usually | only once each though, the look of shock is sometimes | even worth the fingerprint :D | JohnFen wrote: | I'm unusually strict about maintaining a separation | between work and personal (for instance, I would never | allow my personal smartphone to connect to my employer's | WiFi), so I wouldn't use personal keys on a work machine | at all. | | But if those keys (or passwords, etc.) are generated for | work purposes, I consider them to be as much company | property as the machine itself, so I'm no more protective | of them than I am of any other sensitive company data. | dijit wrote: | Interesting thought. | | How do you feel about giving your colleague your | password? | | My personal opinion is that I can hold someone legally | culpable if _their account_ does something like leak | financial information; you have a professional | responsibility to secure your account from absolutely | everyone. | | Administrators acting on your account must of course be | heavily logged and audited, which is the case. | JohnFen wrote: | > How do you feel about giving your colleague your | password? | | I usually don't, mostly just out of good security habits, | but also because most employers specifically prohibit | doing that. | | Almost always, your colleague can be given his own access | to whatever the password is for anyway. If that's not | possible, then I'll share the password and change it | immediately after my colleague doesn't need access | anymore. | | > you have a professional responsibility to secure your | account from absolutely everyone. | | I agree -- that's part of treating credentials the same | way as all other sensitive company data. But it's still | my employer's data, not mine. | | If I quit the company or if my supervisor wants to see | the contents of my machine, I'm fine with that. The | machine and everything on it belongs to the company | anyway. | chucksmash wrote: | > If I quit the company or if my supervisor wants to see | the contents of my machine, I'm fine with that. The | machine and everything on it belongs to the company | anyway. | | I'm fine with that, but I still will not share my | passwords. I'd be happy to reset the passwords for them | if they can't access the data by other means, but as | another commenter pointed out, the fact that anything | needs to be recovered from my^H^H _not my_ laptop | indicates mistakes were made. | StillBored wrote: | Isn't this largely the point of company directory | services? The machines/routers/applications/etc are all | doing their authentication against the directory service, | and permissions are granted and revoked there. Its a | large part of running a company with more than a couple | employees because when someone leaves you don't need to | run around changing passwords and wondering if they still | have access to the AWS account to spin stuff up, or punch | through the VPN. The account in the directory service is | just deactivated and with it all access. | | By default this should be what is happening on all but | the most ephemeral of machines/testing platforms/etc. And | even then if its a formal testing system it should | probably be integrated too. | | Directory service integration BTW is the one feature that | clearly delineates enterprise products from the rest. | dijit wrote: | Ok, but your private key, session tokens and CLI access | tokens (kube configs, gcloud etc;) _are_ your password in | those situations. | | They tie to your identity, thus you must not treat them | the same as company secrets, they are professional | _personal_ secrets which should not be disclosed or | allowed to fall into anyone elses hands (less they be | revoked and cycled). | | It's not just good security posture it could affect your | career quite badly or lead to legal issues. | JohnFen wrote: | I agree. I don't think I've said anything counter to that | (or perhaps I wasn't being clear?) | | > thus you must not treat them the same as company | secrets, they are professional personal secrets | | They are company secrets that are tied to my identity. | The company owns those secrets, not me. Just like my | keycard to get into the building. | dijit wrote: | > I agree. I don't think I've said anything counter to | that (or perhaps I wasn't being clear?) | | I think given the context of the thread (don't touch my | secrets), saying that you don't have anything you would | consider confidential towards your employer or colleagues | is a direct contradiction to what I stated. | | That's why I'm "arguing" because my employer/colleagues | should not have access to my private key, ever. | JohnFen wrote: | Ah, OK. Then we do disagree to an extent. | | There are several very legitimate times when my employer | needs to have access to my keys. If I'm leaving the | company, for an obvious instance. | | But my core point is that such keys/passwords aren't | really mine, they're the company's and in the end, the | company gets to decide what I'm to do with them. | | I think the building access keycard is a perfect analogy. | I'd never let anyone borrow mine on my own volition, but | if the company wants to retrieve it from me, that's their | prerogative. It's theirs, after all. | brazzledazzle wrote: | If an employer needs someone's particular keys something | probably went wrong or there's bad processes in place. | But that aside I think the default course of action | should be to aggressively guard your secrets and tokens | since they represent you. Not as personal or private | property but to keep someone (be it a fellow employee or | a 3rd party attacker) from impersonating you without | authorization. | | There are exceptions but the circumstances where an | employer would need to retrieve my keys without my | assistance are extremely rare and in those instances it's | unlikely I'd still be an employee anyway. | dijit wrote: | We disagree. | | The handing of the keycard is necessary to ensure it's | destroyed and can't be used as a "proof" you work | somewhere (most access cards these days have your name, | face and the company logo printed on the front). | | The keycard will be removed from the access list to the | building even when it's destroyed, they're not considered | reusable by most companies. | | Your private key is not reusable, it should be destroyed | and revoked from all system when you leave a company. | lmm wrote: | We could destroy the keycard with both parties present, | that seems safest. I don't mind turning in a private key | permanently and getting a receipt at the time, but it | needs to be very clear that it's no longer my | responsibility. | JohnFen wrote: | > but to keep someone (be it a fellow employee or a 3rd | party attacker) from impersonating you without | authorization. | | Aside from a third party attacker (which is well-covered | by my normal practices), that's a threat model that I'm | personally not worried about at all, really. In part | because I've never seen or heard of that happening and in | part because if it did, I am confident that there are | enough records to be able to prove it. | ryanjshaw wrote: | I used to shutdown regularly, then the power situation here in | South Africa got so bad that we'd regularly have about 3 hours | of power between interruptions. | | Restoring all my work every couple of hours was becoming a | pain, so I decided to re-enable hibernation support on Windows | for the first time in 10 years... And surprisingly it works | absolutely flawlessly. | | Even on my 12yr old hardware, even if I'm running a few virtual | machines. I honestly haven't seen any reason to reboot other | than updates. | lelanthran wrote: | > I used to shutdown regularly, then the power situation here | in South Africa got so bad that we'd regularly have about 3 | hours of power between interruptions. | | I'm in SA too, and I used to have 100s of days uptime (one | even over a year and a half) ... until the regular blackouts. | | Had to stop using a desktop, I've resigned myself to using a | laptop, purely so that I don't have to boot the thing all the | time and lose my context. | pessimizer wrote: | This thread is like reading that someone is shocked that | other people don't burn their beds every morning after they | wake up. | rmbyrro wrote: | I get anxious just to think that restoring from | sleep/hibernation may fail and I lose all my workspace state... | | If there was no boot failure, nor the need to reboot after some | upgrade, I'd never, ever reboot my system. | eertami wrote: | Sleep uses almost 0 power and works flawlessly. I'm never going | to waste my time, however short, waiting for a machine to boot. | vbezhenar wrote: | I think that there are two types of people. One set of people | (I guess, relatively small) don't trust software and prefer to | reboot OS and even periodically reinstall it to keep it | "uncluttered". Another set of people prefer to run and repair | it forever. | | I'm from the first set of people and the only reason I stopped | shutting down my macbook is because I'm now keeping its lid | closed (connected to display) and there's no way to turn it on | without opening a lid which is very inconvenient. I still | reboot it every few days, just in case. | ComputerGuru wrote: | I'm in the second group (avoid reboots like the plague) but | for the reason you attribute to the first: I never trust that | my Windows machine - currently working - will reboot | successfully and into the same working condition between OS | update regressions, driver issues, etc. | coldtea wrote: | > _Then you have those types who put their machine into | hibernate /sleep with 100+ Chrome tabs open and never do a full | boot ritual. Boggles my mind that people do that._ | | If the OS and hardware drivers properly support sleep, you | almost never need to do otherwise (except to install a new | kernel driver or similar). | | In macOS for example it hasn't been the case that you need | reboot in your regular OS use for over 10+ years. | | The "100+ Chrome tabs" or whatever mean nothing. They're paged | out when not directly viewed anyway, and if you close just | Chrome (not reboot the OS) the memory will be freed in any | case... | [deleted] | moron4hire wrote: | > If the OS and hardware drivers properly support sleep... | | That's like the biggest of big IFs. | tom_ wrote: | I've found sleep very reliable on macOS, and both sleep and | hibernate reliable on Windows. | | I once had my work PC unhibernate and not pop up the login | box. The computer appeared to be running normally | otherwise; I just couldn't log in, and I had to tap the | power button to shut it down. This stuck in my mind due to | its rarity. | | Can't remember ever having a serious issue on macOS. A | couple of my programs sometimes don't survive the | sleep/wake cycle, but it's intermittent, and I'm always in | the middle of something else when it happens. I've never | lost any meaningful work. | andrekandre wrote: | > Can't remember ever having a serious issue on macOS. | | macos is fine for the most part, but there are some edge | cases, such as some sketchy corporate required "security | software" that eats up kernel memory or cpu for some | unknown reason, a reboot can fix performance issues there | | also if you are a dev and apps (like xcode, android | studio etc) fill your drive with cache files* or have | weird background daemons that eat up cpu, at the least a | logout/login (or a reboot) can fix some of those eierd | things | | you could manually delete them without a reboot but ymmv | tasuki wrote: | > Boggles my mind that people do that. | | Why? | | It boggles my mind that you'd reboot needlessly. My uptime is | usually in the hundreds of days. | | Sleep is good: I just close the lid. Next time I open the lid | it immediately picks up where I left off. _Why_ on earth would | you want any other behaviour? | 2b3a51 wrote: | Full drive encryption on Linux. | | I close down my laptop when I'm moving around or when I leave | it somewhere while I'm in another part of the building. | tom_ wrote: | I reboot most weeks, just to make sure the right stuff | happens when I do. (I try to do it in the middle of the day, | so there's time to sort out any matters arising.) | | A couple of times I've discovered I've forgotten to set stuff | to auto-run on login, or things turn out to have lost their | settings, or stuff doesn't work for whatever reason - I'd | much rather discover this at a time of my own choosing! | rolandog wrote: | Security-wise: encryption at rest? In high security scenarios | you may be required to shutdown so you're forcing "attackers" | to go through several layers: motherboard password, disk | password, encryption password, OS user password + 2FA, etc. | JohnFen wrote: | On my personal machines? I don't shut them down or reboot | very often. | | At work, however, I have to use Windows. In that case, I shut | it down at the end of every workday, in part because that | prevents weird issues Windows tends to develop when running | too long. | | Mostly, though, it's because of those damned forced updates. | Since I can't trust Windows to not reboot itself at any | random point in time, having the habit of shutting down at | the end of the day at least ensures that I won't accidentally | lose my state overnight or over the weekend. | tom_ wrote: | How to stop Windows installing updates behind your back: | https://news.ycombinator.com/item?id=18157968 | | If you don't/won't/can't use the group policy editor, I got | a lot of mileage out of hibernating the PC and powering it | off at the mains. You can't leave it running something | overnight, but you can at least quickly get back to exactly | where you left things the previous day. | | (Powering it off at the mains ensures that even if you have | a device connected that could wake the PC up - thus putting | your computer in a state where WIndows Update can reboot it | - it can't. You can turn this feature off on a per-device | basis with powercfg, but then one day you'll plug something | new in and leave it plugged in and it'll wake the PC up | while you're away and Windows Update will do its thing.) | jameson71 wrote: | Security patching? | pessimizer wrote: | What do you need to reboot to patch other than the kernel? | I just restart things. | cannonpalms wrote: | Can all be done online, no? | mcculley wrote: | A long time ago, I had desktops with huge uptimes. The world | has changed. I will no longer go that long without a security | update. Too much is now passing through my machine. | sieabahlpark wrote: | I just have it running 24/7 and never restart for weeks. I | don't even have the 100 tab problem, I just like having the | immediate availability without waiting for startup. | 5e92cb50239222b wrote: | Unless you're on solar, does wasting electricity not bother | you? I used to seed a lot of stuff for years (with typical | uptime measured in months), but the CO2 impact, however tiny | it is in the grand scheme of things, does not seem to worth | it anymore. | sieabahlpark wrote: | [dead] | pessimizer wrote: | If you're shutdown or hibernating, is the power draw | anything compared to a lightbulb? | Hikikomori wrote: | My desktop uses 2w in sleep mode. Likely less if i disable | the motherboard RGB. | aeyes wrote: | > Boggles my mind that people do that. | | :( I only reboot when my machine freezes or when updates | require a reboot. I did a lot of on-call in my life and I saved | tons of time by leaving everything open exactly as I left it | during the day. ~> w 11:19 up 18 days, | 17:03, 9 users, load averages: 3.87 2.96 2.39 | ComputerGuru wrote: | You haven't properly kept a machine alive until the clock | rolls over. | | I logged into a firewalled Windows VM on EC2 that's been | running an internal micro service that was acting up and it | caught my eye that task manager showed an uptime of 6 days | making my mind immediately think it might be a bug caused by | the recent reboot or perhaps the update that triggered it. | | It turns out no reboot had taken place and in fact, the | uptime counter had merely rolled over - and not for the first | time! Bug was unrelated to the machine and it's still (afaik) | ticking merrily away. | | (Our `uptime` tool for Windows [0] reported the actual time | the machine was up correctly.) | | [0]: https://neosmart.net/uptime/ | exikyut wrote: | Okay, what was the actual uptime? :) (:E) | andrewaylett wrote: | Conversely, it boggles _my_ mind that people think 100+ tabs is | a lot. I 've got >500 open in Firefox at the moment, they won't | go away just because I reboot or upgrade. I'll probably not | look at most of them again, but they're not doing any harm just | sitting there waiting to be cleaned up. | db48x wrote: | That's because in Firefox an open tab that you haven't | recently viewed uses no memory. | drbawb wrote: | >Then you have those types who put their machine into | hibernate/sleep with 100+ Chrome tabs open and never do a full | boot ritual. | | I would never suspend to RAM or disk, far too error-prone in my | experience. (Plus serializing out 128GiB of RAM is not great.) | I just leave my machine running "all the time." My most | recently retired disks (WD Black 6TB) have 309 power cycles | with ~57,382 power-on hours. Seems like that works out to | rebooting a little less than once per week. That tracks: I | usually do kernel updates on the weekend, just in case the | system doesn't want to reboot unattended. | trashburger wrote: | > Then you have those types who put their machine into | hibernate with 100+ Chrome tabs open and never do a full boot | ritual. Boggles my mind that people do that. | | Hey, I'm that guy (although I put it to sleep instead)! It | honestly works really well and is in stark contrast to how | Linux and sleep mode interacted just ~10 years ago. It's | amazing for keeping your workspace intact. | | (FWIW, I also don't reboot or shutdown my desktop where it acts | as a mainframe for my "dumb" laptop.) | bregma wrote: | > Boggles my mind that people do that. $ | uptime 15:39:13 up 359 days, 2:02, 16 users, load | average: 0.09, 0.08, 0.15 | | 16 users is 16 tmux sessions, all me doing different tasks. | exikyut wrote: | _[Cries in outdated kernel]_ | | One of the fascinating curiosities you're missing out on is | Pressure Stall Information | (https://docs.kernel.org/accounting/psi.html). Here's what | the PSI gauges look like in htop when kernel support is | available: PSI some CPU: 0.37% 0.78% | 1.50% PSI some IO: 0.38% 0.33% 0.25% PSI | full IO: 0.38% 0.31% 0.23% PSI some memory: | 0.02% 0.04% 0.00% PSI full memory: 0.02% 0.04% | 0.00% | jchw wrote: | I have found that my MicroPC fails on some newer kernels: when | GDM starts up, the machine locks up and the LCD goes wonky. I'm | not particularly looking forward to the bisect, but at least it | won't take 292,612 reboots. | StillBored wrote: | I some ways an early boot kernel only failure is easier. Late | boot failures like that, could just as well have been something | changing in wayland/X/gdm/mesa/dbus/whatever at the same time. | And then if it turns out everything but the kernel is constant, | its easy to take a wild guess and look for something in say the | DRM/GPU driver in use vs the entire kernel. Although last time | I did that turns out it wasn't even in the GPU specific code | but a refactoring in the generic display mgmt code. Still ended | up doing a bisect across like 5 kernel revisions after | everything else failed. Which points to the fact that if linux | had a less monolithic tree it would be possible to a/b test | just the kernel modules and then bisect their individual trees, | rather than adjusting each bisect point to the closest related | commit if your sure its a driver specific problem. There is a | very good chance that if say a particular monitor config + GPU | stops working on my x86, the problem is likely in /drivers/gpu | rather than all the commits in arch/riscv that are also mixed | into the bisect. Ideally the core kernel, arch specific code, | and driver subystems would all be independent trees with | fixed/versioned ABIs of their own. That way one could upgrade | the GPU driver to fix a bug without having to pull forward | btrfs/whatever and risk breaking it. | jchw wrote: | Since I'm in NixOS, I can at least emphatically confirm it is | JUST the kernel. | | Though, given the way the LCD panel wonks out, I'm actually | concerned it's power management related. It looks like what | happens to an LCD panel when the voltage goes too low. (Or at | least, I think that's what that effect is, based on what I've | seen with other weird devices with low battery.) Since | MicroPC is x86, though, I doubt the kernel is driving any of | the voltages too directly, so who knows. | rjmunro wrote: | I wonder if bisect is the optimal algorithm for this kind of | case. Checking for the error still existing still takes an | average of [?]500 iterations before a fail, checking for the | error not existing takes 10,000 iterations, 20 times longer, so | maybe biasing the bisect to only skip 1/20th of the remaining | commits, rather than half of them would be more efficient. | pacaro wrote: | Biasing a binary search would only be beneficial if you know | something about the distribution of the search space | bgirard wrote: | If the factor in one direction is large enough then a linear | search becomes more efficient. Say you have 20 commits | remaining and the factor is 1,000x more costly to make it | easier to picture. You're better off doing a linear search | which guarantees you'll spend less than 2,000x searching the | space. | | That suggests that for a larger search space with a large | enough difference, the optimal bisection point is probably | not always the midpoint even if you know nothing about the | distribution. | | Perhaps someone can find the exact formula for selecting the | next revision to search? | jwilk wrote: | > You're better off doing a linear search which guarantees | you'll spend less than 2,000x searching the space. | | _Almost_. If only the last commit is slow, binary search | is still faster. | bgirard wrote: | > better off | | Better off as in expected/average case. Good point, but | only marginally better in the worse case. | electroly wrote: | There's an additional stopping problem here that isn't | present in a normal binary search. Binary search assumes you | can do a test and know for sure whether you've found the | target item, a lower item, or a higher item. If the test | itself is stochastic and you don't know how long you have to | run it to get the hang, I'd think you'd get results faster by | running commits randomly and excluding them from | consideration when they hang. Effectively, you're running all | the commits at the same time instead of working on one commit | and not moving on until you've made a decision on it. Then at | any time you will have a list of commits that have hanged and | a list of commits that have not hanged yet, and you can keep | the entire experiment running arbitarily long to catch the | long-tail effects rather than having to choose when to stop | testing a single non-hanging commit and move onto the next | one. | pacaro wrote: | I can see some interesting approaches here. Given n | threads/workers you could divide the search space into n | sample points (for simplicity let's divide it evenly) and | run the repeated test on each point. When a point hangs, | that establishes a new upper limit, all higher search | points are eliminated, the workers reassigned in the | remaining search space. | | Given the uncertainty I can see how this might be more | efficient, especially if the variance of the heisenbug is | high. | mortehu wrote: | Each boot updates your empirical distribution. As a trivial | example, if you have booted a version 9999 times with no | hanging, a later version will likely give you more | information per boot. | coldtea wrote: | Still, why would they need to reboot 292,612 times? | | Is that supposed to be the log of the commit messages space? | remram wrote: | If they boot it 10,000 times for revisions that don't fail, | and ~1,000 times for revisions that do fail, you can reach | this number with log2(revisions) about 30. | x86x87 wrote: | read the article. they booted so many times to show that it | was not reproducing. it's overkill but you don't need to boot | 200k times | rwmj wrote: | I didn't mention it in the blog, but Paolo Bonzini was | helping me and suggested I run the bootbootboot test for 24 | hours, to make sure the bug wasn't latent in the older | kernel. I got bored after 21 hours, which happened to be | 292,612 boots. | | Maybe it would have failed on the 292,613rd boot ... | quickthrower2 wrote: | I think your p value is pretty good here | opello wrote: | I've been on a similar quest for hard to reproduce, | timing/hardware/... bugs, and if you're facing any kind | of skepticism (your own or otherwise) it can be very | comforting to have a 10x or even 100x no failure occurred | confidence. | | It's particularly comforting when the reason for the | failure/fix/change in behavior isn't completely | understood. | bsilvereagle wrote: | If the bug occurs reasonably often, say usually once | every 10 minutes, you can model an exponential | distribution of the intervals between the bug triggering | and then use the distribution to "prove" the bug is fixed | in cases where the root cause isn't clear: | https://frdmtoplay.com/statistically-squashing-bugs/ | ajb wrote: | There is actually a bayesian version which I wrote: | https://github.com/ealdwulf/bbchop | | Basically it calculates the commit to test at each step which | gains the most information, under some trivial assumptions. The | calculation is O(N) in the number of commits if you have a | linear history, but it requires prefix-sum which is not O(N) on | a DAG so it could be expensive if your history is complex. | | Never got round to integrating it into git though. | muxator wrote: | Hidden gem! Thanks! | defen wrote: | That's a cool idea. Would also be interesting to consider the | size of the commit - a single 100-line change is probably | more likely to introduce a bug than 10 10-line changes. | phist_mcgee wrote: | You haven't met the developers at my last company. | [deleted] | dumbaccount123 wrote: | [flagged] | TechBro8615 wrote: | This reminded me of another story [0] (discussed on HN [1]) about | debugging hanging U-Boot when booting from 1.8 volt SD cards, but | not from 3.0 volt SD cards, where the solution involved a kernel | patch that actually _introduced_ a delay during boot, by | "hardcoding a delay in the regulator setup code | (set_machine_constraints)." (In fact it sounded so similar that I | actually checked if that patch caused the bug in the OP, but they | seem unrelated.) | | The story is a wild one, and begins with what looks like a patch | with a hacky workaround: | | > The patch works around the U-Boot bug by setting the signal | voltage back to 3.0V at an opportune moment in the Linux kernel | upon reboot, before control is relinquished back to U-Boot. | | But wait... it was "the weirdest placebo ever!" Turns out the | only reason this worked was because: | | > all this setting did was to write a warning to the kernel | log... the regulator was being turned off and on again by | regulator code, and that writing that line took long enough to be | a proper delay to have the regulator reach its target voltage. | | The full story is well worth a read. | | [0] | https://kohlschuetter.github.io/blog/posts/2022/10/28/linux-... | | [1] https://news.ycombinator.com/item?id=33370882 | headline wrote: | Very interesting, I wonder the _why_ | mgsouth wrote: | Disclaimer: not a kernel dev, opinion based upon very cursory | inspection. | | The patch references the "scheduler clock," which is a high- | speed, high-resolution monotonic clock used to schedule future | events. For example, a network card driver might need to reset | a chip, wait 2 milliseconds, and then do another initialization | step. It can use the scheduler to cause the second step to be | executed 2 milliseconds in the future; the "scheduler clock" is | the alarm clock for this purpose. | | Measuring the "current time" is pretty complicated when you're | dealing with multiple-core variable-frequency processors, need | a precise measurement, and can't afford to slow things down. | The "scheduler clock" code fuses together time sources and | elapsed-time indicators to provide an estimated current time | which has certain guarentees (such as code running a particular | core will never see time go backwards, it will be accurate | within particular limits, and it won't need global locks). The | sources and elapsed-time indicators it has available varies by | computer architecture, vendor, and chip family; therefore the | exact behavior on an Intel core 5 will differ from that of an | Arm M7. | | The patch in question changes the behavior of local_time(); | this is the function used by code which wants to know what the | current time is on its particular core. The patch tries to make | local_time() return a sane value if the schedule clock hasn't | been fully initialized but is at least running. | | As you can imagine, there a lot of things that can go wrong | with that. I _think_ the problem is that | sched_clock_init_late() is marking the clock as "running" | before it should. I could very well be wrong. Regardless, it's | pretty clear that there's some kind of architecture-dependent | clock initialization race condition that once in a while gets | triggered. | cryptonector wrote: | Great thinking. I'll also note that `sched_clock_register()` | uses `pr_debug()`, which can be an alias of `printk()`, | though I don't think that's it. | rwmj wrote: | If anyone would like to try reproducing the bug, I have a fairly | solid reproducer here: | | https://lore.kernel.org/lkml/20230614173430.GB10301@redhat.c... | | You will need a vmlinux or vmlinuz file from Linux 6.4 RC. | | If these are the last two lines of output then congratulations | you reproduced the bug: [ 0.074993] Freeing | SMP alternatives memory: 48K *** ERROR OR HANG *** | | You could also try reverting f31dcb152a3 and rerunning the test | to see if you get through 10,000 iterations. | Twirrim wrote: | I've been having flashbacks to troubleshooting some | particularly thorny unreliable boot stuff several years ago. In | the end tracked that one down to the fact that device order was | changing somewhat randomly between commits (deterministically, | though, so the same kernel from the same commit would always | have devices return in the same order), and part of the early | boot process was unwittingly dependent on particular network | device ordering due to an annoying bug. The kernel has never | made any guarantees about device ordering, so the kernel was | behaving just fine. | | That one was.. fun. First time I've ever managed to identify | dozens of commits widely dispersed within a large range, all | seem to be the "cause" of the bug, while clearly having nothing | to do with anything related to it, and having commits all | around them be good :) | chenxiaolong wrote: | I gave that reproducer a try and it failed after 1968 | iterations. | | * CPU: Intel(R) Core(TM) i9-9900KS | | * qemu: qemu-kvm-7.2.1-2.fc38.x86_64 | | * host kernel: 6.3.6-200.fc38.x86_64 | | * guest kernel: 6.4.0-0.rc6.48.fc39.x86_64 (grabbed latest from | mirrors.kernel.org/fedora since fedoraproject.org DNS is down | and I can't access koji) | | Log: <...> 1966... 1967... 1968... | [ 0.075343] LSM: initializing | lsm=lockdown,capability,yama,bpf,landlock,integrity [ | 0.075514] Yama: becoming mindful. [ 0.075514] LSM | support for eBPF active [ 0.075514] landlock: Up and | running. [ 0.075514] Mount-cache hash table entries: | 4096 (order: 3, 32768 bytes, linear) [ 0.075514] | Mountpoint-cache hash table entries: 4096 (order: 3, 32768 | bytes, linear) [ 0.075514] x86/cpu: User Mode | Instruction Prevention (UMIP) activated [ 0.075514] | Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 [ | 0.075514] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 | [ 0.075514] Spectre V1 : Mitigation: usercopy/swapgs | barriers and __user pointer sanitization [ 0.075514] | Spectre V2 : Mitigation: Enhanced / Automatic IBRS [ | 0.075514] Spectre V2 : Spectre v2 / SpectreRSB mitigation: | Filling RSB on context switch [ 0.075514] Spectre V2 | : Spectre v2 / PBRSB-eIBRS: Retire a single CALL on VMEXIT | [ 0.075514] RETBleed: Mitigation: Enhanced IBRS [ | 0.075514] Spectre V2 : mitigation: Enabling conditional | Indirect Branch Prediction Barrier [ 0.075514] | Speculative Store Bypass: Mitigation: Speculative Store Bypass | disabled via prctl [ 0.075514] TAA: Mitigation: TSX | disabled [ 0.075514] MMIO Stale Data: Vulnerable: | Clear CPU buffers attempted, no microcode [ | 0.075514] SRBDS: Unknown: Dependent on hypervisor status | [ 0.075514] Freeing SMP alternatives memory: 48K *** | ERROR OR HANG *** | | I'll try reverting f31dcb152a3 and testing again later. Happy | to test anything else if needed. | rwmj wrote: | Yup, that's the bug. If it does away after reverting the | commit, that would be interesting too. I don't have any other | suggestions. | chenxiaolong wrote: | I tested with 6.4.0-0.rc6.48.fc39.x86_64 + f31dcb152a3 | revert and all 10000 iterations succeeded (same hardware | and environment as my previous post). | | To guarantee that there's absolutely no other difference | between the two tests, I took the source RPM, added the | commit f31dcb152a3 diff + `%patch -P 2 -R`, and built the | kernel RPM with mock. | swordbeta wrote: | I wasn't able to reproduce this with 10k iterations on arch, | I'm probably doing something wrong. Does the host kernel | matter? | | Host kernel: 6.1.33 | | Guest kernel: 6.4-rc6 | | Guest config: http://oirase.annexia.org/tmp/config-bz2213346 | | QEMU: 8.0.2 | | Hardware: AMD Ryzen 7 3700X CPU @ 4.2GHz | [deleted] | rwmj wrote: | > Does the host kernel matter? | | Honestly I don't know! We've seen it appear with host kernel | 6.2.15 | (https://bugzilla.redhat.com/show_bug.cgi?id=2213346#c5) but | I'm not aware of anyone either reproducing or not reproducing | it with earlier host kernels. All your other config looks | right. | garaetjjte wrote: | vmlinuz-6.4.0-0.rc6.48.fc39.x86_64 failed on my 6.0.0 host | after 249 iterations. | rwmj wrote: | We had another report that it happens on RHEL _8_ host, | which is a very much older (franken) kernel. | [deleted] | allanrbo wrote: | Running binary search on something that's flaky is a pain. "Noisy | binary search" or "robust binary search" can help here: | https://github.com/adamcrume/robust-binary-search | hoten wrote: | That README is light on details. How is this different from | selecting some N (and hoping it is high enough) and repeating | your test case that many times? You just don't have to select a | value for N using this tool? | | EDIT: I missed the link to the white paper. | IshKebab wrote: | The paper lists the algorithm (which is relatively simple) | but basically it is much more efficient than repeating test | cases. | | You can see that that must be possible fairly easily. | Consider two algorithms: | | 1. Classic binary search - test each element once and 100% | trust the result. | | 2. Overkill - test each element 100 times because you don't | trust the result one bit. | | The former will clearly give you the wrong result most of the | time, and the latter is extremely inefficiency. There's | clearly a solution that's more efficient without sacrificing | accuracy in-between. | | Skimming the algorithm, it looks like they maintain Bayesian | probabilities for each element being "the one" and then test | an element 50% probability point each iteration, then update | the probabilities accordingly. Basically a Bayesian version | of the traditional algorithm. | allanrbo wrote: | Good explanation! And in the case of "I booted Linux 293k | times in 21 hours" it wasn't just 100 times, it was 10,000 | :-) | allanrbo wrote: | You do still have to select an N, but it's not as critical | that the N gives 100% guarantee of the flaky failure (which | can be really difficult or even impossible to achieve). | Unlike regular binary search, robust binary search doesn't | permanently give up on the left or right half based on just a | single result. | NelsonMinar wrote: | What a fantastic bug report writeup this is. Both the linked post | and the backing LKML and QEMU bug report. | [deleted] | [deleted] | sp332 wrote: | To save anyone clicking through the email thread: there is no | resolution in there so far. | loeg wrote: | Bisect points at this commit, even if the cause isn't known | yet: | https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin... | [deleted] | parentheses wrote: | It makes sense n-sect (rather than bi-sect) as long as these can | be run in parallel. For example, if you're searching 1000 | commits, a 10-sect will get you there with 30 tests, but only 3 | iterations. OTOH, a 2-sect will take more than 3x the time, but | require 10 iterations. | | There's ofc always some sort of bayesian approach mentioned in | other answers. | eichin wrote: | Yeah, I did a 4-way search like this on gcc back in the Cygnus | days - way before git, and the build step involved "me setting | up 4 checkouts to build at once and coming back in a few hours" | so it was more about giving the human more to dig into at | comparison time than actual computer time and usage. (It always | amazes me that people _have_ bright-line tests that make the | fully automated version useful, but I 've also seen "git bisect | exists" used as encouragement to break up changes into more | sensible components...) | eknkc wrote: | No disrespect to Peter Zijlstra, I'm sure he has been a lot more | impactful on the open source community than I will ever be but | his immmediate reply caught my attention: | | >> [Being tracked in this bug which contains much more detail: >> | https://gitlab.com/qemu-project/qemu/-/issues/1696 ] | | > Can I please just get the detail in mail instead of having to | go look at random websites? | | Maybe it's me but if I did boot boot linux 292.612 times to find | a bug, you might as well click a link to a repository of a major | open source project on a major git hosting service. | | Is it really that weird to ask people online to check a website? | Maybe I don't know the etiquette of these mail lists so this is a | geniune question. I guess it is better to keep all conversation | in a single place, would that be the intention? | dezgeg wrote: | Many kernel people really are stuck in their ways like that. | They don't want to leave their Mutt (e-mail client) at any | cost. I recall some are even to this day running using a text | console (ie. no X11 or Wayland). | donalhunt wrote: | Don't blame them. I'm fed up of browsers using gigs of ram to | display kb of data. :( | CommitSyn wrote: | I am only guessing here, but I assume it's so the content of | the mailing list archive remains. If a linked website goes down | or changes at any time in the future, then that archive is no | longer fulfilling its purpose of archiving important | information. | zxexz wrote: | I'm pretty much 100% sure that's the reason, and a good one | at that. Mailing lists are the lifeblood of a lot of big open | source projects. | cjsawyer wrote: | This is the same logic in avoiding link-only answers on Stack | Overflow. They're both good rules. | sidfthec wrote: | The irony being that he presumably wants more information on | the mailing list to keep a good archive, while not giving | enough information for people to understand that and follow | the advice later. | kevincox wrote: | If that was the reason it would have been best to state that | in the request. | | > Can I please just get the detail in mail so that it is | archived with the list? | | Of course you can't expect every email written to be perfect, | it is generally treated as an informal medium in these | settings. But stating the reason helps people understand your | motives and serve them better. | enedil wrote: | I think that hardcode kernel devs already know the reasons, | and there is no point in raising it again. For you it might | seem like a random requirement, but it's because of lack of | familiarity. | Szpadel wrote: | i think in that case explaination is needed even more, if | you are hardcore dev, then no one need to remind you | about such rule, on the other hand if you are not so | familiar with those rules yet, explanation would be very | helpful | actionfromafar wrote: | Maybe it's so the mail threads keep the full records. | aidenn0 wrote: | My suspicion is that it's not about reading the bug info once, | but having the information in the mailing-list, which is the | archive of record for kernel bugs. | dale_glass wrote: | It's LKML. The volume of that list is insane, and technical | discussion is very much the point, so they'd expect you to | explain the problem right there, where people can quote parts | of it, and comment on each part separately. | nroets wrote: | Many of the participants may also be reading it in a terminal | emulator with no web browser nearby. | _zoltan_ wrote: | maybe those people should rethink how to do stuff in 2023. | mulmen wrote: | You're welcome to go tell the Linux kernel devs what they | are doing wrong. Fuck around and find out as the kids | say. Or start the Zolnux project and see how far that | goes chasing shiny objects. | owenmarshall wrote: | Their software, their workflow. "Bend to it or pick | something else" seems entirely fine to me. | _zoltan_ wrote: | this is not really true for open source, I think. since | it's collaborative I think it's fair to expect people to | be able to open a GitHub link | snapcaster wrote: | you're wrong. instead you should adopt the standards of | the group you're attempting to join. Getting "tourist who | complains about customs of country they visit" vibes from | this comment | owenmarshall wrote: | I run OpenBSD on most of my systems. The OpenBSD | development team collaborates using cvs instead of git | because it fits their workflow well. If I wanted to | collaborate with them, I'd use cvs too - and if I wanted | to move them to git I'd do it _after_ becoming a core | contributor, not before. If I 'm going to send bug | reports & patches here and there, I'm going to do it in a | way that makes it easy for Theo and team to review. | | This is very much a Chesterton's fence topic, I think. | Linux developers have settled on a workflow that works | for them, and if you want to get time from the people who | are doing the bulk of the work it's fair to expect _you_ | to work within their requests. | mulmen wrote: | It's a gitlab link, not github. And it isn't reasonable | in this context. GitHub hosts a lot of open source | projects but it is not the only place where open source | happens. That's kinda the point of open source, and | especially of git. | | Git itself is a satellite project of the Linux kernel. It | can work without the web at all. That someone EEE'd it so | hard that even Microsoft couldn't resist is no reason to | expect the kernel devs to change their workflow. | rblatz wrote: | Are they on a PDP-11 or a dumb terminal? | treeman79 wrote: | https://en.m.wikipedia.org/wiki/Lynx_(web_browser) | | Used this daily for many years. Was great when connecting | to the internet was only practical via a shell. | Dylan16807 wrote: | Did you try it on this site? | | All of the comments/updates on the bug report are loaded | by javascript and don't work for me in lynx or elinks. | aabbcc1241 wrote: | Do you mean hacker news as "this site"? HN seems to be | server side rendered, so it should display well without | Javascript. | jwilk wrote: | I think they meant <https://gitlab.com/qemu- | project/qemu/-/issues/1696>. | inetknght wrote: | > _Are they on...?_ | | I've met people who seriously do use dumb terminals and | other people who have seriously discussed using a PDP-11. | | So, while your question might sound sarcastic, the answer | is definitely yes. | | Nerds gonna nerd. Nothing wrong with that. | | I personally don't like going to gitlab or github because | I don't like the businesses behind them. That's another | point irrespective of whether I'm browsing in a terminal | or ancient device. | rwmj wrote: | I was a bit short in the original description, but luckily | we've since reached an understanding on how to try to reproduce | this bug. | | Unfortunately he's not been able to reproduce it, even though I | can reproduce it on several machines here (and it's been | independently reproduced by other people at Red Hat). We do | know that it happens much less frequently on Intel hardware | than AMD hardware (likely just because of subtle timing | differences), and he's of course working at Intel. | mulmen wrote: | Asking to click a link in an email is unreasonable in this | context. The email list is the official channel and project | participants are expected to use it. They are not expected to | have a web browser. The popularity of the linked site is | irrelevant. Part of filing good bug reports is understanding a | project's communication style. A link to supplementary | information is fine. But like a Stack Overflow answer the email | should stand on its own. | sigzero wrote: | Yes, he should have just went and looked there. Github is not a | "random website". | mulmen wrote: | The link is to gitlab, not github. But any website is | inappropriate in this context because it's not permanent. The | email list is, at least as far as the project is concerned. | gfiorav wrote: | I once had to bisect a Rails app between major versions and | dependencies. Every bisect would require me to build the app, fix | the dependency issues, and so on. | | And I thought I had it bad! | hoten wrote: | > For unclear reasons the bisect only got me down to a merge | commit, I then had to manually test each commit within that which | took about another day. | | Having hit this before myself... does anyone know how to finagle | git bisect to be useful for non-linear history? | voytec wrote: | What was the title editorialized for, few hours after posting, | with "21 hours" (not important, clickbait-ish)? It was not | breaking any of guidelines[1] to my understanding. | | [1] https://news.ycombinator.com/newsguidelines.html ___________________________________________________________________ (page generated 2023-06-14 23:00 UTC)