[HN Gopher] Those Win9x Crashes on Fast Machines ___________________________________________________________________ Those Win9x Crashes on Fast Machines Author : abbeyj Score : 173 points Date : 2020-06-02 15:20 UTC (1 days ago) (HTM) web link (www.os2museum.com) (TXT) w3m dump (www.os2museum.com) | mwcampbell wrote: | Do drivers or other kernel code still have this kind of delay | loop in current operating systems, or is everything interrupt- | driven now? | garaetjjte wrote: | If they need to wait in atomic context, they probably do. | | e.g. for linux: | https://www.kernel.org/doc/Documentation/timers/timers-howto... | | _ATOMIC CONTEXT: You must use the_ _delay family of functions. | These functions use the jiffie estimation of clock speed and | will busy wait for enough loop cycles to achieve the desired | delay_ | lordnacho wrote: | I've written this kind of code myself, where you measure a time | delta and divide something by the delta. It's always something | that sticks out though, that you might divide by zero (especially | if you did it in Java!). | | The article says it would have been picked up in code review, and | I agree. But it just seems odd that it wasn't changed right | there. Why not just write to loop so that it keeps looping as | long as the divisor is below some number like 10ms? You also want | to minimise the estimation error, which is easier to do if you | divide by a slightly larger number. Consider a loop that takes | between 1 and 2ms to finish, your estimate will be either x or | 2x. | qwerty456127 wrote: | I have used Windows 98 SE on CPUs up to various Pentium 3s. There | was a problem with big (above 512MB) RAM volumes but it was easy | to solve. | | I was only forced to switch to Windows XP when I upgraded to | Pentium M (Dothan) - besides the Safe Mode I could find no | solution to run Windowes 98 on it. | | I would gladly return to Windows 98 now if my hardware and | software supported it. | Wowfunhappy wrote: | Why do you prefer Windows 98 over XP? | phire wrote: | The original Mac does a calibration of the floppy drive motor | during boot to measure jitter. | | If you are implementing an emulator, you must insert some jitter | to the emulated floppy drive. | | Because if there is no jitter, the ROM's calibration code does a | division by zero and crashes. | matthewhartmans wrote: | Now I know why hitting the computer always seemed to fix it :) | Wowfunhappy wrote: | While perhaps not so great from a defensive programming | perspective, Mac OS feels like a different case since it's only | designed to run on specific hardware. | | Modern Mac OS also has all sorts of "bugs" that Hackintosh | users need to patch or otherwise work around. Since we're doing | something that was never intended, I don't really see these as | flaws in the OS. | vonseel wrote: | I'm about to build my third hackintosh, although it will be | my first on OpenCore. Can you expand upon why you call these | "bugs" and which patches you are referring to? | Wowfunhappy wrote: | Well, one specific thing I was thinking of was the limit of | 15 USB ports in El Capitan and later. There's no reason for | that to exist in an absolute sense, but no real Macs have | enough ports to run into trouble. | thekyle wrote: | What about USB hubs? Are you limited to 15 USB ports | total? | Wowfunhappy wrote: | No, only ports on the motherboard. I think the limit is | technically per-controller, but I'm not sure and I don't | want to say something wrong. If you add ports via a PCIe | card, those don't count against the limit either. | | That said, the limit is more problematic than it | initially appears, because USB 3 ports count twice--once | for USB 2 devices, and once for USB 3 devices. Some | motherboards also use USB under the hood for things like | Bluetooth, and USB headers which aren't connected to | anything will take up space if you don't explicitly | exclude them. | kelnos wrote: | I would still consider them to be timebomb bugs, though. Even | if you're developing for a restricted set of hardware, newer | versions of that hardware could very easily violate some | corner-cutting assumptions in the future. I would rather | spend a little more time now to get something right and | future-proof, rather than pass the problem onto future-me, | who likely won't have the context anymore to find the issue | quickly, or, worse, future-someone-else, who doesn't have the | context at all. | yjftsjthsd-h wrote: | Yeah, over a long enough time window I think portability | and correctness will always come back to bite you. Apple | could've saved time by making Darwin only handle one | processor nicely, but then the Intel transition and ARM | additions (iOS is still Darwin, after all) would've hurt | more. Windows coasted on x86 for a while, but now that | they're targeting ARM I'll bet they're pretty glad that it | was originally built to be portable. Code that only works | on the exact system you need today might be good enough | sometimes, but if you survive long enough you'll want it to | be more flexible than all that. | | EDIT: I should add, this applies to applications, not just | OSs. If you're an Android dev - will your app run on | Android 15? Will it work on ChromeOS? Will it run on | Fuchsia? If you're writing Windows software - will it run | on ARM? If you're making webapps - does they work on | Firefox? And maybe it's not worth the effort, especially if | you don't plan to be selling the same software in 5 years, | or maybe you think you can just deal with those things when | you get there, but if you plan to still be in business in a | decade then you should plan accordingly. | korethr wrote: | Holy shit. I feel like this neatly explains why Windows 95 was an | utter crash-fest on the computer I bought just before my freshman | year of highschool. With an AMD K6-2 running at 350Mhz, it was | the first computer I had that was all new components instead of | the franken-sytems built from a dumpster-dive base and other | dumpster-dived components grafted on. The shop I bought it from | initially put 95 OSR2 on it. And it did like to crash. It wasn't | until I started using Windows 98SE that I started to see anything | resembling stability, and not need to re-install every other | month. | | If only I had known about AMDK6UPD.EXE back then and been able to | understand the reasons behind the crash and why the patch fixed | things. | graton wrote: | I have to admit I find this type of article about old | computer/software quite interesting as recently I discovered a | backup of mine that contained source code I wrote in 1993. I was | writing assembly language back then. Using a really great library | called Spontaneous Assembly. First version 2.0 and then 3.0. | SpontaneousAssembly 3.0 added support for easily writing TSR | (Terminate and Stay Resident) code. | | Back in the early 1990s I was in college and working in the | computer lab. So I wrote various little DOS utilities to help us | better manage the computers and the interaction with Novell | Netware. | | Due to this reminiscing I have even purchased a few tech books | from that time. MS-DOS Encyclopedia, Peter Norton's Programmers | Guide to the IBM PC, and some others. | | I only wish I still had a copy of SpontaneousAssembly 3.0 as it | would be fun to recompile some of my old code! | codys wrote: | It looks like there's a copy in the library of congress [1]. | Unclear how one would go about making a copy. | | 1: https://www.worldcat.org/title/spontaneous-assembly-for- | cc-a... | clan wrote: | For those who got curious like me have a look at: | http://300m.us/docs/computing/SA-3.0a/TOC.htm | | I am not familiar with how libraries work in the US. Can anyone | get a library card with Library of Congress? They have the | floppy images: | | https://www.worldcat.org/title/spontaneous-assembly-for-cc-a... | | EDIT: http://300m.us/docs/computing/ Has a link for purchase | which 404 but to an existing site. Maybe Kevin is the friendly | type? | nikomen wrote: | Anyone can get a reader card at the Library of Congress if | they have a photo id and are at least 16 years old. I'm not | sure how you access computer files there, though. The reader | card has to be obtained in person. With the Library of | Congress closed to visitors because of COVID-19, I imagine | it's not possible right now. | LeoPanthera wrote: | There's a patch for this problem, which is particularly useful if | you want to run Windows 95 in a virtual machine. | https://winworldpc.com/download/c39dc2a0-c2bf-693e-0511-c3a6... | | Indeed, there's a pre-made VirtualBox image pinned to the top of | Reddit's /r/windows95 if you are lazy. | ghewgill wrote: | This is exactly the same timer loop problem as was found in Turbo | Pascal around the same era: | https://retrocomputing.stackexchange.com/q/12111 | RcouF1uZ4gsC wrote: | I think you could solve the problem by pushing the "turbo" | button on the computer case that would reduce your cpu | frequency to something like 8 mhz. | unilynx wrote: | All turbo buttons I remember specifically clocked down to | 4.77 mhz - apparently the original 8088 frequency? | einr wrote: | The turbo button originates from Taiwanese "Turbo XT" | clones that would run an 8088 or V20 at 8, 10, 12 or even | 16 MHz with turbo engaged and 4.77 with it off. | | Later 386 and 486 systems implemented turbo logic in | different ways. Some by reducing bus speed, some by | disabling CPU caches, some by inserting wait states for | memory access. | torgoguys wrote: | It depends. On later computers--around the time frame we're | talking about where the Turbo Pascal CRT bug was showing up | --the turbo button, where it still existed on computers of | the day, often just enabled/disabled the L2 cache near the | processor. | TwoBit wrote: | >"It was somewhat unfortunate that this was called an "AMD fix" | (the file containing the solution was called AMDK6UPD.EXE), even | though Microsoft was clear that this was not a problem in AMD | CPUs but rather in their own code." | | I'll bet the AMD name was suggested by producers and/or | management at the protest of engineering, with the argument that | the public knows this as an AMD problem and so it's better to | call it that regardless of the technical reality. I've seen this | logic many times in my career and do understand there's some | rationale to it. | Nextgrid wrote: | Could it be simply because since the bug primarily affected AMD | CPUs at the time it would make it easier for everyone if the | update was called the "AMD update" as opposed to some cryptic | name like "network stack delay loop update"? | TwoBit wrote: | >"The issue also illustrates how seemingly solid assumptions made | by software and hardware engineers sometimes aren't. Software | engineers look at the currently available CPUs, see how the | fastest ones behave, and assume that CPUs can't get faster by a | factor of 100 anytime soon." | | Disagree. Where I've worked (Oculus/Facebook and EA) we would | never allow such assumptions in code reviews, regardless of how | unlikely the failure may be. You never allow div/0 unless it's | mathematically provable to be impossible. I'm sure other orgs | have the same code review policy, and static analysis these days | would also catch it. | outworlder wrote: | > we would never allow such assumptions in code reviews | | Right. | | Today we have the benefit of hindsight, we know how fast | processors have become. In the Win3.1 era, noone sane would | have predicted this. Even Moore's Law applied to transistor | counts, not processor speeds. | | What you should ask is: what other assumptions are you | implicitly making that you are not currently aware of? | Dylan16807 wrote: | > In the Win3.1 era, noone sane would have predicted this. | | That's a bold claim! | | We went from 4-8MHz 286 chips to 20-50MHz 486 chips in the | decade leading up win3.1's first release. By the time we were | approaching windows 95, pentiums were up to 133MHz. | | Those chips _already_ had a 2-cycle branch instruction. | | So you're already staring down the barrel of calibration | taking 15 milliseconds. It's a reasonably obvious step to | consider LOOP being a cycle faster than branching, which | takes you all the way down to 7 milliseconds. | | So taking that all together, x86 clock speeeds have doubled | 3-4 times in the last dozen years. A chip could come out | tomorrow that takes 15 or even 7 milliseconds on the | calibration loop. Your code breaks if it hits 2. | | I think someone sane could have predicted the problem. | anyfoo wrote: | That's simplifying things a little. The 90s were a completely | different time in computing, still somewhat pioneer when it | came to "modern" operating systems in personal computing. What | came before on home computers was usually tied to the actual | hardware and its implementation in a very thorough way, where | way more outrageous (but at the time, widely accepted) | assumptions were made. For example, what memory location to | write into for direct display on the screen _from your | application code_. A few years earlier, _the absolute time that | a particular instruction takes_. | | Computers became more powerful and more diverse, we added | abstractions, we abolished assumptions. | | And still I'm pretty sure that even in Oculus (to pick up your | example, I know nothing about that), there are bound to be a | great deal of assumptions in the code that cease to be valid | with later versions of the products. | anyfoo wrote: | By the way, it just dawned on me that preventing the division | by 0 is not even solving the problem. What then, just set the | delay to the biggest representable delay? But on a machine | with a 1000x faster CPU, that can still be off by an order of | magnitude or two. And depending on what the delay is used | for, _that_ could then cause much harder to debug problems | later on. Some assumptions about reasonable ranges had to be | made, just like the assumption that 32 bit was a reasonable | address size back then. But a more obvious error message | would have been nice (something the article mentions as | well). | raverbashing wrote: | This reminds me of some discussion about the evolution of games | (can't find it right now, it was probably about ID Software). | | Computers today are _literally_ 1000x better than PCs 30 years | ago. 1000x (even more) faster, 1000x more ram, not to mention | storage and other capabilities | Dylan16807 wrote: | And yet latency to RAM goes almost unchanged, which has a lot | of very interesting effects. | netsharc wrote: | Huh, my 1st computer hat 640KB of RAM (does it count as a | computer?), the 3rd one had either 4 or 8 MB. My current one | has 16GB, so you're right, that is actually 2048 (or 4096) | times more... | thrownaway954 wrote: | i love this site. os/2 was such a huge part of my life in the 90s | and the sole reason i love computers back then. it's great that | this site has preserved so much history of it. | kzrdude wrote: | Serious question, what was OS/2 and who used it? | TazeTSchnitzel wrote: | https://en.wikipedia.org/wiki/OS/2 | LeoPanthera wrote: | Wikipedia's OS/2 article is comprehensive. | | https://en.wikipedia.org/wiki/OS/2 | | tl;dr A graphical OS developed by IBM that succeeded DOS and | competed with Windows. Notably, it featured pre-emptive | multitasking before Windows did. It was not a success in the | home market but was reasonably successful in big business, | especially finance, for a short amount of time. | skissane wrote: | As a tween/teen, I learnt a lot from OS/2. Up until then I | had only used DOS and Windows 3.x. And then my Dad bought | me a copy of OS/2 2.0, and also the Walnut Creek Hobbes | OS/2 CD-ROM. And I discovered EMX (the OS/2 equivalent of | Cygwin). And I started playing with bash, EMACS, GCC, etc. | Next thing you know, I was installing Slackware Linux. At | which point I largely lost interest in OS/2. But EMX was an | important stepping-stone for me in getting in to Linux. | Lammy wrote: | And still exists today as ArcaOS! | | https://www.arcanoae.com/ | projektfu wrote: | Much better name than eComStation | Narishma wrote: | I think the first version wasn't graphical. | LeoPanthera wrote: | Actually that's right! The GUI, called "Presentation | Manager", debuted with OS/2 1.1. | kelnos wrote: | I think it's important to note (even in a tl;dr) that for a | time OS/2 was a joint venture between IBM and Microsoft, | and that MS sabotaged that relationship while secretly | working on WinNT. | | On a related note, "Showstopper!: The Breakneck Race to | Create Windows NT and the Next Generation at Microsoft" is | a surprisingly entertaining story, and reads more like a | novel than a documentary/memoir. | WalterGR wrote: | https://en.m.wikipedia.org/wiki/OS/2 : | | _As a result of a feud between the two companies over | how to position OS /2 relative to Microsoft's new Windows | 3.1 operating environment, the two companies severed the | relationship in 1992 and OS/2 development fell to IBM | exclusively._ | | https://en.m.wikipedia.org/wiki/Windows_NT : | | _Windows 3.0 was eventually so successful that Microsoft | decided to change the primary application programming | interface for the still unreleased NT OS /2 (as it was | then known) from an extended OS/2 API to an extended | Windows API. This decision caused tension between | Microsoft and IBM and the collaboration ultimately fell | apart._ | thrownaway954 wrote: | an operating system made through a joint venture between | microsoft and ibm. it was the predecessor to WinNT. it could | run dos, win16, win32, posix as well as os/2 native apps. it | really was an amazing operating system at the time with a | VERY passionate community behind it. watch some of the videos | for a good take on it: | | http://www.os2museum.com/wp/os2-history/os2-videos-1987/ | som33 wrote: | It was the days where people owned their own software and DRM | had not made it's way into games, since the internet has | enabled PC game theft on a massive scale, by valve, ea and | activision. | | OS/2 was an alternative Operating system oriented towards | businesses that could run apps from different operating | systems under one unified framework. | walterbell wrote: | People who wanted an object-oriented graphical desktop | | ATM machines. | Wowfunhappy wrote: | Would it have been possible for Microsoft to test for something | like this, or would it be possible today? For example, is it | feasible to slow down time to simulate an impossibly-fast CPU? | traverseda wrote: | You can do that for linux userpace apps using the "faketime" | utility. It just intercepts that calls that try to find out the | actual system time. Not sure how that would effect kernalspace, | since the kernal is sort of the thing that decides what time | actually _is_. | Wowfunhappy wrote: | > Not sure how that would effect kernalspace, since the | kernal is sort of the thing that decides what time actually | is. | | Yes, I'm imagining you'd need to be in a virtualized/emulated | environment of some sort. | quickthrower2 wrote: | For a unit test, you could reduce the loop counter to say 1024. | londons_explore wrote: | The reverse (speeding up time) is done pretty frequently to | check software for bugs that might only occur after its been | running for a few years. | | It finds things like "The daily check for updates leaves a few | logfiles, and after 30 years there are enough logfiles that the | disk is full". | | Normally you need to fake or mock all time related API's. | jeffbee wrote: | We still face a related class of problem today. The x86 PAUSE has | wildly varying throughput. On most Intel parts it is 1/8 or so, | but on Skylake Xeon it's 1/141. On Ryzen its 1/3. I've seen code | that makes assumptions about how much real time must have passed | based on PAUSE loops. | acqq wrote: | > I've seen code that makes assumptions about how much real | time must have passed based on PAUSE loops. | | Note: here, the PAUSE instruction is not the problem at all, | but the "code that makes assumptions." | | Because the "seen" code is not named, I assume it's something | internal for some company? | jeffbee wrote: | Yes, private code. Basically there was some mutex fairness | thing that was written on a SKX and on a Zen CPU where PAUSE | is 50x faster it didn't have good fairness, it was too tight. | userbinator wrote: | The last time I benchmarked it, which was at the beginning of the | i7 era, LOOP was just as fast (within the margin of error) as | dec/jnz - Intel probably doesn't want to be seen as slower than | AMD and didn't care about that timing loop anymore. | MauranKilom wrote: | Couldn't they just microcode it to that? ___________________________________________________________________ (page generated 2020-06-03 23:00 UTC)