[HN Gopher] Those Win9x Crashes on Fast Machines
       ___________________________________________________________________
        
       Those Win9x Crashes on Fast Machines
        
       Author : abbeyj
       Score  : 173 points
       Date   : 2020-06-02 15:20 UTC (1 days ago)
        
 (HTM) web link (www.os2museum.com)
 (TXT) w3m dump (www.os2museum.com)
        
       | mwcampbell wrote:
       | Do drivers or other kernel code still have this kind of delay
       | loop in current operating systems, or is everything interrupt-
       | driven now?
        
         | garaetjjte wrote:
         | If they need to wait in atomic context, they probably do.
         | 
         | e.g. for linux:
         | https://www.kernel.org/doc/Documentation/timers/timers-howto...
         | 
         |  _ATOMIC CONTEXT: You must use the_ _delay family of functions.
         | These functions use the jiffie estimation of clock speed and
         | will busy wait for enough loop cycles to achieve the desired
         | delay_
        
       | lordnacho wrote:
       | I've written this kind of code myself, where you measure a time
       | delta and divide something by the delta. It's always something
       | that sticks out though, that you might divide by zero (especially
       | if you did it in Java!).
       | 
       | The article says it would have been picked up in code review, and
       | I agree. But it just seems odd that it wasn't changed right
       | there. Why not just write to loop so that it keeps looping as
       | long as the divisor is below some number like 10ms? You also want
       | to minimise the estimation error, which is easier to do if you
       | divide by a slightly larger number. Consider a loop that takes
       | between 1 and 2ms to finish, your estimate will be either x or
       | 2x.
        
       | qwerty456127 wrote:
       | I have used Windows 98 SE on CPUs up to various Pentium 3s. There
       | was a problem with big (above 512MB) RAM volumes but it was easy
       | to solve.
       | 
       | I was only forced to switch to Windows XP when I upgraded to
       | Pentium M (Dothan) - besides the Safe Mode I could find no
       | solution to run Windowes 98 on it.
       | 
       | I would gladly return to Windows 98 now if my hardware and
       | software supported it.
        
         | Wowfunhappy wrote:
         | Why do you prefer Windows 98 over XP?
        
       | phire wrote:
       | The original Mac does a calibration of the floppy drive motor
       | during boot to measure jitter.
       | 
       | If you are implementing an emulator, you must insert some jitter
       | to the emulated floppy drive.
       | 
       | Because if there is no jitter, the ROM's calibration code does a
       | division by zero and crashes.
        
         | matthewhartmans wrote:
         | Now I know why hitting the computer always seemed to fix it :)
        
         | Wowfunhappy wrote:
         | While perhaps not so great from a defensive programming
         | perspective, Mac OS feels like a different case since it's only
         | designed to run on specific hardware.
         | 
         | Modern Mac OS also has all sorts of "bugs" that Hackintosh
         | users need to patch or otherwise work around. Since we're doing
         | something that was never intended, I don't really see these as
         | flaws in the OS.
        
           | vonseel wrote:
           | I'm about to build my third hackintosh, although it will be
           | my first on OpenCore. Can you expand upon why you call these
           | "bugs" and which patches you are referring to?
        
             | Wowfunhappy wrote:
             | Well, one specific thing I was thinking of was the limit of
             | 15 USB ports in El Capitan and later. There's no reason for
             | that to exist in an absolute sense, but no real Macs have
             | enough ports to run into trouble.
        
               | thekyle wrote:
               | What about USB hubs? Are you limited to 15 USB ports
               | total?
        
               | Wowfunhappy wrote:
               | No, only ports on the motherboard. I think the limit is
               | technically per-controller, but I'm not sure and I don't
               | want to say something wrong. If you add ports via a PCIe
               | card, those don't count against the limit either.
               | 
               | That said, the limit is more problematic than it
               | initially appears, because USB 3 ports count twice--once
               | for USB 2 devices, and once for USB 3 devices. Some
               | motherboards also use USB under the hood for things like
               | Bluetooth, and USB headers which aren't connected to
               | anything will take up space if you don't explicitly
               | exclude them.
        
           | kelnos wrote:
           | I would still consider them to be timebomb bugs, though. Even
           | if you're developing for a restricted set of hardware, newer
           | versions of that hardware could very easily violate some
           | corner-cutting assumptions in the future. I would rather
           | spend a little more time now to get something right and
           | future-proof, rather than pass the problem onto future-me,
           | who likely won't have the context anymore to find the issue
           | quickly, or, worse, future-someone-else, who doesn't have the
           | context at all.
        
             | yjftsjthsd-h wrote:
             | Yeah, over a long enough time window I think portability
             | and correctness will always come back to bite you. Apple
             | could've saved time by making Darwin only handle one
             | processor nicely, but then the Intel transition and ARM
             | additions (iOS is still Darwin, after all) would've hurt
             | more. Windows coasted on x86 for a while, but now that
             | they're targeting ARM I'll bet they're pretty glad that it
             | was originally built to be portable. Code that only works
             | on the exact system you need today might be good enough
             | sometimes, but if you survive long enough you'll want it to
             | be more flexible than all that.
             | 
             | EDIT: I should add, this applies to applications, not just
             | OSs. If you're an Android dev - will your app run on
             | Android 15? Will it work on ChromeOS? Will it run on
             | Fuchsia? If you're writing Windows software - will it run
             | on ARM? If you're making webapps - does they work on
             | Firefox? And maybe it's not worth the effort, especially if
             | you don't plan to be selling the same software in 5 years,
             | or maybe you think you can just deal with those things when
             | you get there, but if you plan to still be in business in a
             | decade then you should plan accordingly.
        
       | korethr wrote:
       | Holy shit. I feel like this neatly explains why Windows 95 was an
       | utter crash-fest on the computer I bought just before my freshman
       | year of highschool. With an AMD K6-2 running at 350Mhz, it was
       | the first computer I had that was all new components instead of
       | the franken-sytems built from a dumpster-dive base and other
       | dumpster-dived components grafted on. The shop I bought it from
       | initially put 95 OSR2 on it. And it did like to crash. It wasn't
       | until I started using Windows 98SE that I started to see anything
       | resembling stability, and not need to re-install every other
       | month.
       | 
       | If only I had known about AMDK6UPD.EXE back then and been able to
       | understand the reasons behind the crash and why the patch fixed
       | things.
        
       | graton wrote:
       | I have to admit I find this type of article about old
       | computer/software quite interesting as recently I discovered a
       | backup of mine that contained source code I wrote in 1993. I was
       | writing assembly language back then. Using a really great library
       | called Spontaneous Assembly. First version 2.0 and then 3.0.
       | SpontaneousAssembly 3.0 added support for easily writing TSR
       | (Terminate and Stay Resident) code.
       | 
       | Back in the early 1990s I was in college and working in the
       | computer lab. So I wrote various little DOS utilities to help us
       | better manage the computers and the interaction with Novell
       | Netware.
       | 
       | Due to this reminiscing I have even purchased a few tech books
       | from that time. MS-DOS Encyclopedia, Peter Norton's Programmers
       | Guide to the IBM PC, and some others.
       | 
       | I only wish I still had a copy of SpontaneousAssembly 3.0 as it
       | would be fun to recompile some of my old code!
        
         | codys wrote:
         | It looks like there's a copy in the library of congress [1].
         | Unclear how one would go about making a copy.
         | 
         | 1: https://www.worldcat.org/title/spontaneous-assembly-for-
         | cc-a...
        
         | clan wrote:
         | For those who got curious like me have a look at:
         | http://300m.us/docs/computing/SA-3.0a/TOC.htm
         | 
         | I am not familiar with how libraries work in the US. Can anyone
         | get a library card with Library of Congress? They have the
         | floppy images:
         | 
         | https://www.worldcat.org/title/spontaneous-assembly-for-cc-a...
         | 
         | EDIT: http://300m.us/docs/computing/ Has a link for purchase
         | which 404 but to an existing site. Maybe Kevin is the friendly
         | type?
        
           | nikomen wrote:
           | Anyone can get a reader card at the Library of Congress if
           | they have a photo id and are at least 16 years old. I'm not
           | sure how you access computer files there, though. The reader
           | card has to be obtained in person. With the Library of
           | Congress closed to visitors because of COVID-19, I imagine
           | it's not possible right now.
        
       | LeoPanthera wrote:
       | There's a patch for this problem, which is particularly useful if
       | you want to run Windows 95 in a virtual machine.
       | https://winworldpc.com/download/c39dc2a0-c2bf-693e-0511-c3a6...
       | 
       | Indeed, there's a pre-made VirtualBox image pinned to the top of
       | Reddit's /r/windows95 if you are lazy.
        
       | ghewgill wrote:
       | This is exactly the same timer loop problem as was found in Turbo
       | Pascal around the same era:
       | https://retrocomputing.stackexchange.com/q/12111
        
         | RcouF1uZ4gsC wrote:
         | I think you could solve the problem by pushing the "turbo"
         | button on the computer case that would reduce your cpu
         | frequency to something like 8 mhz.
        
           | unilynx wrote:
           | All turbo buttons I remember specifically clocked down to
           | 4.77 mhz - apparently the original 8088 frequency?
        
             | einr wrote:
             | The turbo button originates from Taiwanese "Turbo XT"
             | clones that would run an 8088 or V20 at 8, 10, 12 or even
             | 16 MHz with turbo engaged and 4.77 with it off.
             | 
             | Later 386 and 486 systems implemented turbo logic in
             | different ways. Some by reducing bus speed, some by
             | disabling CPU caches, some by inserting wait states for
             | memory access.
        
             | torgoguys wrote:
             | It depends. On later computers--around the time frame we're
             | talking about where the Turbo Pascal CRT bug was showing up
             | --the turbo button, where it still existed on computers of
             | the day, often just enabled/disabled the L2 cache near the
             | processor.
        
       | TwoBit wrote:
       | >"It was somewhat unfortunate that this was called an "AMD fix"
       | (the file containing the solution was called AMDK6UPD.EXE), even
       | though Microsoft was clear that this was not a problem in AMD
       | CPUs but rather in their own code."
       | 
       | I'll bet the AMD name was suggested by producers and/or
       | management at the protest of engineering, with the argument that
       | the public knows this as an AMD problem and so it's better to
       | call it that regardless of the technical reality. I've seen this
       | logic many times in my career and do understand there's some
       | rationale to it.
        
         | Nextgrid wrote:
         | Could it be simply because since the bug primarily affected AMD
         | CPUs at the time it would make it easier for everyone if the
         | update was called the "AMD update" as opposed to some cryptic
         | name like "network stack delay loop update"?
        
       | TwoBit wrote:
       | >"The issue also illustrates how seemingly solid assumptions made
       | by software and hardware engineers sometimes aren't. Software
       | engineers look at the currently available CPUs, see how the
       | fastest ones behave, and assume that CPUs can't get faster by a
       | factor of 100 anytime soon."
       | 
       | Disagree. Where I've worked (Oculus/Facebook and EA) we would
       | never allow such assumptions in code reviews, regardless of how
       | unlikely the failure may be. You never allow div/0 unless it's
       | mathematically provable to be impossible. I'm sure other orgs
       | have the same code review policy, and static analysis these days
       | would also catch it.
        
         | outworlder wrote:
         | > we would never allow such assumptions in code reviews
         | 
         | Right.
         | 
         | Today we have the benefit of hindsight, we know how fast
         | processors have become. In the Win3.1 era, noone sane would
         | have predicted this. Even Moore's Law applied to transistor
         | counts, not processor speeds.
         | 
         | What you should ask is: what other assumptions are you
         | implicitly making that you are not currently aware of?
        
           | Dylan16807 wrote:
           | > In the Win3.1 era, noone sane would have predicted this.
           | 
           | That's a bold claim!
           | 
           | We went from 4-8MHz 286 chips to 20-50MHz 486 chips in the
           | decade leading up win3.1's first release. By the time we were
           | approaching windows 95, pentiums were up to 133MHz.
           | 
           | Those chips _already_ had a 2-cycle branch instruction.
           | 
           | So you're already staring down the barrel of calibration
           | taking 15 milliseconds. It's a reasonably obvious step to
           | consider LOOP being a cycle faster than branching, which
           | takes you all the way down to 7 milliseconds.
           | 
           | So taking that all together, x86 clock speeeds have doubled
           | 3-4 times in the last dozen years. A chip could come out
           | tomorrow that takes 15 or even 7 milliseconds on the
           | calibration loop. Your code breaks if it hits 2.
           | 
           | I think someone sane could have predicted the problem.
        
         | anyfoo wrote:
         | That's simplifying things a little. The 90s were a completely
         | different time in computing, still somewhat pioneer when it
         | came to "modern" operating systems in personal computing. What
         | came before on home computers was usually tied to the actual
         | hardware and its implementation in a very thorough way, where
         | way more outrageous (but at the time, widely accepted)
         | assumptions were made. For example, what memory location to
         | write into for direct display on the screen _from your
         | application code_. A few years earlier, _the absolute time that
         | a particular instruction takes_.
         | 
         | Computers became more powerful and more diverse, we added
         | abstractions, we abolished assumptions.
         | 
         | And still I'm pretty sure that even in Oculus (to pick up your
         | example, I know nothing about that), there are bound to be a
         | great deal of assumptions in the code that cease to be valid
         | with later versions of the products.
        
           | anyfoo wrote:
           | By the way, it just dawned on me that preventing the division
           | by 0 is not even solving the problem. What then, just set the
           | delay to the biggest representable delay? But on a machine
           | with a 1000x faster CPU, that can still be off by an order of
           | magnitude or two. And depending on what the delay is used
           | for, _that_ could then cause much harder to debug problems
           | later on. Some assumptions about reasonable ranges had to be
           | made, just like the assumption that 32 bit was a reasonable
           | address size back then. But a more obvious error message
           | would have been nice (something the article mentions as
           | well).
        
         | raverbashing wrote:
         | This reminds me of some discussion about the evolution of games
         | (can't find it right now, it was probably about ID Software).
         | 
         | Computers today are _literally_ 1000x better than PCs 30 years
         | ago. 1000x (even more) faster, 1000x more ram, not to mention
         | storage and other capabilities
        
           | Dylan16807 wrote:
           | And yet latency to RAM goes almost unchanged, which has a lot
           | of very interesting effects.
        
           | netsharc wrote:
           | Huh, my 1st computer hat 640KB of RAM (does it count as a
           | computer?), the 3rd one had either 4 or 8 MB. My current one
           | has 16GB, so you're right, that is actually 2048 (or 4096)
           | times more...
        
       | thrownaway954 wrote:
       | i love this site. os/2 was such a huge part of my life in the 90s
       | and the sole reason i love computers back then. it's great that
       | this site has preserved so much history of it.
        
         | kzrdude wrote:
         | Serious question, what was OS/2 and who used it?
        
           | TazeTSchnitzel wrote:
           | https://en.wikipedia.org/wiki/OS/2
        
           | LeoPanthera wrote:
           | Wikipedia's OS/2 article is comprehensive.
           | 
           | https://en.wikipedia.org/wiki/OS/2
           | 
           | tl;dr A graphical OS developed by IBM that succeeded DOS and
           | competed with Windows. Notably, it featured pre-emptive
           | multitasking before Windows did. It was not a success in the
           | home market but was reasonably successful in big business,
           | especially finance, for a short amount of time.
        
             | skissane wrote:
             | As a tween/teen, I learnt a lot from OS/2. Up until then I
             | had only used DOS and Windows 3.x. And then my Dad bought
             | me a copy of OS/2 2.0, and also the Walnut Creek Hobbes
             | OS/2 CD-ROM. And I discovered EMX (the OS/2 equivalent of
             | Cygwin). And I started playing with bash, EMACS, GCC, etc.
             | Next thing you know, I was installing Slackware Linux. At
             | which point I largely lost interest in OS/2. But EMX was an
             | important stepping-stone for me in getting in to Linux.
        
             | Lammy wrote:
             | And still exists today as ArcaOS!
             | 
             | https://www.arcanoae.com/
        
               | projektfu wrote:
               | Much better name than eComStation
        
             | Narishma wrote:
             | I think the first version wasn't graphical.
        
               | LeoPanthera wrote:
               | Actually that's right! The GUI, called "Presentation
               | Manager", debuted with OS/2 1.1.
        
             | kelnos wrote:
             | I think it's important to note (even in a tl;dr) that for a
             | time OS/2 was a joint venture between IBM and Microsoft,
             | and that MS sabotaged that relationship while secretly
             | working on WinNT.
             | 
             | On a related note, "Showstopper!: The Breakneck Race to
             | Create Windows NT and the Next Generation at Microsoft" is
             | a surprisingly entertaining story, and reads more like a
             | novel than a documentary/memoir.
        
               | WalterGR wrote:
               | https://en.m.wikipedia.org/wiki/OS/2 :
               | 
               |  _As a result of a feud between the two companies over
               | how to position OS /2 relative to Microsoft's new Windows
               | 3.1 operating environment, the two companies severed the
               | relationship in 1992 and OS/2 development fell to IBM
               | exclusively._
               | 
               | https://en.m.wikipedia.org/wiki/Windows_NT :
               | 
               |  _Windows 3.0 was eventually so successful that Microsoft
               | decided to change the primary application programming
               | interface for the still unreleased NT OS /2 (as it was
               | then known) from an extended OS/2 API to an extended
               | Windows API. This decision caused tension between
               | Microsoft and IBM and the collaboration ultimately fell
               | apart._
        
           | thrownaway954 wrote:
           | an operating system made through a joint venture between
           | microsoft and ibm. it was the predecessor to WinNT. it could
           | run dos, win16, win32, posix as well as os/2 native apps. it
           | really was an amazing operating system at the time with a
           | VERY passionate community behind it. watch some of the videos
           | for a good take on it:
           | 
           | http://www.os2museum.com/wp/os2-history/os2-videos-1987/
        
           | som33 wrote:
           | It was the days where people owned their own software and DRM
           | had not made it's way into games, since the internet has
           | enabled PC game theft on a massive scale, by valve, ea and
           | activision.
           | 
           | OS/2 was an alternative Operating system oriented towards
           | businesses that could run apps from different operating
           | systems under one unified framework.
        
           | walterbell wrote:
           | People who wanted an object-oriented graphical desktop
           | 
           | ATM machines.
        
       | Wowfunhappy wrote:
       | Would it have been possible for Microsoft to test for something
       | like this, or would it be possible today? For example, is it
       | feasible to slow down time to simulate an impossibly-fast CPU?
        
         | traverseda wrote:
         | You can do that for linux userpace apps using the "faketime"
         | utility. It just intercepts that calls that try to find out the
         | actual system time. Not sure how that would effect kernalspace,
         | since the kernal is sort of the thing that decides what time
         | actually _is_.
        
           | Wowfunhappy wrote:
           | > Not sure how that would effect kernalspace, since the
           | kernal is sort of the thing that decides what time actually
           | is.
           | 
           | Yes, I'm imagining you'd need to be in a virtualized/emulated
           | environment of some sort.
        
         | quickthrower2 wrote:
         | For a unit test, you could reduce the loop counter to say 1024.
        
         | londons_explore wrote:
         | The reverse (speeding up time) is done pretty frequently to
         | check software for bugs that might only occur after its been
         | running for a few years.
         | 
         | It finds things like "The daily check for updates leaves a few
         | logfiles, and after 30 years there are enough logfiles that the
         | disk is full".
         | 
         | Normally you need to fake or mock all time related API's.
        
       | jeffbee wrote:
       | We still face a related class of problem today. The x86 PAUSE has
       | wildly varying throughput. On most Intel parts it is 1/8 or so,
       | but on Skylake Xeon it's 1/141. On Ryzen its 1/3. I've seen code
       | that makes assumptions about how much real time must have passed
       | based on PAUSE loops.
        
         | acqq wrote:
         | > I've seen code that makes assumptions about how much real
         | time must have passed based on PAUSE loops.
         | 
         | Note: here, the PAUSE instruction is not the problem at all,
         | but the "code that makes assumptions."
         | 
         | Because the "seen" code is not named, I assume it's something
         | internal for some company?
        
           | jeffbee wrote:
           | Yes, private code. Basically there was some mutex fairness
           | thing that was written on a SKX and on a Zen CPU where PAUSE
           | is 50x faster it didn't have good fairness, it was too tight.
        
       | userbinator wrote:
       | The last time I benchmarked it, which was at the beginning of the
       | i7 era, LOOP was just as fast (within the margin of error) as
       | dec/jnz - Intel probably doesn't want to be seen as slower than
       | AMD and didn't care about that timing loop anymore.
        
         | MauranKilom wrote:
         | Couldn't they just microcode it to that?
        
       ___________________________________________________________________
       (page generated 2020-06-03 23:00 UTC)