[HN Gopher] Intel and AMD Contemplate Different Replacements for... ___________________________________________________________________ Intel and AMD Contemplate Different Replacements for x86 Interrupt Handling Author : eklitzke Score : 140 points Date : 2021-06-04 18:10 UTC (4 hours ago) (HTM) web link (www.eejournal.com) (TXT) w3m dump (www.eejournal.com) | korethr wrote: | Somewhat off topic from the main thread of the article, but I | have always wondered about the multiple privilege levels. What's | the expected/intended use for them? The only thing I think of is | separating out hardware drivers (assuming ring 1 can still | directly read/write I/O ports or memory addresses mapped to | hardware) so they can't crash the kernel should the drivers or | hardware turn out to be faulty. But I don't think I've ever heard | of such a design being used in practice. It seems everyone throws | their driver code into ring 0 with the rest of the kernel, and if | the driver or hardware faults and takes the kernel with it, too | bad, so sad. Ground RESET and start over. | | What I find myself wondering is _why_? It seems like a good idea | on paper, at least. Is it just a hangover from other CPU | architectures that only had privileged /unprivileged modes, and | programmers just ended up sticking with what they were already | familiar and comfortable with? Was there some painful gotcha | about multiple privilege modes that made them impractical to use, | like the time overhead of switching privilege levels made it | impossible to meet some hardware deadline? Silicon-level bugs? | Something else? | monocasa wrote: | It works with the call gates. You could have a sort of nested | microkernel idea if it wasn't such a mess with each ring able | to include the lower privileged ring's address spaces. And not | just a free for all, but the kernel really is just the control | plane, but can set up all sorts of per process descriptor | tables (the LDTs). | | So you'd have a tiny kernel at ring 0 which could R/W | everything but wasn't responsible for much. | | Under that you'd have drivers at ring 1 that can't see the | kernel or other drivers, but can R/W user code at rings 2 and | 3. | | Under that you'd have system daemons at ring 2 that can R/W | regular programs but not other daemons at the same level, nor | drivers or the kernel. | | And then under that you had regular processes at ring 0 that | have generally the same semantics as today's processes. | | Each process of any ring can export a syscall like table | through the call gates, and so user code could directly invoke | drivers or daemons without going throught the kernel at all. | Basically IPC with about the same overhead as a C++ virtual | method call. | | So what happened? The exact underlying semantics didn't exactly | match many OSs anyone wanted to build (particularly OSs that | anyone cared about in the late 80s early 90s). And you can | enforce similar semantics all in software with the exception of | the cheap IPC anyway. | dmitrygr wrote: | I think OS/2 used every feature x86 had, including call gates | and at least 3 privilege levels (0, 2, 3). That is why OS/2 is | such a good test for any aspiring x86 emulator developer. | vlovich123 wrote: | Given the multi core, NUMA and Spectre/meltdown reality we're | living in, and the clear benefits of the io_uring approach, why | not just have a dedicated core(s) to handle "interrupts" which | are nothing more than entries in a shared memory table? | devit wrote: | There must be a way to alter a core's instruction pointer from | another core or from hardware to support killing processes | running untrusted machine code, and to support pre-emptive | multithreading without needing to have compilers add a check | for preemption on all backward branches and calls. | | These features are very worth the hassle of providing this | capability (which is known as "IPI"s), and once you have that | hardware interrupts become pretty much free to support by using | the same capability and the OS/user can then decide whether to | dedicate a core, to affinity them to a core, to load balance | them among all cores or to disable the interrupts and poll | instead. | vlovich123 wrote: | I was thinking rather than mucking with instruction pointers | you would just send a message back to the other CPU saying | "pause & switch to context X". Technically an interrupt but | one that can be handled internally within the CPU. | [deleted] | electricshampo1 wrote: | This is essentially the approach taken by | | https://www.dpdk.org/ (network) and https://spdk.io/ (storage) | | Anything trying to squeeze perf doing IO intensive work should | switch to this model (context permitting of course). | knz42 wrote: | This is the approach proposed in | http://doi.org/10.1109/TPDS.2015.2492542 | | (preprint: | https://science.raphael.poss.name/pub/poss.15.tpds.pdf ) | DSingularity wrote: | And old systems like Corey OS. | eklitzke wrote: | This approach works for I/O devices (and for things like | network cards the kernel will typically poll them anyway), but | I/O isn't the only thing that generates interrupts. For | instance, a processor fault (e.g. divide by zero) should be | handled immediately and synchronously since the CPU core | generating the fault can't do any useful work until the fault | is handled. | vlovich123 wrote: | Is that actually true? Wouldn't this imply you could launch a | DOS attack against cloud providers just generating divisions | by zero? | bonzini wrote: | You would only attack yourself. The CPU time you're paying | for would be spent processing division by zero exceptions. | vlovich123 wrote: | Then the CPU isn't stopping and is moving on doing other | work. Meaning the divide by zero could be processed by a | background CPU and doesn't require immediate handling. | Same for page faults. | bananabreakfast wrote: | Incorrect. The "CPU" is stopping and handling the fault. | There is no background CPU from your perspective. In a | cloud provider you are always in a virtual environment | and using a vCPU which is constantly being preempted by | the hypervisor. | | You cannot DOS a hypervisor just by bogging down your | virtualized kernel. | bonzini wrote: | Your instance can still be preempted by the hypervisor. | imtringued wrote: | A site that renders HTML is more computationally expensive | than handling a site that does nothing but division by | zero. | justincormack wrote: | Faults for divide by zero are a terrible legacy thing, Arm | etc do not do this, you test for zero if you want to, | otherwise you get a value. | DblPlusUngood wrote: | A better example: a page fault for a non-present page. | rwmj wrote: | At university we designed an architecture[1] where you | had to test for page not present yourself. It was all | about seeing if we could make a simpler architecture | where all interrupts could be handled synchronously, so | you'd never have to save and restore the pipeline. Also | division by zero didn't trap - you had to check before | dividing. IIRC the conclusion was it was possible but | somewhat tedious to write a compiler for[2], plus you had | to have a trusted compiler which is a difficult sell. | | [1] But sadly didn't implement it in silicon! FPGAs were | much more primitive back then. | | [2] TCG in modern qemu has similar concerns in that they | also need to worry about when code crosses page | boundaries, and they also have a kind of "trusted" | compiler (in as much as everything must go through TCG). | warkdarrior wrote: | Interesting. So what happens if the program does not test | for the page? Doesn't the processor have to handle that | as an exception of sorts? | vlovich123 wrote: | Could be handled by having the CPU switch to a different | process while the kernel CPU faults the data in. | ben509 wrote: | The CPU doesn't know what processes are, that's handled | by the OS. So there still needs to be a fault. | vlovich123 wrote: | You're thinking about computer architecture as designed | today. There's no reason there isn't a common data | structure defined that the CPU can use to select a backup | process, much how it uses page table data structures in | main memory to resolve TLB misses. | pjc50 wrote: | Just to make it explicit for the people having trouble, | the mechanism for switching processes in a pre-emptive | multitasking system _is_ interrupts. | DblPlusUngood wrote: | In some cases, yes (if there are other runnable threads | on this CPU's queue). | eklitzke wrote: | That was just an example, there are many other things the | CPU can do that will generate a fault (for example, trying | to execute an illegal instruction). | mschuster91 wrote: | Yes, but this one is so hard baked into everything that it | would kill any form of backwards compatibility. | bonzini wrote: | Or interprocessor interrupts for flushing the TLB, | terminating the scheduling quantum, or anything else. | bostonsre wrote: | Would guess it would be complicated and difficult to update the | kernel to support something like that. Not sure Linus would | entertain PRs for custom boards that do something like that. | Would think it would need to be an industry wide push for that. | But just speculation.. | vlovich123 wrote: | We're talking about the CPU architecture here, not custom | one-off ARM boards. Think x86 or ARM not a Qualcomm SoC. | | And yes of course. Linus' opinion would be needed. | bogomipz wrote: | The author states: | | >'The processor nominally maintained four separate stacks (one | for each privilege level), plus a possible "shadow stack" for the | operating system or hypervisor.' | | Can someone elaborate on what the "shadow stack" is and what it's | for exactly? This is the first time I've heard this nomenclature. | ChuckMcM wrote: | Argh, why do author's write stuff like this -- _" It is, not to | put too fine a point on it, a creaking old bit of wheezing | ironmongery that, had the gods of microprocessor architecture | been more generous, would have been smote into oblivion long | ago."_ | | Just because a technology is "old" doesn't mean it is useless, or | needs to be replaced. I'm all in favor of fixing problems, and | refactoring to improve flow and remove inefficiencies. I am _not_ | a fan of re-inventing the wheel because gee, we 've had this | particular wheel for 50 years and its doing fine but hey let's | reimagine it anyway. | | That said, the kink in x86 architecture was put their by "IBM PC | Compatibility" and a Windows/Intel monopoly that went on way too | long. But even knowing _why_ the thing has these weird artifacts | that just means the engineers are working under constraints you | don 't understand, doesn't give you license to dismiss what | they've done as needing to be "wiped away." | | We are in a period where enthusiasts can design, build, and | operate a completely bespoke ISA and micro-architecture with | dense low cost FPGAs. Maybe they don't run at multi-GHz speeds | but if you want to contribute positively to the question of | computer architecture, there has never been a better time. You | don't even have to build the whole thing! You can just add it | into an existing architecture and compare how you do against it. | | Want to do flow control colored register allocation for | speculative instruction retirement? You can build the entire | execution unit in an FPGA and throw instructions at it to your | hearts content and provide analysis of the results. | | Okay, enough ranting. I want AARCH64 to win so we can reset the | problem set back to a smaller number of workarounds, but I think | the creativity of people trying to advance the x86 architecture | given the constraints is not something to belittled, it is to be | admired. | gumby wrote: | Also "smitten" is the passive past participle of "smite", not | "smote", which is the active. This bothered me through the | whole article. | | I suppose the author is not a native English speaker. Like the | person who titled a film "honey I shrunk the kids". Makes me | wince to even type it. | Dylan16807 wrote: | I propose that they are quite familiar with English and want | to avoid the love-related connotations of "smitten". | Scene_Cast2 wrote: | It might be there to put a playful twist on things. Kind of | like "shook" (slang) or "woke" (when it first appeared). | failwhaleshark wrote: | _And I woke from a terrible dream_ | | _So I caught up my pal Jack Daniel 's_ | | _And his partner Jimmy Beam_ | | https://www.independent.co.uk/news/uk/home-news/woke- | meaning... | | _AC /DC - You Shook Me All Night Long (Official Video)_ | | https://youtu.be/Lo2qQmj0_h4 | | (Rock n' roll stole the best vernacular.) | gumby wrote: | Those are both correct ("conventional" to descriptivists | like me) uses of the respective terms. | thanatos519 wrote: | Ah, "not a native English speaker", usual written as | "American". | | </troll> | coldtea wrote: | > _Just because a technology is "old" doesn't mean it is | useless, or needs to be replaced._ | | Sure. But if on top of old it is "a creaking wheezing | ironmonger" that "had the gods of microprocessor architecture | been more generous, would have been smote into oblivion long | ago", then it does need to be replaced. | | And both the article mentions reasons why (they don't stop at | mere old) and Intel/AMD share them. | nwmcsween wrote: | > Just because a technology is "old" doesn't mean it is | useless, or needs to be replaced. I'm all in favor of fixing | problems, and refactoring to improve flow and remove | inefficiencies. I am not a fan of re-inventing the wheel | because gee, we've had this particular wheel for 50 years and | its doing fine but hey let's reimagine it anyway. | | Can't get promotions if you don't NIH a slow half broken | postgres | edoceo wrote: | > slow half broken postgres | | Don't bring Postgres into this. Its fast and closer to zero- | broken than to half-broken. ;) | dkersten wrote: | I think you misunderstood the comment. It wasn't calling | Postgres slow and half broken, it was making a stab at | other home-brewed (NIH, Not Invented Here | https://en.wikipedia.org/wiki/Not_invented_here) databases, | calling them slow, half baked copies of postgres, implying | that they should have just used postgres, but that doing so | wouldn't get them promotions. | justicezyx wrote: | I am doubting that if you walk over the aisle to the engineers | in the other team, and state an obscure legacy issue to | him/her. That person would, with decent chance, say something | like "how could it be?" "was the original engineer | dumb/imcompetent"? "was the original team got reorged"? etc.... | | Not saying the tech journalist is better in any sense. But | let's be honest, there is no reason they should be doing | better... | failwhaleshark wrote: | Sssh! "Old = bad" means job security for millions of engineers. | We need yet another security attack surface, I mean "improved | interrupt handling." | | I don't understand why people, engineers of all people, fall | for the Dunning-Kruger/NIH/ageism-in-all-teh-things consumerism | fallacy that everything else that came before is dumb, | unusable, or can _always_ be done better. | | Code magically rusts after being exposed to air for 9 months, | donja know? If it's not trivially-edited every 2 months, it's a | "dead" project. | lazide wrote: | Part of it I think is that for many people, it's more fun to | build something than to maintain something. It's also easier | to write code than it is to read it (most of the time). | | So why not do the fun and easy thing? Especially if they | aren't the one writing the checks! | lazide wrote: | Well, no one gets promoted/a raise by writing 'and everything | is actually fine' right? | | On the engineering side, similar more often than not. You get | promoted by solving the 'big problem'. The really enterprising | (and ones you need to watch) often figure out how to make the | big problem the one they are trying/well situated to solve - | even if it isn't really a problem. | Sebb767 wrote: | > Well, no one gets promoted/a raise by writing 'and | everything is actually fine' right? | | Well, that's actually not true. There are quite a few live | coaches and the likes making their living on positive | writing. And there a quite a few writers, like Scott | Alexander, which, while not being all positive, definitely | don't need to or want to paint an overly dark or dramatic | picture. | | In the more conventional news sector, on the other hand, this | is probably true. | herpderperator wrote: | > we've had this particular wheel for 50 years and its doing | fine but hey let's reimagine it anyway | | How is it doing fine? Apple is doing laps over Intel because -- | it seems -- of their choice to use ARM. Would they have been | able to design an x86/amd64 CPU just as good? | yyyk wrote: | >Apple is doing laps over Intel because -- it seems -- of | their choice to use ARM. | | AMD processors are essentially equivalent to M1 in | performance while still keeping x86, so probably yes (there | are an ARM advantage in instruction decoding, but judging by | performance differences, it's probably small). | | Apple's advantage is mostly that Apple (via TSMC) can use a | smaller processor node than Intel and optimize the entire Mac | software stack for their processors. | Someone wrote: | > and optimize the entire Mac software stack for their | processors. | | And vice versa. | cycomanic wrote: | I would argue the author is not really saying "old" is "bad", | but instead that we have been piling more and more craft onto | the old so it has now become a significant engineering exercise | to work around the idiosyncrasies of the old system every time | you want to do something else. | | To use your wheel analogy. It's sort of like starting with the | wheel of horse cart and adding bits and pieces to that same | wheel to make it somehow work as the landing gear wheel for a | jumbo jet. At some point it might be a good idea to simply | design a new wheel. | gwbas1c wrote: | Why continue with x86? Given how popular ARM is, why not just | join the trend? | ronsor wrote: | x86 is not going anywhere for performance computing, backwards | compatibility, and plethora of other reasons. | young_unixer wrote: | Linus' opinion: | https://www.realworldtech.com/forum/?threadid=200812&curpost... | bryanlarsen wrote: | tldr: AMD is "fix the spec bugs". Intel is "replace with better | approach". Linus: do both, please! | bogomipz wrote: | Thanks for posting this link. I was curious about a, b and d: | | >"(a) IDT itself is a horrible nasty format and you shouldn't | have to parse memory in odd ways to handle exceptions. It was | fundamentally bad from the 80286 beginnings, it got a tiny bit | harder to parse for 32-bit, and it arguably got much worse in | x86-64." | | What is it about IDT that requires parsing memory in odd ways? | What is odd about it? | | >"(b) %rsp not being restored properly by return-to-user mode." | | Does anyone know why this is? Is this a historical accident or | something else? | | >"(d) several bad exception nesting problems (NMI, machine | checks and STI-shadow handling at the very least)" Is this one | these two exceptions are nested together or is this an issue | when either one of these is present in the interrupt chain? Is | there any good documentation on this? | PopePompus wrote: | Since cloud servers are a bigger market than users who want to | run an old copy of VisiCalc, why doesn't either Intel or AMD | produce a processor line that has none of the old 16 and 32 bit | architectures (and long-forgotten vector extensions), implemented | in silicon? Why not just make a clean (or as clean as possible) | 64 bit x86 processor? | jandrewrogers wrote: | Intel did this with the cores for the Xeon Phi. While it was | x86 compatible, they removed a bunch of the legacy modes. | th3typh00n wrote: | Because the number of transistors used for that functionality | is absolutely negligible, so removing it has virtually no | benefit. | Symmetry wrote: | The number of transistors sure. The engineer time to design | new features that don't interfere with old features is high. | The verification time to make sure every combination of | features plays sensibly together is extremely high. To the | extent that Intel and AMD are limited by the costs of | employing and organizing large numbers of engineers it's a | big deal. Though that's also the reason they'll never make a | second, simplified, core. | failwhaleshark wrote: | It's never going to happen. The ISA is a hardware contract. | PopePompus wrote: | When things get to the point where AMD is considering | making nonmaskable interrupts maskable (as the article | states), maybe it's time to invoke "force majeure". | PopePompus wrote: | Even so, doesn't having a more complex instruction set, | festooned with archaic features needed by very few users, | increase the attack surface for hacking exploits and increase | the likelyhood of bugs being present? Isn't it a bad thing | that the full boot process is understood in depth by only a | tiny fraction of the persons programming for x86 systems (I'm | certainly not one of them)? | jcranmer wrote: | Not really. Basically none of these features can be used | outside of the kernel anyways, which means that the | attacker already has _far_ more powerful capabilities they | can employ. | failwhaleshark wrote: | Yep. Benefit = The cost of minimalism at every expense - the | cost of incompatibility (in zillions): breaking things that | cannot be rebuilt, breaking every compiler, breaking every | debugger, breaking every disassembler, adding more feature | flags, and it's no longer the Intel 64 / EMT-32 ISA. Hardware | != software. | lizknope wrote: | You mean like the Intel i860? | | https://en.wikipedia.org/wiki/Intel_i860 | | Or the Intel Itanium? | | https://en.wikipedia.org/wiki/Itanium | | Or the AMD Am29000? | | https://en.wikipedia.org/wiki/AMD_Am29000 | | Or the AMD K12 which was a 64-bit ARM? | | https://www.anandtech.com/show/7990/amd-announces-k12-core-c... | | All of these things were either rejected by the market or | didn't even make it to the market. | | Binary compatibility is one of the major if not the major | reason that x86 has hung around so long. In the 1980's and 90's | x86 was slower than the RISC workstation competitors but Intel | and AMD really took the performance crown around 2000. | fredoralive wrote: | I think he's suggesting something more like the 80376, an | obscure embedded 386 that booted straight into protected | mode. So you'd have an x86-64 CPU that boots straight into | Long Mode and thus could remove stuff like real mode and | virtual 8086 mode. AFAIK with UEFI it's the boot firmware | that handles switching to 32/ 64 bit mode, not the OS loader | or kernel, so it would be transparent to the OS and programs. | | But in order to not break a lot of stuff on desktop Windows | (and ancient unmaintained custom software on corprate | servers) you'd still have to implement the "32 bit software | on 64 bit OS" support. That probably means you don't actually | simplfy the CPU much. | | Of course some x86 extensions do get dropped occasionally, | but only things like AMD 3DNow (I guess AMD market share | meant few used it anyway) and that Intel transactional memory | thing that was just broken. | defaultname wrote: | Binary compatibility kept x86 dominant, coupled with | competing platforms not offering enough of a performance or | price benefit to make them worth the trouble. | | That formula has completely changed. With the tremendous | improvement in compilers, and the agility of development | teams, the move has been long underway. People are firing up | their Gravitron2 instances at a blistering pace, my Mac runs | on Apple Silicon (already, just months in, with zero x86 apps | -- I thought Rosetta2 would be the lifevest, but everyone | transitioned so quickly I could do fine without it). | | It's a very different world. | ben509 wrote: | GHC[1] is almost there >_< | | [1]: https://www.haskell.org/ghc/blog/20210309-apple-m1-sto | ry.htm... | monocasa wrote: | No, there's a lot of PCisms that that can be removed and | still allow for x86 cores. User code doesn't care about PC | compat really anymore (see the PS4 Linux port for the | specifics of a non PC x86 platform that runs regular x86 user | code like Steam, albeit one arguably worse designed than the | PC somehow). Cleaning up ring 0 in a way that ring 3 code | can't tell the difference with a vaguely modern kernel could | be a huge win. | nanaahow wrote: | Really nice <a href="https://nanahow.com/how-to-make-brown/">how | to make brown</a> | raverbashing wrote: | > Rather than use the IDT to locate the entry point of each | handler, processor hardware will simply calculate an offset from | a fixed base address | | So, wasn't the 8086 like this? Or at least some microprocessors | that jump to $BASE + OFFSET to a point where one JMP fits more or | less | sounds wrote: | I have no idea how Intel's proposal handles this, but the 8086 | jump was to a fixed location. i.e. BASE was always 0. ___________________________________________________________________ (page generated 2021-06-04 23:00 UTC)