[HN Gopher] The RISC Wars Part 1: The Cambrian Explosion
       ___________________________________________________________________
        
       The RISC Wars Part 1: The Cambrian Explosion
        
       Author : tim_sw
       Score  : 40 points
       Date   : 2023-04-30 18:20 UTC (4 hours ago)
        
 (HTM) web link (thechipletter.substack.com)
 (TXT) w3m dump (thechipletter.substack.com)
        
       | mepian wrote:
       | >Patterson and colleagues first looked to build a modified RISC
       | design to run the Smalltalk programming language (in a project
       | known as SOAR, for Smalltalk On A RISC) and then to form the
       | basis of a desktop workstation (known as SPUR, for Symbolic
       | Processing Using RISC).
       | 
       | "Symbolic Processing" here actually meant running Lisp:
       | https://www2.eecs.berkeley.edu/Pubs/TechRpts/1986/6083.html
       | 
       | "The restricted processor count also allows us to build powerful
       | RISC processors, which include support for Lisp and IEEE
       | floating-point, at reasonable cost."
       | 
       | "SPUR features include a large virtually-tagged cache, address
       | translation without a translation buffer, LISP support with
       | datatype tags but without microcode..."
        
         | marcosdumay wrote:
         | As a rule, "symbolic processing" reduces to "branch and
         | fetch/store heavy code" on the context of hardware design.
        
           | dfox wrote:
           | Another thing that was somewhat relevant at the time was
           | ability to do arithmetic directly on tagged fixnum values.
           | For example SPARC includes (somewhat unusable unless you
           | write the Lisp implementation as a whole OS) instructions for
           | that.
        
       | ChuckMcM wrote:
       | I dislike the framing of this period as "wars" but understand the
       | colloquialism.
       | 
       | We are entering another interesting period with people building
       | RISCV processors and SoCs on budget FPGA hardware. The distance
       | between a working FPGA design and a working bit of custom silicon
       | is a lot shorter than the distance between a simulated design and
       | silicon. What I find amusing however is that we don't have a big
       | "killer app" like the PC was at the time. We have cutthroat cost
       | reduced embedded designs which are unforgiving.
       | 
       | Still, if you are someone who dreamed of designing your own
       | bespoke processor architecture, now is a great time to be alive.
        
         | gumby wrote:
         | > we don't have a big "killer app" like the PC
         | 
         | I think we do, though at a layer down the stack: do _more_ at
         | the edge using _less_ power in the process. Just as the CPU
         | <->memory pipeline lead to all sorts of interesting on-chip
         | development (huge multi level cache architectures) network
         | bandwidth in portable devices will increasingly be a bottleneck
         | (it takes power to run the radio, and more and more devices
         | will be competing for that bandwidth).
        
           | tyingq wrote:
           | Another thing that's different is the concentration of
           | compute within a few big vendors like AWS, Azure, GCP. That
           | makes it easier to introduce new server chips, as you don't
           | have to convince as many different customers. And the big 3
           | can abstract away some of the work.
        
             | convolvatron wrote:
             | the real downside of this market dynamic is that if you
             | can't get one of those players interested in your product -
             | you don't have a business. and if one of those deep
             | pocketed agents feels like replicating your work instead of
             | buying it - not much you can do.
             | 
             | overall I think this means a lot less interesting work
             | lower down in the stack
        
               | gumby wrote:
               | Yeah, the term for this is "monopsony" -- like a monopoly
               | but on the buy side. Antitrust law focuses on this less.
        
       | Taniwha wrote:
       | I think the big thing the RISC vs CISC wars articles always seem
       | to miss or play down is the effect that large caches near or on-
       | chip had in the way that people made trade offs in CPU design.
       | 
       | CISC (heavily encoded) instruction sets made sense when memory
       | was very expensive (late 70s I worked somewhere where we spent
       | >$US1M for 1.5Mb of actual core for a mainframe - reading a word
       | took 1uS - reading an instruction was destructive, you had to
       | write it back) heavily encoded instructions made sense because of
       | the realities of the memory hierarchy.
       | 
       | As we started to move CPUs onto single chips the numbers started
       | to change, RAS/CAS timers were ~1-300nS, caches in particular,
       | initially off chip and eventually on chip were in the 10ns sorts
       | of speeds, suddenly the tradeoffs between heavily encoding
       | instructions to make them small and trading off decode time for
       | fetch time were reasonable changes to make
        
         | dfox wrote:
         | One notable point in there is that the thing that everyone
         | things of when you say "RISC pipeline" (ie. MIPS) simply has to
         | have an I-cache, because without it you simply do not have the
         | memory bandwidth to do anything other than reading
         | instructions.
         | 
         | Interesting historical fact there is that early ARM cores (and
         | I suspect that Berkeley RISC is similar) are what one would
         | today call CISC microarchitecture with the datapath strikingly
         | similar to lets say m68k, but with significantly simpler
         | sequencing logic.
        
           | Taniwha wrote:
           | I'm pretty sure the first generation MIPS/ARM/etc had off
           | chip (but very close) caches
        
       | deviantbit wrote:
       | RISC vs CISC is an silly comparison now. I know Patterson still
       | likes to beat that drum. Modern ARM RISC processors have
       | multitudes more instructions than that of prior generation CISC
       | processors. It is as if the speed demons and the brainiacs traded
       | places.
       | 
       | By modern standards, the 80486 would be a RISC processor. The
       | CISC processors (what we used to call the brainiacs) are now the
       | speed demons. Their clock speeds are incredible, and they have
       | become the new speed demons, where ARM has become the power
       | efficient brainiac, but still behind the CISC processors in
       | performance (this may change one day, but they've been saying
       | that since the 1970s).
       | 
       | Intel/AMD OP codes are translated so knowing the exact number of
       | real OP codes is difficult, and we've blurred the lines between
       | instructions, and OP codes. Many instructions are executed by
       | firmware, and are not real instructions as when I was part of CPU
       | architecture design.
       | 
       | The days of opening a CPU software developer manual and having
       | the exact number of cycles each operation takes are long gone.
       | They seem to all execute in a single instruction, or the pipe-
       | lining, branching, etc are so advanced that it doesn't matter.
       | 
       | All in all it is incredible to have seen where we came from, to
       | where we are today. This has been a magnificent period, and I'm
       | glad to have been part of it, and witnessed many of the amazing
       | milestones many take for granted.
       | 
       | If you're a RISC or CISC fan, they're both derived from von
       | Neumann's architecture. This man deserves more credit than he
       | gets for our modern society. We should make statues of him, not
       | these idiot politicians, criminals, and religious figures.
        
         | msla wrote:
         | > Modern ARM RISC processors have multitudes more instructions
         | than that of prior generation CISC processors.
         | 
         | That isn't and wasn't the thing that RISC processors reduced.
         | 
         | What was reduced was the complexity of the instructions
         | themselves, not the number of instructions. For example, having
         | complex addressing modes is CISC, whereas only touching the ALU
         | in register-register operations is RISC; the theme is improved
         | pipelineability, so you want less ALU state to have to back out
         | if the memory operation takes a page fault. A number of things
         | (delay slots, for example) follow from that focus on making the
         | instructions as simple and pipelinable as possible.
         | 
         | Here's a guy who helped design MIPS on the distinction:
         | 
         | https://userpages.umbc.edu/~vijay/mashey.on.risc.html
        
           | mst wrote:
           | I tend to call ARM a load/store architecture to emphasise
           | that difference and it seems to get my point across without
           | derailing the conversation into a RISC-or-not debate.
           | 
           | (Note that I'm endorsing this as working in human to human
           | communication rather than for absolute correctness)
        
           | ilyt wrote:
           | I wouldn't call modern ARM instructions not complex either
        
             | msla wrote:
             | That still doesn't make the 80486 a RISC in any sense.
        
               | dfox wrote:
               | 486 vs ARM is a good comparison as both are the extreme
               | cases of CISC and RISC designs. So much that for a real
               | instruction mix generated by reasonably modern compilers
               | (think gcc3 and up), 486 is actually more of an RISC than
               | ARM with its bunch of variants of ldm/stm.
        
               | chasil wrote:
               | I agree; all instructions should be the same size in
               | classic RISC.
               | 
               | The quantity of instructions is not as relevant.
        
       | gumby wrote:
       | It's interesting to look at two early development highlighted
       | early in the article. The first was register windows: one of
       | those "seems like a good idea at first", but which turned out to
       | be not worth the transistors. You can understand the motivation:
       | memory access was slow and expensive and IIRC the cache
       | architectures of the day, when there even was one, were commonly
       | just an I-cache or a D-cache even though the machines were von
       | Neumann devices.
       | 
       | OTOH the lack of interlock was a kind of inverse: why waste
       | transistors on something known at compile time anyway (same
       | motivation for delay slots in branches, another idea that turned
       | out not to be worth it). In reality, computation is dynamic, so
       | runtime branch prediction (and later speculative execution)
       | turned out to be a much better performance win, regardless of the
       | design cost.
       | 
       | Edited to remove a comment about the 801, to which the author
       | replied below.
        
         | klelatti wrote:
         | Author here. Thanks for the comment and feedback. Really
         | interesting point on MIPS vs RISC-I approaches.
         | 
         | I was a bit puzzled by the IBM 801 comment as the second para
         | of the previous post mentions the 801 and links to an earlier
         | post that is all about the 801. Was there something you think I
         | missed in these earlier posts?
        
           | gumby wrote:
           | > I was a bit puzzled by the IBM 801
           | 
           | That didn't show as a link when I read it (shows as a link on
           | my current device). Apologies for the oversight; I edited my
           | comment.
           | 
           | The 801 was an insightful jump from then-current trends in
           | processor design and that insight, to me, is what RISC is all
           | about. There were lean and orthogonal instruction sets
           | already (most notably, to me, the PDP-6/10 and Seymour Cray's
           | work at CDC) but the central idea of offloading a lot of
           | heavy lifting to the compiler (and recognizing Multics'
           | insight of writing an OS in a HLL, pointing to a practically
           | assembly-language free future) was groundbreaking, a kind of
           | Special Relativity of computing.
        
         | chasil wrote:
         | Here is salient commentary on register windows, and other bad
         | ideas from early RISC (from the perspective of debugging
         | assembly language).
         | 
         | https://www.jwhitham.org/2016/02/risc-instruction-sets-i-hav...
        
           | gumby wrote:
           | A good post, though I have a couple of minor criticisms:
           | 
           | > Nobody writes assembly code any more, except when they do,
           | 
           | His very first sentence turns out to be the primary
           | assumption of RISC: essentially nobody writes assembly code
           | so don't worry about making that easy, and depend on the HLL
           | compiler to do a lot of heavy lifting. The small amount of
           | assembly is just to boot the processor, boot a process, and
           | some small glue in the OS, and as that stuff is heavily used
           | but almost never written you don't have to worry about
           | alleviating those peoples' suffering.
           | 
           | Oh, and compiler and debugger writers, and he does point out
           | how some of these decisions make things more complex for
           | them!
           | 
           | > CISC had _always_ involved decoding instructions into
           | micro-instructions.
           | 
           | That isn't actually true; into the 60s instructions were
           | implemented in hardware, and the roots of CISC lie there --
           | essentially some quintessentially CISC instructions (BCD
           | support, string handling, etc) were subroutines implemented
           | in hardware because they were so common and that made the
           | computer easier to sell. (as an aside: back in those days,
           | when the machines had high level languages they were often
           | unique to the vendor or even the specific machine model!)
           | 
           | Then again, even tiny machines like the 8080 that weren't
           | really RISC _or_ CISC had microcode, but that was later.
        
       ___________________________________________________________________
       (page generated 2023-04-30 23:00 UTC)