[HN Gopher] I decided to build a nine-bit computer
       ___________________________________________________________________
        
       I decided to build a nine-bit computer
        
       Author : mad_ned
       Score  : 139 points
       Date   : 2022-02-09 13:44 UTC (9 hours ago)
        
 (HTM) web link (madned.substack.com)
 (TXT) w3m dump (madned.substack.com)
        
       | googamooga wrote:
       | Ternary logic based computer Setun with 9-bit "bytes" was
       | developed in late fifties in the USSR. Not much info on about it
       | in English, unfortunately.
       | 
       | https://en.wikipedia.org/wiki/Setun
        
         | buescher wrote:
         | Ternary on binary systems usually uses a 2-bit "trit" with an
         | extra potential state. That's even how the later Setun machines
         | did it, from what I understand. Oh yes, and the Soviets even
         | developed their own trinary Forth dialect for them.
        
           | kragen wrote:
           | Even the original Setun machine worked that way, as it turned
           | out, if we believe Willis Ware's contemporary report on
           | Soviet computers.
        
       | giovannibajo1 wrote:
       | Nintendo 64 had a 9-bit RAM (Rambus RDRAM). Only 8 bits of each
       | byte were accessible from the MIPS CPU for obvious reasons; the
       | 9th bit was only used by the GPU (called "RDP") to store extra
       | information while rendering (begin a UMA architecture, the CPU
       | used the same RDRAM used by the CPU). Typically it contained a
       | flag called "coverage" that was used to discriminate pixels on
       | the edge of polygons, that were later subject to antialiasing. By
       | reading back pixels using the CPU, you would be unable to see the
       | coverage flag.
        
         | wolfram74 wrote:
         | Yet another weird bit of N64 lore, I'm amazed mupen64 works as
         | well as it does.
        
           | oh_sigh wrote:
           | > weird bit
           | 
           | Quite literally
        
         | klodolph wrote:
         | The 9th bit was also used by the depth buffer.
         | 
         | To add to this, the reason that the RAM was available with 9
         | bits in the first place is so that it could be used to make
         | systems with ECC. It's just that you didn't have to use that
         | 9th bit for error correction, you could use it for extra data,
         | if you designed the system to use it that way.
        
       | Koshkin wrote:
       | Wrote an emulator of a simple 12-bit CPU once, and ran a few
       | examples (coded in binary, of course) on it. It's a fun exercise
       | - highly recommended!
        
       | JoachimS wrote:
       | I was hoping that he was building a CPU that worked on Strong
       | Kleenean logic. https://en.wikipedia.org/wiki/Three-valued_logic
        
       | teekert wrote:
       | I read it because of the steam powered machine at the top...
       | turns out it's an fpga.
        
       | pjdesno wrote:
       | Long ago I interned for the group supporting the C-30 ARPANET
       | IMP. At one point the IMP was a 16-bit machine, emulating the old
       | Honeywell (?) minicomputer that the original IMP code was written
       | for. At some point they needed more memory, so they lashed on
       | another 4-bit bit slice, and it became a 20-bit machine.
       | 
       | There was an alternate microcode load for it, which implemented
       | an instruction set similar to that of a PDP-11, and could run an
       | ancient version of Unix. (maybe not so ancient back in 1985, but
       | definitely pre-BSD) We used one or two of those for our
       | development machines, and it was my job to write software tools
       | on them, using C with 20-bit words and 10-bit bytes.
       | 
       | Man, it was a pain in the ass.
        
         | larsbrinkhoff wrote:
         | The Unix machine was called C70, right?
        
         | googamooga wrote:
         | I have working PDP-11/23 in my possession and last couple of
         | months I'm trying to convince myself that I have enough
         | soldering skills to solder four additional memory lanes on its
         | backplane to increase memory limit from 256KB to 4096KB.
         | Otherwise even though I install a processor able to work with
         | 22bit addresses only 18bit addressing will be available.
        
       | IshKebab wrote:
       | > The answer to why you would still want to build an FPGA system
       | is (and always has been) speed.
       | 
       | > So I quickly gave up on creating something that could only
       | exist on my FPGA board
       | 
       | I've been doing some FGPA stuff and I think that's the wrong way
       | to look at it. Yes FPGAs are often useful when you need raw speed
       | but that's not the only advantage over CPUs. You also get
       | extremely low latency and direct control of IO pins. With
       | software you are limited to the existing hardware peripherals,
       | but with an FPGA you can make your own!
        
         | horsawlarway wrote:
         | I agree with you.
         | 
         | I had a brief stint in hardware design, and an FPGA is almost
         | always going to be worse than dedicated hardware for a task,
         | but it's extraordinarily flexible.
         | 
         | Most workflows I saw - you design the hardware on the FPGA
         | (hugely useful for quickly testing and prototyping) then you
         | outsource and actually build a custom chip if you really want
         | speed.
         | 
         | It's also a great polyfill tool - since it can take the place
         | of a lot of other hardware peripherals at a moment's notice.
        
       | bronlund wrote:
        
       | onion2k wrote:
       | I took my Macbook to pieces and there were lots more than 64
       | bits.
        
         | errcorrectcode wrote:
         | 2^(>64) combinations to put it back together. I find the empty
         | set plus a Hackintosh seems to have fewer bits but work much
         | faster. It must be those repairable qubits.
        
       | jacquesm wrote:
       | https://en.wikipedia.org/wiki/UNIVAC_1100/2200_series
       | 
       | Had a 36 bit word length resulting in a 9 bit 'byte'.
        
         | uvesten wrote:
         | I went way down the rabbit hole on this one. Seems that they
         | are still made and used, fascinating!
        
           | jacquesm wrote:
           | They are pretty impressive machines. The loadable microcode
           | store is especially interesting, they allow you to emulate an
           | arbitary CPU. Diagnostics in 'IBM' mode was a real
           | possibility on these!
        
         | malkia wrote:
         | I was coming to mention this, though I think my memory goes
         | back to some LISP machine (and was related to car/cdr and
         | related encoding if I'm not mistaken)
        
         | klodolph wrote:
         | 36-bit was a common enough word length. Not just UNIVAC, but
         | IBM 360, PDP-6/PDP-10, and some others. Convenient both for
         | octal (multiple of 3 bits) and working with pre-ASCII, 6-bit
         | character encodings (multiple of 6 bits).
         | 
         | Which is why we have UTF-9 and UTF-18, as defined in RFC 4042.
         | 
         | https://datatracker.ietf.org/doc/html/rfc4042
         | 
         | (Spoiler: It's an April Fool's joke.)
        
       | thehappypm wrote:
       | Unbelievably tangential but your dog is very cute and I want to
       | see more pictures!
        
         | mad_ned wrote:
         | Wish granted! @the.bessie.report on Instagram
        
           | errcorrectcode wrote:
           | Hey, we don't need any Aladdin djinns showing off their
           | magical puppers here. Definitely against guidelines and
           | regulations. ;)
        
           | thehappypm wrote:
           | I love this account! My dog also is a great pup, with some
           | behaviors we're working on, like barking and reactivity. From
           | the scenery it looks like central MA if I had to guess too!
        
       | bencollier49 wrote:
       | Related question - is there an FGPA simulator / designer which
       | works on OS X?
        
         | jecel wrote:
         | This one is written in Java:
         | 
         | https://github.com/hneemann/Digital
         | 
         | You can export your project as a Verilog file that can be used
         | in the various FPGA tools.
        
       | einpoklum wrote:
       | > I decided to build a nine-bit computer
       | 
       | Somehow I was sure that sentence was going to end with "... in
       | MineCraft!"
        
         | IncRnd wrote:
         | Here is a computer created in MineCraft! Wow, it's actually
         | v5.0. The intro starts at 1:22. [1]
         | 
         | [1] https://www.youtube.com/watch?v=SbO0tqH8f5I
        
           | einpoklum wrote:
           | 8-bit... amateurs :-)
        
       | krallja wrote:
       | I hope your address bus is three nonads wide (3^3 bits = 128MiB
       | address space).
       | 
       | It would be thematically even better if you used ternary logic,
       | but I'm not sure that FPGA can handle more than two voltage
       | levels.
        
         | tyingq wrote:
         | >I'm not sure that FPGA can handle more than two voltage levels
         | 
         | There is a high-Z (high impedance) state you can set I/O pins
         | to for a third state, but no way to detect that high impedance
         | state from the FPGA. It's just used to share an output line
         | with more than one pin. You could make a peripheral that could
         | detect the three states though, with a voltage divider and an
         | analog input.
        
         | mad_ned wrote:
         | haha yes, excellent suggestions! I did think about ternary
         | logic actually but I don't know of an FPGA that supports it. I
         | considered creating like a primitive that burns 2 register bits
         | to approximate it even, and just throw away the 4th state and
         | pretend I have 3-state logic on all the layers above. but i
         | have enough on my hands just trying to get the stupid timing
         | working on a simple CPU. Im not actually a CPU designer so I
         | dont really know what I'm doing lol.
        
           | buescher wrote:
           | This is a fantastic hobby project. Have you thought about
           | doing something with the "extra" bit along the lines of
           | tagging bytes for type or garbage collection or whatever like
           | the lisp machines?
        
           | enriquto wrote:
           | Throwing away 25% of your bits sound wasteful... what you
           | need is a moderately large power of 2 that is very close to a
           | power of 3. These can be found by computing the continued
           | fraction of log(3)/log(2). The sequence of convergents starts
           | thus 2/1, 3/2, 5/3, 8/5, 11/7, 19/12, 46/29, 65/41, 84/53.
           | Some good choices seem to be 2^8-3^5=13 (loses 5%) or
           | 2^46[?]3^29 (loses 2.5%).
        
             | amelius wrote:
             | You can also detect Z state by driving the input high,
             | reading, then driving the input low, reading. If both reads
             | are different, then you have a Z state. Otherwise, the
             | input is the read state.
             | 
             | Of course, drive the input through a resistor.
        
       | vient wrote:
       | See also cLEMENCy 9-bit middle-endian (sic) arch from DEF CON CTF
       | 2017
       | 
       | https://2017.notmalware.ru/89dc90a0ffc5dd90ea68a7aece686544/...
       | (link from https://blog.legitbs.net/2017/07/the-clemency-
       | architecture.h...)
        
         | nneonneo wrote:
         | Ah, I have fond memories of hacking on that architecture for
         | DEF CON. We wrote a lot of tools for it: by the end (less than
         | 3 days after getting the spec), we had disassemblers,
         | debuggers, binary rewriters, and even rudimentary decompilation
         | support. It was quite a fun journey :)
        
       | sillyquiet wrote:
       | Regarding the interesting bit (to me) in there about the
       | advantages of FPGAs over an SBC like the Pi (speed)- does anybody
       | know of any blogs or projects where an FPGA's speed _helped_ in a
       | hobby project where software running on an SBC wasn 't fast
       | enough? I can _imagine_ a few, mostly real-time projects
       | involving expensive computations (image or pattern recognition
       | maybe?), but I would love to see some concrete examples.
        
         | undersuit wrote:
         | There's a growing community using an Altera Cyclone SBC to
         | create faithful recreations of retro gaming machines. Software
         | emulation by the similar sized Raspberry Pi limits you, and the
         | MiSTer is much more compact than a desktop computer that does
         | have the power for accurate software emulation.
         | 
         | https://www.retrorgb.com/mister.html
        
         | PragmaticPulp wrote:
         | Basically anything with significant real-time requirements or
         | high bandwidth requires an external FPGA or microcontroller.
         | 
         | Embedded Linux is great, but if you're trying to do something
         | like read from a high-speed ADC then the only way to do it is
         | with an FPGA. The FPGA reads from the ADC at precise intervals
         | and buffers the data. The embedded Linux system can then
         | periodically read the buffer with all of the jitter and
         | latencies that come with using Linux.
         | 
         | Virtually every Linux-based software defined radio,
         | oscilloscope, and logic analyzer work on this architecture. For
         | lower speeds you can get away with a microcontroller running
         | bare metal code to do the buffering, but the high speed stuff
         | enters the domain of FPGAs.
        
           | andai wrote:
           | Noob question, in this instance would a realtime OS or a
           | unikernel also solve the problem?
        
             | emteycz wrote:
             | The problem is, machine code is not the lowest level, there
             | is also processor microcode. Machine code doesn't give you
             | a hard real-time guarantee, the execution is still too
             | approximate. FPGA enables you to work on/below the
             | microcode level.
        
             | tyingq wrote:
             | It's better, but still not the same level of timing
             | guarantees. I suppose, left to right, you would have
             | something like:
             | 
             | SBC/Linux -> SBC/Real-time OS -> General Purpose MCU ->
             | Specialized MCU (Parallax Propeller, for example) ->
             | FPGA/CPLD/DSP
             | 
             | With perhaps some additions to the diagram to account for
             | bit-banging vs actual drivers, speeds where some portion of
             | the left side just isn't fast enough to even kind-of work,
             | slow clock MCUs vs fast clock MCUs, etc.
        
           | coryrc wrote:
           | > read from a high-speed ADC
           | 
           | You just have the peripheral DMA and flag/interrupt when
           | done. If you need an "immediate" reaction you use a DSP.
           | There are only so many useful calculations you can do with a
           | single input stream and DSP can handle them.
        
         | FredFS456 wrote:
         | RF/radio is one solid application. Can't really do the signal
         | processing fast enough on an SBC.
        
           | carlsonmark wrote:
           | Not just the signal processing on the received data, but if
           | you want to transmit something, you will probably be using
           | one or more DDS channels to do so. Those may be in the FPGA,
           | or external chips. Either way, if you are mixing the outputs
           | of the DDS, being off by a single clock cycle can cause your
           | transmitted data to be complete garbage.
           | 
           | With an FPGA and external DDS chips, this is difficult to do
           | just because of mismatches in PCB trace lengths and/or small
           | temperature fluctuations. With a microcontroller, it is
           | nearly impossible to do even when using DMA because of memory
           | bus contention.
        
           | sillyquiet wrote:
           | Ohh good one. mmRadar projects I guess will fall into this
           | category too.
        
         | IshKebab wrote:
         | I think you won't find many because an FPGA that is as fast at
         | computation as a Raspberry Pi will be thousands of pounds. The
         | real advantage is latency and low-level control.
        
           | tyingq wrote:
           | There are cases where an FPGA is used to make a faster CPU.
           | Old CPUs, of course, but it's still a pretty active niche.
           | There are soft cores for Z80s, 6502, and other old CPUs that
           | run circles around the real hardware.
        
         | coryrc wrote:
         | Only useful if a microcontroller peripheral doesn't already
         | exist for the thing you want to do and you have some sub-
         | millisecond latency requirements. If a calculation can be
         | vectorized a CPU or GPU is really fast.
         | 
         | At normal speeds, an image can take 10-15 ms to clock out of
         | the sensor. At that point, there's little reason not to run
         | your image processing on a $3 CPU rather than $$$ FPGA because
         | what's another < 30ms at that point and what would need a
         | reaction that quick anyway?
        
         | al2o3cr wrote:
         | This is a commercial product so it's not _quite_ what you asked
         | for, but the production volume is pretty low (100s) and the
         | implementation is literally an FPGA dev board mounted to the
         | back of the interface panel.
         | 
         | https://intellijel.com/shop/eurorack/cylonix-rainmaker/
         | 
         | In this module, the FPGA's ability to do LOTS of computations
         | in parallel is used to produce 16 taps of pitch-shiftable delay
         | along with a 64-tap comb filter.
        
         | tyingq wrote:
         | Driving LED matrix displays is a good example, since they
         | require good adherance to timing on the output signal.
         | Especially at high refresh rates. There's lots of hobby
         | projects that get away with just using the CPU, but you're
         | throwing a lot of horsepower at something a cheap CPLD could
         | handle fine. There's also solutions like using the "PRU" in a
         | Beaglebone to drive the display...the PRU is basically a
         | microcontroller that can share memory with the CPU, but can
         | work in a more real-time fashion.
         | 
         | So it's not always raw speed, per se, but anything that's
         | sensitive to timing. Linux on a PI can be busy doing something
         | else and miss a critical time to have output (or read)
         | something. An FPGA based solution is working with known
         | loop/io/etc times that don't change.
        
           | marktangotango wrote:
           | That's interesting, looks like the PRU is built into the
           | AM3358 SOC, is that correct?
        
             | tyingq wrote:
             | Yes, I believe it's in all (most?) of the products within
             | the "Sitara" line, or at least AM33XX models.
             | 
             | Lowrisc.org also has a similar plan for what they call
             | "minion cores" in their RISCV based product, whenever that
             | happens. Some NXP processors also have something called an
             | "eTPU" that seems similar.
        
       ___________________________________________________________________
       (page generated 2022-02-09 23:00 UTC)