[HN Gopher] Inside the HP Nanoprocessor ___________________________________________________________________ Inside the HP Nanoprocessor Author : parsecs Score : 85 points Date : 2020-09-02 16:37 UTC (6 hours ago) (HTM) web link (www.righto.com) (TXT) w3m dump (www.righto.com) | jecel wrote: | The masks show how critical alignment is in metal gate | transistors. The green, magenta and light blue have to just | touch. Too much overlap or too far apart and you don't have a | working transistor. | | With polysilicon gates the equivalent of the green would be one | big rectangle, but since it would come after the gate (instead of | being the first step like here) it would actually become two | separate rectangles just touching the gate on each side. | EvanAnderson wrote: | I particularly liked the description of the HP clock module | referenced in the article: | | >The design of the clock module was rather unusual. To preserve | the time when the computer was powered-down, the clock module was | built around a digital watch chip with a backup battery.17 | Inconveniently, the digital watch chip wasn't designed for | computer control: it generated 7-segment signals to drive an LED, | and it was set through three buttons. To read the time, the | Nanoprocessor had to convert the 7-segment display outputs back | into digits. And to set the time, the Nanoprocessor had to | simulate the right sequence of button presses to advance through | the digits.' | | That's quite the convoluted bit of interfacing, but no doubt | using the off-the-shelf digital watch chip made it a "win". It's | pleasingly Rube Goldberg. | hinkley wrote: | I just broke someone's brain by relating this fact to them. | | Layers of abstraction, even in the hardware. | djmips wrote: | Very much so, it reminded me of when you see some 'clever' code | working around some legacy APIs. | jhallenworld wrote: | I saw some projects in the 70s where they add a math co- | processor to a standard 8-bit CPU by interfacing to an off the | shelf calculator chip, with all the same issues. I'm sure it | would be slower, but maybe the physical size might be the same | as the ROM chips (1702s?) that would be required for the | floating point code. | duskwuff wrote: | Back in the 90s, Intel ran a print advertisement for the | Intel 387 portraying their competitors' math coprocessors as | pocket calculators: | | https://i.pinimg.com/originals/fd/1d/01/fd1d012149d9e7d67371. | .. | | I guess there was something to that? :) | kens wrote: | That's an interesting ad, but a bit ironic given Intel's | later Pentium floating point division bug. | | On the topic of using calculator chips as coprocessors, | National Semiconductor introduced the MM57109 Number | Cruncher Unit (that is the real name) in 1977. It was | essentially a repackaged 12-digit scientific calculator | chip, operating on binary-coded decimal values with values | entered in Reverse Polish Notation. This chip was absurdly | slow; a tangent, for instance, could take over a second. | | http://www.projects.scorchingbay.nz/dokuwiki/_media/electro | n... | SomeoneFromCA wrote: | Makes me think about https://en.wikipedia.org/wiki/Casio_F-91W, | "the gitmo watch". | kens wrote: | Author here for all your Nanoprocessor questions. It's an unusual | processor, lacking the ability to add or subtract. Even so, it | was used in HP equipment, not just as a controller, but parsing | strings and doing calculations. | Zenst wrote: | The whole aspect of each chips voltage being so variable that | they had to test them and hand wrote the operating voltage, | making any use of the chip down to matching that voltage - | certainly making drop in replacements interesting for repairs. | | Then the last number on the chip to indicate speed. | | All that hands on for each chip and selling for $15 at that | time - makes you wonder how much they made upon them with all | that manual binning needed. | | Any idea on the margins back then for this chip? | kens wrote: | Since the chip was used in HP products, there wasn't a margin | as such. Much of the benefit was that they weren't paying | margin to another company. | | As for repairs, each product's service manual has a table | specifying the correct resistor value for each Nanocomputer | bias voltage. So you'd need to change the resistor if you | replaced the processor. | kencausey wrote: | Keep in mind that $15 USD in 1974 is more similar to $80 USD | today. So budget appropriately. | gumby wrote: | Did they sell it on the open market or was it an in house | device? | kens wrote: | It was an in-house device, not something HP sold as a | product. | gumby wrote: | Thanks, that's what I had assumed but read some comments | from people who thought otherwise... | formerly_proven wrote: | Thanks for doing this, I remember a stumbling over this mystery | controller in some piece of HP equipment I bought and at the | time there was basically zero information about these around. | monocasa wrote: | The processor was covered recently here as well. | https://news.ycombinator.com/item?id=24109437 | | One neat aspect is it was intended to allow the use of an off | chip, MMIO ALU if the design required it (and was still faster | than a 6502 even with the separate ALU). | kens wrote: | Yes, the HP voltmeter used two 74LS181 ALU chips so it could do | error and scaling calculations. | | The ALU was accessed through four I/O ports: two for the | arguments, one for the operation and carry-in, and one to read | the result. It wasn't memory-mapped, but I/O mapped since the | Nanoprocessor didn't have memory operations (except reading | instructions from ROM). | | Instead of memory-mapped I/O, the Nanoprocessor had I/O-mapped | memory. The real time clock module had 256 bytes of RAM that | were accessed through I/O ports. | monocasa wrote: | What's the distinction you're making between mmio and I/O | mapped? That it only has absolute addressing? Or that it just | calls it I/O? | kens wrote: | Memory and I/O were separate spaces with separate pins and | separate operations. The Nanoprocessor had 11 address lines | for reading instructions from a 2K ROM. It had 4 I/O device | select lines for accessing 15 I/O devices. | | So if you added RAM (as in the real time clock), the RAM | was accessed through I/O instructions. You'd write the | address to one port and read the data through another port. | It ended up looking a lot like microcode, with memory | accesses split into two pieces. | [deleted] | mmastrac wrote: | @kens: small typo in the article: | | "lacking even a mentioned on Wikipedia" | kens wrote: | Thanks, fixed. | pugworthy wrote: | So when are you going to write the Wikipedia page, and | correct the article again? :) | SomeoneFromCA wrote: | The earliest AVRs (the family of MCUs used in Arduino) had no RAM | either, only 32 8 bit regs. One of these was AT90S1200. AFAIK it | had higher max clock frequency then AT90S2313, the one with SRAM. | gumby wrote: | It doesn't have an alu but can do other critical arithmetic, | notably increment/decrement and, crucially, indexing in the | addressing unit. Also bit manipulation. So for a state machine | that's mostly look up tables it's not worth building an alu. | | I was surprised by the two-instruction skip -- skip was still | pretty common in those days, but I haven't seen two before. I | suppose it would be useful for setting a flag before branching, | but I wonder how valuable it was in the end. | kens wrote: | The two-byte skip was typically used to skip over a jump | instruction, giving you a conditional branch. But in many | cases, two instructions were enough to implement the | conditional case. | | The two-instruction skip could also be used in tricky ways to | implement two entry points to a function. E.g. | Entry 1: Set Accumulator bit 1 If accumulator | bit 1 set, skip two instructions Entry 2: Set something | different for entry 2 More setup for entry 2 | Code continues for both entry 1 and entry 2 | trasz wrote: | It's much later, but Arm's "IT (If-then) makes up to four | following instructions conditional (known as the IT block). The | conditions can all be the same, or some can be the logical | inverse of others. IT is a pseudo-instruction in ARM state." | aidenn0 wrote: | Note that it's has sufficient instructions to emulate addition | and subtraction, since it has compare and decrement/increment. | Would take O(n) instructions to add or subtract by N | kens wrote: | This is the algorithm the HP clock module uses to combine two | BCD digits into one byte. It adds the two values by | incrementing one and subtracting the other in a loop. Since the | BCD digit is at most 9, this is fairly quick. | | I think you could implement a faster addition algorithm by | testing the high order bit of the arguments, incrementing the | result as needed, and then shifting. Repeating this 8 times | should give you the sum, compared with up to 255 steps for the | simple algorithm. | DudeInBasement wrote: | Teacher: you'll never not need addition and subtraction. | | HP: hold my -2 voltage | djmips wrote: | The resistor compensating the manufacturing process differences | reminds me of when I worked on the 3DFx Voodoo and there was a | chain of transistors that sat inline with the clock but you could | select which output would be sent to the remote TMUs which were | clocked by this line. Code in the start up would draw textured | test patterns and examine the Frame buffer to adjust the clock | timing by nanoseconds using the chain of transistors. This was | actually necessary because of variances in the manufacturing. | When 3DFx switched to a completely new chip maker our boards | failed and we had to fix our startup code because it didn't have | enough margin. Thankfully there were more transistors in the | chain we weren't using before. Crisis averted. The reason our | boards were susceptible than the reference design is that we had | one of our TMUs slightly further away from the FBI. | kens wrote: | It's interesting to hear that the 3DFx adjusted the clock that | way. Coincidentally, I was just reading about similar clock | adjustment in the Pentium II and 4. They had "adaptive | deskewing", where a phase comparator would adjust the clock | delay as needed. It sounds like 3DFx did the adjustment at | startup, but the Pentium did it during use so it could | compensate for temperature drift. The Itanium 2 had similar | deskewing, except the value was set during manufacturing by | blowing fuses. | | Source: "CMOS VLSI Design", page 806. | ChuckNorris89 wrote: | IIRC Intel does similar but way more advanced automatic | deskewing black magic in the thunderbolt controllers. That's | how they can carry high speed PCIe signals so effortlessly | across your average copper cable(it was originally supposed | to be optical). | p_l wrote: | That's somewhat standard part of transceivers with addition | that PCI-E lower layers implement forced skew themselves - | even on the motherboard. | | While I don't exactly know the case with Thunderbolt, | "normal" Display Port uses PCI-E physical layer, just | unidirectional and with different protocol on top. | | On networking equipment, the necessary signal corrections | are part of why (other than DRM) it's more expensive to use | full transceivers that accept cables, vs. fixed-length | cables with fixed transceivers vs. direct-attach cables | which have minimal logic for signal quality. | foobiekr wrote: | This is an amazing story.Thank you for sharing it. ___________________________________________________________________ (page generated 2020-09-02 23:00 UTC)