[HN Gopher] MiSTer, an open-source FPGA gaming project ___________________________________________________________________ MiSTer, an open-source FPGA gaming project Author : tediousdemise Score : 189 points Date : 2021-04-11 17:59 UTC (5 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | fooblat wrote: | I love my MiSTer build! | | The usb controllers made for the recent mini systems (NES, SNES, | Genesis, etc) make great accessories for the mister. Add a couple | of usb arcade sticks and you can really play almost any classic | retro games as it was meant to be played. | | And then there are all the classic computer cores even including | the PDP-1! | vardump wrote: | Yup, same! Can wholeheartedly recommend it for those who want | something between emulation and real hardware. | | The Amiga core is fun, AGA, 2 MB chip, 384 MB fast. It supports | hard disk images, so you can do a hard disk based Workbench | installation and load games and demos practically instantly | (and safely exit to Workbench) using WHDLoad. | | Arcade cores are fun as well. Just like in childhood, but less | hungry for quarters. :) Recently played with arcade Gauntlet | core a bit for example. | xigency wrote: | Interesting. | | I have a couple of FPGA boards on their way to me in the mail | which I intend to use for some homebrew video game projects. | Besides getting the development environment working, it can be | tricky outputting video from an FPGA because of the precise | timing involved. I will have to look through these resources to | see if there are any good tricks to use here. | phendrenad2 wrote: | MiSTer is an amazing phenomenon. The MiSTer itself is just an | Intel FPGA devkit, which many believe to be sold at a loss | (because it's a training tool and not Intel's main source of FPGA | revenue). The amazing thing is the aftermarket for addons. There | are many possible combinations of addon boards that add RAM with | deterministic latency, USB hubs, cooling fans, cases, retro | controller ports, etc. All custom-made for this ecosystem. | tinybear1 wrote: | It is definitely being sold at a loss, the Cyclone V SOC being | used costs more than the entire development board.[0] I wonder | if Intel will ever take notice due to MiSTer's growing | popularity and quit subsidizing the board. | | [0] | https://www.digikey.com/en/products/detail/intel/5CSEBA6U23I... | | Edit: it was erroneous of me to state the board was being sold | at a loss, rather I meant that the board was being definitely | being subsidized by companies such as Intel and their partners | such as Panasonic. My mistake. I also wasn't meaning to convey | that the consumer Digikey pricing was the same as the large | volume manufacturers such as Terasic. Rather I meant to | demonstrate and agree with the OP on the astounding situation | that MiSTer currently exists in, owning to the lack of economic | viability for someone to produce a low volume commercial FPGA | emulation machine for a niche audience without any | subsidization. | tverbeure wrote: | There is absolutely no way they're sold at a loss. Your | DigiKey price of $245 proves this, because a factor of 10 is | a good starting point as a ratio between volume and one-off | DigiKey pricing of any type of complex silicon. | | A better way to approach this is as follows: what's the die | size of an FPGA like this? What's the production cost of the | die? Then check the historic gross margin percentage of FPGA | companies. Xilinx is around 68%, and that includes high-end | products which carry the highest markups, unlike this cookie | cutter thing. | | That should give you a good ballpark number. | | DigiKey charges what they do because nobody else is willing | to sell these things in low volume, and they have very high | inventory costs. | andrewcchen wrote: | Digikey pricing is not indicative of actual volume pricing, | especially for FPGAs where they are often many times | overpriced when buying from distributors. I doubt the board | is sold at an loss, probably sold at a small profit, not | that's it's really significant for a low volume dev board. | tverbeure wrote: | The price of a DE10-Nano is $135 ($115 for academic use.) | | Anyone who thinks that Terasic sells these at a loss doesn't | have a clue about volume pricing of FPGAs. And as a special | Intel partner, there's little doubt that Terasic has access to | this kind of pricing. | Jorge1o1 wrote: | I don't really know much about game emulation so I was curious | about what differentiates this FPGA game project vs traditional | CPU emulation. | | From their github page [1]: | | >Traditional emulators on CPUs execute code sequentially. This is | a tricky method of emulation because real hardware has many chips | and all of them work in parallel...This requires a lot of CPU | power to emulate even an old and slow retro computer. Sometimes | even a modern CPU working at 100 times the speed of the retro | computer is not enough, so the emulator has to use approximation, | skip emulation of some less important parts, or assume some | standard work of the emulated system without extraordinary usage. | | > FPGA doesn't need high frequencies to emulate retro computers; | it works at much lower frequencies than traditional emulators | require. Since everything in FPGA works in parallel, it is no | problem to handle any possible usage of the emulated system. | | [1] https://github.com/MiSTer-devel/Main_MiSTer/wiki/Why-FPGA | | (Edited for formatting) | TillE wrote: | byuu wrote a good article about this, unfortunately it's no | longer available, but basically it should be self-evident that | there's nothing _inherently_ more accurate about hardware | emulation. | | If you've actually decapped the original chips and duplicated | them exactly in an FPGA, that's pretty cool. But otherwise it's | just another approximation. The lower power requirements are | nice, of course. | zokier wrote: | I think big differentiator is that it is easier to get | predictable latencies with FPGA where you control almost | everything, compared to general-purpose PC which is not | really that well optimized for hard real-time operation. So I | believe "race the beam" style things are more easily | accomplished with FPGAs, and also having tight audio-video | sync. Although the PC emulation scene has been also doing | some fairly incredible things too. | valec wrote: | you can find it here https://archive.is/fWosI | emodendroket wrote: | That's true, but I think it's also true that you could trim a | bit more lag if you do it well. | tyingq wrote: | Quite a few of the FPGA soft cores related to 8 bit gaming | are reverse engineered from either schematics, decapped | chips, or both. Or they take pains to at least use the same | number of cycles for each instruction, etc. | mbalyuzi wrote: | Actually decapping the original chips is very much a thing. | See for example Chris Smith's work mapping out the innards of | the ZX Spectrum ULA - | http://www.zxdesign.info/book/insideULA.shtml . | near wrote: | It is, but these cores are almost exclusively not being | done that way. Not yet at least. I hope that they will be, | that would be really awesome. I paid $1200 last year for | the SNES PPUs to be decapped for this purpose, but it's a | truly enormous undertaking to map out those chips and then | recreate it in Verilog. You're talking thousands of hours | of work per chip. If anyone reading this is able to help | with that effort, please do let me know, we could really | use the help. | tediousdemise wrote: | By decapped, do you mean delidded? | | Theoretically it would be possible to automate this with | a couple things: | | - USB electron microscope to image the transistor | topology | | - CV lib to identify connections and generate | corresponding Verilog code | klodolph wrote: | "Decapping" is a more intense version of delidding where | you use chemical agents or something similarly extreme | (laser, plasma, milling) to remove the package (ceramic, | plastic). | | My understanding is that there are people who do it often | enough that it is automated in the way you describe, but | you still need someone with a lot of skill to spend | serious time on it. Computer vision works wonders but | there are errors which must be identified and fixed. | | A lot of the chips people care about are can just be done | optically, no electron microscope needed. | tediousdemise wrote: | Ah, that's a good distinction. I'd be pretty scared of | damaging the hardware by doing that, but I'm sure there | are some really experienced folks out there that would | appreciate the hardware donation. | FPGAhacker wrote: | Not that this is necessarily helpful to you in the short | term, but it strikes me as a good problem for machine | learning (going from die pictures to transistor | schematic.) | bcrl wrote: | That's exactly what's happening. There are loads of projects | going on right now decapping old chips and reverse | engineering them. From old CPUs like the 6502 to the Amiga | Alice chip. It's just a matter of time before most of the | retro systems are fully reverse engineered and documented. | tediousdemise wrote: | On FPGAs (depending on the hardware mapping), you get the | benefit of lower latency. I consider this to be timing | accuracy. | | Say you have two implementations of an LED controlled by a | switch: one which uses an FPGA and one which uses a | microcontroller. The uC implementation must continuously poll | peripherals connected to its GPIO pins at a set frequency; it | must check the state of the switch, and then change the state | of the LED. The FPGA, on the other hand, _physically_ wires | the switch to the LED; there is no lag when the state of the | switch changes. | | The FPGA implementation can be scaled to connect however many | additional lights and switches you want (limited by the size | of the fabric), with zero overhead lag. This is the | parallelization benefit of FPGAs that you may hear about. For | the uC implementation, you must add additional switches and | lights to the polling loop, which brings down performance in | linear time, O(n). This is the drawback of sequential | processing. | klodolph wrote: | Most game consoles don't do any of this, though. The | gamepad is polled by software. | | On the NES and SNES, the buttons are connected to a shift | register (e.g. 4021). The CPU triggers a latch and then | reads out the shift register one bit at a time. | mikepurvis wrote: | This would be less about a user peripheral like the | gamepad (which is obviously going to be read out exactly | once per frame anyway) and more about getting subtle | interactions between the CPU, memory, and specialized | systems for graphics/audio correct. And not just correct | after thousands of hours of work to smoke out the exact | sources of specific title bugs, but correct essentially | for free. | | See for example the tale of an absolutely wild mGBA | investigation that was posted here a while ago: | | "What happens if an interrupt gets raised between | prefetch and the data load? Will it start prefetching the | interrupt vector before the invalid memory access? I | quickly mocked this up in mGBA, turned on interrupts in | the test ROM, and sure enough it broke out of the loop. | So I tried the same test ROM on hardware and...it did not | break out of the loop. So there goes that theory. | Eventually I realized something. You saw that asterisk | earlier I'm sure, so yes, there is one thing that can | happen in between prefetch and the memory access, but | only if the memory bus gets queried by something other | than the CPU between the prefetch and invalid memory | access." | | https://mgba.io/2020/01/25/infinite-loop-holy-grail/ | djmips wrote: | This was given as an example, not for you to straw man | about the gamepad. | rtkwe wrote: | There was a good article from Arstechnica a decade ago that | pointed out why you need so much more power to get perfect | emulation. To get exact emulation takes a lot of power because | there are a few games which use odd tricks that are hard to | document and precisely reimplement in software. FPGA emulation | gets around that by more directly emulating the hardware. | | https://arstechnica.com/gaming/2011/08/accuracy-takes-power-... | dang wrote: | Related thread from 2018: | | _MiSTer: Run Amiga, SNES, NES and Genesis on an FPGA_ - | https://news.ycombinator.com/item?id=18721594 - Dec 2018 (30 | comments) | hyperpl wrote: | I'd really like to see a portable/handheld leverage this | technology for on-the-go gaming. | craigjb wrote: | I built the Gameslab around this concept, but haven't worked on | it much lately. | | https://craigjb.com/2019/11/26/gameslab-overview/ | tediousdemise wrote: | The Analogue Pocket[1] is exactly this (albeit proprietary). | Out of the box it recreates GB, GBC, and GBA using the Altera | Cyclone-V platform. | | [1] https://www.analogue.co/pocket | drewblaisdell wrote: | I wonder, why is there no DIY Analogue Pocket-style MiSTer | project? Is the DE10-Nano too large or inefficient for this? | jamespo wrote: | The limited market is problably covered with Odroid Go / | GPD XD / RG350M etc. Mister leverages an off the shelf FPGA | board that would require a lot more work in a handheld | form. | tediousdemise wrote: | I'd reckon it's the same reason that there isn't much of a | custom laptop scene. The open ended nature of stuffing a | screen, battery, and input peripherals into a chassis seems | an order of magnitude more difficult than just making a | headless box to plug into your TV. | | But with some effort, it would be awesome. | jonny_eh wrote: | Physical design is also a lot more important. Getting | "feel" just right is very hard and expensive, especially | when it comes to game controllers. | duskwuff wrote: | The DE10-Nano itself is a bit large for a handheld device, | and hasn't been optimized for power consumption. (It's | designed as a development board, not as a component of a | finished product.) There's nothing stopping someone from | using the Cyclone-V SoC in a handheld device, though. | GekkePrutser wrote: | This isn't really new, right? I've heard of this years ago. | | But it is an amazing project. Instead of emulating, they actually | rebuilt the old custom ICs (which 8-bit computers were full of) | in an FPGA. Really impressive. | jonny_eh wrote: | Old projects get reshared many times. It's always new to | someone. | tediousdemise wrote: | Yeah, it really is an amazing application for FPGAs--preserving | computing and gaming history. The list of cores available for | MiSTer is simply staggering: | | > Computers - Classic | | * Acorn Archimedes * Acorn Atom * Alice MC10 * Altair 8800 * | Amiga * Amstrad CPC 6128 * Amstrad PCW * ao486 (PC 486) * | Apogee * Apple I * Apple II+ * Apple Macintosh Plus * Aquarius | * Atari 800XL * Atari ST/STe * BBC Micro B,Master * BK0011M * | Color Computer 2, Dragon 32 * Commodore 16, Plus/4 * Commodore | 64, Ultimax * Commodore PET * Commodore VIC-20 * DEC PDP-1 * | EDSAC * Galaksija * Jupiter Ace * Laser 310 * MSX * MultiComp * | Orao * Oric 1 & Atmos * SAM Coupe * Sharp MZ Series * Sinclair | QL * Specialist/MX * TI-99/4A * TRS-80 Model 1 * TSConf * | Vector 06C * X68000 * ZX Spectrum * ZX Spectrum Next * ZX81 | | > Consoles - Classic | | * Astrocade * Atari 2600 * Atari 5200 * Atari Lynx * AY-3-8500 | * ColecoVision, SG-1000 * Gameboy, Gameboy Color * Gameboy | Advance * Genesis/Megadrive * SMS, Game Gear * MegaCD * NeoGeo | * NES * Odyssey2 * SNES * TurboGrafx 16 / PC Engine * Vectrex | | > Other Systems | | * Arduboy * Chess * CHIP-8 * Epoch Galaxy II * Flappy Bird * | Game of Life * TomyTronic Scramble | timbit42 wrote: | I'm still waiting for the KENBAK-1 core. | tediousdemise wrote: | Is there good documentation or ICDs out there that | adequately describe the architecture? Looks like there's | only 50 that were ever made, and only 14 believed to exist | today. | zokier wrote: | http://kenbakkit.com/manuals.html | | Seems pretty well documented. Considering the simplicity | of the computer, feels like it would be relatively easy | project to get to MiST | pomian wrote: | Interesting project would be to dig out some old cassettes | from, let's say, commodore 64. Try to load them into a | present day computer by patching wires/cables? - and see if | they run in this system. I remember writing for example: a | mining program, to calculate, overburden, volume and tonnage, | at different slopes, different rock types, etc. The science | behind the calculations is still valid, but we could likely | increase load times, and calculating times. | near wrote: | It is indeed an amazing project, especially its open source | nature. It provides some impressive power savings and latency | reductions that are very hard to match with general purpose | CPUs. | | But in most cases, it is emulation, as the lead developer will | attest. | | https://github.com/MiSTer-devel/Main_MiSTer/wiki/Why-FPGA | | "From my point of view, if the FPGA code is based on the | circuitry of real hardware (along with the usual tweaks for | FPGA compatibility), then it should be called replication. | Anything else is emulation, since it uses different kinds of | approximation to meet the same objectives. Currently, it's hard | to find a core that can truly be called a replica - most cores | are based on more-or-less functional recreations rather than | true circuit recreation. The most widely used CPU cores - the | Z80 (T80) and MC68000 (TG68K) - are pure functional emulations, | not replications. So it's okay to call FPGA cores emulators, | unless they are proven to be replicas." | | But there's nothing wrong with emulation for preservation, | until we get to a point where we can wide-scale clone these | older chips down to the transistor level through analysis of | delayered decap scans. And even then, emulation will be useful | for artificial enhancements as well as for understanding how | all those transistors actually worked at a higher level. | | It's also not a total solution: by taking many more transistors | to programmatically simulate just one, it limits the maximum | scale and frequency of what it can support. N64/PS1/Saturn has | not yet been fully supported and is still theoretical, but | likely, to be possible. Going beyond that is not possible at | this time. | | Software emulation and FPGA devices should be seen as | complementary approaches, rather than competitive. The | developers of each often work together, and new knowledge is | mutually beneficial. | floatboth wrote: | Well, yeah, it's not replication if it's not an exact | hardware replica, but the word "emulation" has very | "software" connotations. I guess let's call it.. recreation? | (That word is even in the quote above!) | someperson wrote: | "FPGA re-implementation" may be a better term | jamespo wrote: | So it's not perfect but it's better than emulators... | near wrote: | In latency and power usage, yes. In compatibility and | accuracy, no. Both are Turing complete, so there's nothing | you can do with one that you can't do with the other. | | If you take the SNES core, my software emulator has 100% | compatibility and no known bugs, and synchronizes all | components at the raw clock cycle level. It also mitigates | most of the latency concern through a technique known as | run-ahead. But it does require more power to do this. | stormbrew wrote: | I'm really curious where you got "better" out of the quoted | text. Because it's not there or implied, but people keep | reading this into anything about fpga recreations of chips. | There's nothing inherently better about doing emulation on | an fpga or a cpu, other than basically the amount of | electricity involved in doing it. | | But people keep presuming an improved accuracy that there's | no basis for. | emodendroket wrote: | Probably the marketing copy for Super NT and similar | products... harder to get people to part with hundreds of | dollars if your pitch is "lower power draw and reduced | input delay" | cmrdporcupine wrote: | Lower latency is definitely a thing. With FPGA it's | possible to 'chase the beam' like the original hardware, | and have much reduced input latency from devices, etc. | With an emulator you're going to be fighting the OS and | the frameworks you built on top of. Even if you go "bare | metal" (like my friend's BMC64 project which runs a C64 | emulator like a unikernel on the RPi with no OS) you are | still dealing with hardware built for usage patterns very | different from the classic systems. You're always going | to be one or more frames behind. | near wrote: | That is true. There are however techniques software | emulators can use like run-ahead that can get you lower | latency than even the original hardware on a PC: | https://near.sh/articles/input/run-ahead | | The caveat is that it doesn't _always_ work, and it makes | the power requirements even more unbalanced. Some might | also see it as a form of cheating to go below the | original game 's latency. If you want to match the | original game's latency precisely, FPGAs are the way to | go right now for sure. | tediousdemise wrote: | Run-ahead seems pretty cool, great technical write up. | How would you compare this to the feature called frame- | skipping that I often see implemented in software | emulators? | mschuster91 wrote: | > It's also not a total solution: by taking many more | transistors to programmatically simulate just one, it limits | the maximum scale and frequency of what it can support. | N64/PS1/Saturn has not yet been fully supported and is still | theoretical, but likely, to be possible. Going beyond that is | not possible at this time. | | The limiting factor here is the amount of stuff you can throw | into a single FPGA, correct? | | So in theory, shouldn't it be possible to tie a bunch of | FPGAs together, with two beefy ones being responsible for | replicating CPU / GPU functionality, a couple smaller ones | for sound and other "helper" processors, and some bog- | standard ARM SoC to provide the bitstreams to the FPGAs and | emulate storage (game cartridges, save cards) and input | elements (mainly "modern" controllers)? | near wrote: | There's both a cost and a speed barrier to it. FPGAs are | often used to design, simulate, and test modern circuits at | sub-realtime speeds. No amount of FPGAs will get you a PS2 | emulator at playable speeds right now, let alone a | PS3/Switch emulator. PCs can do that today by taking | shortcuts such as dynamic recompilation and idle loop | skipping. | vardump wrote: | Hmm... looking at the frequencies and gate counts, I | think PS2 is well within realm of possibility to run on a | not-so-cheap FPGA (or several). But PS3 generation | consoles definitely not. | duskwuff wrote: | > The limiting factor here is the amount of stuff you can | throw into a single FPGA, correct? | | And the speed that you can get your design to run at. | Something like the Game Cube (PPC750 @ 485 MHz) would be | difficult to implement in an FPGA, for example. | GekkePrutser wrote: | Ah ok I wasn't aware of this. I thought it was spot on. | | And yeah I hope we can easily order small batches of ICs (at | big pitch of course) in a few years, in a similar way to how | creating PCBs has become so simple now. | | I mean I remember how much of a PITA it was in the 80s. | Drawing on overhead sheets. All the acids and other | chemicals. Drilling. And now we get super-accurate 10x10cm | boards dual-layer, drilled, soldermasked and silkscreened for | a buck a pop with a minimum of 10. Wow. I really hope this | trend continues down to the scale of ICs (or that FPGAs | simply get better/easier). | | By the way, emulating a CPU is pretty easy and very accurate | anyway. The big problem with accurate emulation is with some | of the peripheral ICs which used hard to emulate stuff like | analog sound generators. ___________________________________________________________________ (page generated 2021-04-11 23:00 UTC)