[HN Gopher] Nyuzi - An Experimental Open-Source FPGA GPGPU Proce... ___________________________________________________________________ Nyuzi - An Experimental Open-Source FPGA GPGPU Processor Author : peter_d_sherman Score : 128 points Date : 2021-02-14 14:37 UTC (8 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | ourlordcaffeine wrote: | Are there open source opencl to FPGA compilers? | | If you're playing with FPGA's, you might as well directly compile | the kernel into a circuit, rather than building a gpu on an FPGA | and running your kernel on that. | | Proprietary solutions like Altera OpenCL compiler exist. | antman wrote: | Layman here! I see a lot of posts on that subject lately so I | need to ask: Can someone design a RAM chip? | pjc50 wrote: | Possibly, but why would you want a less efficient RAM chip that | costs more compared to something that's a commodity you can | buy? | ComputerGuru wrote: | There's basically certain little magic sauce to RAM chip | _design_. The production process (which is rather independent | from the commonly discussed "node size" production for CPU | /GPU-related tech) is where the magic happens. | detaro wrote: | What do you mean specifically by "design a RAM chip"? | (obviously RAM chips that you can buy are designed before they | are made, so that's probably not what you are after?) | | FPGAs typically do contain dedicated RAM areas, because | implementing it out of FPGA logic slices is terribly | inefficient. | tcherasaro wrote: | FPGA designer here. Just wanted to point out that | "efficiency" is highly context sensitive in FPGA design. | Everything is an area / speed / power trade-off. If you only | need a ram that is 8-bits wide and 64 words deep then it | might be way inefficient to waste a dedicated 18kbit block | ram on it when it would fit better into 2 LUTs. This is why | Xilinx, for one, provides pragma such as RAM_STYLE to help | guide synthesis: | | (* ram_style = "distributed" *) reg [data_size-1:0] myram | [2**addr_size-1:0]; | | block: Instructs the tool to infer RAMB type components. | | distributed: Instructs the tool to infer the LUT RAMs. | | registers: Instructs the tool to infer registers instead of | RAMs. | | ultra: Instructs the tool to use the UltraScale+TM URAM | primitives. | | See: https://www.xilinx.com/support/documentation/sw_manuals/ | xili... | | edit: formatting* | sitkack wrote: | Yes, designing ram is a lower level operation as compared to | designing logic via an HDL and needs to directly take the | process (chemistry, optics, mechanics) of the fab into account. | | https://openram.soe.ucsc.edu/ | bserge wrote: | DRAM chips are fascinating. | | Instead of going with the much more expensive SRAM, someone | decided that refreshing billions of capacitors hundreds of | times a second while performing read and write operations is | an acceptable way of _storing_ data (even if only while | powered). | | I wonder what the managers who first heard the idea must've | thought :D | | And it works so well! It's probably one of the most reliable | component in a computer. | pjc50 wrote: | Many of the early RAM systems were non-persistent (mercury | delay lines, phosphor) and some were destructive-read (core | memory). | | Appears to have been invented by Dennard of Dennard | Scaling: https://www.thoughtco.com/who-invented-the- | intel-1103-dram-c... | kleiba wrote: | I'm a total lay person here, but my understanding is that | designing a new processor is very challenging these days because | of the patent situation. That is, so much in hardware design is | patented that you're bound to run into problems if you don't know | what you're doing. | | Is this true, and is it of relevance here? | 10000truths wrote: | Yes, IP cores are very expensive to license, if they're even | available for licensing at all. This is part of the appeal of | RISC-V - an open-spec, royalty-free processor architecture that | is free of charge for chip designers to implement. | lkcl wrote: | unfortunately, if you make modifications and you want them to | be "upstreamed" (using libre/open project terminology as an | alonogy) you cannot do that without participating in the | RISC-V Foundation. you can implement APPROVED (Authorized) | parts of the RISC-V specification. you cannot arbitrarily go | changing it and still call it "RISC-V", that's a Trademark | violation. | admax88q wrote: | RISC-V is not an IP core, just an instruction set | architecture. | | Any implementation of it has the exact same patent minefield | to navigate as any other ISA. Most of the patents are around | implementation techniques not instruction set. | jecel wrote: | The RISC-V instruction set was carefully designed not to | require the use of any currently valid patents to do an | implementation. It is up to each processor designer to not | violate any patents in their project. | lkcl wrote: | this is unfortunately not true (that the RISC-V ISA was | designed not to require currently-valid patents). people | may _believe_ that to be the case, but it 's not. from | 3rd hand i've heard that IBM has absolutely tons of | patents that RISC-V infringes. whether IBM decide to take | action on that is another matter. they're a bit of a | heavyweight, so there would have to be substantial harm | to their business for the "800 lb gorilla" effect to kick | in. | astrange wrote: | It's not actually possible to do this though, it's up to | the other side's lawyers to decide if they're going to | sue you, and the answer is yes if they can afford it. You | don't have a jury on hand to evaluate every patent that | ever exists. | | Besides that, engineers in large companies are told to | explicitly not look up any patents so they won't be | accused of willful infringement. | vmception wrote: | Yes but there are alot of profitable applications which dont | need to be advertised. You run it in-house and make money on | the output, ie. ML farms or mining. You dont take preorders for | the hardware at all and just have boutique custom units and | nobody knows the architecture, even if you offer some remote | rental/SaaS tool. | pkaye wrote: | This processor seems to be a barrel processor architecture from | my quick look so not entirely new. | | https://en.wikipedia.org/wiki/Barrel_processor | ChuckNorris89 wrote: | Not sure of how relevant it is here, but yes, GPUs | architectures are bound by tons of patents so you can bet your | a$$ that if you were to commercially launch your own GPU IP, | you'll have Nvidia's and AMD's lawyers knocking on your door in | under 10 seconds. | | IIRC most companies out there selling GPU IP are still paying | royalties to AMD for their patents on shader architecture which | they got from their acquisition of ATI which in turn came from | their acquisition of ArtX which was founded by people who | worked at the long defunct SGI (Silicon Graphics). | | The funny thing is, if you backtrack through all GPU | innovations, most stem from former SGI employees. | | When 3Dfx went under, even though Nvidia's GPU tech was already | superior to anything 3Dfx had, Nvidia immediately swept in and | picked their carcass clean, mostly for their patents in this | space, so they would have more ammo/leverage against | competitors going forward. | | Regardless how you feel about patents, with their pros and | cons, hardware engineering is a capital intensive business and | without patents to protect your expensive R&D, it wouldn't be a | viable business. | bserge wrote: | Aren't patents supposed to expire? | | Isn't that the idea, you have a patent for 10-20 years, build | your business (which AMD/nVidia did, very successfully) then | everyone is free to use it, possibly leading to innovation? | | I'm poorly versed in this, so if anyone with more knowledge | could share some thoughts, that would be appreciated. | lkcl wrote: | only if the patent holder does not create an "improvement" | on the old one. then the older (referenced) patent is | extended. Bosch have done this specifically so that they | can hold on to the original CAN Bus patent. | HideousKojima wrote: | Correct, patents in the US expire after 20 years. | JPLeRouzic wrote: | And if I remember correctly, (I wrote my last patent 10 | years ago) there are annuals fees that would invalidate | any right if not paid. | arithmomachist wrote: | >Nvidia immediately swept in and picked their carcass clean, | mostly for their patents in this space, so they would have | more ammo/leverage against competitors going forward. | | That's surely not a healthy situation either. Courts should | never be a central part of competition among businesses. | joshspankit wrote: | To clarify what I think is the relevance, as well as to | explore my own questions: | | If someone were to clean-room design their own GPU chip, how | likely is it that Nvidia and AMD would come down on them | anyway simply by virtue of the fact that they (presumably) | have patents on everything that you could think of putting in | that chip? | | In essence: do you now have to be an expert in what you're | _not_ allowed to put in before you even start? | raphlinus wrote: | So here's what I would do if I were in this situation. I | wouldn't build a graphics processing unit per se, but | instead would build a highly parallel SIMD CPU organized in | workgroups, and with workgroup-local shared memory. These | cores could be relatively simple in some respects (they | wouldn't need complex out-of-order superscalar pipelines or | sophisticated branch prediction), but should have good | simultaneous multithreading to hide latency effectively. | | Then, if you wanted to run a traditional rasterization | pipeline, you'd do it basically in software, using | approaches similar to cudaraster (which is BSD licensed!). | The paper on that suggests that it would be on the order 2X | slower than optimized GPU hardware for triangle-centric | workloads, but that might be worth it. The good news is | this story gets better the more the workload diverges from | what traditional GPUs are tuned for - in particular, the | more sophisticated the shaders get, the more performance | depends on the ability to just evaluate the shader code | efficiently. | | It would of course be very difficult to make a chip that is | competitive with modern GPUs (the engineering involved is | impressive by any standards), but I think a lot would be | gained from such an effort. | | I should probably disclaim that this is _definitely_ not | legal advice. Anyone who wants to actually play in the GPU | space should plan on spending some quality time with a team | of topnotch lawyers. | jeffbush wrote: | (project author here) That is pretty close to the | approach this project has taken, although my motivation | was not so much avoiding IP as exploring the line between | hardware acceleration and software. | lkcl wrote: | allo jeff nice to see you're around :) thank you so much | for the time you spend guiding me through nyuzi. also for | explaining the value of the metric "pixels / clock" as a | measure for iteratively being able to focus on the | highest bang-per-buck areas to make incremental | improvements, progressing from full-software to high- | performance 3D. | | have you seen Tom Forsyth's fascinating and funny talk | about how Larrabee turned into AVX512 after 15 years? | | https://player.vimeo.com/video/450406346 | https://news.ycombinator.com/item?id=15993848 | raphlinus wrote: | Great to hear! I've poked around a little and see that, | and in any case wish you success and that we can all | learn from it. | ChuckNorris89 wrote: | To clarify further, Nvidia and AMD (and probably other | small players like ARM, Quallcomm, Imagination) own the | patents on core shader tech, which are the building blocks | of any modern GPU design. | | If you want to design a GPU IP that works around all their | patents, you probably can, but unless you're a John Carmack | x10, your resulting design would be horribly inefficient | and not competitive enough to be worth the expensive | silicon it will be etched on and probably not compatible to | any modern API like Vulcan or DirectX. | | But if you just want to build your own meme GPU for | education/shits and giggles, that doesn't follow any | patents or APIs, then you can and some people already did: | | https://www.youtube.com/watch?v=l7rce6IQDWs | ericbarrett wrote: | I am not in the graphics space but I am quite familiar with | tech business practices. | | I think the chance you would be sued is near 100%. If you | released and showed any market traction at all, you would | immediately become a threat to the duopoly; they surely | remember the rise of 3Dfx. Don't bother arguing the merits | of the patents because it would be a business decision, not | a technical one--this is the kind of thing that's decided | at the C-level and then justified (or cautioned against) by | the company's legal team, not the other way around. Patents | are merely leverage to effect the defense of the business, | and you can be sure they'll be used. | joshspankit wrote: | I agree with you (and definitely a conversation worth | having) but for the sake of this thread let's pretend | that legal action would only be taken when a patent was | actually matched with what was put in the chip. | lkcl wrote: | if it were done, say, as a Libre/Open processor, say, with | the backing of NLnet (a Charitable Foundation), where the | "Bad PR ju-ju" for trying it on was simply not worth the | effort | | if it were done. say, as a Libre/Open processor, say, with | the backing of NLnet (a Charitable Foundation), where NLnet | has access to over 450 Law Professors more than willing to | protect "Libre/Open" projects from patent trolls by running | crowd-funded patent-busting efforts | | if it were done as a Libre/Open Hybrid Processor, based on | extending an ISA such as ooo, I dunno, maybe OpenPOWER, | which has the backing of IBM with a patent portfolio | spanning several decades, who would be very upset if tiny | companies like NVidia or AMD tried it on against a | Charitably-funded project. | | that would be a very interesting situation, wouldn't it? i | wonder if there's a project around that's trying this as a | strategy? hmmm, hey, you know what? there is! it's called | http://libre-soc.org | ericbarrett wrote: | I learned GL in the 1990s on SGI systems. Shaders didn't | exist, poly counts were in the 100s, and textures were a | massive processing burden. The rendering pipeline of course | was quite different. And yet so much is the same! Code | organization, data types, all is quite familiar, whether it's | OpenGL or DirectX or what not. The achievements of SGI | engineers have literally benefited generations. | lkcl wrote: | Jeff's evaluation of GPLGPU is fascinating: | https://jbush001.github.io/2016/07/24/gplgpu- | walkthrough.htm... | | you are absolutely correct in that everything has moved on | from "Fixed Function" of SGI, and how GPLGPU works (worked) | - btw it's NOT GPL-licensed: Frank sadly made his own | license, "GPL words but with non-commercial tacked onto the | end" which ... er... isn't GPL... _sigh_ - but everything | commercially has now moved on to Shader Engines. | | that basically means Vulkan. | | however you may be fascinated to know, from Jeff's | evaluation, that there are still startling similarities in | basic functionality in not-GPL GPLGPU and in modern designs | targetted at Shader Engines. | ComputerGuru wrote: | I don't see how patents acquired from SGI could possibly | still be protected and require licensing. | peter_d_sherman wrote: | Related: | | Ben Eater - Let's build a video card! | | https://eater.net/vga | | Embedded Thoughts Blog - Driving a VGA Monitor Using an FPGA | | https://embeddedthoughts.com/2016/07/29/driving-a-vga-monito... | | Ken Shirriff - Using an FPGA to generate raw VGA video:FizzBuzz | with animation | | http://www.righto.com/2018/04/fizzbuzz-hard-way-generating-v... | | Clifford Wolf - SimpleVOut -- A Simple FPGA Core for Creating | VGA/DVI/HDMI/OpenLDI Signals | | https://github.com/cliffordwolf/SimpleVOut | | PDS: Also, this looks interesting, from SimpleVOut: | | >"svo_vdma.v | | A _video DMA controller_. Has a read-only AXI4 master interface | to access the video memory. " | fortran77 wrote: | Yeah, but these people aren't doing GPGPU computation | phendrenad2 wrote: | Or even anything resembling even 2D graphics acceleration. | FPGAhacker wrote: | One of the things that interests me (of many), is the use of | cmake. | | Does anyone have good references on extending cmake to new tools | that don't produce executables per se, or otherwise work in non | traditional ways? | code-scope wrote: | Very Cool project: | | Love GPGPU, I git clone it and try to understand the code better | here: https://www.code- | scope.com/s/s/u#c=sd&uh=0f2c2fa280a2&h=afe7a329&di=-1&i=38 | | It looks like 5 stages FP (FP32?) pipe lines, NUM_VECTOR_LANES | =16 NUM_REGISTERS=32 | | Are you writing your own kernel from scratch? If so which CPU | does it runs on - some embedded CPU inside FPGA? | | In the mandelbrot.c code, it has following: #define vector_mixi | __builtin_nyuzi_vector_mixi How does it get | translate to vector operations in FPGA? Where is the code | implement the __builtin_*? | | Thanks a lot an very interesting project. | marcodiego wrote: | There's people keeping OpenVGA alive[1]. With the failure of the | open grahics project[2] is there any known promising projects | besides libregpu[3]? | | [1] https://github.com/elec-otago/openvga | | [2] https://en.wikipedia.org/wiki/Open_Graphics_Project | | [3] https://libre-soc.org/3d_gpu/ | phkahler wrote: | >> is there any known promising projects besides libregpu? | | I think the most useful thing right now would be a high quality | version of the "easy" parts of a GPU. Basic scan out, possibly | overlays, color space conversion, buffer handling. This would | allow ANY open processor projects to have frame buffer graphics | and run LLVMpipe for basic rendering and desktop compositing. | This may be slow, but it is required for every open GPU | project, while a SoC can live without the actual GPU for some | applications. | | IMHO, first thing first. | lkcl wrote: | this is easy to chuck together in a few days, literally, from | pre-existing components found on the internet. | | * litex (choose any one of the available cores) | | * richard herveille's excellent rgb_ttl / VGA HDL | https://github.com/RoaLogic/vga_lcd | | * some sort of "sprite" graphics would do | https://hackaday.com/2014/08/15/sprite-graphics- | accelerator-... | | the real question is: would anyone bother to give you the | money to make such a project, and the question before that | is: can you tell a sufficiently compelling story to get | customers - _real_ customers with money - to write you a | Letter of Intent that you can show to investors? | | if the answer to either of those questions is "no" then, with | many apologies for pointing this out, it's a waste of your | time unless you happen to have some other reason for doing | the work - basically one with zero expectation up-front of | turning it into a successful commercial product. | | now, here's the thing: even if you were successful in that | effort, it's so trivial (Richard Herveille's RGB/TTL HDL sits | as a peripheral on the Wishbone Bus) that it's like... why | are you doing this again? | | the _real_ effort _is_ the 3D part - Vulkan compliance, | Texture Opcodes, Vulkan Image format conversion opcodes | (YUV2RGB, 8888 to 1555 etc. etc.), SIN /COS/ATAN2, Dot | Product, Cross Product, Vector Normalisation, Z-Buffers and | so on. | phkahler wrote: | Seriously? VGA with DVI outputs? And a link to a Sprite | engine? | | We need HDMI output, preferably 4K capable. I also | mentioned colorspace conversion. Should have also said to | "just throw in" video decoder for VP9 and AV1 if that's | available. The point is that the likes of SiFive and other | Risc-V SoC vendors should be making desktop chips, not just | headless Linux boards or ones with proprietary GPUs. | | Like I said, the "easy" part should be done and available - | not theoretically assemblable from various pieces. | | If this were readily available, I'd be able to buy it from | someone today. There IS a market for it and that will be | growing fast. Add a real GPU and things look even better. | marcodiego wrote: | Yeah. I also miss the small but firm steps approach. ___________________________________________________________________ (page generated 2021-02-14 23:00 UTC)