[HN Gopher] AMD to Acquire Xilinx ___________________________________________________________________ AMD to Acquire Xilinx Author : ajdlinux Score : 487 points Date : 2020-10-27 10:52 UTC (12 hours ago) (HTM) web link (www.amd.com) (TXT) w3m dump (www.amd.com) | fxgx99 wrote: | Interesting fact, apparently the CEO of AMD and NVIDIA are | cousins, Thinking about that for a second | baybal2 wrote: | Should we call it a "tit-for-tat" acquisition? | saagarjha wrote: | What's next, Nvidia acquiring Lattice Semiconductor? | baybal2 wrote: | May well be, even if they have no current use for them, one | would really want to have a patent bulletproof vest in case | the king of the hill battle intensifies. | jl2718 wrote: | This is a very good point. IP protection may be the only | thing keeping margins alive in this sector when the DUV | gains run out. | ancharm wrote: | That or Achronix | snvzz wrote: | I hope not. I actually like Lattice's lineup. | fxgx99 wrote: | Apparently the CEO of NVIDIA are cousins, think about that for a | second | imtringued wrote: | I would rather see Processing in Memory (PIM) become mainstream | than FPGAs. FPGAs are basically an assembly line that you can | change overnight. Excellent at one task and they minimize end to | end latency but if it's actually about performance you are | entirely dependent on the DSP slices. | | With PIM your CPU resources grow with the size of your memory. | All you have to do is partition your data and then just write | regular C code with the only difference being that it is executed | by a processor inside your RAM. | | Having more cores is basically the same thing as having more DSP | slices. Since those cores are directly embedded inside memory | they have high data locality which is basically the only other | benefit FPGAs have over CPUs (assuming same number of DSP and | cores). Obviously it's easier to program than either GPUs or | FPGAs. | nickff wrote: | You're comparing two completely different paradigms. | | FPGAs are not an assembly line at all; the assembly line | analogy applies much more closely to a processor's pipeline. | | FPGAs are just a massive set of very simple logic units which | can be interconnected in many different ways. FPGAs are best | used in situations where you want to perform a series of simple | operations on a massive incoming dataset, in parallel, | especially in real-time situations. Performing domain | transforms on data coming in from sensor arrays is one very | good application for FPGAs. | qwertox wrote: | Are FPGAs rewritable at will with almost no degradation (for | example a rewrite every minute over many days), or do they | suffer the same degradation problems as EEPROMs (like the | ones in Arduinos)? | calacatta wrote: | FPGAs use SRAM to store their program, while CPLDs (complex | programmable logic devices) use flash. Some clever | marketeers here & there will stretch this distinction but | it's an established convention. The internal architecture | between FPGAs and CPLDs is typically different, based on | cost of memory vs. logic and typical use cases. FPGAs tend | to be used for higher-capacity computations but require | more life support; CPLDs tend to serve smaller, true glue | logic applications, where the low config overhead (just | apply power) and quicker & simpler power-up is a strong | pull. | | So CPLDs will have some kind of NVRAM wear-out concern, and | this is almost always specified as a number of maximum | erase & program cycles. | charlesdaniels wrote: | I think GP meant in the sense that reconfiguration time is | large. FPGAs cannot be effectively time-division multiplexed, | as a full reconfiguration can take up to tens of seconds. | | GP is also correct that DSP/SRAM blocks are critical to | performance. FPGAs are not very efficient at raw compute if | you have to synthesize everything out of LEs. | | The performance benefit of FPGAs, which PIMs also share (in | theory, there aren't any PIMs ready for real-world deployment | AFAIK) is that they can leverage much larger memory | bandwidths than general purpose CPUs can. An FPGA might run | at a lower clock rate (low 100s of MHz), but be able to | operate on several kb per clock cycle. This can work really | well when paired with off-chip logic to convert high rate | serial interfaces to lower clock rate parallel interfaces, | then back after the FPGA is done processing. | | There is also a lot of work going on in the space of time- | division multiplexing FPGAs effectively. The two main | approaches are overlay architectures and partial | reconfiguration. The former implements another high-level | fabric on top of the FPGA which will be less general-purpose, | but can be reconfigured faster. The latter is a feature | vendors have added to some high-end chips where specific | regions of the FPGA can be reconfigured without affecting | other regions. | nickff wrote: | I agree with your statements regarding reconfiguration and | TDM, though I still think GP (and to a lesser extent, your | comment) are very focused on traditional computing | paradigms. FPGAs are much more promising for real-time | systems, particularly those with very large incoming | datasets to transform or otherwise process in parallel. | Thinking about FPGAs in terms of how 'quickly' they process | data is really missing the point IMO. | | One common, and very good application for FPGAs is for use | in Active Electronically Scanned Array radar, sonar, or | camera image processing. You can perform parallel filtering | and transforms with various frequency and phase settings, | which would be impossible for a similarly-sized processor | to do. | | FPGAs have the potential to revolutionize sensor arrays, by | making them much more useful and affordable. | charlesdaniels wrote: | I agree yes. "Traditional computing paradigms" are (IMO) | not all that interesting as research topics at this | point. As far as I know, most of the work in that space | is in branch prediction and cache replacement policies. | | FPGAs are what you really want when you need to deal with | high resolution data that is coming in at very high data | rates. Often even a very fast general-purpose processor | with hand-tuned assembly simply won't have even the | theoretical memory throughput to process your data | without "dropping frames". They also have the benefit of | deterministic performance, which with modern | caching/branch prediction systems you can't guarantee | (AFAIK, my computer architecture knowledge isn't that | cutting edge). | | They can also work really well if you have some | computation you want to do that is so far off the beaten | path for general-purpose processors (or so memory bound) | that FPGAs can take the cake. | | There is also some work in sprinkling even more hardlogic | into the FPGA dies, like processors or accelerator cores | for various applications. FPGAs are great for | implementing the glue logic to move data between those. | nickff wrote: | I think you touched on one of the biggest things about | FPGAs in your comment, which is that they are perfect for | computation that does not involve branches. If you've got | a lot of data, and you're doing transforms, you usually | don't need to branch, so being able to crunch everything | through in parallel is a massive benefit. | | Also agree that additional hard logic or peripherals will | be a game-changer for FPGAs, though they would make each | design more domain-specific. Alternatively, we may see a | shift in how the interconnects are done, which allows for | flexible use of these 'modules'. It's also possible that | we'll see continual increases in LE counts which make | more specialized hardware unnecessary. I don't know which | way things will go. | legulere wrote: | Having memzero and memcpy happen in memory without polluting | caches would already be a huge gain. | AnthonyMouse wrote: | Though you could presumably just do the same thing with new | instructions, i.e. have an instruction for secure zeroing | which zeros the data in any memory or cache where it exists | but doesn't cause the zeros to be cached anywhere they | weren't already. | ohazi wrote: | Please, God, no. | | The surface area for security vulnerabilities is already | impossibly high. Do we really want to add "firmware running | on a DIMM exfiltrating key material" to that list? | eloff wrote: | There are security problems with every architecture. There | is no fundamental reason PIM should be less secure than | what we do now. This is just fear of the unknown talking. | saagarjha wrote: | Hmm, this was rumored but I guess now it is actually happening. | Nice bump on the share price there I guess, it's currently | trading at around $115 and it seems to be converted to $143 in | AMD. I assume this is to help AMD push more into the server and | ML compute spaces? | tookiwooki wrote: | testttttttt weee | teleforce wrote: | For those in ASIC and chip design industry, the two of the | largest chip companies namely Intel and AMD buying two of the | largest FPGA companies is inevitable, it's just a matter of | "when" rather than "if". | | I think the more interesting news is what they are going to do | pro-actively with these mergers rather than just sitting on it. | | I really hope their respective CEOs will take a page from the | open source Linux/Android and GCC/LLVM revolutions. I'd say the | chip makers companies are the ones that benefit most (largest | beneficiary) from the these open source movement not the end | users. To understand this situation we need to understand the | economic rules of complementary goods or commodity [1]. | | In the case of chip makers if the price of | designing/researching/maintaining OS like Linux/Android and the | compilers infrastructure is minimized (i.e. close to zero) they | can basically sell the hardware of their processors at a premium | price with handsome profits. If on another hand, the OSes and the | compilers are expensive, their profit will be inversely | proportional to the complementary elements' (e.g. OSes & | compilers) prices. | | Unfortunately as of now, the design tools or CAD software for | hardware design and programming, and also parallel processing | design tools are prohibitively expensive, disjointed and | cumbersome (hence expensive manpower), and if you're in the | industry you know that it's not an exaggeration. | | Having said that, I think it's the best for Intel/AMD and the | chip design industry to fund and promote robust free and open | source software development tools for their ASIC design including | CPU/GPU/TPU/FPGA combo design. | | IMHO, ETH Zurich's LLHD [2] and Chris Lattner's LLVM effort on | MLIR [3] are moving in the right direction for pushing the | envelope and consolidation of these tools (i.e. one design tool | to rule them all). If any Intel or AMD guys are reading this you | guys need to knock your CEO/CTO's doors and convinced them to | make these complementary commodity (design and programming tools) | as good and as cheap as possible or better free. | | [1]https://www.jstor.org/stable/2352194?seq=1 | | [2]https://iis.ee.ethz.ch/research/research- | groups/Digital%20Ci... | | [3]https://llvm.org/devmtg/2019-04/slides/Keynote- | ShpeismanLatt... | MayeulC wrote: | I don't recall where I read this, but hardware vendors have | been trying to comoditize software, and vice-versa. | | It's really obvious when you think about it. If you sell nails, | you want to make sure that everyone has or can afford a hammer, | and hammer manufacturers like to make sure that there is a | large supply of compatible nails. | | As much as I would like to see it, I am not sure the equation | is that simple in the case of CAD software. Sure, that would | make it easier to use FPGAs, but it would also make it easier | to create competing products, as a stretch. | | I still think it's worth it, and wish bitcode format was | documented, at the very least. | gpderetta wrote: | Interestingly, Xilinx owns Solarflare. I wonder if that was part | of the appeal. | tutanchamun wrote: | Yeah, I thought so too since Nvidia owns Mellanox and Intel | having their own NICs, OmniPath etc. | varispeed wrote: | I hope they will not drop their CPLD chips. They were made | obsolete at least once but Xilinx fortunately decided to extend | the support for a couple of more years. CPLD are very useful for | repairing vintage gear where logic components fail and are no | longer available (for example custom programmed PALs), so you can | describe the logic in Verilog and often solder it in place of | multiple chips. If they drop it then the only way to do it would | be to use full blown FPGA which is a bit wasteful. | thrtythreeforty wrote: | I would be very interested in reading a blog post about this. | Is there one that I can read or you'd be willing to write? | varispeed wrote: | Have look at materials linked there | http://dangerousprototypes.com/blog/2011/04/17/replacing- | dis... | petra wrote: | What happened recently in AMD's market ? | | AWS based ARM processor looks to be widely deployed in the | cloud.Nvidia, the leader in GPU compute in buying ARM.Intel, | which has suffered deeply because of their 10nm fab problems are | going to work with TSMC.And AMD's P/E ratio is at 159, higher | than Amazon's! | | So Maybe AMD is looking to convert some inflated stock with a | predictable business. | | And it's better to invest in a predictable business that may have | possible synergies with yours. Otherwise it looks bad to the | stock market. | | And Xilinx is probably the biggest company AMD can buy. | kbumsik wrote: | Yup, in 2020 it's getting the new era of semiconductor | industry. | Traster wrote: | AMD's P/E is high but that's based on the fact that AMD | earnings are $390m (2020Q3) vs. $6Bn (2020Q3) for Intel - | essentially people are pricing in AMD being the obvious | alternative to Intel in the data centre and the potential | profit from that is _enormous_ compared to AMD 's current | market share. | paulmd wrote: | or another way of putting that is investors are jumping the | gun and pricing in years and years of expected marketshare | growth that haven't happened yet. | | it's a relatively safe bet now that intel has more or less | conceded leadership through 2023 but it's not zero risk. The | market generally doesn't have an appreciation of that, P/E | was still nuts even before the release of Zen2 when AMD's | success was far less clear (Zen1/Zen+ were far less appealing | products and scaled far less well into server class). It's a | lot of amateurs (see: r/AMD_Stock on reddit) buying it | because they like the company rather than trading on the | fundamentals. | | Right now the stock market is just nuts in general though, | there's so much money from the Fed's injections sloshing | around and looking for any productive asset, and tech | companies look like a good bet when everyone is stuck at | home, building home offices, consuming tech hardware and | electronic media. Housing is getting even more weird as well. | 01100011 wrote: | I don't know, but I been told... AMD margins aren't great. | AMD is making phenomenal products lately but if they're | giving them away to gain market share then they may never be | as profitable as investors would like. | | If Intel gets their house in order in a couple years, AMD | won't have much time to gain market and raise prices. I've | rooted for AMD since the K6 days but I think there's a risk | that they'll always be #2(or less). | gvb wrote: | Everybody seems to view this as AMD mimicing Intel when it | acquired Altera. (That acquisition has not born visible fruit.) | | My contrarian speculation is that this is a move driven by Xilinx | vs. Nvidia given Nvidia's purchase of Arm and Xilinx' push into | AI/ML. Xilinx is threatened by Nvidia's move given their | dependence on Arm processors in their SOC chips and their ongoing | fight in the AI/ML (including autonomous vehicles) product space. | My speculation is that this gives Xilinx an alternative high | performance AMD64 (and possibly lower performance & lower power | x86) "hard cores" to displace the Arm cores. | | Interesting times. | jl2718 wrote: | I don't think NVidia/ARM would affect Xilinx much. Given what | the bulk of FPGAs are doing in the data center, I think AMD was | looking more at NVidia/Mellanox and of course Intel/Altera, but | for networking, not compute. For Xilinx, this gives a path to | board-level integration with x86. | andy_ppp wrote: | Or package level integration... | gumby wrote: | Possible, but I suspect heat and area would be problems, at | least for the CPUs. | | A smaller AMD core could be supplied as a hard core on the | Xilinx part but would that really be worth it? | pclmulqdq wrote: | Intel had some heat problems when they tried this. The | FPGAs weren't able to use their heat budget dynamically, | and as a result, the whole SiP had bad performance. | m0zg wrote: | I think you're onto something here. AMD is likely seeing the | end of the road for their CPU business within the next decade, | since it will run up against physics and truly insane cost | structures that will come after 5nm. At the same time we're far | past the practical limit wrt ISA complexity (as evidenced by | periodic lamentations about AVX512 on this site). The only real | way to go past all of that right now is specialized compute, | reconfigurable on demand, deployment of which is hampered by | the fact that it's very expensive and not integrated into | anything, so the decision to use it is very deliberate, which | in practice means it rarely ever happens at all. Bundle a mid- | size FPGA as a standardized chiplet on a CPU, integrate it | well, provide less painful tooling, and that will change. Want | hardware FFT? You got it. Want hardware TPU for bfloat16? You | got it. Want it for int8? You got it. Think of just being able | to add whatever specialized instruction(s) you want to your | CPU. | | I'm not sure this is worth $35B, but if Lisa Su thinks so, it | probably is. She's proven herself to be one of the most capable | CEOs in tech. | ohazi wrote: | Also, the advantage Altera supposedly got after being acquired | by Intel was better fab integration with what was then the best | process technology available (High-end FPGAs genuinely need | good processes). | | 1. That's no longer the case, so sucks for Altera / Intel | | 2. AMD doesn't have a fab, so any advantages are necessarily on | the design / architecture / integration side. | andromeduck wrote: | My read on the Altera acc was that Intel needed to shore up | fab volumes in the face of their foundry customers jumping to | TSMC first chance they could. As the capital required per | node continues to rise exponentially, they need more and more | volume to amortize that over. This is also why they're trying | to get into GPUs again. | samps wrote: | To slightly refine this, Intel didn't have many "foundry | customers" before Altera. Via Wikipedia (https://en.wikiped | ia.org/wiki/Intel#Opening_up_the_foundries...), the need to | fill up the manufacturing lines was engendered by poor x86 | CPU sales around ~2013, not poor third-party fab runs. In | 2013, Intel was still ahead of TSMC with 22 nm. | dogma1138 wrote: | Intel got their 3D/2.5D stacking tech from Altera, the FPGA | in Xeon sockets also is doing as well as it can considering | the niche market. | person_of_color wrote: | What is the difference between hard and soft cores? | KSteffensen wrote: | Soft cores use the configurable logic matrix of the FPGA. You | can choose to implement them or not, depending on your use | case. They can also be tuned to the use case, adding or | modifying CPU instructions, cache structure, e.t.c. This | involves writing RTL code, with all the design, verification | and backend synthesis work that comes with that. Tools like | Synopsys ASIP Designer tries to help with this effort. | | Hard cores are not part of the configurable logic matrix but | are separate resources on the FPGA. That means they can't be | tuned to the use case in the same way as a soft core. The | trade-off is that they typically are better optimized with | regards to clock frequency and power consumption since the | components are made to be a CPU and not generic configurable | logic. One example of an FPGA with a hard core CPU would be | the Xilinx Zynq devices. | dreamcompiler wrote: | An FPGA is (to a very crude approximation) just a bunch of | static RAM organized in an unusual way. If you think of | normal static RAM as "address wires go in one side and data | wires come out the other", in an FPGA there are no "address | wires" -- it's all data in/data out. The memory cells are | still just memory cells; what we label the wires is merely a | matter of engineering perspective. In a Xilinx memory cell we | choose labels for the wires typically used for logic gates. | | Anyway in a Xilinx chip the bits of data you put in the | memory cells determine what logic function gets executed. | That works because in general any particular stored memory -- | in any computer, anywhere -- is (conceptually) just a logic | function, and conversely all logic is implementable with the | stuff we conventionally call memory. | | But we typically don't do that, because "real" logic made of | fixed-function transistors is much faster than logic built | with changeable memory cells. However, there's a market for | fully-changeable logic--even if it's slower--and that's what | Xilinx chips are. | | Every CPU is just a bunch of registers and logic. If you hand | me a few million discrete NAND gates, I can use them to build | an X86, a RISC-V, and ARM, or whatever. It will be the size | of a house and it will be very slow, but it will run the | binary code for that processor. With a Xilinx chip, you have | a few million NAND gates (or NOR gates or inverters or | whatever you like) at your disposal and they're all on one | chip and you can wire them up however you want with nothing | but software. Bingo: You can build an X86 out of pure logic, | and it's all on one chip rather than being the size of a | house. That's a soft core. | | The nice thing about soft cores is that you can build | whatever CPU functions you want and leave off the functions | you don't need. If you want to change the design, you just | download a bunch of new bits to the Xilinx memory cells. Thus | you can change an ARM into an X86 in an instant, without | changing any hardware. | | Soft cores are very flexible, but they're also slow, because | implementing logic with static RAM cells is slower than doing | it with dedicated transistors. | | That's where hard cores come in: A hard core is a dedicated | area of silicon on the Xilinx chip carved out to _only_ | implement an ARM chip or a PowerPC or other CPU with fixed- | function transistors. So it 's fast. The downside is you | can't change its functionality on-the-fly. If you decide | you'd rather have a PowerPC than an ARM chip you have to | change the whole chip. | | In both types of cores, you still have a bunch of memory | cells left over that you can program to do whatever kind of | logic you like. | AareyBaba wrote: | How much slower would an FPGA (soft core) implementation of | say an ARM core be compared to the hard core implementation | ? | acallan wrote: | A soft core is a CPU that is programmed into an FPGA instead | of a "regular" core that is made of discrete components. | gmueckl wrote: | Why not license other softcores, e.g. from SiFive? | gvb wrote: | The performance of soft cores are significantly lower than | hard cores. | | Xilinx already has a RISC soft core in their MicroBlaze | architecture so they don't have a pressing need for a low | power, reasonable performance RISC soft core. Ref: | https://en.wikipedia.org/wiki/MicroBlaze | | AMD has high performance CPUs being fabbed by TSMC (same | foundry as Xilinx), so (theoretically) AMD CPUs can be | grafted onto the Xilinx FPGA as a hard core. | | With AMD and the MicroBlaze, they have the high performance | and low power processor spectrum covered with no need for 3rd | party licensing costs. | brandmeyer wrote: | FPGAs are to ASICs as interpreted languages are to compiled | languages. I don't mean that literally, but I do mean it in | the performance sense. At the same process node, an FPGA is | over 50x the power and 1/20th the speed of a dedicated ASIC | and it isn't getting any better. | tails4e wrote: | I agree with the sentiment, but the numbers are off. It's | about 10x the power worst case (maybe 5x for some dsp heavy | apps) and also around 5 to 10x for speed. An FPGA can | easily run at 100s of MHz, up to 500 with good design | pipelining, so suggesting an ASIC could do 500x20 times the | speed is 10Ghz, so definitely beyond most ASICs, so I think | 5x is more reasonable. | brandmeyer wrote: | My experience is that to get those "high" clock | frequencies that the work per cycle has to be extremely | small. If you normalize to total circuit delay in units | of time than you still end up many times worse, because | you need many extra pipeline cycles to get the Fmax that | high. | dragontamer wrote: | That's a decent analogy. | | Note however that Xilinx has a dsp slice (UltraScale) which | is a prefabbed adder / multiplier. This would be PyTorch in | the analogy. | | FPGA LUTs cannot compete against ASICs, so modern FPGAs | have thousands of dedicated multipliers to compete. | | The LUTs compete against software, while the dedicated | 'Ultrascale DSP48 Slices' competes against GPUs or Tensors. | | -------- | | It's not easy, and it's not cheap. But those UltraScale | DSP48 units are competitive vs GPUs. | | It's still my opinion that GPUs win in most cases, due to | being more software based and easier to understand. It is | also cheaper to make a GPU. But I can see the argument for | Xilinx FPGAs if the problem is just right... | TomVDB wrote: | In my experience, the 20x performance number is after | taking the DSPs into account. | dragontamer wrote: | Just looking at raw FLOPs: the 7nm Xilinx Versal series | tops out at 8 32-bit TFlops (DSP Cores only), plus | whatever the CPU-core and LUTs can do (but I assume CPU- | core is for management, and LUTs are for routing and not | dense compute). | | In contrast: the NVidia A100 has 19 32-bit TFlops. Higher | than the Xilinx chip, but the Xilinx chip is still within | an order of magnitude, and has the benefits of the LUTs | still. | | ----- | | It should be noted that Xilinx Versal "AI engine" is a | VLIW SIMD-architecture: https://www.xilinx.com/support/do | cumentation/white_papers/wp..., effectively an ASIC-GPU | hardwired into the FPGA. | tails4e wrote: | Yes, and FPGAs can be better than GPUs for some | applications, even more power efficient, and cost | effective. | banjo_milkman wrote: | Raw FLOPs is completely misleading, which is why Nvidia | focus on it as a metric. The GPU can't keep those ops | active - particularly during inference when most of the | data is fresh so caches don't help. It's the roofline | model. | | In my experience FPGA>GPU for inference, if you have | people who can implement good FPGA designs. And inference | is more common than training. Much of this is due to | explicit memory management and more memory on FPGA. | dragontamer wrote: | Well, my primary point is that the earlier assertion: | "GPUs are 20x faster than FPGAs" is no where close to the | theory of operations, let alone reality. | | ASICs (in this case: a fully dedicated GPU) obviously | wins in the situation it is designed for. The A100, and | other GPU designs, probably will have higher FLOPs than | any FPGA made on the 7nm node. | | But not a "lot" more FLOPs, and the additional | flexibility of an FPGA could really help in some | problems. It really depends on what you're trying to do. | | ------ | | At best, 7nm top-of-the-line GPU is ~2x more FLOPs than | 7nm top-of-the-line FPGA under today's environment. In | reality, it all comes down to how the software was | written (and FPGAs could absolutely win in the right | situation) | TomVDB wrote: | > GPUs are 20x faster than FPGAs | | The original comment by brandmeyer said "ASIC", not | "GPU". | | Take the same RTL. Synthesize it for ASIC and for FPGA. | Observe a 20x difference after normalizing for power, | area, and clock speed. | davrosthedalek wrote: | The question is, how much does your algorithm get from | the 19 TFlops for a GPU, and how much from the 8 from the | Versal. I'm sure many algos fit GPUs fine, but some | don't, and might get more out of an FPGA. | XMPPwocky wrote: | Also note Xilinx's next-gen parts have dedicated VLIW "AI | accelerators" (full hard CPUs!) | dragontamer wrote: | Those AI accelerators aren't really "full CPUs", since | there's no cache coherence, really. They're tiny 32kB | memory slabs + decoder + ALUs + networking to connect to | the rest of the FPGA. | | But its certainly more advanced than a DSP slice (which | was only somewhat more complicated than a multiply-and- | add circuit). | | ------- | | I guess you can think of it as a tiny 32kB SRAM + CPU | though. But its still missing a bunch of parts that most | people would consider "part of a CPU". But even a GPU | provides synchronization functions for its cores to | communicate / synchronize together with. | brandmeyer wrote: | With one notable detail: Its much easier to stream data | through the GPU than it is through an FPGA. And I say | that fully knowing how much of a (relative) PITA it is to | stream data through a GPU. | | I think it also works better to think of the DSP | resources as a big systolic array with lots of local | connectivity and memory and only sparse remote | connectivity. The SIMD model doesn't really apply. | XMPPwocky wrote: | What's annoying is there's no real reason it has to be | harder to stream through an FPGA- it's largely just that | the ecosystem and FPGA vendor tooling is so utterly | garbage that installing it will probably attract raccoons | to your /opt. | NOGDP wrote: | > But those UltraScale DSP48 units are competitive vs | GPUs. | | Are they really competitive from a price / performance | perspective? Based on my limited understanding, nvidia | GPUs, for example, are several times cheaper for similar | performance? | dragontamer wrote: | Mass produced commodity processors will always win in | price/performance. That's why x86 won, despite "more | efficient" machines (Itanium, SPARC, DEC Alpha, PowerPC, | etc. etc.) being developed. | | One of the few architectures to beat x86 in | price/performance was ARM, because ARM aimed at even | smaller and cheaper devices than even x86's ambitions. | Ultimately, ARM "out-x86'd" the original x86 business | strategy. | | ------------- | | GPUs managed to commoditize themselves thanks to the | video game market. Most computers have a GPU in them | today, if only for video games (iPhone, Snapdragon, | normal PCs, and yes, game consoles). That's an | opportunity for GPU-coders, as well as supercomputers who | want a "2nd architecture" more suited for a 2nd set of | compute problems. | | ----- | | FPGAs will probably never win in price / performance | (unless some "commodity purpose" is discovered. I find | that highly unlikely). Where FPGAs win is absolute | performance, or performance/watt, in some hypothetical | tasks that CPUs or GPUs don't do very well. (Ex: BTC | Mining, or Deep Learning Systolic Arrays, or... whatever | is invented later) | | Computers are so cheap, that even a $10,000 FPGA may save | more electricity than an equivalent GPU, over the 3 year | lifespan of their usage. Electricity costs of data- | centers are pretty huge. | | The ultimate winner is of course, ASICs, a dedicated | circuit for whatever you're trying to do. (Ex: Deep | Blue's chess ASIC. Or Alexa's ASIC to interpret voice | commands). But FPGAs serve as a stepping stone between | CPUs and ASICs. | | ------ | | If you have a problem that's already served by a | commodity processor, then absolutely use a standard | computer! FPGAs are for people who have non-standard | problems: weird data-movement, or compute so DENSE that | all those cache-layers in the CPU (or GPU) just gets in | the way. | dnautics wrote: | I kind of love this crude analogy. | rjsw wrote: | I'm guessing you didn't mean to write "softcores", licensing | a SiFive design to be a hard core connected to the FPGA | fabric would be one option. | gmueckl wrote: | Argh, you are right! Thanks for pointing it out. I did pick | the wrong wording. | duskwuff wrote: | Xilinx already has a number of devices with ARM hard cores, | though (like the Zynq series). There's no compelling reason | for them to switch away from that. | baybal2 wrote: | I do not believe it makes sense to spend so much money for a | niche, in a niche product like AI/ML chips. | | And I believe AMD are good with using calculators. | datameta wrote: | Being early on the ML hardware acceleration boat is going to | pay off astronomically. Embedded inferencing is going to be a | society defining technology by the time we hit mid-decade. | It's already being used for predictive maintenance of | machinery in IIoT with huge payoffs via decrease in | unforeseen total machine failure or need of heavy overhauls. | hacknat wrote: | It really blows my mind how many people are still bearish | on ML. It's fair to argue timelines (although even that is | becoming less true), but I think the evidence is firmly on | the side of the bulls now. | ansible wrote: | And for their products that include hard cores, maybe they will | switch to RISC-V like with the MicroSemi PolarFire. | | I'm still debating on getting the Icicle development kit. | phendrenad2 wrote: | <many years ago> when Intel acquired Altera, and announced Xeon | CPUs with on-chip FPGAs, I was optimistic that eventually they | would add FPGAs to more low-end desktop CPUs (or at least Xeons | in the sub-$1000 zone). But it never materialized. I'm slightly | optimistic this time around too, but I suspect that the fact that | Intel didn't do it hints at some fundamental difficulty. | ianhowson wrote: | It's the usual "fundamental difficulty" with FPGAs -- CPUs and | GPUs are faster and more power efficient for compute-intensive | tasks. An algorithm on FPGA needs to overcome the 20x worse | architectural efficiency just to break even with a CPU or GPU. | | The big benefit of having FPGA closely attached to CPU is that | you can access the memory and internal buses quickly. | Transferring stuff over PCIe hurts a lot. So you could make an | argument for jobs using small work units requiring fast | turnaround; CUDA kernels take milliseconds to launch. | | I worked with some of the early Xeon+FPGA parts and there just | wasn't that much we could do with them. There wasn't enough | fabric to build anything meaningful and we had an abundance of | CPU cores, so the best we could do was specialized I/O | accelerators. | Symmetry wrote: | I think the more relevant comparison here would be ASICs. | Softcores on FPGAs are indeed terrible but if you're | implementing some algorithm directly at the gate level for | cryptography or signal processing or whatever then being able | to arrange inputs outputs into dataflows is a big win with no | roundrips to general purpose registers or bypass networks. | Not having to fetch instructions and being limited in | paralellism is also a big win. And generally if you're doing | something like mining bitcoin you should expect an FGPA to | perform somewhere between an ASIC and a GPU. | | The problem is that if a task is common then someone is just | going to make an ASIC to do it. And if its uncommon then the | terrible FPGA software ecosystem and low prevalence of | general purpose FPGAs in the wild mean that people will just | do it on a CPU or GPU. | ianhowson wrote: | > if you're implementing some algorithm directly at the | gate level for cryptography or signal processing or | whatever then being able to arrange inputs outputs into | dataflows is a big win with no roundrips to general purpose | registers or bypass networks | | This is true, but keep in mind that that sort of algorithm | runs _insanely_ well on any CPU or GPU because they, too, | do not want to touch main memory. You would be blown away | by how much work a CPU can do if you can keep the working | set within L1 cache. | | Re. ASICs, it's a continuum: | | - "flexible, low performance, cheap in small quantities" | (CPUs) | | - "reasonably flexible, better performance, cheap-ish in | small quantities" (GPUs) | | - "inflexible, best performance, expensive in small | quantities" (ASICs) | | FPGAs fit somewhere between GPUs and ASICs -- poor | flexibility, maybe great performance, moderate small- | quantity price. | | If your problem is too big for GPUs, as you say, sometimes | it's easiest to jump straight to an ASIC. But it's such a | narrow window in the HPC landscape. The vast majority of | customers, even with large problems, are just buying a lot | of GPUs. They're using off-the-shelf frameworks even though | a custom CUDA kernel would give them 10x performance and | 10% cost. The cost to go to an FPGA is too great and the | performance gain simply isn't there. | ip26 wrote: | It absolutely seems like there are some incredible | opportunities in the high end. But as far as I know, FPGAs are | quite area hungry which makes them inherently expensive. It's | hard to think you'd find FPGAs of meaningful size included in | $60 desktop CPUs, unless the harvesting opportunity is | significant. | baybal2 wrote: | On-package, not on-chip | Nokinside wrote: | Nokia designed their ReefShark 5G SoC chipset with significant | FPGA component and used Intel as their supplier. Intel couldn't | deliver what they promised. It was complete disaster. | | They had to redesign ReefShark and cancel dividends. It was a | huge setback. | noki_throway wrote: | This is utter bullshit. Nokia f*cked up because they over- | engineered their FPGA solution for 5G. They took largest FPGA | in the market and couldn't squeeze their design in it. | | It was not Nokia SoC just plain Stratix10. They moved to own | SoC after that glorious project. | rathel wrote: | Apparently after Altera acquisition they sought "synergies" in | all the different divisions. My friend was an intern who was | tasked with porting some of the network protocol stack to | SystemVerilog. Apparently it did work and SystemVerilog was the | right HDL to use because of support for structs that can map to | packet headers. I'm not sure it's being used in production. | | It'd be interesting to see how AMD will execute and integrate | this acquisition, considering they are less of a madhouse | company than Intel. | saddlerustle wrote: | They never ended up shipping the high end ones either. | QuixoticQuibit wrote: | Im skeptical as well. The primary reason IMO is the software. | How do you easily reconfigure your FPGA to efficiently run | whatever computationally intensive and/or specialized algorithm | you have? | Someone wrote: | Also, for the way most modern CPUs are used: how do you task | switch? If the hardware is large enough, you can deploy | multiple configurations at a time, but does software support | that? Is is possible to have relocatable configurations? | | In theory, you could even page out code, but I guess the | speed of that will be slow. Also, paging in probably would be | challenging because the logical units aren't uniform (if only | because not all of them will be connected to external wires) | varispeed wrote: | This can be used with a client-server model, that is if | there are enough free cells and I/O available on FPGA it | could let it install the configuration and then any | application could communicate with it concurrently, maybe | with some basic auth. | Someone wrote: | But from what I understand of FPGAs, fragmentation would | be a serious issue. You may have the free cells and I/O | you need to implement some circuit, but if they're | dispersed over your FPGA or even connected, but in the | wrong shape for the circuit you're building, that's | useless. | | An enormous crossbar could solve that, but I would think | that would be way too costly, if practically possible at | all. | rjsw wrote: | You can reconfigure just part of the FPGA, it isn't used | all that often though. | nomercy400 wrote: | It is doable. I've seen it during my Computer Engineering | courses 14 years ago. | | Basically you analyze the code for candidates, select a | candidate, upload your custom hardware design, run your | operation on the hardware, and repeat. | | The difficult part is that uploading your hardware to FPGA is | in the order of tenths of seconds, which is ages when | compared to the nano and micro seconds your CPU works. So | your specific operation must be worthwhile to upload. | | A bit of FPGA on your CPU makes it more flexible, for example | your could set a profile such as 'crypto' or 'video' to add | some specific hardware acceleration to you general purpose | CPU. | | Imagine your CPU being able to switch your embedded GPU into | another CPU core. | hajile wrote: | Codecs are a great example. | | Let's say the current zen 2 had an FPGA onboard. AMD could | sell you an upgraded design with AV1 support for a few | dollars. Most people aren't going to buy a new CPU on the | basis of a video decoder, but they'll buy an upgrade to the | chip that auto "installs" itself. That's a sale AMD | otherwise wouldn't have made. | dboreham wrote: | Except the new codec won't fit into the FPGA they put on | that chip that's in the field. | eqvinox wrote: | The codec is gonna get nowhere near to filling a "CPU- | class" FPGA, so if anything you get fewer parallel | instances of it. | threatripper wrote: | I would see it being used more like a GPU than a CPU. | dragontamer wrote: | An actual GPU or CPU will always run circles around an FPGA | CPU or FPGA GPU. | | Where FPGAs win are new architectures, like Systolic | engines. Entirely different computer designs from the | ground up. | gmueckl wrote: | Even GPUs multitask all the time, even though it's less | obvious. Cooperative multitasking in this context means | setting up and executing different shaders/kernels. The | overhead involved in this is quite manageable. | | Repurposing FPGAs to different tasks means loading a new | bitstream into the device every time. So it is much more | efficient to grant exclusive access to each user of the | device for long stretches od time. The proper pattern for | that is more like a job queue. | wtetzner wrote: | I believe there is some amount of support in OpenCL for | FPGAs. If only we could get companies to property support | OpenCL, we'd have a nice software interface to pretty much | any kind of compute resource on a machine. | SSLy wrote: | My armchair amateur brain immediately thought about something | CUDA-like. | numpad0 wrote: | FPGA code takes hours to compile, yet product/model | specific | FPGAhacker wrote: | You would use precompiled modules or compositions of | these modules (pipeline or parallel). | | This can be a relatively fast operation. Seconds or less | depending on complexity. | simias wrote: | You're not wrong but I expect they'd make it so that the | various models would be similar enough (at least within a | given CPU generation) so that you could use mostly | precompiled artifacts instead of rerouting everything | from scratch. | | I've always been pretty skeptical of their approach | though, in order to be usable they'd need excellent | tooling to support the feature, and if there's one thing | that existing FPGA software isn't it's "excellent". | | Getting FPGAs to perform well is often an art more than a | science ("hey guys, let's try a different seed to see if | we get better timings") so the idea that non-hardware | people would start to routinely generate FPGA bitstreams | for their projects is so implausible that it's almost | comical to me. | | Maybe one day we'll have a GCC/LLVM for FPGAs and it'll | be a different story. | pclmulqdq wrote: | Beyond the GCC/LLVM, you also really need a standard | library. Nobody is talking about that. Today, if you want | a std::map on an FPGA, you have to either pay $100k or | build it yourself. That's untenable. | himinlomax wrote: | I wonder how much of the delay in FPGA tech adoption is due to | the utterly hilarious disaster that are the toolchains. They | look like huge brittle proprietary monstrosities, incompatible | with modern development methodologies. | rowanG077 wrote: | I think pricing is also an issue. Anyone with 5 dollars in | their pocket can buy an arduino clone and go to town. And | many people do as can be seen by the huge hobbyist scene. You | want to try FPGA development and do anything that is not | blinking a LED? Good luck shelling out hundreds to thousands | of dollars for the shittiest software known to this planet. | bsder wrote: | A Max10 T-Core board from Terasic is $55 academic and tools | are free for the Max10 class. | | You only start paying for FPGA tools when you need the | really big FPGAs. | | And, I'll go out on a limb, but, at this point, I think | Arduino causes more harm to beginning embedded developers | than good. Yeah, the ecosystem is wonderful if you aren't a | developer. | | However, Arduino is now weird compared to mainstream | embedded development. Most things have converged to 32-bit | instead of 8-bit. Arm Cortex-M is now mainstream so your | architectural understanding is useless. 5V causes a lot of | grief given that everybody else in the world is at 3V/3.3V. | | A developer basically has to _unlearn_ a bunch of things to | move up from an Arduino. I still recommend Arduino to non- | developers or somebody just trying to throw together a | project, but I no longer recommend them to someone actually | trying to learn embedded development. | rowanG077 wrote: | What does usb 3 gigabit ethernet or pcie ip cost? Is it | for free using intel? | KSteffensen wrote: | And they are based on SystemVerilog and TCL, two of the worst | programming languages in serious use. | | Those toolchain disasters are not quite as hilarious when you | have to use them daily.... | himinlomax wrote: | TCL itself is not _that_ bad for the purpose IMO; it 's | more the stuff around it, the proprietary binary formats, | the gooey crap, and the non-open nature thereof. | bsder wrote: | Oy. I'm a Python guy, but Tcl is _NOT_ that bad. Do not | blame the horrible software engineering at Altera and | Xilinx on Tcl. Those companies make more than enough money | that they could sit down with Tcl and Tk, spend some time | on the code, and have a quite decent tool. Instead, they | keep their bitstream completely closed to lock out | competitors and saddle the world with shitty tools. | | I'm _really_ surprised that Lattice hasn 't tried to go | around Xilinx and Altera by doing exactly that. You would | think that an open bitstream format and a couple million | dollars thrown at academic researchers (Lattice makes about | $200 million per quarter in gross profit) would produce | some real progress, but I digress ... | | SystemVerilog, on the other hand, was specifically created | because Verilog and SystemC got loose to the end users and | the EDA companies were not going to make that mistake | again. So, yeah, SystemVerilog is pretty bad. | eqvinox wrote: | I'm hoping/expecting a chip that goes into the Epyc/SP3 socket | and has the memory & PCIe & socket crossconnect as hard IP but | the CPU cores replaced with programmable logic. If you have a | use case for FPGAs, it's more likely you want it in a | concentrated form like this... not on low-end or desktop | systems :/ | | If I remember correctly, there was something similar back in | the early HyperTransport days... | einpoklum wrote: | Well, you want something more low-end for developers' and | hobbyists' machines, I would guess. | eqvinox wrote: | As much as I agree with you and want one for myself too, I | doubt that this market segment is interesting to AMD at | all. The kinds of workloads that warrant going FPGA are the | kind of workloads where you just give your devs a bunch of | high-priced development systems. Those would likely be | close to identical to the production boxes, just with more | debug pieces plugged in. | hderms wrote: | Are you envisioning retaining at least a few cores? It seems | like you'd probably still want an OS running on native | silicon. | detaro wrote: | I assume they are thinking about a design for multi-socket | systems. | eqvinox wrote: | Yeah I think it's both more effective and cheaper to have | dual/quad socket systems with 1 "normal" CPU and the rest | filled with FPGAs without CPU cores, just to max out on | the raw crunching ability. The PCIe block on the FPGA | chips could be flexible enough to (re-?)wire directly | into the programmable logic, maybe even reconfigurable to | other protocols (e.g. 100GE). Also in "normal" NUMA | fashion each FPGA would have the memory channels | associated with that socket (presumably through the | interconnect as if it were a CPU, so the CPU can access | it too.) | | I'm just looking at this from a logical chain of "who | needs FPGAs in their computers?" => "cases with loooots | of specific data crunching" => "want a | controlling/driving CPU for the complicated parts, but | then just concentrate as much FPGA in as possible." => | Multi-socket with 1 CPU & rest FPGAs. | | (There currently is no commodity Quad-socket SP3 | mainboard, not sure if this is a design limitation or | just no one made one yet? I'd still say the approach | works great with only 2 sockets.) | zrm wrote: | I wouldn't expect to see anything like this on SP3 | anyway, since it would take some time to do the work and | by then the current generation would likely be whatever | they replace SP3 with in order to support DDR5. | dboreham wrote: | This opinion is unlikely to be popular, and it's been decades | since I was a full participant in the hardware business, | but...I just have never seen the use case for FPGAs beyond | niche prototyping / small run applications, which by definition | make no money. I suppose there are also scenarios where you | want to keep your design secret from the fab and/or change it | every week, but those seem very niche too (NSA, GCHQ, ..?). | jleahy wrote: | Not long ago there was an FPGA inside the iPhone (an ice40). | Hardly niche. | Answerawake wrote: | Really? What function did it serve the iPhone? | ATsch wrote: | Likely just simple glue logic. Things like converting one | protocol into another, doing some multiplexing or some | simple pre-processing or filtering on some sensor data. | They're incredibly tiny (2x2mm) and use little power, so | they pop up in designs pretty regularly. | 908B64B197 wrote: | I wonder if they are reprogrammable, so if what's running | on these could ever be updated. | seany wrote: | For low volume (sub 100k units?) they're often the only good | way to do configurable* SERDES in any environment that is | latency sensitive. | | Configurable as in one SKU is in several products, but not | necessarily reconfigurable by the end user. | ljhsiung wrote: | Couple things-- | | 1) You underestimate how critical prototyping has become, | again likely since you say it's been a couple decades. Time | to market has become more important, and verification has | become harder as CPUs have gotten even more complex. FPGAs | enable cosimulation and emulation, leading to faster | iteration of both design and verification efforts and thus | better TTM. | | FPGAs are _so_ important in the hardware development process | that I would even say you 're not a serious hardware company | if you _don 't_ have any FPGA frameworks to design silicon. | | 2) As others have mentioned, FPGAs are also critical for low- | latency workloads that require constant tweaks-- high | frequency trading (ugh...) comes to mind. The need for | "constant tweaks" could also be satisfied with just "normal" | software, but that has higher latency as opposed to an FPGA, | and FPGAs can get some crazy performance if you're willing to | pay the price (south of 7 figures). | | Overall sure, usage of FPGAs might be niche compared to, idk, | Javascript; but it's commonplace/practically essential in | hardware. | ATsch wrote: | It is very likely that the packets of this comment traveled | through several FPGAs to get from your computer to my screen. | Yes, they are definitely more niche than CPUs. But niche | products have really high margins and people willing to pay | for them. | | FPGAs are already incredibly popular. They're just mostly in | things you are unlikely to personally own or know about. | You're going to find at minimum one, but probably more FPGAs | in things like big routers and other telecom equipment, e.g. | cell towers, firewalls, load balancers, enterprise wifi | controllers, video conferencing hardware, test equipment like | oscilloscopes, sensor buoys, scientific instruments, MRI | machines, LIDARs, high end radio equipment, or even just glue | logic tying together other components, like in the iphone. | moftz wrote: | Easily changing the design and being the cheaper option to | ASICs for small productions are the two main uses for FPGAs. | You may be designing a box that can be configured to do | different things so you may want to support multiple FPGA | images to switch back and forth depending on the mission. You | may just want to be able to easily upgrade firmware for a | complex design in the future. For Space DSP applications, the | FPGA is king and will probably be for a long time simply due | to the ability to cram a lot of functionality into a small | space (DSP, microcontroller, combinational logic circuits, | and massive I/O banks all in one chip) | teleforce wrote: | I am not sure whether you are serious or trolling but I will | bite ;-) | | FPGA are being used in many type of applications where real- | time is necessary and non-recurrent engineering (NRE) cost | need to be minimized, for example here [1]. | | One classic example is that if you poke under the hood of any | signal generator like AWGs, you will probably find an FPGA | inside. As you probably aware since you in hardware business, | AWGs are probably one of most common equipment in any | electronic and electrical labs or companies. | | [1]https://www.electronicdesign.com/technologies/fpgas/articl | e/... | y2kenny wrote: | For those who are not familiar, AWG stands for Arbitrary | Waveform Generator. | bsder wrote: | > I just have never seen the use case for FPGAs beyond niche | prototyping / small run applications, which by definition | make no money. | | You are precisely correct. FPGAs are useful when your volume | doesn't reach volumes where an ASIC would get amortized. | | Networking companies (Cisco, Juniper, etc.) are classically | big consumers of FPGAs. | | Tektronix seems to make quite a bit of money and there is at | least one FPGA in practically every test instrument they | make. This holds true for practically all test instrument | manufacturers. | | I know a _LOT_ of industrial automation and testing companies | that generally have FPGAs in their systems. Both for latency | and for legacy support (Yeah, GPIB still exists ...). | | Yes, they aren't "Arm in a cell phone" type volumes, but that | doesn't mean they aren't quite profitable if you can | aggregate them. | eganist wrote: | I'm optimistic... not so much because of the merits of the | acquisition but moreso because of AMD's history with strategic | actions. ATI kept them afloat through a CPU performance | drought, and divesting globalfoundries secured necessary | liquidity. These two alone essentially saved AMD, so I've got | faith in leadership being able to make the appropriate | strategic maneuvers. | | But maybe I'm being overly optimistic. (Probably because-- | disclosure--I'm long AMD. Been long for years.) | MrXOR wrote: | But FPGAs are the future. They will put the CPU out of work | almost entirely. | | https://www.nextplatform.com/2020/01/31/when-will-fpgas-outw... | | https://www.nextplatform.com/2018/03/19/fpga-maker-xilinx-sa... | FPGAhacker wrote: | Could be interesting. I prefer an independent Xilinx, but maybe | competition with intel will stimulate the whole reconfigurable | computing revolution that fizzled out. | mmrezaie wrote: | I understand that they need a big push in DPU market, but I do | not understand why companies as big as AMD do not invest and | build what they need in house? If anyone can, it is AMD that can | gather the talent. Everyone was talking about future data | centers, and as far as I can tell I have been hearing about | heterogeneous IO since 2009 (and that's me, and I was hearing it | while working on Xen). | | To asnwer my question maybe the market is so volatile that they | cannot do strategic planning like that? | dspillett wrote: | _> why companies as big as AMD do not invest and build what | they need in house?_ | | Often it is an expertise thing, especially when buying smaller | companies. See https://en.wikipedia.org/wiki/Acqui-hiring | | With larger purchases like this one that can still be part of | the equation, though there is also the matter of lead times | needed to bring a significant team and related infrastructure | needed for the project(s) online and up to speed. | | Also if a company is seen as ripe for buying, it can sometimes | be done in part to stop a competitor getting a chance at the | above advantages. | | I suspect a mix of all three is at play here. | Guthur wrote: | Hiring talent is hard and risky, especially in an area that you | are actually expert. | | The more risk option is the just acquire a company that's done | it all for you. | | Of course that does leave the merging part. But on paper it | looks fast and easy. | mmrezaie wrote: | Dicipline of the company is more important than the talent. | Oracle has a lot of talent, but they lack the discipline of | generating anything novel. They buy the idea. Anyhow semi- | conductor industry is different. Apple or Amazon played it in | my opinion better although in vastly different markets. | CodeArtisan wrote: | xilinx is the holder of more than 4000 patents. | mmrezaie wrote: | This is a good point, but it mostly matters when you are in | middle of developign the strategy so that you can protect it | and wanna have a robust plan. Maybe they are! | nfriedly wrote: | I'd like to see consumer-level CPU + GPU + FPGA products that | emulators could take advantage of. I'm thinking of floating point | math for PS2 right now, but I'm sure there are other examples | where an FPGA could be beneficial. | beezle wrote: | So who are left in the fpga space? Lattice? | duskwuff wrote: | Not a lot. Actel was acquired by MicroSemi in 2010, and | MicroSemi was in turn acquired by Microchip in 2018. | | There's a couple of upstarts in China like Gowin and Anlogic, | but they haven't made much of an impact in the larger market | yet. | ohazi wrote: | You can still buy Altera FPGAs, and you'll still be able to buy | Xilinx FPGAs -- they're not going to just throw away a three | billion dollar business. | | Lattice is probably the next biggest. There's also Microchip (< | Microsemi < Actel), Quicklogic, and Gowin. | | Nobody really came close to competing with Altera / Xilinx at | the high end, though. | xiphias2 wrote: | This comment is off topic, but while I'm listening to the | earnings call, I don't hear about specifically official PyTorch | and Tensorflow support for AMD graphic cards. All the questions | and answers are generic with buzzwords like AI, doubling down on | our software support, but it doesn't give me confidence to change | my NVIDIA GPU to an AMD one for the foreseeable future. | | I remember the time when Elon Musk said to an analyst that he's | asking boring questions to fill in his spreadsheet, and I'm | feeling the same thing while listening to the earnings call. | ksec wrote: | Yes I posted another link [1] for earning call but didn't get | much traction / upvote. Although I have to admit Acquiring | Xilinx is much bigger news. | | Judging from watching the Financial News and reading analyst's | comment for years. My feeling was that their Job was not to | push for hard question or an honest answer. Their job is to | push whatever interest they had with the company. So a spin for | better long term prospect and downplay risk. | | I was happy with the Enterprise results ( +116% YoY ) until I | read this | | >Revenue was higher year-over-year and quarter-over-quarter due | to higher semi-custom product sales and increased EPYC | processor sales. | | Semi-Custom is definitely PS5 and Xbox. | | Basically I still dont see EPYC making enough inroad in the | Server Market. And this is worrying, while the Stocks, Reviews, | Hype are all going to AMD. No Results so far have shown Intel | is hurt or AMD is making big gains in market shares and revenue | shares. | | The only good part I guess is Ryzen Mobile contribution to | Computing and Graphics segment. | | [1] https://news.ycombinator.com/item?id=24906314 | hajile wrote: | The server market moves and replaces slowly. Even when Intel | was beating AMD by 30% or more it still took years for AMD | percentages to drop. | | AMD EPYC on AWS is 10.42% cheaper than Intel per hour across | the board for m5 instances (9.6% cheaper for t3 instances). | 7nm EPYC saves more than 10% power vs Intel 14nm and per-chip | savings from buying AMD are way more than 10% vs similar | Intel offerings. Why can Amazon spike prices that much? | Because people will pay and still consider it a deal. | | AMD's main issue still seems to be available due to do much | competition for 7nm and tsmc being reluctant to build new | fabs. | Teever wrote: | Do you have any idea why TSMC is reluctant to build new | fabs? | dogma1138 wrote: | Because outside of very specific cases like solar panels, | display and LEDs spinning up new fabs seems to be quite | risky the investment costs is huge and it seems that TSMC | is very much content to make major bucks with fewer | customers that are chasing the latest and greatest node. | | The more fabs they have the slower their node progression | will be and the cost of each new node these days seems to | be almost exponential. | JackMcMack wrote: | Earnings calls never seem to have any hard questions. There | was a projections miss a few quarters ago (very rare for | amd), and even then the questions feel more like PR. | | Semi-Custom is indeed PS5 and Xbox. This increase was | expected of course. It's lower margins than other segments | though. | | Epyc adoption is indeed slower than I would've liked. But so | far it has matched or beat short and long term projections | from AMD. | | Enterprise is weird. AMD has better price and performance? | Let's buy Intel. Meltdown lowers performance? Let's | compensate by buying more Intel. Intel is supply constrained? | Let's complain, and still buy Intel. | | My guess is Intel is still pressuring OEMs to favor Intel. | Cloud providers are increasing adoption though, and there | have been a few nice HPC wins. | | And lets not forget that AMD is selling every chip they can, | TSMC production is fully booked. | smallnamespace wrote: | Earnings calls _do_ have hard questions in them, but you | might miss them because they 're asked with a big dose of | circumlocution. | | If you look at the incentives a bit, management gets to | decide which analysts can ask questions, so analysts need | to stay on management's good side. | | Analysts know the company's figures inside and out (often | have a 10-tab spreadsheet with an extensive operational | model of the company), and are asking questions to tweak | key model assumptions. | | So analysts ask pointed questions in shared jargon with | management. You don't ask 'are you seeing a big sales drop | because the crypto bubble blew up?', you say 'can you | provide some color on when you excess inventory will clear | from the channel?' | | Analysts get their answers. Management avoids bad headlines | written by casual listeners. | | There's an additional layer, which is the analysts know the | industry and company very well, so general bad things are | already background knowledge (there's no reason to ask | about them). If you want to know what the analyst already | knows, then pay for their report--they're not reporters | fishing for a sound bite. | JackMcMack wrote: | My investment in AMD is so small that it seems silly to | pay for analyst reports. I don't care about short term, | and I have enough faith in my own research to keep | believing in a brighter long term. | | Are there any public resources dissecting an earnings | call? Doesn't have to be recent or AMD. | xiphias2 wrote: | You don't get much information from earnings calls, only | how analysts are thinking (usually short term). Product | reviews on HN, tech oriented in-detail sites, youtube | reviews from gamers, and long term analysts, like Ark | Invest, and even github issues/comments are much better | long term predictors of success/failure. | | Just as an example, I remember reading a lot about AirBnB | here at HN when it wasn't even known in Eastern Europe. I | suggested him to use it to rent out his luxury apartment | that he just bought there, and he was the first one to | rent out a luxury apartment in the country (also somebody | from AirBnB-s management flew there personally)...he made | lots of money from rental fees of course, but also AirBnB | got incredibly successful. There are lots of other | examples of course, this was just the least controversial | that I can write here :) | smallnamespace wrote: | For earnings calls specifically, Seeking Alpha sometimes | has decent discussion. | | Take everything you read with a big grain of salt though. | You get what you pay for, especially in finance--'free' | content usually comes with an agenda. | selectodude wrote: | From a personal point of view of somebody who just upgraded | to an AMD-based desktop from an Intel one, the CPUs work | great. It's the software support that feels half-baked. | Enabling memory integrity blue-screens my computer. I need | to update the chipset drivers every month because AMD is | still dialing in their scheduler and frequency scaling over | a year after release, and the idle power usage is | inexcusable. | | AMD's performance is fantastic, which is why I was fine | getting it for a home gaming desktop, but I'm not sure I'd | be willing to pull the trigger on AMD on an enterprise | server buildout. Intel still simply (actually or otherwise) | _feels_ more reliable. | xiphias2 wrote: | I can imagine that switching from Intel to AMD takes a bit of | time for servers, as supporting 2 architectures at the same | time is usually bad news for 99%-ile latencies for web | services. At the same time x86 is a mature instruction set | (as long as you don't use 512 bit vectorization, where things | are getting tricky), so the transition shouldn't be that | hard. | ckastner wrote: | My impression is that the accelerated computing side of AMD is | receiving far, far too little attention. For example, their | flagship GPU is still not officially supported by ROCm (AMD's | answer to CUDA) [1]. Imagine the 2080 Ti not being supported by | CUDA. | | I've become a huge AMD fan, both because of their hardware, and | because of their commitment to open source. But while the | battles they have one against Intel on the x86 side are | impressive, it seems that CUDA is leaving them far behind. | | [1] https://github.com/RadeonOpenCompute/ROCm/issues/887 | xiphias2 wrote: | ,,AMD ROCm is validated for GPU compute hardware such as AMD | Radeon Instinct GPUs. Other AMD cards may work, however, they | are not officially supported at this time.'' | | It seems like Lisa Su thinks that there's a separate ,,gamer | market'' and ,,accelerator market''. | | Jensen Huang understands that the same person can like to | play games and train machine learning models on the same | machine. | | I'd love to switch to AMD CPU to have a portable laptop with | low resource usage, as I spend most of my time travelling, | but as GPUs in the cloud are overpriced (thanks to Jensen | with separated pricing for servers), and internet in hotels | are unpredictable, I don't want to train models in the cloud. | | Anyways, Lisa said that she reads all comments about AMD, so | I hope she'll listen :) | Tuna-Fish wrote: | It likely will never have support. AMD chose to bifurcate | their GPU designs into compute (CDNA) and games (RDNA) lines, | with different architectures. RDNA sheds all the fancy | features needed to support modern compute, thus gets more | efficient in games that do not use it, but also cannot | support the modern compute APIs. | xiphias2 wrote: | NVIDIA is adding more features, like super-scaling to | games, and machine learning models are improving faster | than Moore's law. I expect those fancy features, like | tensor cores to be a must for 4K gaming in the future. | | What's funny is that the same strategy (leaving out | specialized instructions from consumer level hardware) that | worked extremely well for CPUs won't work for GPUs in my | opinion. | | If you look at ray tracing hardware (I have it on my RTX | 2070 Max-Q card in my laptop), it sucks right now, but it's | improving very fast as machine learning algorithms improve. | | I just found this: | | https://www.tomshardware.com/news/amd-big_navi-rdna2-all- | we-... | | One thing that I forgot is that AMD can just focus on | inferencing hardware (INT16 operations), and leave out | tensor cores...so actually you are right, I'll just stay | with NVIDIA GPUs. | fizixer wrote: | AMD are clowns. | | Clowns are acquiring Xilinx? to turn Xilinx into a clown show? | Hikikomori wrote: | How are they clowns exactly? | fizixer wrote: | Uhhh ... by playing the kind of competitor that nVidia | couldn't have imagined in their wildest wet dreams? | johnwalkr wrote: | Xilinx Zynq and Ultrascale series are multiple Ghz ARM cores plus | FPGA. They're incredibly useful for small volume niche use cases | and to give an example from my industry, becoming popular in | space applications. The reason is hardware | qualification/verification is extremely expensive but a change to | FPGA fabric is not. | | My point is Xilinx have already proven ARM CPU+FPGA on one die | and I think AMD CPU+FPGA is very likely to be a success. | | Between this, ARM adoption, Apple Silicon and similar offerings | (which kind of skipped ARM+FPGA for ARM+ASIC), RISC-V, it's like | 1992 again with exciting architectures. Only this time software | abstraction is much better so there is not a huge pressure to | converge on only 1-2 architectures. | panpanna wrote: | ARM + ASIC? Isn't that simply a SoC? | | Edit: technically, the arm part is also ASIC, but you get what | I mean | mkhalil wrote: | Well considering how Ryzen Master runs, I don't have much hope | for Xilinx's software to get better by this acquisition. | throwmemoney wrote: | It is really funny when you find out that Intel uses Xilinx FPGAs | for prototyping as they cannot get what they acquired (Altera) | working in house to make things work. | cjwinans79 wrote: | If true, ouch! Intel seems to be getting kicked every which way | these days. Too complacent when they once ruled the roost. | rustybolt wrote: | Source? | hongseleco wrote: | I second this, I want details! | GloriousKoji wrote: | I don't work for Intel but I do work for a semiconductor | company. While Xilinx FPGAs aren't directly used for | prototyping, there are a large number of third party boxes | purchased to accelerate hardware simulations and they're | chocked full of FPGAs. | yvdriess wrote: | That's the more likely explanation. Altera vs Xilinx | isn't just the hardware, it's an entirely different | toolchain. It would be insane of Intel to demand third | parties to move all their technology over to Altera's. | andy_ppp wrote: | Could programmable AI chips compete with Graphics Cards and TPUs | or is it futile to try? | ninjaoxygen wrote: | In the stock trading world, HFT on FPGA is closer to the edge | than GPU / TPU solutions and they are using ML models, if you | count that as AI. When the logic and the NIC are on the same | hardware, it's really fast. An ASIC would be even faster, but | you can't really iterate on that. | heliophobicdude wrote: | Perhaps they would not make good competition. FPGAs have been | known to be slower than ASICs. But then again, perhaps some | other company will find a good use for rapidly changing IC | design. | galangalalgol wrote: | Their new versal line puts a tpu on the die with the fpga. | Great for inference, especially if you want to use the fpga to | quickly extract features in the fpga and infer from feature | space. | hehetrthrthrjn wrote: | This is a smart move, reflecting Intel's own, with an eye to the | datacenter where the FPGA is seen as having a bright future. | voxadam wrote: | I'd love to know why Intel chose to buy Altera instead of the | industry leader Xilinx. | hehetrthrthrjn wrote: | It may have been Xilinx not wanting to get into bed with | Intel. Xilinx may have wanted a degree of technical | independence or freedom to carry out their own strategy that | was not forthcoming from Intel. | saagarjha wrote: | Wonder what AMD told them. | dbcooper wrote: | "We will use TSMC." | rjsw wrote: | Probably worth looking at where the different FPGA brands | were being fabbed. Xilinx is a better fit to AMD. | andrew_gen wrote: | I believe Altera was already manufacturing chips using Intel | process prior to acquisition, while Xilinx is using TSMC? | saagarjha wrote: | Yes, I believe they switched off of TSMC in 2013 or so. | pclmulqdq wrote: | Altera Stratix V FPGAs actually had more market share than | Virtex 7s. They were better chips. That said, the production | delays around Arria 10 and Stratix 10 and the time lag caused | by the Intel acquisition totally killed their market | position. The only reasons to use Intel FPGAs now are (1) | 64-bit floating point support or (2) if your Intel salesman | gives you a really good deal. | ksec wrote: | Both Altera and Xilinx were on TSMC. Altera wanted an Edge | over Xilinx, at the time Intel was committing ( on paper ) to | their Custom Foundry. Altera switched and bet to Intel Custom | Foundry. Nothing ever worked out with Intel Custom Foundry | because they were not used to working with others on Foundry | Process. Intel thought the problem was with Altera not being | part of company and they had too much cash so they might as | well buy them for better synergy. And it did, getting | internal access _seems_ to have ( on paper or slides ) speed | things up with product launches and roadmap, until they hit | the Intel 10nm fiasco. | saagarjha wrote: | Has Intel done much with Altera? I haven't heard much of | anything come out of that partnership. (Then again, I'm not | plugged in to this stuff.) | mastax wrote: | I don't use FPGAs (tooling is too poor, languages are bad, | up-front costs are high) but I hang out on FPGA forums and | the overwhelming consensus has been bad. Chipmakers and | especially high-performance chipmakers have always been | focused on high-volume and/or high-margin customers, but the | Intel acquisition has made Altera worse in that regard. Their | sales and support teams were integrated into Intel and now | you can't get any support from them whatsoever even if you | spend $MM/yr. You need to funnel even basic questions and bug | reports through a distributor contact to have any chance. I | forget the specifics but they made tooling even more | restrictive/expensive. The only new products out of it are a | few Xeons with built-in FPGA ($$$$$), good for HFT guys I | guess. | baybal2 wrote: | > This is a smart move, reflecting Intel's own, with an eye to | the datacenter where the FPGA is seen as having a bright | future. | | What in the world FPGAs have to do in a datacentre? | coldtea wrote: | The "datacentre" mentioned is not necessarily an enterprise | datacenter or a web app backend datacenter. | | Think ML, networking and other such uses... | RantyDave wrote: | Virtual machines are very much a thing now, and | virtualisation has made it into network cards reasonably well | ... but pretty well nothing else. | | In our future datacentre we want to say how many cores, | connected to how much ram, how much GPU resource, some NVME | etc. etc. and there's going to be a whole lot of very | specialised switching and tunnelling going on. This needs to | be as close to the cores/cache as possible, a good order of | magnitude faster than we run our present networking stuff, | and probably an area where there will be a significant pace | of development ie a software defined solution would be nice. | | So, a software defined north bridge, in essence. And an FPGA | is pretty much the only thing we have right now that could do | the job. | einpoklum wrote: | Because an FPGA lets you optimize your "hardware" solution to | a computing problem without the hassle of fabricating a chip | of your own (although the performance with an FPGA is much | lower than with a custom chip). | sadiq wrote: | Microsoft have been using them in Bing and other projects for | a while: https://www.microsoft.com/en- | us/research/project/project-cat... | unsigner wrote: | Word on the street is that this was a vanity project of a | VP, and never resulted in performance levels that couldn't | be achieved with a little bit of focused optimization of | boring old CPU work (threading + SIMD). | saagarjha wrote: | Aren't they already widely used as NICs? And I many places | are beginning to offer them for ML workloads and such. | detaro wrote: | I don't think "widely", maybe in a few niches? "SmartNICs" | are becoming a bit of a thing again, but those are mostly | not FPGA-based as far as I know. | ATsch wrote: | There's been a recent trend to increasingly move more compute | capabilities into NICs. This has been going on for a while, | but has gained a new dimension with cloud providers. For | example, with their "Nitro" system, AWS can more or less run | their Hypervisor entirely on the NIC and completely offload | the network and storage virtualization from their servers. | This development is likely to continue. FPGAs are going to | play a significant part in that because they allow the | customers to reconfigure this hardware according to their | needs. | QuixoticQuibit wrote: | Can you expand on why Intel's move was smart (what did the | Altera acquisition do for them) and why FPGAs have a bright | future in the datacenter? | | From what little I've seen in this space, FPGAs have not made | large inroads in the ML space or datacenters in general. This | seems partly due to their inefficient nature compared to ASICS | and moreover their software. | | Unless AMD is planning something really ambitious (e.g., true | software-based hardware reconfiguration that doesn't require | HDL knowledge) and are confident they've figured it out, I'm | not sure what they hope to achieve here. | ansible wrote: | > _From what little I've seen in this space, FPGAs have not | made large inroads in the ML space or datacenters in | general._ | | I don't know that they have actually made large inroads into | those spaces, but Xilinx is indeed pushing hard for that. For | years now. | brandmeyer wrote: | > true software-based hardware reconfiguration that doesn't | require HDL knowledge | | This has been a holy grail for at least two decades. Its very | much like asking for a programming language that can be used | by non-programmers. | znpy wrote: | Hopefully we'll get better open source tools, a better Vivado | maybe ? | amelius wrote: | They could have better open source tools simply by opening up | all their specs. | | The same holds for any other vendor. | galangalalgol wrote: | That is my hope as well. Also the new versal line compares | itself to a gpu, maybe we see a gpu/fpga combo? | NotCamelCase wrote: | Open source or not, I think this'll (hope so!) lead to a much | better FPGA development flow within Xilinx ecosystem in not-so- | distant future. | thecureforzits wrote: | Why? What market is there in open source any more? | rcxdude wrote: | Just better tools would be nice (and open-sourcing would | bring some hope for that). FPGA tooling is atrocious, | especially if you're used to software tooling. And the | difference in tooling can sell chips all on its own. | mhh__ wrote: | FPGA development tools are generally dated, very very | expensive, and one way streets for customisation. | | From what I understand, open sourcing the bitstream format in | its entirety will only do so much but it would certainly | help. It's not just building GCC for FPGAs | detaro wrote: | One that has tools that don't make your users hate you. | Seriously, the open-source FPGA toolchains are breath of | fresh air to use, despite being small projects with few | contributors (although due to that and no vendor support they | are severly limited in supported targets and special | features). | duskwuff wrote: | Yep. The Icestorm toolchain for Lattice FPGAs is a real | breath of fresh air -- fast compile times, multiple sets of | interoperable tools, open file formats, development in the | open... it's great. I just wish something like this was | available for more than just Lattice parts. ___________________________________________________________________ (page generated 2020-10-27 23:00 UTC)