[HN Gopher] PCI-Sig Releases 256GBps PCIe 6.0 X16 Spec ___________________________________________________________________ PCI-Sig Releases 256GBps PCIe 6.0 X16 Spec Author : ksec Score : 88 points Date : 2022-01-11 20:22 UTC (2 hours ago) (HTM) web link (www.servethehome.com) (TXT) w3m dump (www.servethehome.com) | ChuckMcM wrote: | Okay, that is just stupidly wide bandwidth :-). As a systems guy | I really don't see the terabytes of bandwidth in main memory | coming in the next 5 years. GDDR6, the new "leading edge" is | nominally 768GBs, which sounds great but how many transactions | per second can you push through a CPU/GPU memory controller? Are | we going to see 1152 bit wide memory busses? (1024 bits w/ECC) | That is over 1/3 of all the pins on a high end Xeon and you're | going to need probably another 512 pins worth of ground return. | HBM soldered to the top of the die? Perhaps, that is where GPUs | seem to be headed, but my oh my. | | I'm sure on the plus side there will be a huge resale market for | "combination computer and room heater" :-) | synergy20 wrote: | what a coincidence, my boss just asked me to buy the | memerbership. | | keywords: PAM4 and 128GBps/per x16 lane(full width, per my | understanding, not 256Gbps as title says), which mean it can be | used to make a 1Tbps NIC, or any traffic at the scale. | | from my reading, we will see pcie6 products mid-2023. | NKosmatos wrote: | Good news for everyone and hopefully this will not interfere with | the PCIe 5.0 rollout, even though the 4.0 one is not fully | adopted by the market yet. | | Not that I'd be able to use it, but it's a pity they make the | spec available to members only (4000$/year membership) or they | sell it at ridiculous prices | https://pcisig.com/specifications/order-form | | I know that other specifications and even ISO standards are | provided for a fee https://www.iso.org/store.html and perhaps | something similar should be applied for open source software to | avoid similar issues like faker.js and colors.js | wmf wrote: | You might be better off with a book than the spec anyway. | Unfortunately I only see books covering PCIe 3.0. | AdrianB1 wrote: | The problem with PCIe is not bandwidth, it is the limit in lanes | on consumer PCs: 20 lanes from CPU and a few more from the | SouthBridge is not enough when the GPU is usually linked on a 16 | lane connection. The easy way out is to reduce GPU lanes to 8, | that leaves plenty of bandwidth for nVME SSDs and maybe for 10 or | 25 Gbps NICs (it's about time). | | For servers it is a different story, but the recent fast move | from PCIe ver 3 to ver 5 improved the situation 4x, doubling | again is nice, but it does not seem that much of a deal. Maybe | moving NICs from the usual 8 lane to a lot less (8 lanes of ver 3 | means 2 lanes of ver 5 or a single lane of ver 6) will also make | some difference. | addaon wrote: | But doubling the bandwidth per lane allows one to use half as | many lanes to a GPU and maintain bandwidth. As you mention, it | allows an eight lane GPU to be a viable option. And better yet, | due to how PCIe handles variable number of lanes between host | and device, different users with the same CPU, GPU, and even | motherboard can choose to run the GPU at eight lane with a | couple of four-lane SSDs, or at sixteen lane for even more | bandwidth if they don't need that bandwidth elsewhere. | AdrianB1 wrote: | 8 lane GPU is viable for a long time (benchmarks on PCIe 8x | versus 16x shows a 5% perf difference), but it did not change | the physical layout the motherboard manufacturers use; you | cannot use the existing lanes any way you want, on some | motherboards you cannot even split it the way you want | between physical connectors and video card manufacturers | continue to push 16x everywhere. | johncolanduoni wrote: | PCIe bandwidth increases have outstripped increases in GPU | bandwidth use by games for a while now. Anything more than 8x | is overkill unless you're doing GPGPU work: https://www.gamer | snexus.net/guides/2488-pci-e-3-x8-vs-x16-pe... | tjoff wrote: | With the bandwidth you do get more flexibility though, and with | pci express bifurcation you can get the bandwidth equivalent of | 4 pcie4 16x from a single pcie6 16x. | | And that's great since the vast majority don't need that much | bandwidth anyway. | | Today you typically have a whole slew of devices sharing 4x to | the cpu. More bandwidth would open up for more usb and perhaps | cheaper onboard 10gig Ethernet etc. | csdvrx wrote: | Totally, and sTRX4 has a limited set of board available. | | I was hoping AM4 would provide that many lanes on easy to buy | motherboards, but it's a meager 28, so not even enough for 2x | x16 | yxhuvud wrote: | At least AM5, which is coming soon, seems to improve the | situation. | csdvrx wrote: | With a meagre extra 4 lanes | alberth wrote: | Isn't this only an artificial problem Intel created to segment | the market, a problem that AMD doesn't have. | jeffbee wrote: | It's not just segmentation. Laptop buyers are not going to | pay for 64 lanes. A regular Intel SKU of the 12th generation | has 28 PCIe 4.0/5.0 lanes. A Xeon has 64, does _not_ have | 5.0, and costs way more, partly because it has 4189 pins on | the bottom, which is insane. | csdvrx wrote: | Yes they do have that problem, even the upcoming AM5 will | only have 28 lanes given the annoucements: | https://www.hwcooling.net/en/more-on-amd-am5-tdp-to- | reach-12... | wmf wrote: | AMD has the same or slightly more lanes than Intel. | the8472 wrote: | EPYCs have 128 PCIe Gen4, recent Xeons have 64 PCIe Gen4. | And Intel introduced Gen4 later than AMD. | tjoff wrote: | It's an AMD problem as well. It's absolute nightmare trying | to research for a computer today. What ports can you use in | what circumstances. Which slots go the CPU directly and which | go to a chipset. | | Which lanes are disabled if you use nvme-slot 2. Which slot | has which generation etc. A proper nightmare. | | And while we are at it, dedicating pci-lanes to nvme-slots | must be one of the most boneheaded decisions in modern | computers. Just use a pci-card with up to four nvme-slots on | it instead. | FridgeSeal wrote: | Maybe it's because I bought a "gaming" motherboard, but the | manual was pretty (for my understanding at least) as to | what configuration of m.2 drives and PCIe lanes would run | at what version, what went to cpu and what went to chipset. | alberth wrote: | Netflix, I imagine, would love to have this kind of I/O | bandwidth. | mikepurvis wrote: | I'd be surprised if any of this mattered for them, since their | workload (at least the "copy movie files from disk to network" | part of it) is embarrassingly parallel. | | Unless they're really squeezed on power or rack space budget, I | would imagine they'd do just fine being a generation back from | the bleeding edge. | loeg wrote: | They are very squeezed on rack space budget in at least some | locations. | extropy wrote: | Yeah, using significantly cheaper but just somewhat slower | hardware works great if you can parallelize. | | Also cutting edge is usually very power hungry and | power/cooling costs are majority of your expenses at data | center scale. | nijave wrote: | >Unless they're really squeezed on power or rack space budget | | I think this is the case for their open connect appliances | (or whatever they call them). They want to try to maximize | throughput on a single device so they don't have to colocate | so much equipment | SahAssar wrote: | Isn't netflix pretty much capped at network I/O, not disk I/O? | All the posts I've read about them have been focused on | network. | willcipriano wrote: | Isn't this just pure I/O? You could have a PCIe 6.0 raid | controller or network card. | vmception wrote: | what use case does this level of bandwidth open up, and I am not | able to understand why the article thinks SSDs are one of them. | PCIe already provides more than enough bandwidth for the fastest | SSDs, am I missing some forward-looking advancement? | jjoonathan wrote: | CXL puts PCIe in competition with the DDR bus. The bandwidth | was already there (now doubly so), but CXL brings the latency. | That's exciting because the DDR bus is tightly linked to a | particular memory technology and its assumptions -- assumptions | which have been showing a lot of stress for a long time. The | latency profile of DRAM is really quite egregious, it drives a | lot of CPU architecture decisions, and the DDR bus all but | ensures this tight coupling. CXL opens it up for attack. | | Expect a wave of wacky contenders: SRAM memory banks with ultra | low worst-case latency compared to DRAM, low-reliability DRAM | (not a good marketing name, I know) where you live with 10 | nines of reliability instead of 20 or 30 and in exchange can | run it a lot faster or cooler, instant-persistent memory that | blurs the line between memory and storage, and so on. | user_7832 wrote: | > CXL | | Thank, that it quite an interesting technology I wasn't aware | of. Apparently Samsung already made a CXL RAM module for | servers in 2021 (1). I wonder how Intel optane would have | been if it had used CXL (assuming it didn't). | | Side note but AMD devices' (laptops/NUCs) lack of thunderbolt | or pcie access is why I'm quite hesitant to buy a portable | AMD device which is quite unfortunate. I really hope | AMD/their partners can offer a solution soon now that | thunderbolt is an open standard. | | 1. https://hothardware.com/news/samsung-cxl-module-dram- | memory-... | zionic wrote: | 120hz star citizen | [deleted] | dragontamer wrote: | PCIe 4.0 x4 lane provides enough bandwidth. Ish... we've | already capped out with 8GBps SSD actually. PCIe 4.0 x4 lane is | now the limiting factor. | | PCIe 6.0 1x lane would provide the same bandwidth, meaning you | run 1/4th as many wires and still get the same speed. | | Alternatively, PCIe 6.0 4x lane will be 4x faster than 4.0, | meaning our SSDs can speed up once more. | AnotherGoodName wrote: | Internally passing around multiple 4k video outputs would use | this. Maybe you don't want the output via the port on the back | of the card but want to pass it through internally to some | other peripheral? I think this is how thunderbolt ports work | right (happy to be corrected)? | adgjlsfhk1 wrote: | current top of the line SSDs are close to maxing out 4 lanes of | PCIE gen 4. Gen 6 will make it a lot easier to find room for a | few 100gb/s ethernet connections which are always nice for | faster server to server networking, as well as making it easier | to use pcie only storage. | jeffbee wrote: | You can make SSDs arbitrarily fast by just making them | wider/more parallel. The reason it seems like PCIe 4 or 5 is | "fast enough" is because the SSDs are co-designed to suit the | host bus. If you have a faster host bus, someone will market a | faster SSD. | Zenst wrote: | > PCIe already provides more than enough bandwidth for the | fastest SSDs | | Today, yes and a few tomorrows as well. But even when a | standard is announced as finalized, it can be a long time | (Years even) until it makes it's way onto motherboard of the | consumer space. By which time, the current goalposts may start | looking closer than expected. | | I'm just glad they have one number, no endless revisions and | renaming of past releases and with that - thank you PCI-Sig. | smiley1437 wrote: | From what I understand, internally an SSD's bandwidth can be | easily scaled by spreading reads and writes across arbitrarily | large numbers of NVRAM chips within the SSD. | | So, you can just create SSDs that saturate whatever bus you | connect them to. | | In a sense then, it is the bus specification itself that limits | SSD throughput. | StillBored wrote: | Current SSDs, but there is literally nothing stopping people | from putting PCIe switches in place and maxing out literally | any PCIe link you can create. | | The limit then becomes the amount of RAM (or LLC cache if you | can keep it there) bandwidth in the machine unless one is doing | PCIe PtP. There are plenty of applications where a large part | of the work is simply moving data between a storage device and | a network card. | | But, returning to PtP, PCIe has been used for accelerator | fabric for a few years now, so a pile of GPGPU's all talking to | each other can also swamp any bandwidth limits put in place | between them for certain applications. | | Put the three together and you can see what is driving ever | higher PCIe bandwidth requirements after PCIe was stuck at 3.0 | for ~10 years. | nwmcsween wrote: | So my understanding of DDR5 has on chip ECC is needed due to the | ever increasing need for -Ofast, will/does PCIe have the same | requirements? | loeg wrote: | PCIe packets have always had error detection at the DLLP layer. | ksec wrote: | Yes. Forward Error Correction (FEC) [1] . The Anandtech article | wasn't in my feed when I submitted this. As it offers much more | technical details. | | [1] | https://www.anandtech.com/show/17203/pcie-60-specification-f... | Taniwha wrote: | reading the articles I think that FEC is being used to protect | link integrity - it's different from ECC on DRAM which is also | protecting the contents (against things like row-hammer and | cosmic rays) | monocasa wrote: | The on chip ECC for DDR5 isn't because of faster memory; it's | because of denser memory. It can rowhammer itself, and fox it | up silently. And the cheaper brands can start shipping chips | with defects like they do with flash, relying on the ECC to | paper over it. | rjzzleep wrote: | How long does it usually take to get consumer products of new | PCIe specs? Fast PCIe Gen 4 is only just getting affordable. Like | $350 for 2 TB NVMe ssds. | | Also, I remember playing around with PCI implementation on FPGAs | over a decade ago and timing was already not easy. What goes into | creating a PCIe Gen4/5 device these days? How can you actually | achieve that when you're designing it? Are people just buying the | chipsets from a handful of producers because it's unachievable | for normal humans? | | EDIT: What's inside the spec differences between say gen 3 and 6 | that allows for so many more lanes to be available? | willis936 wrote: | I've not done PHY development personally, but these interfaces | are called SerDes. SerDes is short for Serial-Deserializer. | Outside of dedicated EQ hardware, everything on the chips are | done in parallel so nothing needs to run at a multi-GHz clock. | [deleted] | Taniwha wrote: | I think that these days there's a lot of convergence going on | - everything is essentially serdes in some form - some chips | just have N serdes lanes and let you config them for | PCIe/ether/data/USB/etc as you need them, much as more | traditional SoCs config GPIOs between a bunch of other | functions like uarts/spi/i2c/i2s/pcm/... | ksec wrote: | >How long does it ....... | | It is not just about getting a product out ( i.e PCI-E 6.0 SSD | ), but also the platform support. ( i.e Intel / AMD Motherboard | support for PCI-e 6.0 ) | | Product Launch are highly dependent on Platform support. So far | Intel and AMD dont have any concrete plan on PCI-E 6.0, but I | believe Amazon could be ahead of the pack with their Graviton | platform. Although I am eager to see Netflix's Edge Appliance | serving up to 800Gbps if not 1.6Tbps per box. | iancarroll wrote: | I recently bought a few Zen 2 and Zen 3 HPE servers, and | found out only via trial and error that HPE sells Zen 2 | servers without Gen4 motherboard support! | | It seems they took the original Zen motherboards with Gen3 | and just swapped out the CPU. Only the Zen 3 has a refreshed | motherboard. Makes me now check things more carefully to be | sure. | dragontamer wrote: | > How long does it usually take to get consumer products of new | PCIe specs? | | Like 2 years. | | When PCIe 3.0 was getting popular, 4.0 was finalized. When 4.0 | was getting popular, 5.0 was finalized. Now that PCIe 5.0 is | coming out (2022, this year), PCIe 6.0 is finalized. | formerly_proven wrote: | There was a much bigger gap between 3.0 and 4.0. PCIe 3.0 was | available with Sandy or Ivy Bridge, so 2011/2012. PCIe 4.0 | was introduced with Zen 2 in 2019. | | We seem to be back to a faster cadence now however. | zamadatix wrote: | The large delay between 3.0 to 4.0 was a gap between | specifications (2010 to 2017) not a gap between | specification to implementations (2017 to 2019). | dragontamer wrote: | With the rise of GPU-compute, a lot of the supercomputers | are playing around with faster I/O systems. IBM pushed | OpenCAPI / NVLink with Nvidia, and I think that inspired | the PCIe ecosystem to innovate. | | PCIe standards are including more and more coherent-memory | options. It seems like PCIe is trying to become more like | Infinity Fabric (AMD) / UltraPath Interconnect (Intel). | jeffbee wrote: | PCIe 5 was standardized May 2019 and you could buy it at | retail in late 2021. 2 years good rule of thumb. | jiggawatts wrote: | I love how exponential growth can be utterly terrifying or | unfathomably amazing. | | Just a few years ago I was trying to explain to an IT manager | that 200 IOPS just doesn't cut it for their biggest, most | important OLAP database. | | He asked me what would be a more realistic number. | | "20,000 IOPS is a good start" | | "You can't be serious!" | | "My laptop can do 200,000." | KennyBlanken wrote: | > Just a few years ago | | > "My laptop can do 200,000." | | Only now (PCIe 4 and very recent controllers etc) are the | very latest top-end NVME drives hitting around 150k IOPS | (which isn't stopping manufacturers from claiming ten times | that; WD's NVME drive tests at around 150-200k IOPS and yet | they claim 1M) and only in ideal circumstances...reads and | writes coming out of the SLC cache, which typically under | 30GB, often a lot smaller except in the highest-end drives. | | Many drives that claim to reach that sort of performance are | actually using Host Backed Cache, ie stealing RAM. | | IOPS on SSDs drops precipitously once you exhaust any HBC, | controller ram, SLC cache, mid-level MLC cache...and start | having to hit the actual QLC/TLC. In the case of a very large | database, a lot of IO would be outside cache (though | certainly any index, transaction, logging, etc IO would | likely be in cache.) | Cullinet wrote: | I would love to pick up from the 200k IOPS laptop quote and | demo a RAM drive and then saturate the RAM drive into | swapping - I don't know how you could do this on stock | distros or Windows but it would make a great executive | suite demo of the issues. | jeffbee wrote: | There's not more lanes available. The generations are getting | faster just by increasing the transfer clock rate, up to PCIe | 5, and in PCIe 6 by increasing the number of bits per transfer. | The way they doubled the speed every generation was pretty | basic: the timing tolerances were chopped in half every time. | The allowable clock phase noise in PCIe 4 is 200x less than in | PCIe 1. The miracle of progress, etc. | | That miracle is somewhat over. They're not going to be able to | drive phase noise down below 1 femtosecond, so 6.0 changes | tactics. They are now using a fancy encoding on the wire to | double the number of bits per symbol. Eventually, it will look | more like wifi-over-copper than like PCI. Ethernet faster than | 1gbps has the same trend, for whatever it's worth. | bserge wrote: | Speaking of which, when is 10Gbit Ethernet coming to laptops? | Most have even lost the port ffs. | jeffbee wrote: | Many laptops have a thunderbolt port which serves a similar | purpose. On TB4 I get 15gbps in practice, and I can bridge | it to ethernet using either a dock or a PC (I use a mac | mini with a 10g port to bridge TB to 10ge). | rektide wrote: | > _How long does it usually take to get consumer products of | new PCIe specs?_ | | Personally I'm expecting this spec to drive pcie 5.0 adoption | into consumer space. | | Tbh consumers dont need this througjput. But given that | consumer space has remained stuck around 20x lanes off the cpu | (plus some for the chipset), the 5.0 and 6.0 specs will be | great for those wanting to build systems with more peripherals. | A 1x 16GBps link is useful for a lot. | robbedpeter wrote: | I'd be leery of dismissing the potential consumer demand. | That much throughput could be put to good use for a myriad of | personal and business functions, and software trends to | filing whatever hardware can provide. It's like every | prediction about users not needing X amount of ram or cpu or | dpi or network or storage space. | | Having that much throughput suggests paging and caching | across multiple disks, or using giant models (ml or others) | with precomputed lookups in lieu of real-time generation. At | any rate, all it takes is a minor inconvenience to overcome | and the niche will be exploited to capacity. | cjensen wrote: | PCIe Gen5 is now available in the latest Intel desktop | processors. There are very few lanes, so it can really only run | a single GPU, but that's covers a lot of the potential market. | eigen wrote: | Look like Desktop Chipset 600 series just supports gen 3 & | 4[1] and PCIe gen5 ports are only available on AlderLake | Desktop[2] not Mobile[3] processors. | | [1] https://ark.intel.com/content/www/us/en/ark/products/seri | es/... | | [2] https://ark.intel.com/content/www/us/en/ark/products/1345 | 98/... | | [3] https://ark.intel.com/content/www/us/en/ark/products/1322 | 14/... ___________________________________________________________________ (page generated 2022-01-11 23:00 UTC)