[HN Gopher] Marvell Announces First PCIe 5.0 NVMe SSD Controller...
       ___________________________________________________________________
        
       Marvell Announces First PCIe 5.0 NVMe SSD Controllers: Up to 14
       GB/S
        
       Author : ksec
       Score  : 198 points
       Date   : 2021-05-27 14:01 UTC (8 hours ago)
        
 (HTM) web link (www.anandtech.com)
 (TXT) w3m dump (www.anandtech.com)
        
       | harveywi wrote:
       | Marvell? I thought that DC had all the Flash controller rights.
        
         | FBISurveillance wrote:
         | I see what you did there.
        
         | deadcore wrote:
         | lol that made me chuckle
        
       | louwrentius wrote:
       | Try to imagine with all these developments how much performance
       | you can get from vertical scaling.
       | 
       | I bet that many applications don't need to care about horizontal
       | scaling with all the burdens involved before you outgrow the
       | performance of a 'single' box (aka stack overflow)
        
       | PedroBatista wrote:
       | That ~10W tho ...
       | 
       | I think right now the real money is in somebody who can implement
       | PCI-E 5 efficiently, or soon we'll see every SSD with a mini fan
       | on it.
       | 
       | ( It's not all PCI-E 5's fault, these controllers have been more
       | and more power hungry )
        
         | wtallis wrote:
         | You can always wish for more efficiency, but it's important to
         | understand that SSDs _have_ actually been getting steadily more
         | efficient in the Joules per Byte transferred sense. We 're just
         | seeing a simultaneous concentration and consolidation of
         | performance that is moving a bit faster, hence the need for
         | enterprise SSDs to abandon the M.2 form factor in favor of the
         | somewhat larger and easier to cool EDSFF family.
         | 
         | One way of looking at things is to realize that until now, a
         | storage controller handling 14 GB/s of traffic would have been
         | a full-height RAID card demanding 300 LFM of airflow, and now
         | we're putting that much throughput into a SSD with a heatsink
         | that's roughly 1U by 1in in cross section.
        
       | bmcahren wrote:
       | With the location of modern NVME slots, you can't even exceed
       | 500MB/s without a controller rate limiting due to heat.
        
       | PinkPigeon wrote:
       | I mean, I love the insane GB/sec figures, but does anyone else
       | mostly care about IOPS? These state 1.8M read and 1M write, which
       | sounds quite impressive.
        
         | CoastalCoder wrote:
         | Related: anyone know of a good video or diagram for helping CS
         | students get an intuition regarding the interplay of bandwidth
         | and latency? Including how saturating bandwidth increases
         | latency by causing queueing bottlenecks?
         | 
         | I'm looking for something a bit more visual and dynamic than
         | the old "station wagon full of tapes going down the highway"
         | imagery.
         | 
         | [EDIT: Just for clarification, I feel like I already have a
         | pretty good grasp on these concepts. I'm looking for good ways
         | to help others at the ~ undergrad level.]
        
           | bombcar wrote:
           | Here's a post on relative latencies that may be useful:
           | https://danluu.com/infinite-disk/
           | 
           | There was another post I saw recently comparing the increase
           | in disk SIZE over the last 30 years vs the increase in disk
           | SPEED vs LATENCY (so size in GB, speed in GB/s, latency in
           | IOPS) - and how size increases far outstripped speed which
           | outstripped latency, though all had improved.
           | 
           | Found it! The key is IOPS/GB as a metric.
           | 
           | https://brooker.co.za/blog/2021/03/25/latency-bandwidth.html
        
           | louwrentius wrote:
           | Maybe this doesn't answer your question exactly but I
           | addressed this topic in two blogpost, maybe it helps.
           | 
           | https://louwrentius.com/understanding-storage-performance-
           | io...
           | 
           | https://louwrentius.com/understanding-iops-latency-and-
           | stora...
        
           | wtallis wrote:
           | I do some rather coarse measurements of random read
           | throughput vs latency as part of my SSD reviews. See eg. the
           | bottom of https://www.anandtech.com/show/16636/the-inland-
           | performance-...
           | 
           | Those graphs cut off the essentially vertical latency spike
           | that results from enqueuing requests faster than the
           | sustained rate at which the drive can serve them. For a
           | different view in terms of queue depth rather than
           | throughput, there are some relevant graphs from an older
           | review that predates io_uring:
           | https://www.anandtech.com/show/11930/intel-optane-ssd-
           | dc-p48...
           | 
           | Generally speaking, latency starts increasing long before
           | you've reached a drive's throughput limit. Some of this is
           | inevitable, because you have a relatively small number of
           | channels (eg. 8) and dies to access in parallel. Once you're
           | up to the throughput range where you have dozens of requests
           | in flight at a time, you'll have constant collisions where
           | multiple requests want to read from the same
           | plane/die/channel at once, and some of those requests have to
           | be delayed. But that's mostly about contention and link
           | utilization between the SSD controller and the NAND flash
           | itself. The PCIe link is pretty good about handling
           | transactions with consistently low latency even when on
           | average it's mostly busy.
        
           | uyt wrote:
           | Are you referring to Little's Law?
        
         | MrFoof wrote:
         | >... _but does anyone else mostly care about IOPS_
         | 
         | IOPS helps, but for the average user, hundreds of thousands is
         | already functionally infinite. What matters at this point is
         | latency. Where you really feel that is Queue Depth 1. Where you
         | read a file that points you to other files, that point you to
         | other files, etc. That is the exact case where the computer is
         | still making you wait.
         | 
         | This happens when you start your operating system, it starts
         | services, you launch apps, etc. Driving that latency down is
         | the biggest improvement you'll ever see past where we are today
         | in terms of IOPS and throughput.
         | 
         | This is where the latest Optane actually shines. Optane doesn't
         | win on IOPS or throughput, but where it shines is its crazy
         | latency _(delivered at relatively low power levels)_. Where
         | latencies are 10% of that of even the highest end PCIe 4.0 NVMe
         | SSDs. Do something like launch 20 applications at once, and it
         | 'll be done in a fraction of a time compared to even something
         | like a Samsung 980 Pro because of latency being more around 10
         | ms instead of 100 ms.
         | 
         | PCIe 5.0 SSDs will cut latencies down to where Optane is today,
         | but driving latency under 1 ms is where we'll get into a new
         | level of crazy.
        
           | jiggawatts wrote:
           | I can't upvote this enough.
           | 
           | Related: Notice how the public cloud marketing material tends
           | to focus on scalability over other metrics? That's because
           | scaling horizontally for them is _easy_ : They just plop down
           | more "stamps" -- a set of clusters and controllers that is
           | their unit of scale. Need 1,000 more servers in US East? Plop
           | down 10 more stamps of 100 servers. Easy!
           | 
           | Except of course this involves an awful lot of networking,
           | with long cable runs and many hops and virtualisation layers.
           | The end result is that you can't get _anywhere_ near the
           | underlying storage latency.
           | 
           | Azure's Premium SSD has "write flush" latencies of about 4
           | milliseconds according to my measurements, which is easily
           | 100x slower than what my laptop can do with a now very
           | outdated Samsung NVMe SSD.
           | 
           | Notice that if you go to their marketing page, they talk
           | about "low latency" and "lowest latency", but they have _no
           | numbers?_ Meanwhile the MB /s and IOPS is stated with
           | numbers: https://azure.microsoft.com/en-
           | us/pricing/details/managed-di...
        
           | Dylan16807 wrote:
           | Isn't launching 20 applications _at once_ the realm where
           | flash competes the best?
        
       | WesolyKubeczek wrote:
       | 1) is the actual NAND flash faster, or are we talking about going
       | from "awesome" to "abysmal" as soon as you run out of DRAM/SLC
       | caches or what have you? Which, given this kind of bandwidth, is
       | going to be sooner rather than later.
       | 
       | 2) this plus QLC cells, which, durability-wise, make TLC look
       | good, makes me anticipate headlines like "This new PCIe 5.0 SSD
       | ran out of its rated TBW in a week!"
        
         | NullPrefix wrote:
         | Chia is a godsend for storage prosumers. All you have to do is
         | look at the speed and check if the warranty is voided by Chia.
        
         | wtallis wrote:
         | These controllers are for enterprise drives where SLC caching
         | is almost unheard of and all the performance specs are for
         | sustained performance on a full drive. But the best performance
         | may only be attainable on a drive with higher capacity than you
         | can afford.
        
         | derefr wrote:
         | > Which, given this kind of bandwidth, is going to be sooner
         | rather than later.
         | 
         | I don't see why -- like NAND up until now, it keeps up by
         | adding more separately-addressable chips to the board and
         | striping writes across them. A 512GB SSD with this chip would
         | hit the wall pretty soon, but they wouldn't waste this
         | controller on a 512GB SSD. They'd use it for 16TB+ SSDs.
        
         | vbezhenar wrote:
         | > "This new PCIe 5.0 SSD ran out of its rated TBW in a week!"
         | 
         | If this drive supports claimed 9 GB/s write speed, you can
         | write 324 TB at 10 hours. Samsung QLC warranty for 1 TB drive
         | is 360 TBW.
        
           | wtallis wrote:
           | 1 TB of QLC can't get anywhere close to 9 GB/s write speed.
           | The best is currently about 40 MB/s for a 1 Tbit die, so 320
           | MB/s for 1 TByte. The slow write speed of QLC generally
           | prevents you from burning out a drive in less than a few
           | weeks.
        
       | NikolaeVarius wrote:
       | Is there a theoretical "practical" limit for how fast these
       | things can physically get (in this particular form factor)
       | 
       | My understanding is that NVMe interface is pretty much as close
       | as you can get to the CPU without being integrated. Is there a
       | world where these things can operate as fast as RAM?
        
         | dragontamer wrote:
         | > My understanding is that NVMe interface is pretty much as
         | close as you can get to the CPU without being integrated.
         | 
         | The best for consumer technology, yes. But future I/O protocols
         | continue to improve and get better.
         | 
         | NVidia / IBM's collaboration on OpenCAPI (which is deployed in
         | Summit as a CPU/GPU interface) has 300GBps I/O between the CPU
         | and GPU, far faster than NVMe speeds (and even DDR4 RAM
         | bandwidth).
         | 
         | And future chips may go even faster. I/O is probably one of the
         | fastest growing aspects of a modern computer. PCIe 5.0, CXL,
         | OpenCAPI, etc. etc. Lots and lots of new technology coming into
         | play here.
         | 
         | There are even some products on making Flash-RAM work on the
         | DDR4 interface. Non-volatile memory is what that's called.
         | Intel's Optane works pretty well on that. Its not very
         | mainstream but I hear reports that its impressive (slower than
         | real RAM of course, but storage that has the bandwidth of RAM
         | is still cool).
         | 
         | > Is there a world where these things can operate as fast as
         | RAM?
         | 
         | Well... yesish. Flash is a kind of RAM (random access memory).
         | 
         | To answer your fundamental question though: No. Flash is
         | fundamentally higher latency than DRAM (aka: DDR4). But with
         | enough parallelism / big enough flash arrays (or DRAM arrays),
         | you can continue to get higher and higher bandwidth.
         | 
         | At the top-of-the-line is SRAM, the RAM used to make L1 cache
         | and registers inside of CPUs. This is very expensive and rarely
         | used.
         | 
         | --------
         | 
         | Then you've got various non-mainstream RAMs: FRAM, MRAM,
         | Optane, etc. etc.
        
         | baybal2 wrote:
         | Flash chips themselves can't operate as fast as RAM. You can
         | get as much bandwidth as RAM, but you can not get it as
         | physically fast as RAM i.e. faster than 200ns
        
           | anarazel wrote:
           | Why? DMA can do transfers to/from CPU caches.
        
             | baybal2 wrote:
             | NAMD flash cells themselves can't charge/discharge faster
        
           | m4rtink wrote:
           | What about those non volatile RAM technologies (IIRC called
           | 3D XPoint) Intel is using for their Optane stuff ?
           | 
           | It seems to be kinda in between RAM and flash spec wise.
        
             | Der_Einzige wrote:
             | I actually was one of the people who did performance
             | benchmarking of 3D XPoint before it came out. In app direct
             | mode, you can maybe eek out 70% of the throughout and about
             | 3x worse latency. Also, not all apps support app direct
             | mode.
             | 
             | Many customers try to use 3D XPoint as another part of the
             | cache hierarchy in-between regular ssd and ram. It's
             | actually pretty neat for faas workloads which want
             | containers to be "warm" rather than "hot" or "cold"...
        
               | wtallis wrote:
               | To clarify for readers who aren't current on all the
               | lingo: app direct mode refers to one of the modes for
               | using the Optane persistent memory modules that connect
               | to the CPU's DRAM controller. It doesn't apply to the
               | Optane NVMe SSDs that use PCI Express and require the
               | software overhead of a traditional IO stack. In a few
               | years, something like CXL may allow for something close
               | to the app direct mode to be usable on persistent memory
               | devices with an SSD-like form factor, but we're not there
               | yet.
        
             | the8472 wrote:
             | Optane comes in two flavors. As NVMe storage and as NVDIMM.
             | The former sits a bit below flash in terms of latency. The
             | latter sits between a bit above DRAM in terms of latency
             | and is byte-addressable.
        
             | baybal2 wrote:
             | Intel-Micron tried, we seen how it went.
             | 
             | It wasn't really that fast, or wear resistant to replace
             | flash.
        
               | LinAGKar wrote:
               | Are you sure? That's not what I've heard. It's just
               | really expensive, and not very dense.
        
               | baybal2 wrote:
               | Rephrasing: the speed, and write endurance were
               | sufficiently jaw dropping enough to beat flash's density,
               | and cost.
        
         | gameswithgo wrote:
         | Just put a battery on ram and you have an sdd as fast as ram,
         | in principle.
         | 
         | The big thing about SSDs is that while sequential reads and
         | writes have steadily improved, many workloads have not. From a
         | cheap SSD to the best, there is very minimal difference in
         | things like "how fast does my computer boot", "how fast does my
         | game or game level load" and "how fast does visual studio
         | load", or "how fast does gcc compile"
        
           | programmer_dude wrote:
           | A battery is not a solid state device. It often has
           | liquids/gels (electrolyte) in it.
        
           | toast0 wrote:
           | > From a cheap SSD to the best, there is very minimal
           | difference.
           | 
           | I've got a cheap SSD that will change your mind. It has
           | amazingly poor performance though, I agree with the general
           | concept that while quantitative differences can be measured,
           | there's not a qualitative difference between competent SSDs.
        
           | kmonsen wrote:
           | Is your second paragraph really true? These exact worlds are
           | the reasons many have upgraded their ssd, the PS5 will only
           | let you start new games from they built in fast hard drive
           | for example.
        
             | topspin wrote:
             | "Is your second paragraph really true?"
             | 
             | Partially, on a PC. The difference, on a PC, between a high
             | end NVMe SSD and a SATA SSD is minimal for most use cases;
             | small enough that an average user won't perceive much
             | difference. The workloads in question (booting, loading a
             | program, compiling code, etc.) involve a lot of operations
             | that aren't bound by IO performance (network exchanges,
             | decompressing textures, etc.) and haven't been optimized to
             | maximize the benefit of high performance storage so the
             | throughput and IOPS difference of the storage device don't
             | entirely dominate.
        
             | wincy wrote:
             | The PS5s operating system and even how they package game
             | files had to be developed from the ground up to realize
             | these gains. There's a good technical talk about it from
             | one of Sony's engineers.
             | 
             | https://m.youtube.com/watch?v=ph8LyNIT9sg
        
               | emkoemko wrote:
               | so Sony made their own OS for PS5? they are not using
               | FreeBSD anymore?
        
               | wtallis wrote:
               | They might still be using BSD stuff, but they developed
               | new IO and compression offload functionality that doesn't
               | match any off the shelf capabilities I'm aware of.
        
               | touisteur wrote:
               | Might be close to spdk + compress accelerators?
        
             | wtallis wrote:
             | The PS5 and Xbox Series X were designed to offer high-
             | performance SSDs as the least common denominator that game
             | developers could rely on, so that game devs could start
             | doing things that aren't possible if you still have to
             | allow for the possibility of running off a hard drive.
             | That's still largely forward-looking; most games currently
             | available still aren't designed to do that much IO--but the
             | consoles will treat all new games as if they really rely on
             | that expectation of high performance IO, and that means
             | running them only from the NVMe storage.
        
           | blackoil wrote:
           | Are these things blocked on disk i/o?
        
           | vbezhenar wrote:
           | I second that. I have M.2 SSD on laptop and SATA SSD on
           | desktop. There's no perceived difference on disk operations
           | for me outside of corner cases like copying huge file. But
           | there's very noticeable difference between prices.
        
             | reader_mode wrote:
             | Build times should be better no ? Unless you have enough
             | ram to keep it all im cache ?
        
             | staticassertion wrote:
             | That may be due to the fact that your software (like your
             | OS) was built for a world of slow spinning disks, or maybe
             | semi-slow SSDs at best. Not a lot of code is written with
             | the assumption that disks are actually fast, and it's not
             | too common to see people organizing sequential read/writes
             | in their programs (except for dbs).
        
             | programmer_dude wrote:
             | +1, I installed an M.2 SSD in my desktop looking at the
             | quoted 10x difference in speed but I was taken aback at the
             | lack of any perceivable improvement. Money down the drain I
             | guess.
        
             | prutschman wrote:
             | Do you have an NVMe M.2 device specifically? The M.2 form
             | factor supports both NVMe and SATA (though any given device
             | or slot might not support one or the other).
             | 
             | I've got a workstation that has both SATA 3 and M.2 NVMe
             | SSDs installed.
             | 
             | The SATA 3 device can do sustained reads of about 550
             | MB/sec, fairly close to the 600 MB/sec line rate of SATA 3.
             | 
             | The NVMe device can do about 1.3 GB/sec, faster than
             | physically possible for SATA 3.
        
               | vbezhenar wrote:
               | Yes, I have an NVMe M.2 device. Samsung 970 EVO Plus.
        
           | branko_d wrote:
           | Similar effect existed for the HDDs of old.
           | 
           | As the platter density increased, the head could glide over
           | more data at the same spindle speed and you would get
           | increasingly higher sequential speed. But you would _not_ get
           | much better latency - physically moving the head to a
           | different track could not be done significantly faster. With
           | each new generation, you would get only marginally better
           | speed for many workloads that users actually care about. The
           | latest HDDs could saturate SATA bus with sequential reads,
           | but would still dip into KB territory (not MB, let alone GB)
           | for sufficiently random-access workloads.
           | 
           | SSDs are similar in a sense that they can be massively
           | parallelized for the increased throughput, but the latency of
           | an individual cell is much harder to improve. Benchmarks will
           | saturate the I/O queue and reach astronomical numbers by
           | reading many cells in parallel, but for most desktop users,
           | the queue depth of 1 (and the individual cell latency) is
           | probably more relevant. That's why a lowly SATA SSD will be
           | only marginally slower than the newest NVMe SSD for booting
           | an OS or loading a game.
        
             | TwoBit wrote:
             | Doesn't it depend a lot on the game? A well designed game
             | could get a lot more out of a fast SSD.
        
         | digikata wrote:
         | NVMe is built on PCIe, so latency wise, they will be limited by
         | the PCIe latency (roughly an order of magnitude slower than the
         | memory bus though pcie 5 may be smaller than that) + media
         | latency. Throughput wise, they are limited by the number of
         | PCIe lanes that the controller supports, and the system it
         | plugs into has been sized to allocate.
        
       | vmception wrote:
       | Which of your applications or your clients applications running
       | at the same time are currently bottlenecked?
        
         | Strom wrote:
         | Even something as simple as grep is bottlenecked by disk speeds
         | right now.
        
       | api wrote:
       | That's getting to be RAM speeds, but of course not RAM latencies.
        
       | ChuckMcM wrote:
       | Pretty impressive. And if it can really do 4 GB/s[1] of random
       | write performance is super helpful in database applications.
       | 
       | I am wondering where the "Cloud SSD" stuff takes it relative to
       | general purpose use. Does anyone have any insights on that?
       | 
       | [1] They quote 1 million IOPS and assuming a 4K block size, which
       | is a good compromise between storage efficiency and throughput,
       | gives the 4GB/s number.
        
         | wtallis wrote:
         | Here's a version of the OCP NVMe Cloud SSD Spec from last year:
         | https://www.opencompute.org/documents/nvme-cloud-ssd-specifi...
         | 
         | It covers a lot of ground, but as far as I'm aware nothing in
         | there really makes drives less suitable for general server use.
         | It just tightens up requirements that aren't addressed
         | elsewhere.
        
           | ChuckMcM wrote:
           | Yeah, pretty much. Thanks I've added it to my specifications
           | archive.
        
       | ksec wrote:
       | 20x20mm Package. That is quite large.
       | 
       | ~10W Controller. Doesn't matter on a Desktop or Server. But we
       | sort of hit the limit what could be done on a Laptop.
       | 
       | Having said that it doesn't mention what node it was fabbed on. I
       | assume there are more energy efficiency could be squeezed out.
       | 
       | <6us Latency. Doesn't say what percentile it is and under what
       | sort of condition. But Marvell claim this is 30% better than
       | previous Marvell SSD Controller.
       | 
       | I think ServeTheHome article [1] is probably better. ( Cant
       | change it now >< )
       | 
       | We also have PCI-E 6.0 [1] finalised by the end of this year
       | which we cant expect to be in 2023/2024. SSD Controller with
       | 28GB/s.
       | 
       | I am also wondering if we are approaching the end of S-Cruve.
       | 
       | [1] https://www.servethehome.com/marvell-bravera-
       | sc5-offers-2m-i...
       | 
       | [2] https://www.anandtech.com/show/16704/pci-
       | express-60-status-u...
        
         | jagger27 wrote:
         | 10W definitely matters on servers. You can easily have a dozen
         | of these in 1U. 120W isn't nothing to dissipate.
         | 
         | I wonder what node this controller is made on. If it's made on
         | TSMC N7 then they could cut power consumption roughly in half
         | by going to N5P. The package size makes me wonder if it's an
         | even older node however.
        
           | dogma1138 wrote:
           | It's not an issue in servers really, look at how much power
           | high end NIC consume heck 10GBE SFP+ modules can consumer 5W>
           | each and you can easily have 48> of those in a switch...
        
             | dylan604 wrote:
             | But a switch doesn't include a hairdrier, er, GPU
             | generating heat within the enclosure. Server cases have
             | potentially multiple GPUs, CPUs, plus now these 10W
             | controller chips
        
               | lostlogin wrote:
               | You are correct.
               | 
               | Some switches could do with much better cooling. A POE
               | switch or one that is putting 10gb down a cat 6 cable is
               | very toasty though. I've got one that is borderline too
               | hot to touch. Thanks Ubiquiti.
        
               | touisteur wrote:
               | Oh you don't want your hand near a Mellanox connectx 6
               | then. 200GbE doesn't come cold...
        
             | zamadatix wrote:
             | Most switches don't allow 5W pull in every port, those
             | levels are usually only found in 10G copper SFPs which
             | can't reach full distance due to power requirements so
             | typically pull max allowed (or higher) levels. Typical 10G
             | SFP+ SR or Twinax will consume about a watt per module. The
             | ASIC may be a couple hundred watts under load.
             | 
             | Servers typically have much higher power density unless
             | you're talking 400G switches compared to low end servers.
        
         | lmilcin wrote:
         | > ~10W Controller. Doesn't matter on a Desktop or Server.
         | 
         | Of course it does. For one, that's heat you have to efficiently
         | remove or face SSD throttling back on you or degrade over time.
         | It does not make sense to buy expensive hardware if it is going
         | to show impaired performance due to thermal throttling.
         | 
         | This makes the business of putting together your PC this much
         | more complicated, because up until recently you only had to
         | take care about CPU and GPU and everything else was mostly
         | afterthought.
         | 
         | We are already facing motherboards with their own fans, now I
         | suppose it is time for SSDs.
        
           | dragontamer wrote:
           | Weren't 10,000 or 15,000 RPM hard drives like 15W or 20W or
           | something?
           | 
           | These 2U or 4U cases, or tower desktop cases, were designed
           | to efficiently remove heat from storage devices, as well as
           | the 3000W++ that the dual socket CPU and 8way GPUs will pull.
           | 
           | 10W is tiny for desktop and server. Barely a factor in
           | overall cooling plans.
        
             | lmilcin wrote:
             | They were, but 3,5" drives are large hunks of aluminium
             | with many times the surface. Meaning that as long as you
             | have some airflow around them and the air is not too hot
             | they are fine.
             | 
             | Also the heat was mostly generated by motor and an actuator
             | and not the controller.
        
               | zamadatix wrote:
               | High performance m.2 drives come with removable finned
               | heatsinks for this reason. Without them they rely on
               | throttling during sustained heavy workload. Dissipating
               | 10 Watts in a desktop isn't the concern.
        
               | Matt3o12_ wrote:
               | The heatsink doesn't really work, though, and is
               | marketing for the most part [1].
               | 
               | I have a crucial P1 NVME SSD and I can make it overheat
               | pretty reliably. Pretty much any synthetic workload makes
               | it overload if the SSD is empty (it reaches 70deg pretty
               | quickly and even starts throttling until it reaches 80deg
               | and the whole system starts shuttering because of extreme
               | throttling do it doesn't damage itself. Although I have
               | not properly tested it, it seems that not using any
               | heatsinks from my motherboard makes the temps actually
               | better but it still overheats.
               | 
               | The main reason it can overheat quickly is probably
               | because its sitting in a really bad position where it
               | gets close to zero airflow despite being in an airflow
               | focused case. Most motherboards place the nvme slot
               | directly under the GPU. The main problem seems to be that
               | the controller is overheating when it's writing at close
               | to 2000 MB/s. It's also important to note that only the
               | controller (an actual relatively powerful ARM processor),
               | not the flash memory, seems to overheat.
               | 
               | Fortunately, this is mostly not an issue because it's a
               | QLC drive and the workload is unrealistic in the real
               | world. When writing to an empty drive at 2000MB/s (Queue
               | depth 4, 128k sequential writes), it takes 2 minutes
               | until the cache is full. The way its currently used, it
               | takes 30 secs for the cache to become full and for write
               | speeds to drop to 150MB/s. The only way it has every
               | overheated in the real world was during the loading
               | screen of a gameplay when it reached 78C quickly (and I
               | only noticed it in the hardware monitor). If the GPU
               | hadn't heated up the nvme drive before (it was sitting at
               | 65C mostly idle), and starved it for air, I doubt it
               | would have hit 60C.
               | 
               | So until motherboards start placing nvme where it can get
               | some actual cooling, or they make actual functioning
               | heatsinks, their power usage can make a difference.
               | 
               | [1]: https://www.gamersnexus.net/guides/2781-msi-m2-heat-
               | shield-i... but there are many more articles/forum posts
               | with similar issues.
        
               | zamadatix wrote:
               | The Crucial P1 is a budget SSD that doesn't come with a
               | proper heatsink, similar in quality to that god awful
               | "heat shield" in that linked review. When I say "High
               | performance m.2 drives come with removable finned
               | heatsinks" I mean an actual high performance drives that
               | come with a finned heatsinks like
               | https://www.amazon.com/Corsair-Force-
               | MP600-Gen4-PCIe/dp/B07S... not examples of budget drives
               | paired with flat pieces of metal.
               | 
               | Also your high performance SSD should be going in the
               | direct-to-CPU slot to the right of the GPU, not under it.
        
               | baybal2 wrote:
               | The thing is the flash chips themselves are relatively
               | cool.
               | 
               | It's the controller that gets most of the heat from
               | running PCIE at top speed.
        
           | matheusmoreira wrote:
           | Would be cool to have a general liquid cooling solution for
           | all components. We'd install a radiator outside our homes
           | just like an air conditioning unit and then connect the
           | computers to it.
        
         | AtlasBarfed wrote:
         | Probably not a small node, remember flash gets more fragile the
         | smaller the process.
         | 
         | SSD makers are layering on larger nodes, and focusing on
         | multibit (they are basically at PLC/5bit flash for consumer or
         | non-heavy wear, which is frankly a bit nuts)
        
           | Dylan16807 wrote:
           | This is a controller, not flash.
        
         | bhouston wrote:
         | 2023/2024 I suspect you meant to write.
         | 
         | I do like the fact that SSD bandwidth may be approaching memory
         | bandwidth.
        
           | ksec wrote:
           | >2023/2024 I suspect you meant to write.
           | 
           | ROFL. Thanks. Keep thinking about the good old days.
        
           | CyberDildonics wrote:
           | SSD bandwidth is not really approaching modern memory
           | bandwidth - a computer with one of these will probably have 4
           | to 8 channels of DDR4 at 25GB/s each with DDR5 released about
           | a year ago. That is 100GB/s to 200GB/s currently and more by
           | the time PCIe 5 becomes a reality.
           | 
           | I am sure if you go back a few years you can find systems
           | that have the same or less memory bandwidth than this has
           | now, but they have both been moving forward enough that SSDs
           | are still not close. That being said, between bandwidth and
           | random access times, swapping out to disk backed virtual
           | memory is very pragmatic and isn't the death of performance
           | it used to be.
        
             | dekhn wrote:
             | I think a fairer comparison would be 1 DDR channel to one
             | SSD, or comparing the 8 channels to a multi-SSD striped
             | RAID array. SSDs are ~7GB/sec, so I would say that SSD
             | bandwidth is roughly within one or two orders of magnitude
             | slower than RAM. I certainly wouldn't want to replace an
             | app optimized to use all the RAM on a machine with one
             | running on a lower-RAM SSD machine, though.
        
               | muxator wrote:
               | Maybe I'm old. I never used to think that an application
               | that is able to use my whole ram is optimized. It evokes
               | exactly the opposite impression, indeed.
               | 
               | I think I understand what you mean, but the gut reaction
               | is that one.
               | 
               | Yeah, I am really old, after all.
               | 
               | Edit: let me explain. Something that is able to use the
               | whole space of a flat memory model is way less
               | sophisticated than something that is able to deal with a
               | complex memory hierarchy. Our machines are indeed a
               | complex pyramid of different subsystems with varying
               | bandwidth and latency characteristics. A program that is
               | able to embrace the inherent hierarchical nature of our
               | machines (or multi-node systems) is way more "optimized",
               | according to my sensibility.
        
               | dekhn wrote:
               | I'm talking about situations like search, where you hold
               | the entire index in ram. Total # of machines = size of
               | index / indexserver ram. Usually the apps that run on
               | these have, say, 96 cores and they're using about 80, and
               | the idle time is mostly instructions waiting for memory
               | fetches.
               | 
               | Typically that index fronts a disk repository which
               | wouldn't fit in RAM, although over time, what fit in RAM,
               | what fit in disk, what lived in RAM, what got cached in
               | RAM, etc, have changed over time.
               | 
               | BTW, I'm probably of the same generation as you and the
               | single most important lesson I ever learned for computing
               | performance was "add more RAM"; in the days when I first
               | started using Linux with 4MB of RAM, it wasn't enough to
               | do X11, g++ and emacs all at the same time without
               | swapping, so I spent my hard-earned money to max out the
               | RAM, at which point it didn't swap and I could actually
               | do software development quickly.
        
             | terafo wrote:
             | If you have 8 of these on a single system, you are getting
             | very close to RAM bandwidth. Still, there is issue of
             | latency and TBW.
        
               | jopsen wrote:
               | Now if we could just make a Beowulf cluster of these...
        
               | bhouston wrote:
               | Beowulf, that is a name i haven't heard in a long time.
        
             | TwoBit wrote:
             | IMO disk tech of 25 GB/s vs 100 GB/s for memory counts as
             | approaching.
        
             | anarazel wrote:
             | It really depends on the type of memory usage though. On
             | current Intel multi-socket the bandwidth a single process
             | can have for core meditated memory accesses, even for node
             | local memory, is seriously disappointing. Often < 10GB/s.
             | Yes, it scales reasonably nicely, but it's very painful to
             | tune for that, given that client and low core count, single
             | socket, SKUs end up with > 35GB/s. And that this affects
             | you even from within VMs that are on a single socket.
             | 
             | I heard that ice lake sp improves a bit in the area, but
             | haven't gotten access to one yet.
        
       | walrus01 wrote:
       | Given a theoretical either/or choice between a PCIE 5.0 SSD, or
       | more PCI-E lanes using the tech we have now, I would rather have
       | a greater number of PCI-E 4.0 lanes in single socket
       | consumer/workstation grade motherboards.
       | 
       | Leaving open the possibility for dual NVME SSD in a workstation
       | along with a x16 video card, and 10Gbps NICs.
        
       | Synaesthesia wrote:
       | Still not good enough for PS5 hey Sony?
        
       | dvfjsdhgfv wrote:
       | It looks like we'll get fantastic speeds in a few years. Now it's
       | time to take care of durability.
        
       | dstaley wrote:
       | Obviously there's always a customer for faster speeds, but have
       | we even hit the upper threshold of PCIe 4 in the consumer market?
        
         | wmf wrote:
         | SN850 is getting close to the limit of PCIe 4.0.
         | https://www.anandtech.com/show/16505/the-western-digital-wd-...
        
         | wtallis wrote:
         | The first wave of consumer gen4 SSDs that all used the Phison
         | E16 controller were only good for about 5 GB/s out of the ~7
         | GB/s possible (and 3.5 GB/s on PCIe gen3). But the newer gen4
         | consumer drives that started hitting the market last fall come
         | a lot closer, and this summer a lot of those are getting
         | refreshed with newer, faster NAND that will have PCIe 4.0 as
         | thoroughly saturated as PCIe 3.0 has been for the past few
         | years. Phison has already clearly stated that their next high-
         | end controller after the recently-launched E18 will be a PCIe
         | gen5 chip, and the E18 is fast enough to finish out the gen4
         | era.
        
           | dstaley wrote:
           | Absolutely wild that PCIe 4.0 was saturated just two years
           | after introduction in the consumer market.
        
       ___________________________________________________________________
       (page generated 2021-05-27 23:00 UTC)