[HN Gopher] Hyperscale in your Homelab: The Compute Blade arrives
       ___________________________________________________________________
        
       Hyperscale in your Homelab: The Compute Blade arrives
        
       Author : mikece
       Score  : 52 points
       Date   : 2023-01-23 15:27 UTC (1 days ago)
        
 (HTM) web link (www.jeffgeerling.com)
 (TXT) w3m dump (www.jeffgeerling.com)
        
       | exabrial wrote:
       | I really want something like NVidia's upcoming Grace CPU in blade
       | format, but something where I can provision a chunk of SSD
       | storage off a SAN via some sort of PCI-E backplane. Same form
       | factor like the linked project.
       | 
       | I'm noticing that our JVM workloads execute _significantly_
       | faster on ARM. Just looking at the execution times on our lowly
       | first-gen M1s Macbooks is significantly better than some of our
       | best Intel or AMD hardware we have racked. I'm guessing it all
       | has to do with Memory bandwidth.
        
       | robotburrito wrote:
       | This is cool. But it's super hard to compete w/ a computer you
       | bought off craigslist for 25$.
        
       | ilyt wrote:
       | I always wanted such thing for various "plumbing" services
       | (DHCP/DNS/wifi controller etc) but lack of ECC and OOB management
       | kinda disqualifies it for anything serious.
       | 
       | >He's running forty Blades in 2U. That's:                   >
       | >      160 ARM cores         >      320 GB of RAM         >
       | (up to) 320 terabytes of flash storage         >
       | 
       | >...in 2U of rackspace.
       | 
       | Yay that's like... almost as much as normal 1U server can do
       | 
       | Edit: I give up, HN formatting is idiotic
        
         | LeonM wrote:
         | > I always wanted such thing for various "plumbing" services
         | (DHCP/DNS/wifi controller etc)
         | 
         | You don't need a cluster for that, even a 1st gen Pi can run
         | those services without any problem.
        
           | guntherhermann wrote:
           | I can only speak for Raspi 3B+, but I agree.
           | 
           | I have multiple services running on it (including pihole,
           | qbittorrent, vpn) and it's at about 40% mem usage right now.
        
         | timerol wrote:
         | Also not noted: 320 TB in 40 M.2 drives will be extremely
         | expensive. Newegg doesn't have any 8 TB M.2 SSDs under $1000.
         | $0.12/GB is about twice as expensive as more normally-sized
         | drives, to say nothing of the price of spinning rust.
        
         | bee_rider wrote:
         | Just the Pi's are $35 a pop, right? So that's $1400 of Pi's, on
         | top of whatever the rest of the stuff costs. Wonder how it
         | compares to, I guess, a whatever the price equivalent AMD
         | workstation chip is...
        
           | philsnow wrote:
           | It seems they're the ones with 8 GB of ram, so probably
           | closer to $75 each.
        
             | bee_rider wrote:
             | I'd be interested to see if anyone had any application
             | other than CI for Raspberry Pi programs, I really can't see
             | one.
        
         | singron wrote:
         | It's actually 3U since the 2U of 40 pis will need almost an
         | entire 1U 48 port PoE switch instead of plugging into the TOR.
         | The switch will use 35-100W for itself depending on features
         | and conversion losses. If each pi uses more than 8-9W or so
         | under load, then you might actually need a second PoE switch.
         | 
         | If you are building full racks, it probably makes more sense to
         | use ordinary systems, but if you want to have a lot of actual
         | hardware isolation at a smaller scale, it could make sense.
         | 
         | In some colos, they don't give you enough power to fill up your
         | racks, so the low energy density wouldn't be such a bummer
         | there.
        
         | marginalia_nu wrote:
         | I do think this is sort of fool's gold in terms of actual
         | performance. Even though the core count and RAM size is
         | impressive, those cores are talking over ethernet rather than
         | system bus.
         | 
         | Latency and bandwidth is atrocious in comparison, and you're
         | going to run into problems like no individual memory allocation
         | being able to exceed 8 Gb.
         | 
         | Like for running a hundred truly independent jobs then sure,
         | maybe you'll get equivalent performance, but that's a very
         | unique scenario that is rare in the real world.
        
           | DrBazza wrote:
           | It probably lends itself to tasks where CPU time is much
           | greater than network round trip. Maybe scientific problems
           | that massively parallel. Way back in the 90s I worked with
           | plasma physics guys that used a parallel system on "slow" Sun
           | boxes. I can't remember the name of the software though.
        
           | varispeed wrote:
           | I built such a toy cluster once to see for my self and gave
           | up. It is too slow to do anything serious. You can be much
           | better off by just buying older post lease server. Sure it
           | will consume more power, but conversely you will finish more
           | tasks in shorter time, so advantage of using ARM in that case
           | may be negligible. If it was Apple's M1 or M2, that would
           | have been a different story though. RPi4 and clones are not
           | there yet.
        
             | marginalia_nu wrote:
             | I overall think people tend to underestimate the overhead
             | of clustering. It's always significantly faster to run a
             | computation on one machine than spread over N machines with
             | hardware of (1/N) power.
             | 
             | That's not always a viable option because of hardware
             | costs, and sometimes you want redundancy, but those
             | concerns are on an orthogonal axis to performance.
        
               | jbverschoor wrote:
               | Well, a complete M1 board, which is basically about as
               | large as half an iPhone mini, is fast enough. It's also
               | super efficient. So I'm still waiting for Apple to
               | announce their cloud.
               | 
               | They're currently putting Mx chips in every device they
               | have, even the monitors. It'll be the base system for any
               | electric device. I'm sure we'll see more specialized
               | devices for different applications, because at this
               | point, the hardware is compact, fast, and secure enough
               | for anything, as well as the software stack.
               | 
               | Hello Apple Fridge
        
               | marginalia_nu wrote:
               | Fast enough for what?
        
               | convolvatron wrote:
               | Lines gets blurred when you are on a supercomputer
               | interconnect and a global address space or even rdma
        
               | dekhn wrote:
               | the fastest practical interconnects are roughly 1/10th
               | the speed of local RAM. Because of that, if you use
               | interconnect, you don't use it for remote RAM (through
               | virtual memory).
               | 
               | I don't think anybody in the HPC business really pursued
               | mega-SMP after SGI because it was not cost-effective for
               | the gains.
        
           | PragmaticPulp wrote:
           | >I do think this is sort of fool's gold in terms of actual
           | performance.
           | 
           | It's a fun toy for learning (and clicks, let's be honest).
           | 
           | It's not a serious attempt at a high performance cluster or
           | an exercise in building an optimal computing platform.
           | 
           | Enjoy the experiment and the uniqueness of it. Nobody is
           | going to be choosing this as their serious compute platform.
        
             | analognoise wrote:
             | In TFA, isn't Jetbrains using it as a CI system?
        
               | lmz wrote:
               | Unless they need something Pi specific I don't understand
               | why this would be preferable versus just virtualizing
               | instances on a "big ARM" server. I'm sure those exist.
        
               | bee_rider wrote:
               | Tangential, but it is so funny to me that "TFA" has
               | become a totally polite and normal way to refer to the
               | linked article on this site. Expanding that acronym would
               | really change the tone!
        
               | OJFord wrote:
               | I'm not sure it is 'totally polite'? I usually read it as
               | having a 'did you even open it' implication that 'OP' or
               | 'the submission' doesn't. Maybe that's just me.
        
               | bee_rider wrote:
               | Maybe it isn't _totally_ polite, but it IMO it reads in
               | this case more like slight correction than "In the
               | fucking article," which would be pretty aggressive, haha.
        
         | PragmaticPulp wrote:
         | > but lack of ECC and OOB management kinda disqualifies it for
         | anything serious.
         | 
         | > Yay that's like... almost as much as normal 1U server can do
         | 
         | It's a fun toy. _Obviously_ it isn't the best or most efficient
         | way to get any job done. That's not the point.
         | 
         | Enjoy it for the fun experiment that it is.
        
         | FlyingAvatar wrote:
         | I think the hardware isolation would be a selling point in some
         | cases. Granted, it's niche.
        
         | [deleted]
        
         | xattt wrote:
         | But does anyone remember the Beowulf trope(1) from Slashdot? Am
         | I a greybeard now?
         | 
         | (1) https://hardware.slashdot.org/story/01/07/14/0748215/can-
         | you...
        
           | neilv wrote:
           | To go along with "Imagine a Beowulf cluster of those!", don't
           | forget "Take my money!"
        
             | Koshkin wrote:
             | You can get off my lawn now.
        
           | iamflimflam1 wrote:
           | But does it run Doom?
        
             | mejutoco wrote:
             | Crysis
        
               | cptnapalm wrote:
               | What does Natalie Portman need to imagine a Beowulf
               | cluster of Dooms running Crysis? Grits?
        
           | mometsi wrote:
           | I like user _big.ears_ ' speculation on what someone could
           | possibly do with that much parallel compute:
           | I don't think there's any theoretical reason someone couldn't
           | build a fairly realistic highly-complex "brain" using, say,
           | 100,000,000 simplified neural units (I've heard of a guy in
           | Japan who is doing such a thing), but I don't really know
           | what it would do, or if it would teach us anything that is
           | interesting.
        
           | edoloughlin wrote:
           | I do and you are. I'm also imagining one covered in hot
           | grits...
        
           | trollied wrote:
           | Don't forget the Hot Grits & Natalie Portman.
        
           | pdpi wrote:
           | Beowulf clusters were those lame things that didn't have
           | wireless, and had less space than a nomad, right?
        
           | pjmlp wrote:
           | I do. And cool research OSes that did process migration.
        
             | nine_k wrote:
             | Ah, Plan 9.
        
               | wiredfool wrote:
               | And God help us, OS2/Warp.
        
               | pjmlp wrote:
               | With much better tooling for OO ABI than COM/WinRT will
               | ever get (SOM).
        
               | zozbot234 wrote:
               | I'm not sure that Plan 9 does process migration out of
               | the box. It does have complete "containerization" by
               | default, i.e. user-controlled namespacing of all OS
               | resources - so snapshotting and migration could be a
               | feasible addition to it.
               | 
               | Distributed shared memory is another intriguing
               | possibility, particularly since large address spaces are
               | now basically ubiquitous. It would allow users to
               | seamlessly extend multi-threaded workloads to run on a
               | cluster; the OS would essentially have to implement
               | memory-coherence protocols over the network.
        
               | nine_k wrote:
               | If not Plan 9, then likely Inferno. (A pretty different
               | system, of course.)
        
             | infinite8s wrote:
             | It's too bad Apple never bought Gerry Popek's LOCUS
             | (https://en.wikipedia.org/wiki/LOCUS), which could do
             | process migration between heterogeneous hardware!
        
           | bradleyy wrote:
           | So, I once met a guy named Don.
           | 
           | We were hanging out in the garage of a mutual friend,
           | chatting. Got to the "what do you do" section of the
           | conversation, and he says he works in massively parallel
           | stuff at XYZ corp. Something something, GPUs.
           | 
           | I make the obvious "can you make a Beowulf cluster?" joke, to
           | which he responds (after a pregnant pause), "you... do know
           | who I am?"
           | 
           | Yep. Donald Becker. A slightly awkward moment, I'll cherish
           | forever.
        
           | Quequau wrote:
           | I do but in all fairness, I have an entirely grey beard.
        
             | KingOfCoders wrote:
             | Grey! Mine is white.
        
           | b33j0r wrote:
           | Sure do! It is interesting that these technologies evolve
           | more slowly than it seems, sometimes.
           | 
           | On the graybearding of the cohort, here's a weird one to me.
           | These days, I mention slashdot and get more of a response
           | from peers than mentioning digg!
           | 
           | In 2005, I totally thought digg would be around forever as
           | the slashdot successor, but it's almost like it never
           | happened (to software professionals... er, graybeards)
        
           | flyinghamster wrote:
           | You and me both. The funny thing is, I wound up writing a
           | program that would benefit from clustering, and felt my way
           | around setting up MPICH on my zoo. I laughed out loud when I
           | realized that, after all these years, I'd built an impromptu
           | Beowulf cluster, even though the machines are scattered
           | around the house.
           | 
           | Installing MPICH from source instead of from your
           | distribution is best if you can't have all your cluster
           | members running the same version of the same distro and/or
           | have multiple architectures to contend with. But it takes
           | forever to compile, even on a fast machine.
        
           | lemper wrote:
           | yeah, wanted to replicate something like that by proposing a
           | hardware vendor who visited my uni. decades ago. didn't go
           | nowhere because i was intimidated by the red-tapes.
        
           | pbronez wrote:
           | New to me - found the source article in the Wayback Machine:
           | 
           | https://web.archive.org/web/20010715201416/http://www.scient.
           | ..
        
             | pbronez wrote:
             | Found the definition:
             | 
             | Sterling and his Goddard colleague Donald J. Becker
             | connected 16 PCs, each containing an Intel 486
             | microprocessor, using Linux and a standard Ethernet
             | network. For scientific applications, the PC cluster
             | delivered sustained performance of 70 megaflops--that is,
             | 70 million floating-point operations per second. Though
             | modest by today's standards, this speed was not much lower
             | than that of some smaller commercial supercomputers
             | available at the time. And the cluster was built for only
             | $40,000, or about one tenth the price of a comparable
             | commercial machine in 1994.
             | 
             | NASA researchers named their cluster Beowulf, after the
             | lean, mean hero of medieval legend who defeated the giant
             | monster Grendel by ripping off one of the creature's arms.
             | Since then, the name has been widely adopted to refer to
             | any low-cost cluster constructed from commercially
             | available PCs.
        
           | faichai wrote:
           | Natalie Portman says yes, and instructs you to put some hot
           | grits down your pants.
        
           | wrldos wrote:
           | It remember building a 4 node Beowulf cluster out of
           | discarded compaq desktops and then having no idea what to do
           | with it.
        
             | red-iron-pine wrote:
             | Did kinda the same thing but with Raspberry Pis. Neat, a
             | cluster of r-pi's... now what?
        
               | ipsin wrote:
               | If you want to continue the chain of specific goals in
               | service of no specific purpose: run Kubernetes on it.
        
               | Sebb767 wrote:
               | Then add ArgoCD for deployment and istio for a service
               | mesh!
               | 
               | While you are at it, also setup Longhorn for storage.
               | With that solved, you might as well start hosting Gitea
               | and DroneCI on the cluster, plus an extra helm- and
               | docker repo for good measure. And in no time you will
               | have a full modern CI/CD setup to do nothing but updates
               | on! :-)
               | 
               | Seriously, though, you will learn a lot of things in the
               | process and get a bottom up view of current stacks, which
               | is definitely helpful.
        
               | wrldos wrote:
               | I did this. I am still dead inside. Thank goodness all my
               | production shit has a managed control plane and network.
        
             | mkj wrote:
             | 75mhz, yeah! Stacked on top of each other! With 10mbit
             | ethernet! I think we got OpenMosix going even.
             | 
             | But then 5 years later I was working on them for a living
             | in HPC, but they were no longer called Beowulf Clusters
             | then.
        
         | goodpoint wrote:
         | > Yay that's like... almost as much as normal 1U server can do
         | 
         | ...but the normal server is much cheaper.
        
         | imtringued wrote:
         | Didn't AMD announce a 96 core processor with dual socket
         | support?
         | 
         | As usual this is either done for entertainment value or to
         | simulate physical networks (not clusters).
        
           | adrian_b wrote:
           | Intel also has now up to 480 cores in an 8-socket server with
           | 60 cores per socket, though Sapphire Rapids is handicapped in
           | comparison with AMD Genoa by much lower clock frequencies and
           | cache memory sizes.
           | 
           | However, while the high-core-count CPUs have excellent
           | performance per occupied volume and per watt, they all have
           | extremely low performance per dollar, unless you are able to
           | negotiate huge discounts, when buying them by the thousands.
           | 
           | Using multiple servers with Ryzen 9 7950X can provide a
           | performance per dollar many times higher than that of any
           | current server CPU, i.e. six 16-core 7950X with a total of
           | 384 GB of unbuffered ECC DDR5-4800 will be both much faster
           | and much cheaper than one 96-core Genoa with 384 GB of
           | buffered ECC DDR5-4800.
           | 
           | Nevertheless, the variant with multiple 7950X is limited for
           | many applications by either the relatively low amount of
           | memory per node or by the higher communication latency
           | between nodes.
           | 
           | Still, for a small business it can provide much more bang for
           | the buck, when the applications are suitable for being
           | distributed over multiple nodes (e.g. code compilation).
        
             | cjbgkagh wrote:
             | This is the exact space I'm in, high cpu low network. By my
             | estimates it's about 1/4 the cost per CPU operation to use
             | consumer hardware instead of enterprise. The extra
             | computers allow for application level redundancy so the
             | other components can be cheaper as well.
        
             | bee_rider wrote:
             | One problem with 480 cores in single node: 480 cores is a
             | shitload of cores, who needs more than a single node at
             | this point? The MPI programmer inside me is having an
             | existential breakdown.
        
         | zaarn wrote:
         | The 1U server is however likely to use more than 200 Watts of
         | power that the 40 Blade 2U setup would use.
        
           | logifail wrote:
           | > The 1U server is however likely to use more than 200 Watts
           | of power
           | 
           | Q: Why would a 1U server need more than 200W if you're doing
           | nothing more than basic network services?
           | 
           | I have mini tower servers that draw a fraction of that at
           | idle.
        
             | zaarn wrote:
             | The Pi's will be using those 200Watts at near full tilt.
             | The main use here would be larger computational tasks that
             | you can easily split up among the blades. Or you run a very
             | hardware-failure tolerant software service on top.
        
             | bluedino wrote:
             | I have some idle Dell R650's that draw 384W. A couple
             | drives, buncha RAM, two power supplies, 2 CPU's (Xeon 8358)
        
               | logifail wrote:
               | > Dell R650's that draw 384W
               | 
               | Umm, I'm not sure I can afford the electricity to run kit
               | like that :)
               | 
               | I'm currently awaiting delivery of an Asus PN41 (w/
               | Celeron N5100) to use as yet another home server, after a
               | recommendation from a friend. Be interesting to see how
               | much it draws at idle!
        
         | guntherhermann wrote:
         | It's ~~four~~ two spaces to get the "code block" style.
         | like         this
         | 
         | and asterisk for italics (I don't think there is a 'quote'
         | available, and I'm not sure how they play together.
         | 
         | * does this work? * Edit: No! Haha                   *how*
         | *about*         *this*
         | 
         | Edit: No, no joy there either.
         | 
         | I agree, it's not the most intuitive formatting syntax I've
         | come across :)
         | 
         | I guess we're stuck with BEGIN_QUOTE and END_QUOTE blocks!
        
           | teddyh wrote:
           | It's two spaces. https://news.ycombinator.com/formatdoc
        
         | mkl wrote:
         | From the FAQ: https://news.ycombinator.com/formatdoc
        
         | guntherhermann wrote:
         | _> Yay that 's like... almost as much as normal 1U server can
         | do_
         | 
         | What about cost, and other metrics around cost (power usage,
         | reliability)? If space is the only factor we care about then it
         | seems like a loss.
        
           | betaby wrote:
           | What about them? 1U servers from vendors are reliable and
           | efficient - people use them in production for years. As for
           | the cost, those hobby-style board are very expensive for
           | dollars/performance. Indeed I'm not getting why would one
           | want a cluster of expensive, low spec nodes?
        
         | sys42590 wrote:
         | Indeed, that box here next to my desk draws 50W of electricity
         | continuously despite being mostly idle. Why? Because it has
         | ECC.
         | 
         | Having some affordable low power device with ECC would be a
         | game changer for me.
         | 
         | I added affordable to exclude expensive (and noisy) workstation
         | class laptops with ECC RAM.
        
           | namibj wrote:
           | Most AMD desktop platforms support ECC, and if you don't use
           | overclocking facilities, they are pretty efficient (though
           | their chiplet architecture causes idle power draw to be a
           | good fraction of active power draw, still much less than 50W
           | though).
        
           | Maakuth wrote:
           | There are Intel Atom CPUs that support ECC. I had a
           | Supermicro motherboard with a quad core part like that and I
           | used it as a NAS. It was not that fast, but the power
           | consumption was very low.
        
             | smartbit wrote:
             | Do you remember how many Watts it was using with idle
             | disks?
        
               | MrFoof wrote:
               | I personally have at 43-45W idle...
               | >Corsair SF450 PSU         >ASRock Rack X570D4U w/BMC
               | >AMD Ryzen 7 Pro 5750GE (8C 3.2/4.6 GHz)         >128GB
               | DDR4-2666 ECC         >Intel XL710-DA1 (40Gbps)
               | >LSI/Broadcom 9500-8i HBA         >64GB SuperMicro SATA
               | DOM         >2 SK Hynix Gold P31, 2TB NVMe SSD         >8
               | Hitachi 7200rpm, 16TB HDD         >3 80mm fans, 2 40mm
               | fans, CPU cooler
               | 
               | That was an at the time modern "Zen 3" (using Zen 2
               | cores) system on an X570 chipset. The CPU mostly goes in
               | 1L ultra SFF systems. TDP is 35W, and under stress
               | testing the CPU tops out around around 38.8-39W. The
               | onboard BMC is about 3.2-3.3W of power consumption
               | itself.
               | 
               | Most data ingest and reads comes from the SSD cache, with
               | that being more around 60W for high throughput. Under
               | very high loads (saturating the 40Gbps link) with all
               | disks going, only hits about 110-120W.
               | 
               | By comparison, a 6-bay Synology was over double that idle
               | power consumption, and couldn't come close to that
               | throughput.
        
               | sys42590 wrote:
               | thanks for the parts list, especially because I think
               | ASRock Rack paired with a Ryzen Pro offers better
               | performance than a Supermicro in the same price range.
        
               | MrFoof wrote:
               | There's reasons for that though.
               | 
               | I could drop a few more watts if ASRock could put
               | together a decent BIOS where disabling things actually
               | disables things.
               | 
               | SuperMicro costs what it does for a reason.
               | 
               | --- ---------
               | 
               | If you're looking for a chassis, I'm using a SilverStone
               | RM21-308, with a Noctua NH-L9a-AM4 cooler, and cut some
               | SilverStone sound deadening foam for the top panel of the
               | 2U chassis.
               | 
               | Aside from disks clicking, it's silent, runs hilariously
               | cool _(I 3D printed chipset and HBA fan mounts at a local
               | library)_ and it's more usable storage, higher
               | performance _(saturates 40Gbps trivially)_ and lower
               | power consumption than anything any YouTuber has come
               | remotely close to. That server basically lets me have
               | everything else in my rack not care much about storage,
               | because the storage server handles it like a champ. I
               | really considered doing a video series on it, but I'm too
               | old to want to deal with the peanut gallery of YouTube
               | comments.
        
               | philsnow wrote:
               | If you don't mind me asking, how do your other workloads
               | access the storage on it, NFS? The stumbling block for
               | NFS for me is identity and access management.
        
               | Maakuth wrote:
               | It was this board: https://www.supermicro.com/en/products
               | /motherboard/a2sdi-2c-...
               | 
               | I think it was idling at something like 30-40W with four
               | HDDs and a UPS. I didn't have an especially efficient PSU
               | and the UPS must have taken some power too. The
               | motherboard alone would draw as little as 15W, I suppose.
        
           | nsteel wrote:
           | > Why? Because it has ECC
           | 
           | Sorry if I am missing the obvious here, but why would ECC
           | consume so much power?
        
             | growse wrote:
             | It's not that ECC consumes power, it's that systems that
             | support ECC tend to consume more idle power (because
             | they're larger etc.)
        
           | aidenn0 wrote:
           | Xeon-D series?
        
           | walterbell wrote:
           | Epyc Embedded and possibly some Ryzen Embedded devices.
        
           | stordoff wrote:
           | How much RAM is that with? My home server idles at ~25-27W,
           | but that's with only 16GB (EEC DDR4). However, throwing in an
           | extra 16GB as a test didn't measurably change the reading.
        
         | mayli wrote:
         | That would be 40x (Rpi4 8GB $75 + 8TB nvme $1200 + psu and
         | others) ~ $51000.
        
         | 2OEH8eoCRo0 wrote:
         | > Yay that's like... almost as much as normal 1U server can do
         | 
         | Hyperscale in your _Homelab_. Something to hack on, learn, host
         | things like Jellyfin, and have fun with.
        
           | jeffbee wrote:
           | I agree but can't you get the same effect with VMWare ESXi?
           | If I just wanted to "have fun" managing scores of tiny
           | computers, and I emphasize that this sounds like the least
           | amount of fun anyone could have, I can have as many virtual
           | machines as I want.
        
             | fishtacos wrote:
             | I can understand why some people want something
             | physical/tangible while testing or playing in their hobby
             | environment. I'm still a fan of virtualization - passmark
             | scores for an RPi4 (entire SOC/quad core) are 21 times less
             | than a per-single-core comparison in a 14-core 15-13600k
             | (as a point of reference, my current system) and while am
             | running 64GB RAM, can easily upgrade to 128GB or more on a
             | single DDR4 node.
             | 
             | Hard to see to an advantage given obvious limitations,
             | although it may make it more fun to work within latency and
             | memory constrictions, I guess.
        
         | MuffinFlavored wrote:
         | > lack of ECC and OOB management kinda disqualifies it
         | 
         | Can you expand on this please?
        
         | metalspot wrote:
         | its a nice hobby project, but of course a commercial blade
         | system will have far higher compute density. supermicro can do
         | 20 epyc nodes in 8u, which at 64 cores per node is 1280 cores
         | in 8u, or 160 in 1u, so double the core density, and far more
         | powerful cores, so way higher effective compute density.
        
       | onphonenow wrote:
       | I've been getting good price/perf just doing the top AMD consumer
       | CPU's. Wish someone would make an AM5 platform motherboard with
       | out of band / remote console mgmt. that really is a must if you
       | have a bunch of boxes and have them somewhere else. The per core
       | speeds are high on these. 16 core / 32 threads/boxe gets you
       | enough for a fair bit.
        
         | trevorstarick wrote:
         | Have you taken a look at any of AsrockRack's offerings? They've
         | got some prelim 650 mATX boards:
         | https://www.asrockrack.com/general/productdetail.asp?Model=B...
        
       | 1MachineElf wrote:
       | Love it, however, I'm skeptical of Raspberry Pi Foundation's
       | claims that the CM4 supply will improve during 2023. It might
       | improve for some, but as more novel solutions like these come up,
       | the supply will never be enough.
        
       | Havoc wrote:
       | I've built a small rasp k3s cluster with pi4 and ssd. It works
       | fine but one can ultimately still feel that they are quite weak.
       | Or put differently deploying something on k3s still ends up
       | deploying on a single node in most cases and this gets single
       | node performance under most circumstances
        
         | nyadesu wrote:
         | I've been running a cluster like that since some years ago and
         | definitely felt that, but it was easy to fix by adding AMD64
         | nodes to it
         | 
         | Modifying the services I'm working on to build multi-arch
         | container images was not as straightforward as I imagined, but
         | now I can take advantage of both ARM and AMD64 nodes on my
         | cluster (plus I learned to do that, which is priceless)
        
       | Annatar wrote:
       | [dead]
        
       | aseipp wrote:
       | I love the form factor. But please. For the love of god. We need
       | something with wide availability that supports at least ARMv8.2.
       | 
       | At this rate I have so little hope in other vendors that we'll
       | probably just have to wait for the RPi5.
        
       | jdoss wrote:
       | I think these are fantastic, but I really wish it had a BMC so
       | one could do remote management. I'd love for version 2 to have it
       | so I could buy a bunch for my datacenter.
        
       | spiritplumber wrote:
       | I helped write parallelknoppix when I was an undergrad - our
       | university's 2nd cluster ended up being a bunch of laptops with
       | broken displays running it. Took me a whole summer.
       | 
       | Then the next semester I am denied the ability to take a parallel
       | computing class because it was for graduate students only and the
       | prof. would not accept a waiver even though the class was being
       | taught on the cluster me and a buddy built.
       | 
       | That I still had root on.
       | 
       | So I added a script that would renice the prof.'s jobs to be as
       | slow as possible.
       | 
       | BOFH moment :)
        
         | sidewndr46 wrote:
         | The school I went to had similar but more insane policies
         | 
         | * I frequently took computer science graduate courses and
         | received only undergrad. credit because they could not offer
         | the undergrad course
         | 
         | * Other majors were default prohibited from taking computer
         | science courses under the guise of a shortage of places in
         | classes. Even when those majors required a computer science
         | course to graduate
         | 
         | I would like to point out that 300 and 400 level courses in the
         | CS program usually had no more than 8 students. I distinctly
         | remember meeting in a closet for one of my classes, because we
         | had so few students they couldn't justify giving us a
         | classroom.
         | 
         | Contrast that with the math department where I wanted to take
         | some courses in parallel rather than serial. After a short
         | conversation with the professor he said "ok sure, seems alright
         | to me".
        
           | bionsystem wrote:
           | Why do you guys think such things happen ?
        
             | KRAKRISMOTT wrote:
             | Academia being inelastic and refusing to adapt in the face
             | of market forces.
        
           | jollyllama wrote:
           | Other majors were default prohibited from taking computer
           | science courses under the guise of a shortage of places in
           | classes. Even when those majors required a computer science
           | course to graduate
           | 
           | I went to an institution that did the opposite; seats were
           | reserved for non-cs majors despite a shortage of sections.
           | This resulted in CS undergrads waiting for courses just so
           | they could graduate. It was frustrating because it felt like
           | the department was taking care of outsiders over its own.
        
         | havnagiggle wrote:
         | That was nice of you!
        
           | yjftsjthsd-h wrote:
           | One might even say that it had a very high nice value;)
        
         | lemper wrote:
         | i think i've read this kind of reply before. and i was not
         | wrong [1]. nice story to tell. too bad my uni. didn't have that
         | kind of facilities and opportunities.
         | 
         | [1] https://news.ycombinator.com/item?id=34197024
        
         | throwaway1777 wrote:
         | Prof might've done you a favor. Seems like you didn't need that
         | class anyway.
        
           | spiritplumber wrote:
           | I wanted the credit hours though :)
        
         | anonymousDan wrote:
         | Parallel knoppix sounds cool. Did the os's on each machine
         | coordinate in any way at the kernel level? Or was it all user
         | level libs/apps/services?
        
       | walrus01 wrote:
       | If you want "hyperscale" in your homelab, the bare metal
       | hypervisor needs to be x86-64 because unless you literally work
       | for Amazon or a few others you are unlikely to be able to
       | purchase other competitively priced and speedy arm based servers.
       | 
       | There is still near zero availability in mass market for CPUs you
       | can stick into motherboards from one of the top ten taiwanese
       | vendors of serious server class motherboards.
       | 
       | And don't even get me started on the lack of ability to actually
       | buy raspberry pi of your desired configuration at a reasonable
       | price and in stock to hit _add to cart_.
        
         | vegardx wrote:
         | Supermicro launched a whole lineup of ARM-based servers last
         | fall. They seem to mostly offer complete systems for now, but
         | as far as I understand that's mostly because there's still some
         | minor issues to iron out in terms of broader support.
        
       | ultra_nick wrote:
       | I'd like to buy a laptop that's also a fault tolerant cluster.
        
       | Saris wrote:
       | It's too bad ARM boards are so expensive, it makes them nearly
       | pointless for projects unless you need the GPIO.
        
       | Aissen wrote:
       | Multiple server vendors now have Ampere offerings. In 2U, you can
       | have:
       | 
       | * 4 Ampere Altra Max processors (in 2 or 4 servers), so about 512
       | cores, and much faster than anything those Raspberry Pi have.
       | 
       | * lots of RAM, probably about 4TB ?
       | 
       | * ~92TB of flash storage (or more ?)
       | 
       |  _Edit_ : I didn 't want to disparage the compute blade, it looks
       | like a very fun project. It's not even the same use case as the
       | server hardware (and probably the best solution if you need
       | _actual_ raspberry pis), the only common thread is the 2U and
       | rack use.
        
         | dijit wrote:
         | those things are insanely expensive though, I priced a 2core
         | machine at 20,000 EUR without much ram or SSDs.
         | 
         | I'm keeping my eyes open though.
        
           | aeyes wrote:
           | Try the HPE RL300, should be more reasonably priced but I
           | couldn't get a quote because availability seems to be
           | problematic at the moment.
        
           | Aissen wrote:
           | An open secret of the server hardware market: public prices
           | mean nothing and you can get big discounts, even at low
           | volume.
           | 
           | But of course the config I talked about is maxed-out and
           | would probably be more expensive than 20k. It would be
           | interesting to compare the TCO with an equivalent config, and
           | I wouldn't be surprised to see the server hardware still win.
        
       | bogwog wrote:
       | $60 per unit sounds pretty good. Does anyone have experience
       | cross compiling to x86 from a cluster of Pis and can say how well
       | it performs? A cheap and lower-power build farm sounds like an
       | awesome thing to have in my house.
        
       | thejosh wrote:
       | These would be awesome for build servers, and testing.
       | 
       | I really like Graviton from AWS, and Apple Silicon is great, I
       | really hope we move towards ARM64 more. ArchLinux has
       | https://archlinuxarm.org , I would love to use these to build and
       | test arm64 packages (without needing to use qemu hackery, awesome
       | though that it is).
        
       | bullen wrote:
       | This is a lot cheaper, more silent and smaller:
       | 
       | http://move.rupy.se/file/cluster_client.png
        
       | pnathan wrote:
       | This looks cool!
       | 
       | I would, however, say that while I'm in the general target
       | audience, I won't do crowdfunded hardware. If it isn't actually
       | being produced, I won't buy it. The road between prototype and
       | production is a long one for hardware.
       | 
       | (Still waiting for a _very_ cool bit of hardware, 3+ years later
       | - suspecting that project is just *dead*)
        
       | throwaway67743 wrote:
       | Yes, this, more of this!
        
       | bashinator wrote:
       | There's no backplane - all power and communication goes through a
       | front-facing ethernet port. Kind of defeats the purpose of a
       | blade form factor IMO.
        
       | alex_suzuki wrote:
       | Ah, do you feel it too? That need to own some of these, even
       | though you have zero actual use for them.
        
         | petesergeant wrote:
         | Nothing generates that feeling for me like seeing these things:
         | 
         | https://store.planetcom.co.uk/products/gemini-pda-1
         | 
         | I absolutely can't imagine what I'd use it for, and yet, my
         | finger has hovered over "buy" many many times over the last few
         | years
        
           | alex_suzuki wrote:
           | Reminds me of the Psion Series 5 which I owned more than
           | twenty years ago... and even then, had little use for. :^)
           | https://en.wikipedia.org/wiki/Psion_Series_5
        
             | petesergeant wrote:
             | Exactly that. I used to thumb through the back pages of
             | Personal Computer World[0] under the covers as a kid
             | looking at the palmtops. I think it's mostly nostalgia
             | 
             | 0: https://en.wikipedia.org/wiki/Personal_Computer_World
        
               | alex_suzuki wrote:
               | Good times, good times.
        
           | LanternLight83 wrote:
           | Doesn't look too far off from the Pinephone with it's
           | keyboard
        
           | staindk wrote:
           | I feel that way about the ClockworkPi consoles [1]
           | 
           | There's a 5% chance that I fall madly in love with this thing
           | and go tinker on some project in a coffee shop every
           | weekend... but it's much more likely that I end up almost
           | never using it :|
           | 
           | [1] https://www.clockworkpi.com/shop
        
         | fy20 wrote:
         | I think I could justify the world's most secure and reliable
         | Home Assistant cluster with automatic failover...
        
           | pbronez wrote:
           | Yeah that's my thought. The main benefit to this is High
           | Availability. You're not going to get compelling scale-out
           | performance, but you can protect yourself from local hardware
           | failures.
           | 
           | Of course, then you have to ask if you need the density.
           | There are lots of ways to put Rpi in a rack.. and this
           | approach gives up Hat compatibility for density.
           | 
           | For example, I'm considering a rack of Rpi with hifi berry
           | DACs for a multi-zone audio system. This wouldn't help me
           | there.
        
           | criddell wrote:
           | Frankly, the bar for that is pretty low...
        
         | Hamuko wrote:
         | I don't feel like I have zero actual use for them. The amount
         | of Docker containers I have running on my NAS is only ever
         | going up. These could make for a nice, expandable Kubernetes
         | cluster.
         | 
         | As for if that's a good use-case is a whole another thing.
        
       | lars-b2018 wrote:
       | It's not clear to me how to build a business based on RPi
       | availability. And the clones don't seem to be really in the game.
       | Are Raspberry Pis becoming more readily available? I don't see
       | that.
        
         | nsteel wrote:
         | Businesses and consumers don't see the same availability,
         | apparently. And yes, they are very slowly becoming more
         | available. But still no Pi 4 about.
        
         | goodpoint wrote:
         | Correct. These are for hobbyists and there is no market.
        
       | amelius wrote:
       | How do we measure the performance of these kinds of systems?
        
         | geerlingguy wrote:
         | Not super fast but efficiency is okay:
         | https://github.com/geerlingguy/top500-benchmark#results
        
         | robbiet480 wrote:
         | The Blade is just a carrier for a Raspberry Pi CM4, so the
         | performance will be that of a normal CM4.
        
           | amelius wrote:
           | Ok, still it would be nice to have a line that says this
           | system can do X1 threads of X2 GFLOP/s and has a memory
           | bandwidth of X3 MB/s, or something like that.
        
             | ZiiS wrote:
             | Unfortunately if you are asking that question the answer
             | for all the Pi's and clones is "Not enough by more then an
             | order of magnitude".
        
               | geerlingguy wrote:
               | The clones based on the RK3588 are approaching last-Gen
               | Qualcomm speeds, so they're not as much of a let down as
               | the 2016-era chips the Pi is based on.
               | 
               | And efficiency is much better than the Intel or AMD chips
               | you could get in a used system around the same price.
        
             | znpy wrote:
             | You can look at benchmarks for the rpi cm4 for that
        
       | robbiet480 wrote:
       | Been waiting for this for over a year, was the first person to
       | buy a pre-purchase sample. Planning to set up a PXE k3s cluster.
        
       | atlgator wrote:
       | Why only 1 Gbps ethernet?
        
         | geerlingguy wrote:
         | That's the speed of the NIC built into the CM4. If you want 2.5
         | or 5 Gbps, you'd have to add a PCIe switch, adding a lot more
         | cost and complexity--and that would also remove the ability to
         | boot off NVMe drives :(
         | 
         | Hopefully the next generation Pi has more PCIe lanes or at
         | least a faster lane.
        
       | forgotuser22 wrote:
       | A
        
       | davgoldin wrote:
       | This looks very promising. I basically could print an enclosure
       | to specifically fit my home space. And easily print a new one
       | when I move.
       | 
       | More efficient use of space compared to my current silent mini-
       | home lab -- also about 2U worth of space, but stacked semi-
       | vertically [1].
       | 
       | That's 4 servers each with AMD 5950x, 128GB ECC, 2TB NVMe, 2x8TB
       | SSD (64c/512GB/72TB total).
       | 
       | [1] https://ibb.co/Jm1SX7d
        
         | LolWolf wrote:
         | Wait this is pretty sick! What's the full build on that? How do
         | you even get started on finding good cases that aren't just
         | massive racks for a home build?
        
       | forgotuser22 wrote:
       | A
        
       | wildekek wrote:
       | I have this cycle every 10 years where my home infra gets to
       | enterprise level complexity (virtualisation/redundancy/HA) until
       | the maintenance is more work than the joy it brings. Then, after
       | some outage that took me way too long to fix, I decide it is over
       | and I reduce everything down to a single modem/router and WiFi
       | AP. I feel the pull to buy this and create a glorious heap of
       | complexity to run my doorbell on and be disapointed, can't wait.
        
       | lacrosse_tannin wrote:
       | These are pi's right? No hardware AES :/
        
       | blitzar wrote:
       | The blade has arrived but can you get a compute unit to go in it?
       | The non availability of the whole pi ecosystem has done a lot of
       | damage.
        
         | preisschild wrote:
         | There are other CM-compatible SoMs.
         | 
         | Like the Pine64 SOQUARTZ
        
           | russelg wrote:
           | Geerling covers this in the accompanying video for this post.
           | He couldn't get it running due to no working OS images being
           | obtainable.
        
             | fivesixzero wrote:
             | I spent some time last week tinkering with a SOQuartz board
             | and ended up getting it working with a Pine-focused distro
             | called Plebian[1].
             | 
             | Took awhile to land on it though. Before that I tried all
             | of the other distros on Pine64's "SOQuartz Software
             | Releases"[2] page without any luck. The only one on that
             | page that booted was the linked "Armbian Ubuntu Jammy with
             | kernel 5.19.7" but it failed to boot again after an apt
             | upgrade.
             | 
             | So there's at least one working OS, as of last week. But
             | its definitely quite finicky and would probably need some
             | work to build a proper device tree for any carrier board
             | that's not the RPi CM4 Carrier Board.
             | 
             | [1] https://github.com/Plebian-Linux/quartz64-images
             | 
             | [2] https://wiki.pine64.org/wiki/SOQuartz_Software_Releases
        
             | fellowmartian wrote:
             | I don't think these boards are meant for the way people are
             | trying to use them. Mainline Linux support is actually
             | great on RK3566 chips, but you have to build your own
             | images with buildroot or something like that.
        
             | blitzar wrote:
             | So you can get those compute units are obtainable, but a
             | functioning image remains unobtanium. What a mess.
        
               | geerlingguy wrote:
               | You can usually get an image that functions at least
               | partially, but it's up to you to determine whether the
               | amount it functions is enough for your use case. A K3s
               | setup is usually good to go without some features like
               | display output.
        
               | blitzar wrote:
               | I like to tinker, but there is a limit.
               | 
               | The killer feature for their go fund would be is if they
               | sourced a batch of pi compute modules ...
        
         | bombcar wrote:
         | The Rock5B is whipping the Pi on compute power and
         | availability. Only use a Pi if you absolutely have to.
        
           | blitzar wrote:
           | At $150+ I would just buy an old small form factor dell / hp
           | from ebay and have a whole machine.
        
             | bogwog wrote:
             | I bought a retired dual-socket Xeon HP 1U server on ebay
             | with 128GB of ECC RAM for like $50 on ebay a while back. It
             | only had one CPU, but upgrading it to two would be very
             | cheap.
             | 
             | Sure, it's hulking, obsolete, and very loud beast, but it's
             | hard to beat the price to performance ratio there... just
             | make sure you don't put anything super valuable on it
             | because HP's old proliant firmware likely has a ton of
             | unpatched critical vulnerabilities (and you'd need an HP
             | support plan to download patches even if they exist)
        
             | celestialcheese wrote:
             | 100% this.
             | 
             | I picked up a HP 705 G4 mini on backmarket for $80 shipped
             | the other day to run Home Assistant and some other small
             | local containers. 500gb ram, Ryzen 5 2400GE, 8gb ddr4 w/ a
             | valid windows license.
             | 
             | Sure it's not as small or silent, but there's no way to
             | beat the prices of these few-years old enterprise mini-pc's
        
       | ChuckMcM wrote:
       | That is a neat setup. I wish someone would do this but just run
       | RMII out to an edge connector on the back. Connect them to a
       | jelly bean switch chip (8 port GbE are like $8 in qty) Signal
       | integrity on, at most 4" of PCB trace should not be a problem.
       | You could bring the network "port status" lines to the front if
       | you're interested in seeing the blinky lights of network traffic.
       | 
       | The big win here would be that all of the network wiring is
       | "built in" and compact. Blade replacement it trivial.
       | 
       | Have your fans blow up from the bottom and stagger "slots" on
       | each row and if you do 32 slots per row, you probably build a
       | kilocore cluster in a 6U box.
       | 
       | Ah the fun I would have with a lab with an nice budget.
        
         | themoonisachees wrote:
         | Couldn't you do 1ki cores /4U with just Epyc CPUs in normal
         | servers? At that point surely for cheaper, also significantly
         | easier to build, and faster since the cores don't talk over
         | Ethernet?
        
         | zokier wrote:
         | > That is a neat setup. I wish someone would do this but just
         | run RMII out to an edge connector on the back
         | 
         | That stuck out to me too, they are making custom boards and
         | custom chassis, surely it would be cleaner to route the
         | networking and power through backplane instead of having
         | gazillion tiny patch cables and random switch just hanging in
         | there. Could also avoid the need for PoE by just having power
         | buses in the backplane.
         | 
         | Overall imho the point of blades is that some stuff gets
         | offloaded to the chassis, but here the chassis doesn't seem to
         | be doing much at all.
        
         | nine_k wrote:
         | What kind of fun might that be?
        
           | ChuckMcM wrote:
           | Well for one, I'd build a system architecture I first
           | imagined back at Sun in the early 90's which is a NUMA fabric
           | attached compute/storage/io/memory scalable compute node.
           | 
           | Then I'd take a shared nothing cluster (typical network
           | attached Linux cluster) and refactor a couple of algorithms
           | that can "only" run on super computers and have them run
           | faster on a complex that costs 1/10th as much. That would be
           | based on an idea that was generated by listening to IBM and
           | Google talk about their quantum computers and explaining how
           | they were going to be so great. Imagine replacing every
           | branch in a program with an assert that aborts the program on
           | fail. You send 10,000 copies of the program to 10,000 cores
           | with the asserts set uniquely on each copy. The core that
           | completes kicks off the next round.
        
       | sroussey wrote:
       | Apple should go with a blade design for the Mac Pro. Just stick
       | in as many M2 Ultra blades as you need to up the compute and
       | memory.
       | 
       | Will need to deal with NUMA issues on the software side.
        
         | geerlingguy wrote:
         | I would be all over any server like form factor for M-series
         | chips. The efficiency numbers for the CPU are great.
        
       | eismcc wrote:
       | It's amazing to see how far these systems have come since my
       | coverage from The Verge in 2014, where I built a multi-node
       | Parallella cluster. The main problem I had then was that there
       | was no of the shelf GPU friendly library to run on it, so I ended
       | up working with the Gray Chapel project to get some distributed
       | vectorization support. Of course, that's all changed now.
       | 
       | https://www.theverge.com/2014/6/4/5779468/twitter-engineer-b...
        
       ___________________________________________________________________
       (page generated 2023-01-24 23:00 UTC)