[HN Gopher] M1 Icestorm cores can still perform well ___________________________________________________________________ M1 Icestorm cores can still perform well Author : ingve Score : 51 points Date : 2021-09-01 08:02 UTC (1 hours ago) (HTM) web link (eclecticlight.co) (TXT) w3m dump (eclecticlight.co) | simondotau wrote: | TLDR: Based on a single simple synthetic benchmark, the low | performance "Icestorm" cores were shown to be as much as 52%--or | as little as 18%--of the performance of the primary "Firestorm" | cores. Highly efficient assembly showed the least performance | drop whereas complex "idiomatic" Swift code showed the greatest | performance drop. | | However the Icestorm cores also use substantially less energy so | they are an efficiency win regardless. Plus they take up use | significantly less physical space which is a large cost saving | for the SOC part. | Filligree wrote: | How significantly less, I wonder? | | For my workloads it'd be an overall win to have more cores at | that speed. The more the better; I'd cap out at maybe a a | hundred or so. | | Obviously Firestorm is better, but a hundred-core desktop CPU | at present seems... unlikely. | maccard wrote: | AMD [0] would like a word. it's 64 cores but 128 with | hyperthreading. | | [0] https://www.amd.com/en/products/cpu/amd-ryzen- | threadripper-3... | Filligree wrote: | So, not a hundred-core processor yet. | | I can't use hyperthreading. It does give a 60% speed boost, | but it's also disabled in production so... | OskarS wrote: | > Highly efficient assembly showed the least performance drop | whereas complex "idiomatic" Swift code showed the greatest | performance drop. | | I wonder what this means. The efficient assembly probably has | fewer instructions that use vector instruction and floating | point calculations more, while the "idiomatic" Swift probably | has just a larger number of instructions that aren't doing | heavy calculation. Does that imply then that the high | performance cores does much deeper pipelining, but the the | number floating point units or whatever is probably pretty | similar across both types? | simondotau wrote: | My initial guess is that it's because Icestorm CPUs have less | L1 and L2 cache, resulting in more frequent cache misses in | complex loops. I'm by no means an expert in any of this, so I | really have no place hypothesising. | | Firestorm has 128KB L1 per core and 12MB shared L2. | | Icestorm has 64KB L1 per core and 4MB shared L2. | webmobdev wrote: | big.LITTLE Processing: Defining the Future of SoC Architecture - | https://www.samsung.com/semiconductor/minisite/exynos/newsro... | | With this CPU design some cores are optimised for performance (at | the expense of using more power) while some cores are optimised | for efficiency (using the least power at the expense of computing | performance). This makes sense for laptops and smartphones, as it | can save power and thus run longer when being powered by | batteries. But (in my opinion) not for Desktop PC's where most | people care more about computing performance than saving a few | watts. | Synaesthesia wrote: | Most of the time your PC isn't working hard, and it makes sense | to use lower power cores to perform basic tasks. | simondotau wrote: | I'm not sure that you could make a case for this not making | sense in a desktop computer, as everything is ultimately a | trade-off. | | It's fairly clear that the Icestorm cores represent a | performance gain in terms of performance per watt, but also die | area. The four Icestorm cores and their support infrastructure | takes up about the same physical space as one Firestorm core | with its support infrastructure. | | I doubt that an M1 with five Firestorm cores would perform as | well as the eight cores we did get. | m_eiman wrote: | Saving watts means lowering fan RPM, meaning less noise. And | that's a big priority for many. | n1000 wrote: | Also, aren't desktop CPUs constrained by thermal load at some | point or can we use ever bigger coolers? Personally, I find | it almost obscene that my desktop PC consumes roughly as much | as a good old incandescent lightbulb (60+W) _while idling_. | My laptop uses as much under full load. | Synaesthesia wrote: | Your PC uses 60w idling? Is that with screen? It's not too | much in that case. CPUS and GPUS have gotten a lot better | at idle power consumption, and PSUs are also quite | efficient these days. | Roritharr wrote: | Not only that, i'd prefer to have a small amount of ram and | cpu to be running 24/7 for always-on features that I'd love | to have my PC doing. | | I don't like having to run a PI for some stuff just because i | don't want my huge tower running all the time, it would be | really neat if it could run at anything between 5 - 600W, not | sure though if the PSUs would be able to offer that range. | shantara wrote: | I didn't realize map-reduce was so much slower than a regular | looped multiplication, regardless of the hardware the code was | running on. | codetrotter wrote: | If the author is here and able to do so I'd much appreciate if | they would share the complete code for the benchmarking as a | whole, so that others may use it for benchmarking other code in | the same way :) ___________________________________________________________________ (page generated 2021-09-01 10:00 UTC)