[HN Gopher] ZFS on a single core RISC-V hardware with 512MB ___________________________________________________________________ ZFS on a single core RISC-V hardware with 512MB Author : magicalhippo Score : 77 points Date : 2022-03-13 17:16 UTC (5 hours ago) (HTM) web link (andreas.welcomes-you.com) (TXT) w3m dump (andreas.welcomes-you.com) | 2Gkashmiri wrote: | hope this coupled with more tech on risc-v hardware can bring it | to the level of raspbery pi with all the community and hardware | devices and the accessories and all that. | | will it take a decade ? less? | FullyFunctional wrote: | I hope it won't be a decade, but remember the (original) | Raspberry Pi launched on a very mature part, with a _very_ | mature (ancient) ISA. | | Outside the discount pricing, Intel has promised to tape out | SiFive's P650. Revos, Tenstorrent, and others are also working | on fast cores, but it'll be at least 2-3 years before they hit | the market if at all. | | So far SiFive's dual issue in-order core (~ 40 GeekBench 5.4.1) | (like on now-cancelled BeagleV) is the fastest chip you can buy | as a lay person. The D1 (~ 32 GB 5.4.1) is cheaper but less | powerful. | michaelmrose wrote: | There has not ever been a reason for memory to be correlated with | storage capacity nor any reason to believe that such a | correlation ought to exist. | | Nobody ever said well I plugged an 20TB external hard drive so I | better plug in a few more sticks of RAM so that works. | | Dedup needs RAM in proportion to storage because for each | duplicate block it maintains an entry in an in memory table. | lazide wrote: | You uh just kind of contradicted yourself? | | All file systems have metadata which is good to keep in memory. | Building several 50+TB NAS boxes recently, it isn't just ZFS | either. And it isn't some sort of linear performance penalties | sometimes when you don't have enough RAM. It can be kernel | panics, exponential decay in performance, etc. | michaelmrose wrote: | I didn't contradict myself at all virtually nobody needs | dedup its not remotely worth the RAM cost for 99.9% of users. | | Can you quantify what you are saying. What OS/filesystem? | What minimum RAM requirements for what amount of storage? | lazide wrote: | That's a good point - I've seen free memory drop every time | I've built the larger file systems (and not from just the | cache), but I never tried to quantify it. And I don't see | any good stats or notes on it. | | seems like no one is building these larger systems on boxes | small enough for it to matter, or at least google isn't | finding it. | michaelmrose wrote: | Another way of saying this is that RAM usage doesn't | meaningfully scale with storage size in scope of storage | systems encountered by actual non theoretical people | because the minimum ram available on any system one | encounters is sufficient to service the amount of storage | that it is possible to use on said system. | porkgymnastics wrote: | magicalhippo wrote: | > There has not ever been a reason for memory to be correlated | with storage capacity nor any reason to believe that such a | correlation ought to exist. | | However specific implementations can indeed have memory | requirements that scale in relation to storage capacity. For | example, if the implementation keeps the bitmap of free space | in memory, then more storage = larger bitmap = more memory | required. | | There's been several attempts in ZFS to reduce memory overhead. | I'm pretty sure that if you took a decade old version of ZFS | you'd struggle to run it on a system with 512MB RAM. | michaelmrose wrote: | At present 512MB of RAM is notable in how ridiculously tiny | it is and 2TB is still an acceptable amount of storage. | Without resorting to decades obsolete software can you put a | pin on exactly how much storage it would take to render that | tiny amount of RAM unusable and then explain how much storage | it would take to render a machine with 4GB of RAM likewise | unusable so that we may demonstrate memory usage scaling with | storage? | FullyFunctional wrote: | Ha, this is awesome, thanks for checking that out. One point of | note though, I'm pretty sure it would have been faster to build | the kernel + OpenZFS on the Debian/RISC-V in QEMU. QEMU on decent | hardware runs very fast and much faster than the D1. | | ADD: Geekbench 5.4.1 on RISC-V | | - under QEMU/Ryzen 9 3900XT: 82 | | - under QEMU/M1: 76 | | - Native D1: 32 (https://browser.geekbench.com/v5/cpu/13259016) | | The M1 result is skewed because for some reason AES emulation is | much faster on Ryzen. The rest of the integer stuff is faster on | the M1, up to 30% faster. | bombcar wrote: | Anyone have a low-power RISC or ARM hardware that supports many | SATA ports? | mustache_kimono wrote: | Don't know why anyone hasn't made such a board a priority. | Seems like a sweet spot. | vorpalhex wrote: | Would also love to see this, even if it's experimental or beta. | | PiBox is the only contender I am aware of. | dark-star wrote: | But can it do dedupe on such a box? I think the recommendation is | still "1GB of RAM for each TB of storage" if you're using | dedupe... | | I still have some boards with ~512mb RAM lying around (an | UltraSPARC for example) that I'd love to re-purpose to a cheap | NAS, just for the heck of doing it on a non-x86 platform.... | Wowfunhappy wrote: | Yeah, I think the author may be mixing up recommendations for | dedup vs non-dedup. The solution is always to not enable dedup, | it's a niche feature that's not worthwhile outside of _very_ | specific scenarios. | R0b0t1 wrote: | The speed you want them to run is a factor also. The rule of | thumb hasn't applied for a while, he's right in noting that | in the post. | Wowfunhappy wrote: | I thought it does generally apply for dedup, though, | because ZFS is then required to keep the dedup tables in | memory? | lazide wrote: | I've tried dedup out, and even with a large powerful box with a | LOT of duplicate files (multi-TB repositories of media files | which get duplicated several times due to coarse snapshotting | from other less fancy systems), I get near zero deduplication. | I think it was literally low single digits percents. | | ZFS dedup is block based, and actual block size varies | depending on data feed rate for most workloads (zfs queues up | async writes and merges them), so in practice once a file gets | some non-zero block offset somewhere which happens all the | time, even identical files don't dedup. | Wowfunhappy wrote: | Wow, that's worse than I realized! Honestly, this makes me | wonder whether the feature should even exist in ZFS. Given | the enormous hardware requirements and minimal savings... | well, I'd be curious to hear if anyone has ever found a real | use case. | FullyFunctional wrote: | Does anyone actually use dedup? I think even the OpenZFS | documentation says compression is more useful in practice. | If at all, dedup should be an offline feature, to be run as | scheduled by the operator. | | My setup tries to get the absolute highest bandwidth and | uses NVMe sticks in a stripe (I get my redundancy | elsewhere), no compression, no dedup and yet can only hit ~ | 3.5 GB/s reads (TrueNAS Core, EPYC 7443P, Samsung 980PRO, | 256 GiB). I hope TrueNAS SCALE will perform better. | watersb wrote: | My first ever large (> 4TB) ZFS pool is still stuck with | dedup. It's a backup server, gets about 2x with | deduplication. | | At the time, it was the difference between slow and | impossible: I couldn't afford another 2x of disks. | | These days, the pool could fit on a portable SSD that | would fit in my pocket. | | Careful, file-based dedup on top of ZFS might be more | effective. | | Small changes to single, large files see some advantage | with block based deduplication. You see this in | collections disk images for virtual machines. | | You might see that in database applications, depending on | log structure. I don't know, I don't have that | experience. | | For most of us, file-based deduplication might work out | better, and is almost certainly easier to understand. You | can come with a mental model of what you're working with, | dealing with successive collections of files. | | Even though files are just another abstraction over | blocks, it's an abstraction that leaks less without the | deduplication. | | I haven't used a combination of encryption and | deduplication. That was Really Hard for ZFS to implement, | and I'm not sure how meaningful such a combination is in | practice. | bombcar wrote: | It would be nice if ZFS was able to combine dedup and | compression - basically be able to notice that a | block/file/datastream was similar/identical to another | one, and do compression along with a pointer ... | lazide wrote: | practically speaking the tradeoffs to make that work are | unlikely to make you or anyone else happy except in some | VERY specific workloads. | willis936 wrote: | ZFS can have both features enabled at once. | | Though there is no clean way to disable either. | Compression can be removed from files by rewriting them, | but removing deduplication requires copying over all data | to a fresh pool. | mlok wrote: | ZFS Dedup has been wonderful for me : dedupratio = 7.05x (144 | GB stored on a 25 GB volume, and still 1.3 GB left free). I | use it for backups of versions of the same folders and files | slowly evolving over a long period of time ( > 15 years) that | gives a lot of duplication, of course. (I could also use | compression on top of it) | magicalhippo wrote: | While regular dedup is only a win for _highly_ specific | workload, the file-based deduplication[1][2] which is in the | works seems like it can have some potential. | | They discussed it, along with some options for a background- | scanning dedup service (trying to find potential files to | dedup), in the February leadership meeting[3]. | | [1]: https://openzfs.org/wiki/OpenZFS_Developer_Summit_2020_t | alks... | | [2]: https://youtu.be/hYBgoaQC-vo | | [3]: https://www.youtube.com/watch?v=hij7PGGjevc | spullara wrote: | Wow, that is a very naive dedup algorithm. | lazide wrote: | Without restricting pretty heavily how you can interact | with files or causing severe bottlenecks it's the best that | can be done probably, since the FS API doesn't provide any | real guarantees about what data WILL be written later, or | how much of it, etc. So it has to figure things out as it | goes with minimal performance impact. ___________________________________________________________________ (page generated 2022-03-13 23:00 UTC)