[HN Gopher] We saved millions in SSD costs by upgrading our file... ___________________________________________________________________ We saved millions in SSD costs by upgrading our filesystem Author : kmdupree Score : 334 points Date : 2021-11-09 17:41 UTC (5 hours ago) (HTM) web link (heap.io) (TXT) w3m dump (heap.io) | londons_explore wrote: | Postgres is copy on write. (A modified record is copied to a new | page when written, and the old record left in place till a | vacuum). | | ZFS is Copy-on-write (One byte written in a block requires the | whole block be rewritten, with the old one scheduled for | reclamation). | | The underlying SSD wear levelling algorithm is Copy on Write | (Writing a single byte involves writing the data from the page to | a new block, and then erasing the old one sometime later) | | That means a tiny 1 byte modification to a postgres record | involves creating many new unnecessary copies of a lot of data... | | I imagine that if the three layers could be combined, dramatic | performance benefits could happen, since written data might go | down by an order of magnitude at least.... | ComodoHacker wrote: | > A modified record is copied to a new page when written | | AFAIK a modified record is copied to the same page if there is | enough space (which you can tune). | storyinmemo wrote: | This is where RocksDB or other SSTable storage systems really | line up well with SSDs. The write amplification reduction can | improve throughput and disk life by multiples. | | https://engineering.fb.com/2016/08/31/core-data/myrocks-a-sp... | touisteur wrote: | Oh, a postgres disk controler / fs / storage engine would be an | interesting project. | jbverschoor wrote: | That's what you get when everybody is hiding abstractions in 10 | layers. | | The same thing happens with all the layers of virtual-machines | josteink wrote: | To be fair you have some virtualisation stacks (like | QEMU/KVM) which goes in the opposite direction: | | It provides VMs with drivers for purely virtual "virtio" | devices (storage, network, etc) with no effort or overhead | put into mimicking the mechanics of a real sas/sata/scsi | device or whatever. | | The result is less complexity and a much more direct data- | path so things should hopefully not only be more performant, | but also more stable. | strainer wrote: | Not sure about SSD but filesystem COW doesn't only entail | writing and freeing whole blocks, it entails "dont make an | _actual_ copy _until_ write " | | COW filesystem means you can make (virtual) copies of files | /blocks without writing (duplicating) them on the media. They | only write metadata for bare copies and delay duplicating data | until a virtual duplicate is altered. | | Actual writes are written into a free block, then the old block | is marked clear. The old block is not copied and then written | over. In my understanding that's not what COW characterizes - | its refering to how copying data is almost free (only costs | metadata changes in COW filesystems) until copies are altered | (written to) | londons_explore wrote: | These semantics are true at all three layers. | | In Postgres, Transactions work on a "snapshot" of the data | that existed at one point in time. That snapshot is logically | a copy of the data, but in reality uses copy-on-write of | records to avoid having to make a copy of the entire database | at the start of any transaction. | | In ZFS, it works as described. | | In SSD's, operating system 'write' commands are treated as | transactions - ie. certain ordering semantics must be | preserved in case of a power failure. Since performance is | improved by having extra parallelism and not doing the actual | operations in the order they are presented by the OS, a copy- | on-write model is used to ensure that an incomplete | transaction can be rolled back. This isn't supposed to be | user-visible, but occasionally in a badly broken SSD, you | hear users complaining of 'it works fine, but then when I | reboot my computer everything I did is undone'! Well that's | because no transactions are committing... | tmikaeld wrote: | If anything, it would drastically increase the lifetime of the | SSDs. | jandrese wrote: | SSDs already do copy-on-write internally, so this doesn't | change much. The files are slightly smaller so it may save a | few writes here and there, but I wouldn't expect a drastic | change one way or another. | wtallis wrote: | I'm not sure it's an order of magnitude reduction in real flash | writes, but there have been some pretty promising results from | experiments with collapsing those three layers. Moving it all | to the SSD gets you a Key-Value SSD, and moving it all to the | host system gets you a Zoned SSD. Both ideas have seen enough | interest from storage vendors and hyperscale cloud to end up | standardized. So at this point it's mostly a matter of getting | support added to common software like Postgres so that it's | easy for smaller operations to adopt. | blibble wrote: | wear levelling isn't really copy on write is it? | | the original block will be removed from the mapping entirely | (not referred to by a copy) | londons_explore wrote: | Typically flash devices can't overwrite data directly, and | can only erase very large blocks (ie. 1MB or more). | | That means any write of data must be made to a new, freshly | erased, location. | | The old version of the data, sitting in the middle of a large | eraseblock is no longer used, but cannot be reused until | everything else in the eraseblock is either unused or copied | elsewhere. | | In the worst case (of a nearly full SSD), it means that for | every 1 byte write into a 4k page, a full 1Mbyte of data | needs to be copied. Typical cases are better (when the drive | has plenty of spare space, so can delay the reclamation of | the eraseblock for as long as possible, hoping that other | things in the block are invalidated, reducing the amount that | needs to be copied). | | TL;DR: It's complex, but a lot of copying is involved with | most writes... | rsync wrote: | I am genuinely curious how a ZFS "special" device[1] which | absorbs all metadata for a pool would help this kind of a | workload. | | The "special" device is a full-blown vdev that you add to a pool | (it is _not_ a cache) - typically a 3 or 4-way mirror of SSDs. | | Now all metadata read and write happen at SSD speeds. | | We wrote about this in the _rsync.net Technical Notes_ for Q3 of | this year[2]. | | I know what this kind of SSD based vdev does to typical mixed | file performance but I'm not sure how metadata-heavy a postgres | implementation is ... | | [1] Yes, they really are called that. | | [2] | https://www.rsync.net/resources/notes/2021-q3-rsync.net_tech... | aftbit wrote: | Thanks for posting about this. TIL about special devices. | latk wrote: | Would separate SSD metadata devices help if the pool, as in | Heap.io's case, already consists entirely of SSDs? It's | obviously a win for a use case like Rsync.net's where the data | is less "hot"and therefore uses much more cost-effective HDDs. | franga2000 wrote: | Would be interesting to see if Optane or even just some | faster SSDs for the metadata would give any noticeable | improvement. I imagine latency would be more important for | metadata than throughput, so perhaps SSDs would be more or | less equivalent, but I'd be really interested in seeing the | numbers for Optane | buildbot wrote: | This is actually one of the golden use cases for optane! | Back in the pre-optane day, the server company I worked for | would back SSD/HDD zpools with a device called a ZeusRAM | drive: https://www.ebay.com/itm/234256677232 | | It's all about the latency. | withinboredom wrote: | The benefits would be that metadata type FS queries are OOB | from the actual data, theoretically allowing more IOPs on | your data disks spent on actual data. | smallnamespace wrote: | If you just add metadata SSDs you're also adding IOPS to | the pool. The question then becomes whether that improves | performance more than if you added those same SSDs without | the split (my guess would be not splitting is better, since | the IO load will be better balanced across drives). | kevvok wrote: | I'm not sure how much I buy their claim that ZFS being COW is | worse for SSDs given SSDs themselves are COW to allow for wear- | leveling because they have to erase whole blocks and rewrite them | if even one byte is changed | sipos wrote: | I thought the same thing as I was reading it, but I think they | are probably using larger block sizes than the SSD's blocks for | better compression. I'm not certain though. | toast0 wrote: | Given that they mention snapshots, that's probably the bigger | issue. Almost any sort of storage works better when you have | more free space, and having a snapshot means you need to keep | all that data as well as any data that changed since then, so | you have less free space. | | Using a COW filesystem adds at least some amount of usage, | since instead of modifying in place, you'd write a new block | and only trim the old block sometime after the new block is | committed; but if you don't have snapshots and you have zfs | autotrim (and it trims all your old blocks), the commit | interval is short (5 seconds by default?), so I wouldn't expect | a big difference in effective free space here. | bbarnett wrote: | They say their fs block size is 64kb. How large are SSD blocks? | jeffbee wrote: | SSD erase blocks vary, could be anywhere from 64KiB to | several MiB. Their logical blocks (visible to the host | operating system) are usually either 512 or 4096 bytes. | wtallis wrote: | NAND flash erase blocks passed 1MB a long time ago. For | mainstream TLC NAND, 16 to 24 MB is currently typical, and | QLC NAND has gone as high as 48 and 96 MB. NAND page sizes | are usually 16kB, but often with support for faster partial | page programming for the sake of 4kB operations. Logical | block sizes of 512 bytes are purely a fiction for the sake | of compatibility and SSDs don't actually track allocations | at that granularity. They do track things at 4kB | granularity even though that's not quite large enough to be | a good fit for today's flash. | bob1029 wrote: | My experience with SSDs tells me the only way to beat the | system is to employ some append-only log storage structure. | Potentially with segmentation done at the device level, so that | you can have a large fleet of drives in "append" mode while | others are reconciling or cleaning all their blocks in | anticipation of taking another full sequential fill-up. | Throwing mixed workloads at individual devices is just asking | for trouble if you are trying to maintain some razor-thin SLA. | Dirty pages and all the weird tricks employed to hide this | concern results in side effects that break more complex schemes | on top. | | Taking to the next level - Batching your I/O in software is how | you can start saying things like "Transactions per disk I/O", | not just "Fewer I/O for those transactions which now fit in | fewer blocks due to compression". Batching doesn't have to mean | "nightly processing". It can mean "all requests which occurred | over the last 100uS". From a user's perspective, this can | effectively still be a real-time RPC experience. For systems | with very heavy load, this sort of micro-batching can add many | orders of magnitude improvement in throughput. Also bear in | mind that the more transactions you have available to compress | each time means you get better odds when dealing with entropy. | | I have personally developed software that can insert 2-5x the | stated write IOPS figure with these sorts of tactics. On modern | NVMe devices this can mean you start tickling 8 figure | transactions per second if the size of each request is very | modest. | notacoward wrote: | The fact that there's two layers both doing this kind of thing | (especially oblivious to one another) is a known and studied | issue. Most of that study has been in the context of virtual- | machine filesystems atop host filesystems but it's also | applicable here. I'm pretty sure there was a paper at FAST | about this several years ago, but can't find it right now. It's | entirely likely that COW-over-COW is a bad choice just like | TCP-over-TCP is. | klodolph wrote: | Are you thinking of the classic, "Don't Stack Your Log On My | Log"? | | https://www.usenix.org/conference/inflow14/workshop- | program/... | notacoward wrote: | Yeah, seems about right. Thanks! | the8472 wrote: | But the upper-level CoW can coordinate with the lower level | via TRIM, that's not quite the same as TCP-in-TCP where the | congestion algorithms don't talk to each other. | notacoward wrote: | Yes, TRIM exists, but it's a very limited form of | coordination at best and implementations don't even do all | they can with that. Every analogy is imperfect. | paladin314159 wrote: | We switched from LZ4 to Zstd in almost all of our compression use | cases to great effect. Reducing the data on disk or over the | network is a huge win with only a minor loss in decompression | speed (using the appropriate level of Zstd). E.g. data in Kafka: | https://amplitude.engineering/reducing-kafka-costs-with-z-ta... | djanogo wrote: | Can somebody enlighten me on why you would use ZFS for DB?, seems | like there would be overlap/conflict of features. The only | benefit that I know of would be quick restore to previous state, | but how often would you need to restore? | jdhawk wrote: | They state it. Compression | Aachen wrote: | Even NTFS has compression, so it doesn't seem to be that | simple. | toast0 wrote: | Data integrity is nice to have. | jhk727 wrote: | Author here - It's explained in the post but the primary driver | is cost. 80%+ reduction in storage is massive when you're | storing petabytes of data on ssds. | deeblering4 wrote: | It's wild to me that sites managing storage clusters at petabyte | scale are doing it on AWS. I would think that by then you could | save millions more by migrating to your own colocated hardware. | thehappypm wrote: | It's not always that dollars you spend are evil. Sometimes the | more expensive or slower solution is better because it means | you can have fewer people working on it, or have lower-skilled | people managing it. And new people are easier to hire and make | productive because it's a shared skill. And you get updates for | free. And outages are likely to be shorter. And.. the list goes | on. | teknopurge wrote: | Or save even more using decentralized options like Storj.io or | Filecoin. | | IMO, market pendulum is swinging(in-motion) right now: | | cloud ---(cost drivers)---> dedicated HW/colo with hybrid or | custom cloud---(operational drivers)---> web3 decentralization | [^--- we are here ---^] | tmikaeld wrote: | You must be kidding? Those solutions are not even close to | running any database of this scale, did you even read the | article? | teknopurge wrote: | yep. Those solutions aren't there yet, and arguably too | early, but the market direction is clear. In five years I | would not be surprised to see some form of transactional | data store(high TPS, blob store) on a decentralized layer. | | Crazy to talk about different FS compression schemes when | trying to optimize business/app logic higher up the stack. | should be abstracted away by now. (yea, I know it's not, | but should be) | jhk727 wrote: | Author here - as others have noted, there's a lot of benefits | to a company of our size operating infrastructure on AWS vs. | managing physical hardware. A couple of the highlights for | managing our primary database cluster include: | | - Automation - this was noted by another commenter, but with | AWS we can fully automate the instance replacement procedure | using autoscaling groups. On hardware failure, the relevant | database is removed from its autoscaling group and we | automatically start restoring a fresh instance from latest | backup. This would be much more difficult if we were to manage | our own hardware. | | - Flexibility - we have the ability to easily change instance | classes via selling/buying reservations. Some of our biggest | wins historically have come from AWS releasing new instance | families - we've been able to swap out the hardware for our | entire cluster over a week or so, for negligible cost (often | saving money in the process due to the cost per unit of | hardware decreasing on new instance classes). While we could | leverage the same developments in a self-managed environment, | it would be more difficult, and likely more expensive due to | how capital-intensive self-hosted is. | | Additionally - there's a ton of value in the integration of the | AWS ecosystem. We use many AWS managed services, including | heavy use of RDS, Kinesis, S3, and others. For a company with a | relatively small engineering team managing a large | infrastructure footprint, it hasn't made financial sense yet to | invest in moving to self-hosted infrastructure. | tinco wrote: | The things you mention are ostensibly true, yet still they | don't make sense to me. It might make sense when you're a | startup that's growing, but when your SSD costs are so large | you can save _millions_ on them, then the numbers just don 't | add up. | | In my experience, doing things in the cloud is about as | expensive per 12-18 months as buying the hardware up front | is. That's super interesting for a fast growing startup that | could go bust any minute and wants to spend every second of | their time on growing, expanding and marketing. | | But when you're spending so much on AWS you can save millions | just by reducing filesystem overhead by 20%, it should have | stopped making sense a while ago. $2 million should get you a | team of 10 sysadmins and devops engineers. Sure automation | would be more difficult, but you'd have the manpower to | achieve it. Isn't that what running a business is about? | | Flexibility, when you're growing quickly it's nice that you | can provision new hardware instantly, but AWS is so expensive | you could continuously over provision your hardware by 50% | and still always be ahead of the AWS price curve. And as I | said, you could fully swap out your hardware every 18 months | and be at the same price basically. You could even hire a | merchant to offload your old hardware and recuperate 50% of | those costs. | | And I'm not saying to throw AWS overboard altogether, that | you have your core business outside of AWS's datacenter | doesn't preclude you from buying into RDS, Kinesis, S3. | | Is AWS just cutting you more financial slack than we're | getting as a tiny company? Or am I underestimating the costs | of getting that sysadmin team on board? | stef25 wrote: | > $2 million should get you a team of 10 sysadmins and | devops engineers | | Managing staff vs managing AWS ... I know what I'd choose | (without really knowing the numbers) | gwright wrote: | > In my experience, doing things in the cloud is about as | expensive per 12-18 months as buying the hardware up front | is. | | It seems like you are glossing over the other costs: staff | to implement and manage, development time and maintenance | for automation to re-implement everything that AWS | includes, data center costs (not clear if you were thinking | of hardware ownership only or data-center also). | | I'm not saying you didn't think about those things just | saying that they can't be ignored in these types of | comparisons. | tinco wrote: | Where did I gloss over them? I literally suggest spending | 2 million a year on staff. | gwright wrote: | Without specific numbers it is a bit difficult to be as | clear as I would like but I read your comment as | suggesting that the savings from owning/hosting your own | equipment could pay for the team needed to operate that | solution -- but then what was the point of switching? | | The devil is in the details, and I wouldn't say that it | never makes sense to bring operations in-house, but your | post didn't make a clear case from my point of view. | aflag wrote: | You don't need to reimplement everything aws provides. | AWS provides services on demand for a huge costumer base. | The sysadmin/devops team you set up needs to solve only | your particular problem. You can often get a better, | easier to use and maintain system this way. The downside | is that you have to pay for that extra team, but if only | 20% of your data already costs millions, your scale is | big enough that you'll likely save money by hiring a sys | admin team. | lazide wrote: | Sometimes, sometimes not. A lot of common 'prod bricks' | (S3, Managed kubernetes, etc) get used because they are | convenient, and while they could be implemented in some | other, more bespoke way, it's rarer and rarer that it | actually pencils out as a net win. You also have to deal | with the complexity of managing your own version of it, | which is non trivial over the full lifecycle of | something. | | If it is your core business to provide that thing? | Sometimes or even often worth it. Otherwise, often not. | manquer wrote: | Unless you are literally Amazon or big Co, spending ten | of millions ( if 20% is millions then at-least 5-10 | million ) on just SSDs _alone_ ! should make it pretty | much a core business problem to solve? | | My sense that in the last decade startups have not lost | the skills to do Co-Location setups as they did in 2000s | and think it is more complex than actually is. Co-Lo | hardware management is hard yes, but if it not even worth | doing 10+ million /year budgets we would never have SaaS | companies pre-cloud at all. | merb wrote: | a lot of people underestimate how expensive it is just to | have a 10g/25g/100g network and maintain it. that alone | is extremly expensive. especially when you want it over | multiple locations. if you want to connect two | datacenters with a low latency high troughput network you | would probably go to aws since that is cheaper. and that | is just the network, you also need to maintain other | stuff like storage. maintaining a storage network is | extremly hard. like s3/block storage, etc. | PragmaticPulp wrote: | > In my experience, doing things in the cloud is about as | expensive per 12-18 months as buying the hardware up front | is. | | The fallacy is comparing hardware costs to services cost. | The hardware is the cheap part. | | When you run your own system, you have to develop the | entire system up front and maintain it on the backend. The | hardware is cheap by comparison to the salaries and | development costs you pay. | | > $2 million should get you a team of 10 sysadmins and | devops engineers. | | Probably double that once you add in fully-loaded costs as | well as the compensation for ~2 managers to manage them. | mbreese wrote: | But if you're paying X in OpEx to AWS, at some point Y in | CapEx (hardware) and Z for OpEx (your people) becomes | more compelling. What I believe the comment above is | arguing is that if you are saving millions for 20% | savings on SSD costs, X >> Y + Z. | | Yes, you have to manage the hardware, and Y doesn't | automatically go to zero for year two, but the | convenience of the cloud isn't always cost effective. | Don't get caught up with the details. The 2 million | figure doesn't matter as much. It's finding that | inflection point and making the better business decision. | starfallg wrote: | >When you run your own system, you have to develop the | entire system up front and maintain it on the backend. | The hardware is cheap by comparison to the salaries and | development costs you pay | | The development costs are OTC that are amortised during | the life of the solution. Whereas in xAAS, they are MRC. | The longer your tech refresh cycle, the cheaper it is. | It's inherent to the pricing model. | | Also when you get down to the cruz if it, these solutions | (like openstack or vSphere), are software platform that | provides similar features. There's not much development | costs, it's just software licensing and PS. | | In terms of operations, it's not like you can get rid of | sysadmins, they just morphed into DevOps. | | >Probably double that once you add in fully-loaded costs | as well as the compensation for ~2 managers to manage | them. | | You might as well add all sorts of additional costs such | as egress charges on exit and cloud consultancy. | manquer wrote: | There are plenty of financing options( with pretty | competitive interests) that will help you amortize your | upfront costs over project lifetime if your credit is | good ( at this kind range any startup gets access to this | fairly easily), So both options can be monthly recurring | if need be. | aflag wrote: | I think he compared hardware + team salary vs aws. After | a certain size that starts tipping over to the side of | having things on prem. Running things on prem is nothing | scary. You just need people with the skill set needed to | do it. But when you're spending millions in | infrastructure, that's hardly a problem. | [deleted] | jes wrote: | Thank you for writing this article. | | I'm curious: The engineers that brought these significant | cost savings to your company, did they receive a share of the | money saved? | AstroDogCatcher wrote: | Thank you - haven't laughed like that in a while. | Aachen wrote: | I'm not sure how appropriate it is to take a serious | comment where the author has a genuinely unpopular | opinion and say you laughed really hard at it. | Johnny555 wrote: | how would you even allocate the money fairly besides | rolling it into the company to keep it alive and successful | (and maybe the added profit increases the bonus pool if | that exists)? | | The engineers didn't do it in isolation - how much of a | share goes to the office receptionist that answered the | phones and kept visitors out of the way of the engineers? | How much goes to the Finance department that kept the | engineering paychecks coming while they did they work? How | much goes to the salespeople who kept the deals flowing and | money coming in that filled the disks in the first place... | and so on and so on. | | Once a company exits the "a few devs in a garage" stage, | many people contribute to the company's success. | franga2000 wrote: | Let's face it, increasing your employer's profit margin | will only benefit the employees if the company is | struggling and they were about to be laid off because | costs were starting to eat into the margins. The only | case where it does is with a bonus or with workers owning | shares with dividends. "keeping it alive and successful" | only matters if that wouldn't have happened otherwise and | even then not much if you have good job mobility. | | No need to allocate. Something like "our clever engineers | managed to save us $2M per year, so we're giving everyone | a $500 bonus this month" seams entirely reasonable. | shrubble wrote: | It seems shocking to me, that you haven't yet migrated, | however you know your costs/benefit ratio more than I do. | Have you ever examined a split model, where some parts of the | load are run on your own or rented dedicated servers, and | some runs on AWS? | | Separately to your comment about LVM... the LVM snapshot | requires that a separate part of the volumes be set aside to | hold the snapshot data. | | If the snapshot volume fills up with changes being made to | volume that holds your data before the snapshot completes, | then the snapshot will fail. | | This does not occur with ZFS as you have noticed. | newsclues wrote: | Are you in the gap oxide computer is trying to fill? | _nickwhite wrote: | AWS is more often than not a better solution than colo when | factoring in the on-site engineers, techs, and operational | complexity costs a company will pay to monitor and respond to | hardware-related events. One could build out a datacenter | management team with on-call engineers, or, they could pay AWS | to handle all that, and focus on innovation and products that | make their company unique and (hopefully) profitable. AWS makes | a lot of sense for companies that wish to inoculate themselves | from the hardware layer, and it would probably take a company | many magnitudes larger than Heap to realize any real benefits | from self-hosting at a colo. This isn't even considering the | fact that uptime matters, and you'll need more than 1 colo to | really do it right. | | I say this as someone who built, manages and operates | datacenters and colo spaces. | runlevel1 wrote: | I'm not saying it never happens, but I've never seen moving | to AWS (or Azure, GCP, etc.) save in people costs at any tech | company with a large resource footprint. It just shifted | where the time is spent and who had to spend it. | | The public cloud and managed services work great for the most | common use cases, but go outside those and you start having | to engineer around limitations. | | If you have a sizable footprint in any given dimension you're | trading one complexity for another. | bombcar wrote: | The "who had to spend it" is huge - companies _love_ paying | providers and _hate_ paying people /depreciating costs. | markus_zhang wrote: | Curiously our company moved from AWS to on-premise a | couple of years ago. Something about CAPEX -> OPEX was | mentioned back then. | bombcar wrote: | That's the second part of the loop, when the cost of AWS | is high enough that you can show immediate dollar savings | by bringing it in-house. | markus_zhang wrote: | Yeah, that was a bit more than one year before we filed | for IPO :) | walrus01 wrote: | > hate paying depreciating costs. | | One possible solution for this if you want to do it as | bare metal you control, is leased equipment (even with $1 | buyout at end of term), which can be accounted for | differently than purchasing it up front. | [deleted] | gsliepen wrote: | For a while I did maintain a storage cluster that had close | to 0.5 PB of data, and had a capacity of up to 3 PB (if you | filled all the slots with the largest disks you can buy). You | want to ensure you have a lot of redundancy and spare | capacity if you are managing it yourself. Luckily, the | hardware is relatively cheap. It's the manpower that costs a | lot. Still, I think it was only 0.1 FTE to manage this | storage cluster, including the network, file systems, user | access, swapping out bad harddisks and storage pods (but | granted, it's I/O load was very light). Also, while AWS takes | away the burden of handling the physical drives and the | filesystem for you, now you have to handle interfacing with | AWS. That means you need an engineer that knows how to | integrate your application with AWS. If you can leverage more | AWS services, maybe even avoid needing your own server rooms | because everything is running there, it might pay off. | European vs. American salaries might also change the | equation. But if you just use it for storage, I don't think | it's worth it at any scale. | nickstinemates wrote: | Our company is not many magnitudes larger than any company | and it is not remotely cost competitive for us to run any of | our stack on AWS (heavy data ingest (hundreds of billions of | inserts a day,) many disks/'big data' backend, hundreds of | customers accessing the data. Even 1 time deals to get us | into the door are not cost competitive, let alone long tail | economics. Cloud for fast storage and high bandwidth usecases | is extremely expensive. | _3u10 wrote: | It isn't. I have the same overhead running my server as I do | a VM. What I don't get is a $3000 bill for $100 worth of | server. | | You can generally buy whatever you are renting from AWS for 1 | to 3 months of an AWS bill. | | The only thing I don't get from colo is a bunch of other | customers thrashing the cache on my CPUs | | Databases are not web servers there's no possible way to not | run a database on smaller / fewer instances when running at | non-peak times. Instant scaling is the only possible | advantage AWS could bring. However with the prices they | charge it's simpler and cheaper to just buy/rent your own | hardware. Especially if you have to pay egress fees. | (bandwidth is really the biggest ripoff) | midasuni wrote: | The argument isn't about using AWS to run a VM (which can | be cheaper that coloing your own kit, depending how many | you want and for how long), it's all the extra stuff. Start | an aws load balancer rather than run and maintain your own | for example. | | I don't like lock-in, but the prevailing view has always | been in favour of lock-in, be it IBM mainframes, oracle | databases, windows servers etc, and if you swing that way | aws has tempting offers. | | Oh and databases do scale. Say you want to run end of | quarter financials that require a lot of processing for a | day, you bring up tons of read replicas and away you go | _3u10 wrote: | If you bought hardware with what you pay for AWS RDS you | could run your entire DB in RAM. Hell you could probably | put the data in memory on a GPU. | | Also, this is generally why you run financials overnight. | If your hardware is serving transactions during the day | it can easily run your quarterlies at night. | | nginx is far easier to maintain than AWS load balancers | which is what load balancers their load balancers are. | The best part about nginx configs? They are cloud | agnostic and will work on everything from a Raspberry Pi | to a 128 core EPYC server. | | I'll tell you something about RDS reliabilty, your | monthly maintenance window brings your DB down far more | often than a single unreplicated server ever fails. EBS | (like the entire thing) has failed more times in the last | year on us-east-1 than my colo RAID. | | The selling point of AWS is that if you pick AWS and it | fails you can say, well the richest guy in the world | can't figure this stuff out so it must be impossible, | when in reality high school kids could make a more | reliable system. If you pick AWS you have the | unreliability of the base software / hardware of their | systems plus whatever the AWS engineers fuck up. At this | point it's pretty clear that they can't even keep a SAN | working. | spookthesunset wrote: | What about all the config management infrastructure to | manage those nginx instances? | | What is the amount of work required every time some team | wants to spin up new stuff? What is the turnaround time | between when they file the ticket and the work being | completed? | saiya-jin wrote: | That's a nice statement sounding like straight from Amazon | sales reps, and it can actually work for some companies, | maybe. But for our bank (top 10 globally), the only way to | get even equal financially to our own farms is to have | aggressive downtime every night that negatively affect | productivity of our global teams. Pricing is really not that | great if you deal in scale. | | You wanna push 1 evening a bit late to deliver something | valuable for the project? Sorry, no can do. | | I don't even factor in horribly expensive migration projects | that brought actually 0 added business value for the type of | apps we use. We still have to keep our Network, Windows and | Unix admins, various App support personnel etc., there is | plenty of work for them with AWS. Not 1 single IT guy was | made redundant. | | No cost savings, in contrary. | IgorPartola wrote: | That argument doesn't hold up. If how could AWS be cheaper | than doing it yourself at that scale? Like if you are a tiny | company that can't afford your own DC, your own engineers, | etc. then yes AWS is cheaper in absolute costs but not in | per-byte costs. But at scale you should be able to hire | engineers and build out a DC at which point you aren't paying | the AWS margin, which is how you save money. | | In other words your assessment would only be true if AWS had | a 0% or a negative margin. | kevincox wrote: | There is still an economy of scale. You are right, as you | use more resources your economy of scale will increase, but | it will never match AWS's (unless you are huge). So the | math is if the AWS margin is less than the difference | between the two different economies of scale then it makes | sense to run your own datacenters (ignoring the opportunity | cost of the transition costs). Of course at some point the | margin _will_ exceed that difference, but depending on what | type of infrastructure you need it can be at a very high | point. | IgorPartola wrote: | Your last point is the important part: depending on the | type of infrastructure you need you might be able to save | money. If you want a cheap place to dump your files, B2 | is cheaper than S3 and raw storage hardware pays for | itself in about a year. If you need a sophisticated CDN | then yeah you'll need to be huge before it pays for | itself. I would consider ditching S3 at the point where I | can hire two full time engineers to worry about my | storage layer. | walrus01 wrote: | If you have petabytes of data in AWS you already have a | number of on-staff engineers with significant six figure | salaries. | | If the problem is that your group of six-figure salary people | only know how to put data into AWS, or other cloud services, | and not design/engineer/maintain your own bare metal | infrastructure as well, then that would definitely be a | limitation. | | For reference, a few petabytes of data is not actually that | many systems these days, if you have something like a bunch | of 72-drive supermicros or equivalent with 14-16TB drives in | them. Set up properly this can be administered by one FTE (of | course with additional staffing/tech resources for when that | FTE is on vacation/unavailable, and appropriate training for | other persons who might have admin on the setup). | | my very rough calculation here says that a 36-drive ZFS | RAIDZ2 composed of 16TB drives is something like 492TB | (447TiB) usable storage capacity. | | so five such arrays would be 2460TB. | | compared to the monthly AWS bill for 2.0 to 2.5TB of data you | could probably afford to entirely duplicate the whole setup | in a twin identical set of hardware at a geographically | diverse off-site location. | chucky_z wrote: | Engineers who can put together and maintain this hardware | don't exist on the job market. | walrus01 wrote: | Engineers who can put together and maintain 2PB of | storage don't exist on the job market? I'd say you don't | know the right people or aren't looking in the right | places. | | 2PB is really not that much stuff these days. It's less | than two cabinets of equipment and that amount of space | (80RU or so) includes routers, switches, AC power | distribution, OOB, etc. | spookthesunset wrote: | They exist but finding and recruiting those people costs | a non trivial amount of time and money. Both of which | could be spent on whatever secret sauce your company | does. | __turbobrew__ wrote: | I work in a very data heavy HPC space and you are glossing | over many things. | | * to get performant access to a storage cluster is non- | trivial, there are many different variables in place which | must be correctly tuned to get good performance. Network | topology, high quality NICs and switches, a tuned Linux | kernel, client side caching settings, network packet sizes, | file system block sizes, erasure coding settings, etc. | | * your solution mentions nothing of backups, offsite | failovers, or disaster recovery plans. | | * your solution mentions nothing of a physical datacenter: | fire suppression, battery backups, hvac, power supplies, | backup generators, server racks, sound isolation, | workspaces for hardware maintenance, network cable routing. | | * If you have multiple geolocations you need to have dark | fiber or ip transit between locations with multiple ISPs to | have high speed connections between sites without downtime. | | In addition to the raw costs you have to factor in the lead | time of building a qualified infrastructure team, building | out the requirements, provisioning hardware and datacenter | space, setting everything up, and then tuning everything. | With infinite money this is probably still a 2 year lead | time at minimum. | | I do agree that running multi petabyte workloads in AWS is | probably not optimal, but when you are a startup in the | growth stage it is probably better worth your time throwing | VC money at AWS and building out your product. Eventually, | the business should probably migrate to self managed | infrastructure once the right product fit has been found | and the business is looking to streamline. | PragmaticPulp wrote: | > If the problem is that your group of six-figure salary | people only know how to put data into AWS, or other cloud | services, and not design/engineer/maintain your own bare | metal infrastructure as well, then that would definitely be | a limitation. | | It's not about whether or not the engineers can make the | colocated setup work. | | It's that you're going to pay _a lot_ of hidden costs with | a colocated setup. Engineers can 't set up, maintain, and | do on-call for the colocated setup without subtracting from | their primary working hours. | | Each additional engineer you have to hire to help with the | colocated setup is $200-400K fully loaded out of your | company's budget. If you have to hire 3 additional | engineers to fill out your colocated on-call schedule and | help set up and maintain the system, that's easily an extra | $1 million per year on your budget. Cloud is expensive, but | $1 million goes a long way. | | It's easy to look at a potential AWS bill and a potential | colocation and hardware bill and declare colocation the | winner, but then you still have to set up and maintain it | all as well as constantly train everyone on it. | | With AWS, you can hire engineers with AWS experience and | they'll understand the big picture of how to work with | things on day 1. With a custom setup, you're at the whims | of whichever employees set up the system because they know | it best. | | Colocated systems tend to work very well _at first_ when | the original engineers who set it up are all still at the | company and it hasn 't run long enough to start | encountering rare failure modes. They quickly become a | nightmare when your engineering staff turns over multiple | times and nobody can remember who knows how to do what on | the colocated system or if the documentation is up to date | or not. | walrus01 wrote: | > Colocated systems tend to work very well at first when | the original engineers who set it up are all still at the | company and it hasn't run long enough to start | encountering rare failure modes. They quickly become a | nightmare when your engineering staff turns over multiple | times and nobody can remember who knows how to do what on | the colocated system or if the documentation is up to | date or not. | | Everything above really sounds like it's just | regurgitating AWS sales person talking points. | | Sounds like a systemic management / CTO-level problem to | me if a company isn't willing to put in place the hiring | practices and compensation, documentation systems and | operational procedures to deal with that sort of concern. | | If your core engineering staff is turning over multiple | times for arbitrary reasons you have other problems to | deal with. | | > Engineers can't set up, maintain, and do on-call for | the colocated setup without subtracting from their | primary working hours. | | If a company can't hire datacenter techs to install | hardware, cables, and swap hardware as smart remote | hands, maintaining as little as a couple of 45RU cabinets | of gear, you also have other management/systemic problems | to deal with. I'm looking at this from the point of view | of a facilities based bare metal ISP that owns/runs all | of its own hardware, and can tell you it's not rocket | science. | hagy wrote: | I worked at a company that migrated a 100 PB Hadoop | cluster to GCP for assorted reasons despite many years of | success with colocation. I wasn't involved in any of | this, but the team's decision process makes sense. You | can read through their decision making in these blog | posts: | | * https://liveramp.com/developers/blog/google-cloud- | platform-g... * | https://liveramp.com/developers/blog/migrating-a-big- | data-en... | | One big point was challenges of maintaining multiple | colocation sites, with cross replication, for disaster | recovery. Since Hadoop triple replicates all data within | one DC, this requires 6 times the disk storage capacity | of data size for dual DCs. In contrast, cloud object | storage pricing includes replication within a region with | very high availability such that storing once in cloud | storage may be acceptable. Further, you also need double | the compute, with one of the DCs always standing by | should the other fail. | loriverkutya wrote: | If I have to pay the same amount of money and I can | choose to deal with people or deal with a decent sized | company (and leave them to deal with their people), I | always choose the later. | PragmaticPulp wrote: | > If you core engineering staff is turning over multiple | times for arbitrary reasons you have other problems to | deal with. | | People leave for all sorts of reasons: Moving for family | reasons, becoming stay-at-home parents, moving for a | spouse's job, retiring, starting their own companies, or | even just getting bored and wanting to do something | different. Or it could be as simple as getting promoted | to a different role. | | It's unrealistic to make engineering decisions with the | assumption that the same engineers will be around and | stuck on the same project forever. | | Like the OP said: Every hour they have to spend working | on the colocation setup is an hour they aren't spending | on your company's competitive advantage, so you have to | hire more engineers (and more managers) to compensate. | | > If a company can't hire datacenter techs... | | How many techs do you think you need for reasonable on- | call coverage? 3? 6? Add a manager in the mix because you | need someone to manage them. | | The costs add up quickly. | | It's weird to see people championing colocation as a cost | saver and then pivoting to arguments that you just need | to hire more engineers and techs and manage them. | | Employees are expensive. One of the primary benefits of | cloud is that you don't have to hire and manage all of | these employees to do all of these things at the colo. | midasuni wrote: | My team of three looks after hundreds of bits of diverse | kit in dozens of locations across the world. | | I can't remember the last time I took a call outside of | office hours, and even in hours it's very rare. There's | enough resilience built in that any issues can wait until | morning | | The last major outage was in 2017, before we had a third | member of the team. I was on the other side of the world | installing a new system, the other was on leave. We had a | network issue, OSPF melted and knocked out some services, | we were down for about half an hour as I rebooted the | core switch pair remotely. | | (We've since redesigned so that doesn't happen) | | We get paid nowhere near six figures either. | | Sure you can be ridiculous, I remember one team I worked | on that employed a full time unix contractor (on 3 times | the staff wage) to look after 6 servers and deploy a tar | ball every few months. I replaced him with a small shell | script. Another was a DBA looking after a small oracle | database (oracle - which of course is that generations | "just use amazon") | walrus01 wrote: | I think a number of people who are taking the position of | "but it's so HARD, and so EXPENSIVE!" to own and run bare | metal network infrastructure may not have ever seen a | proper OOB management setup, with dedicated OOB network, | serial consoles on stuff, management routers and switch | at site, things like cradlepoint LTE radios stuck to the | top of colocation cages, etc. | | And then basic other things like having remote smart | hands ready to go, and common failure items like fans, | power supplies, fan trays, hard drives pre-positioned and | ready to swap in. With MOPs for swapping them. Stocks of | basic things like fiber patch cables, commonly used | transceivers, copper patch cables, stored in every cage. | PragmaticPulp wrote: | > I think a number of people who are taking the position | of "but it's so HARD, and so EXPENSIVE!" to own and run | bare metal network infrastructure may not have ever seen | a proper OOB management setup, with dedicated OOB | network, serial consoles on stuff, management routers and | switch at site, things like cradlepoint LTE radios stuck | to the top of colocation cages, etc. | | Or we have seen all of this and that's exactly why we | don't want it. | | Building a company is hard enough. Adding the overhead of | developing, maintaining, recruiting for, and staffing our | own datacenter is madness when I can click a few buttons | and get the same thing from a cloud provider _without | hiring anyone extra_ to manage the datacenter. | | No one is denying that a proper data center management | system can exist. We all know it can exist. | | The issue is that it's a huge distraction with a lot of | potential pitfalls. Your network infrastructure with | Cradlepoint LTE radios in the colocation cages sounds | great after it works, gets set up, stays documented, and | all the bugs have been ironed out. But that's a lot of | hidden work that could have been allocated to launching | the product faster. | walrus01 wrote: | I think the difference in point of view here, is that | from my own perspective, owning and running the bare | metal things is a basic core competency of being an ISP. | Which is what I do for a living. The infrastructure I've | described up thread _is_ the product. | | If the use case is somebody developing a software product | that is a totally other scenario. | spookthesunset wrote: | For an ISP, you are probably correct. | spookthesunset wrote: | Been there, done that. No thanks. Every place that has | their own hardware also has a huge bureaucratic process | to get more hardware for your project. Not to mention | almost always the software stack is old as dirt. For | example mongo might be two or three major versions behind | current... and IT wants nothing to do with supporting the | new version. | | People move to the cloud to escape their company's IT | process... there might be some unicorn company out there | that does infrastructure "right" but I've yet to work | there. | _s wrote: | ^^ This 1000x. | | Humans are incredibly expensive and notoriously | unreliable when compared against "machines"; or in this | case an API. | | It's usually worth paying 2-3x the cost to have someone | else manage something for you with a given SLA, because | that's what it will end up being when you decide to bring | it in house when you take into account the time and | effort needed as well. | | A "good" & "reliable" Systems Engineering team, that can | offer 24/7 support will take around a year to hire and | setup, and they need roughly the same amount of time to | transition you off AWS in to your system. They probably | need closer to 3-5yrs to give the same level of | documentation, API's, tooling, processes, UI's and | training that AWS already provides. | | Let's call it 5 years to get to the level of AWS when you | started the transition. | | A decent team of 5-7, including engineers + PO/PM + UX | and so forth, is at least $1.5M/yr. That's $7.5M over 5 | years, not including your new hardware and networking | costs. Let's call it $10M. You're also 5 years behind AWS | now, and over that transition you're still paying AWS, | and your development speed has halved as you wait for | your new team to build or transition infrastructure. | | You can trade cost and quality for speed and have | everything ready in 2-3 years by setting up a few teams. | Add HR support, more contractors etc etc. you're looking | at a $10M+ outlay again, regardless. | | Or you can keep paying AWS $5M/yr, renegotiate fees often | and literally not worry about that headache and focus on | your product. | Dylan16807 wrote: | You act like AWS doesn't require you to have a team, or | carefully transition infrastructure. Because of that your | cost estimate is _much_ higher than the couple additional | people you should actually need. | | > They probably need closer to 3-5yrs to give the same | level of documentation, API's, tooling, processes, UI's | and training that AWS already provides. | | You don't need to build an internal AWS to manage your | own servers. | bserge wrote: | This all sounds like excuses. Insert that two dog meme, | one builds a freaking datacenter using commodity hardware | in a barn and the other uses AWS and complains about | lock-in and expenses. | | As an example, imagine if the founders and engineers of | Backblaze thought like that. | Spooky23 wrote: | Sorry, that's bullshit. | | It all depends on the size of the investment and how you | need to run it. I built a "new" environment on a company | premises due to some compliance requirements that would | be cost-prohibitive in AWS or GCP. The gear was procured | through a leasing vehicle, and the hardware vendor had an | SLA for delivering compute and storage. HPE happened to | win the bid. | | There is very little difference operationally. From a | costing perspective, it's about 40% less than an AWS | solution. But in fairness, the customer had an existing | investment in a facility - you'd reduce the savings if | you had to lease appropriate space in a colo. There are | some differences in terms of headcount, but those staff | aren't in NYC/SFO/BOS, so they are very cheap -- senior | level engineers for $80-120k, fully loaded. | | Startups do stupid shit like buy supermicro computers and | cobbling together hardware that gets them into trouble | when the mad scientist moves on to a new gig. Makes sense | when you're drowning in VC money and need to hire people, | but doesn't make sense in most other ways. You avoid that | by doing competitive procurements and paying marginally | more for HPE/Dell/Lenovo/etc. | walrus01 wrote: | > cobbling together hardware that gets them into trouble | when the mad scientist moves on to a new gig. | | If you think that standards based x86-64 hardware running | Linux and ZFS, or FreeBSD and ZFS is something that is | super unreliable and requires a "mad scientist", then | yes, you are definitely in HPE and Dell's target market. | spookthesunset wrote: | I laughed at the "mad scientist" comment because every | startup I've worked for has had a "mad scientist". They | are always very opinionated and have a lot of political | capital because of seniority. The weird concoctions they | create... the minute they leave all the remaining | engineers immediately replace most of it with off the | shelf stuff. | | Home built web frameworks (which apparently aren't | "bloated" and "slow"), piles of bash scripts because they | never heard of Salt (or whatever is the latest config | management tool)... | | Almost always they think they are "saving money" by doing | what they do, rarely do they ever consider the | opportunity costs to rolling the entire software stack | from the hardware to web stack on their own. | | Good times. | mercurialuser wrote: | the op was talking about hardware, I think. but you are | spot-on on the mad scientist. we are just in the process | to de-clutter all the non standard, hand made, highly | customized, never documented, stuff produced buba | coworker that, unfortunately, passed away. | | btw, we also wrote a HA cluster software for sun solaris | in year 2000... | [deleted] | mercurialuser wrote: | I run 10 years old HP servers, something like DL580G5, | with care-packs (extended, paid warranty). We needed to | flash firmware, 2 motherboards broke and they sent spare | parts to replace. Probably with less enterprise-y server | firms it may be difficult to find spare parts after 10 | years.. | markus_zhang wrote: | We are probably looking at the new future in which Cloud | computing == Mainframes of the 50s~80s and fewer and fewer | people even know how to run the whole scene on-premise. | People who got into cloud computing early (mostly by luck) | get to win big bucks and better life styles while others | try to dispel the magic from left and right. | walrus01 wrote: | ultimately, though, somebody has to own, house and run | those mainframes, so it's just abstracting the work away | to some other group of people. lots of people made | careers out of running mainframes and minicomputers in | the 1955-1985 time frame. in the case of things like aws, | azure, etc, it's just a lot more centralized in a smaller | number of gargantuan companies. | markus_zhang wrote: | Since the future is pretty much set, I think it's more | relevant to try to obtain skills (albeit more difficult | to obtain because fewer companies have them) and jump | into the wagon. | amluto wrote: | You have to factor in the cost of egress from AWS to your | nice colocated drive. | walrus01 wrote: | welcome to the hotel california... | | Last thing I remember, I was | | Running for the door | | I had to find the passage back | | To the place I was before | | "Relax, " said the night man, | | "We are programmed to receive. | | You can check-out any time you like, | | But you can never leave! " | amluto wrote: | And this is why I don't think AWS will lower egress fees | in response to R2. AWS may be more interested in | discouraging people from using egress than in capturing | the revenue from egress. I predict that, at most, we'll | see a narrowly tailored reduction in egress fees that is | designed to be entirely useless for communication between | server applications. | sorenjan wrote: | "Nobody Ever Got Fired for Buying IBM" | | Maybe it's worth a couple of million to not have to deal with | the risk, and just keeping status quo. | dilyevsky wrote: | Exactly, most management would prefer to just set investors | money on fire and keep risk profile low if their business | model can support it | mwcampbell wrote: | I would hope that for a high-throughput DB cluster like this, | they're using instance-local storage rather than EBS. If that's | the case, then they're probably already taking advantage of EC2 | reserved instances to save a lot compared to the on-demand | prices that we usually see. | jhk727 wrote: | We are, though we started out using EBS. As you mentioned, | NVMe instance storage performs much better for our workload. | We work around the lack of durability through strong | automation of point in time restore/swapping in of new nodes | in case of hardware failures. | | And yes, reservations make a massive difference economically. | enginaar wrote: | Bank of America saves $2 billion per year | https://www.google.com/search?client=safari&rls=en&q=bank+of... | dayjah wrote: | I feel it's fair that they're on AWS right now. Generally the | arc of MVP->IPO involves using the cloud to find product market | fit, and as that fit improves your revenues should also. Moving | from the cloud to a colo would then be driven by capital | investment to bring down COGs; to either improve PPS or get to | cash-flow positive. | | Heap using AWS just means they've not yet reached a point on | that trajectory where the capital investment moves the needle | enough to warrant it. That could be for any number of reasons. | Stevvo wrote: | Far lower risk and capital investment than collocation. I've | never had to store petabytes of data, but I would imagine the | considerations are not too different to smaller scales. | [deleted] | jasode wrote: | _> petabyte scale are doing it on AWS. I would think that by | then you could save millions more by migrating to your own | colocated hardware._ | | Usually, those types of judgements are based on thinking of AWS | as a "dumb datacenter" such as a bunch of harddrives or just | bare cpu. | | AWS is more cost-effective _if you use high-level AWS services_ | instead of just storing files in the cloud. In this case, it | looks like Heap is also using _AWS Redshift_ and probably a | bunch of other services in the AWS portfolio. A similar comment | I made previously: | https://news.ycombinator.com/item?id=28288352 | | So for self-hosting hardware, Heap would not only build up the | petabytes of diskspace, they also have to replicate Redshift | functionality and the entire AWS _services portfolio_ they 're | using. If you use enough AWS _services_, it _becomes cheaper_ | than self-hosting because you don 't have to reinvent the | wheel. | jjav wrote: | > AWS is more cost-effective if you use high-level AWS | services instead of just storing files in the cloud. | | Mostly this only works when your utilization is low(ish). | Once you have high load 24x7, the AWS profit margin will | quickly overtake the self-hosted solution. | cestith wrote: | With spiky utilization, you're buying and powering a lot of | hardware to sit idle a good portion of the time. | deeblering4 wrote: | My main take aways from this are that cloud vendor lock-ins | are real, and they can be hard to break free from. | | Perhaps that's more of a cautionary tale for new projects | than a justification for the expense though. | jasode wrote: | _> Perhaps that's more of a cautionary tale for new | projects than a justification for the expense though._ | | You can find case studies for both positions: | | - migrate to AWS to save money: Netflix, Guardian newspaper | [1] | | - migrate away from AWS to save money: E.g. Dropbox [2] | | A lot of companies (especially non-tech businesses) don't | have the technical skills to run internal datacenters at | the same competency as AWS. Thus, they don't want to be | "locked in" to their own IT department that's slow and | handicaps their business. | | Dropbox, Facebook, and Walmart would among the very few | that can competently run their own datacenters with | advanced services like AWS. | | [1] https://web.archive.org/web/20160319022029/https://www. | compu... | | [2] https://www.google.com/search?q=dropbox+migrates+off+aw | s+sav... | Tehnix wrote: | And then Dropbox shifted kinda back again, at least | partially [0], it's interesting to see the ebb and flow | :) | | [0]: https://aws.amazon.com/solutions/case- | studies/dropbox-s3/ | ignoramous wrote: | Wait. Why? How? Their in-house system (Magic Pocket / | Diskotech) _seemed_ so promising. | | https://dropbox.tech/tag-results.magic-pocket | nosefrog wrote: | They're for different use cases. Magic Pocket is for | storing file block data, and according to the AWS | article, they just moved their analytics data to AWS. | jasode wrote: | _> Their in-house system (Magic Pocket / Diskotech) | seemed so promising._ | | The story described Dropbox moving _" 34 PB of analytics | data (Hadoop)"_ to AWS. | | My reading of Dropbox's Magic Pocket / Diskotech appears | to be storage for _customer raw data_ -- similar to | BackBlaze type of raw storage. | | It's 2 different use cases so it's not surprising Dropbox | found AWS to be effective for analytics workloads. AWS | has an _extensive portfolio of software services to | analyze data_ so Dropbox may have concluded paying AWS | would _cost less_ than reinventing the analytics pipeline | in-house. | r3trohack3r wrote: | What you're calling "vendor lock-ins" I'm calling | "providing sufficient value to justify cost." | | It's not that migrating out isn't possible, it's that | Amazon is providing "Engineering/SiteOps Departments as a | Service" at a price that's hard to compete with in house. | whydoyoucare wrote: | I am not sure the size of your company and the budget for | in-house prices, but we realized AWS is not just a lock- | in, but also a permanent money drain. | hackerfromthefu wrote: | What's the newspeak for the high egress fees? | aaronblohowiak wrote: | The amount of staff you have to have on-hand and amount of pre- | planning (and up-front capital commitment) can all make that | very unattractive long after the basic per-GB price would seem | to make it attractive. | bbarnett wrote: | No. You already have 24x7 staff at this scale. Hardware | requires thought and skill, but then so does software. It | isn't voodoo. | outworlder wrote: | > No. You already have 24x7 staff at this scale. Hardware | requires thought and skill, but then so does software. It | isn't voodoo. | | Not necessarily. Hardware requires people to physically | replace failed drives and otherwise do on-site maintenance. | | In the _unlikely_ event that an AWS volume fails, I can | (and have) automation to fix that. While everyone sleeps. | Robotbeat wrote: | Okay, but it's not hard to setup up redundancy and warm | spares as well to make it automatic. You don't need | someone physically there. | deeblering4 wrote: | > Hardware requires people to physically replace failed | drives and otherwise do on-site maintenance. | | This is the premise of colocation (as opposed to building | your own server room). A colo is a secure building with | round the clock staff. Hardware vendors offer rapid on- | site parts replacements and can gain access via the on- | site staff, and the colo has services to perform on-site | work like "remote hands" as well. | | > In the unlikely event that an AWS volume fails, I can | (and have) automation to fix that. While everyone sleeps. | | Fault tolerant architectures can be be deployed on | colocated hardware too. | outworlder wrote: | The point is - I can do any changes I need to the | underlying resources programmatically and near instantly, | without ever having to talk to anyone. Including cloud | provider staff. Or rather, automation can. | | There may exist some colo where I can get a server(or | storage, or network cards or anything else) added in | minutes over an API call but I haven't heard of any. | That's usually found on the VPS side. | | > Fault tolerant architectures can be be deployed on | colocated hardware too. | | They can, usually requiring that you specifically setup | redundancies and the like. Which is something that you | _already_ have for many cloud offerings. Your automation | and redundancies sit on top of the vendor 's existing | redundancies. | | For instance, the EBS volume I mention. It is not a disk. | Its not even just an array. It's a far more sophisticated | abstraction. If there are issues, it can automatically | fetch blocks from your snapshots(if the blocks are | unmodified, something they also keep track of). Not happy | with spinning disks and want a SSD? No need to place a | service order to your colo provider, just send an API | call and this will be automatically migrated to SSDs | without your applications ever noticing the difference | (other than the response time) and with zero downtime. | Your software could even do this if it notices that the | workloads require it. | | If an AWS datacenter goes up in flames the systems I | manage will still function (and will self-heal, assuming | they even get affected, which for big zones they might | not be). I don't have to talk to anyone. I can be | sleeping and this will still happen. | | It's a completely different level of abstraction. | | If you want to compare a big cloud provider with either | your own datacenter or colocation facility, there's a big | disparity in scale. At a minimum, you would have to | compare with several interconnected datacenters or colos. | You still don't get the abstraction layer. | | It's all missing the point though - I was pointing out | that software doesn't necessarily need to have 24x7 | staff, as the parent poster was pointing out, even for | exceptional (but predictable) issues. Sure, you need | someone on-call to handle completely unexpected events, | but I don't think that was the point being made. | jjav wrote: | > There may exist some colo where I can get a server(or | storage, or network cards or anything else) added in | minutes over an API call but I haven't heard of any. | That's usually found on the VPS side. | | To be fair, you can't get hardware added in AWS via API | call either. What you can do is spin up | instances/storage/etc via API call, as long as that spare | capacity hardware is already set up, available and ready | to be allocated to you. Which you can also do on on-prem | hardware. | | If you're saying that your utilization is so peaky or | unpredictable that you end up needing an order of | magnitude more, or fewer, resources available day to day, | then you are absolutely correct that provisioning so much | spare capacity on-prem would be prohibitive. This is an | use-case where AWS excels. | | But if your utilization doesn't have dramatic peaks and | growth is mostly predictable, then it becomes practical | to provision for it on-prem and it'll be a lot cheaper. | koolba wrote: | It is a different skill though. Going from zero to one for | physical infrastructure is a significant leap in both cost | and operational process. You need to manage inventory, | provide 24/7 physical access, and set up supply chains to | ensure you have ongoing availability. | z3t4 wrote: | You don't have to do everything in house, you can for | example buy servers with a on site support agreement. | Then you just have to buy new servers at regular | intervals, you don't need to have a guy that can fix a | server with a soldering pen. | | Same for internet connection, you can buy transit, no | need to become your own ISP. So you don't need people who | deal with peering agreements, etc. | | For electricity you can make a support deal with a local | electrician company. You don't need a guy who can build | and maintain a custom power supply unit. | | It does help however to have someone with basic sysadmin | and network skills. But if you don't have that, you will | sooner or later screw up your AWS infrastructure too. | tomnipotent wrote: | > support deal with a local electrician company | | Maybe if you're wiring a closet in your office, but no | colo facility is letting you within ten feet of their | power infrastructure. The best you're getting is a | racked-mounted UPS. | manquer wrote: | I think OP means setting up your own DC, for colo most of | this is anyway offered by the DC partner, you would only | go with something else if there is very good reason not | to use what they offer. | Spivak wrote: | Y'all are making "buy an asset tag printer", "have a rep | from Dell/HP" and "use the data center's remote hands if | you need it" sound crazy complicated. | fwip wrote: | Not always true. Some data is intrinsically bigger than | others. | | If you have a petabyte of chatlogs, sure, you have 24x7 | obligations to millions of people. If you have a petabyte | of astronomy data, you have like 3 research scientists | using it. | Robotbeat wrote: | The research scientists DEFINITELY can't afford to run | petabytes of astronomy data on AWS. Source: am a research | scientist. | fwip wrote: | Oh, for sure. Just that you don't usually have the amount | of dedicated staff that was implied. | namdnay wrote: | Keep in mind that nobody at large scale is paying the sticker | price for AWS (or Google or Azure) | ksec wrote: | >storage clusters at petabyte scale are doing it on AWS. | | I had to double check just in case, but petabyte is only 1000 | terabyte. It may be big in terms of database, but rather small | in absolute terms. You could fit a single Petabyte in a 1U | server. | | I doubt they pay listed price. And AWS is now mostly a | Enterprise and Sales game. So once you ran other cost involved | in managing, I would think you need to be multiple Rack scale | before the cost break down better for your own hardware. | | And that is excluding other benefits of sitting inside AWS | ecosystem. The only thing I think AWS isn't so good at is the | low cost, sub $1000 per month spending scenario. Where you are | paying a lot more just for staying inside the ecosystem for | things you may not be using. Those tends to flavour Linode or | DO. | toast0 wrote: | > You could fit a single Petabyte in a 1U server. | | That seems a bit over the top. I see 18 TB drives available, | but let's posit 20 TB drives, so you need 50 of them. I don't | think you can fit 50 3.5" drives in a 1U space, even if | there's no motherboard or power supply. 50+ drive storage | chassis are generally 4U. I did see some 16 drive 1U servers | though, so I'm pretty sure you could fit that much storage | into 3U even though I also didn't see any 3U storage chassis. | WJW wrote: | The discounts you can get from doing anything at big enough | scale will push your costs back to colocation prices. Don't | assume that anyone with a cloud bill over 200k is playing | anywhere near the price you read on the pricing page. | Robotbeat wrote: | "Will"? I doubt it. Definitely not with that level of | certainty. | tomnipotent wrote: | > certainty | | Considering the number of people here commenting about | costs but have never managed a P&L, I don't think certainty | is high on the list. | FpUser wrote: | Nope. I had chance to compare what one org had for 600k of | real money after all the discounts. Not even remotely close | to what one can get for rented dedicated servers. | magicalhippo wrote: | > For these reasons, it's generally recommended not to let ZFS go | past 80% disk utilization. | | There's another reason why you don't want to go beyond 80% | utilization, and that's because the block allocator will switch | behavior to a more involved search, which can take a lot more | time. | | Thus allocating new blocks can get really slow once you get past | 80%. | jhk727 wrote: | Thank you for the clarification - I had heard from a few | sources that the block allocator algorithm actually changes at | higher utilization, but was previously unable to find anything | concrete in the documentation. This helped clear up a | longstanding curiosity for me. | k8sToGo wrote: | Does the problem go away again if you go back to below 80%? | magicalhippo wrote: | After a bit of digging, yes but no. | | So turns out it's a bit more involved than what's been | commonly told as a straight up 80% == bad scenario. ZFS by | default divides[1] each vdev (RAIDZ or mirror set) into ~200 | allocation regions called metaslabs[2]. | | When allocating from a metaslab[3] it will check if the free | space _in that metaslab_ is below the threshold defined by | metaslab_df_free_pct. It seems the threshold was changed to | 4% free space at some point[4]. | | If the free space is above the limit it will use the fast | first-fit search, if not it will use the expensive best-fit | search. | | However, as noted that threshold is per metaslab. So if the | pool is fragmented, even though the overall free space in the | pool is above the 4% threshold, there might be metaslabs with | less than that free, which will lead to the expensive best- | fit search. | | So it's not a hard limit, but it should start to be | noticeable above 80%. | | [1]: https://www.delphix.com/blog/delphix- | engineering/openzfs-cod... | | [2]: http://dtrace.org/blogs/ahl/2012/11/08/zfs-trivia- | metaslabs/ | | [3]: https://github.com/openzfs/zfs/blob/master/module/zfs/me | tasl... (note metaslab_df_free_pct) | | [4]: https://www.truenas.com/community/threads/zfs-tweak-for- | firs... | GhettoComputers wrote: | I've read that any TRIM supported SSD prevents this and all | SSDs have extra blocks (some more than others) that aren't | being utilized by default and are designed to replace any bad | blocks. https://www.truenas.com/community/resources/some- | differences... seems like ZFS might be special and if you have | larger SSDs they will have different cell blocks and be faster | because it's 2x256 boards that can be used concurrently versus | 1x256 which will have half the write speed. SSDs also | complicate it further with RAM disks, mixed storage (like | SLC+TLC/QLC) so the SSDs will be affected more if it's a | cheaper drive with no ram or SLC cache, and smaller sizes with | less memory chips. I remember getting the Evo 850 because it | had great firmware with ram and SLC cache, it was 3D TLC but | it's speed was still excellent. | bbarnett wrote: | Good grief. | | Talk of rsync backups on live DB systems. zfs. On the fly disk | encryprion. | | All I can think of is, clearly these guys never worked with | spinning disks and large datasets. | | So much headroom to waste with SSDs, people are spoiled today. | laumars wrote: | Nothing you've described there wasn't possible with spinning | disks storing large data sets. In fact if anything, ZFS is | ideally suited to exactly that scenario. | bbarnett wrote: | I assure you, rsync can tank a highly tuned db environment. | 'updatedb', which updates the db 'locate' uses under linux, | running in its cron, can cause issues. | | It all depends upon how much headroom you have, the type of | io activity, etc. I've operated systems under consistent 80% | io load with massive datasets at the time, under spinning | disks. | | Running rsync would be madness on such a system. I know. I | only did it once. | ComodoHacker wrote: | Let me guess, your system hadn't ionice back then? | whitepoplar wrote: | They never said they were using rsync on their datasets. | They're likely using ZFS snapshots. | netizen-936824 wrote: | I'm not sure I understand your comment. Is this extra headroom | a bad thing? | bbarnett wrote: | More of a jealous thing, and, astonishment at how much extra | hardware is used to give that headroom. | | Over the years, I've run comparatively larger datasets, on | significantly less hardware. | | edit: | | When I switched to SSDs for the first time, to give you a | performance example, on some read queries I saw a 1000x to | 10000x speed improvement. | | This of course was on a read only secondary, long running | reporting queries, no one runs queries of that nature on a | primary. | | SSD were an insane game changer. | terr-dav wrote: | I think you're feeling envy, not jealousy. ;] | nine_k wrote: | A semi-related story from ancient past. | | Back in the university days we've built an information retrieval | system. It ran on an IBM PC XT, with a 20MB HDD, which was pretty | slow. | | The heaviest queries to the information system involved full | scans. They were too slow, slower than had been agreed with the | customer. | | So we installed a disk compression program, maybe Stacker or | something similar. It ate some of the already slow 8088 CPU, and | some of the scarce RAM. But crucially it compressed the data to | about 50% of the original size. | | This made the number of blocks to read, and, most importantly, to | seek twice as low. The query speed increased twofold. We | successfully completed the (tiny) software development contract. | [deleted] | bombcar wrote: | Whenever CPU speeds surpass disk speeds, compression becomes | king; when the opposite happens it dies away. I don't know if | we'll ever see disk speeds compete with CPU speeds again, | however. | peremasip wrote: | This is pretty interesting, because the effects of migrating from | lz4 to zstd were: - Total storage usage reduced by ~21% (for our | dataset, this is on the order of petabytes) | | - Average write operation duration decreased by 50% on our | fullest machines | | - No observable query performance effects | | It seems like the better compression ratio and resulting reduced | IO more than makes up for increased CPU compared to lz4. I wish | they had mentioned the actual effect on CPU. | | Compare to the recent thread "The LZ4 introduced in PostgreSQL 14 | provides faster compression" [0] where the loudest voices were | saying that zstd would not work due to increased CPU. This is a | different layer (filesystem compression vs db comrpession), but | this article represents an interesting data point in the | conversation. | | [0]: https://news.ycombinator.com/item?id=29147656 | pirata99 wrote: | hey you copied this comment!! | | that's mean >:( | peremasip wrote: | hey you copied this comment!! that's mean >:( | whitepoplar wrote: | Kinda unrelated, but I have a question for anyone who is | knowledgeable about running Postgres on ZFS...does setting a | large-ish ZFS block size (e.g. 64kB) for use with Postgres | (default 8kB blocks) cause a great deal of write-amplification | even when ZFS `full_page_writes = off`? | infogulch wrote: | This is pretty interesting, because the effects of migrating from | lz4 to zstd were: | | - Total storage usage reduced by ~21% (for our dataset, this is | on the order of petabytes) | | - Average write operation duration decreased by 50% on our | fullest machines | | - No observable query performance effects | | It seems like the better compression ratio and resulting reduced | IO more than makes up for increased CPU compared to lz4. I wish | they had mentioned the actual effect on CPU. | | Compare to the recent thread "The LZ4 introduced in PostgreSQL 14 | provides faster compression" [0] where the loudest voices were | saying that zstd would not work due to increased CPU. This is a | different layer (filesystem compression vs db compression), but | this article represents an interesting data point in the | conversation. | | [0]: https://news.ycombinator.com/item?id=29147656 | walrus01 wrote: | I wonder what, if any, further improvement would be had by | comparing xzip vs zstd. | | Obviously you need a LOT of CPU to throw at xzip if you want to | use it. | | zstd is very much more optimized for compression at speeds | comparable to traditional gzip. | | I use xzip primarily for things that will get compressed to | long term storage and the time to create the archive isn't a | really important factor. | | in this test: https://sysdfree.wordpress.com/2020/01/04/293/ | | zstd level 19 wins on time vs. xz levels 5 through 9, but the | xz ultimate compressed file size is definitely smaller. | ncmncm wrote: | If your system experiences periods of greater and lesser | load, then using the rest of whatever is its load capacity, | during periods of lesser load, on further compressing its | contents might be worth the bother. | | Perhaps better than stepping to a different compression | algorithm, zstd has multiple levels of compression that might | be used at different times. The advantage there is that the | same decompression algorithm works for all. | | One might reasonably hope that decompression tables may be | shared amongst multiple of the 64k raw blocks, to further | squeeze usage. | jhk727 wrote: | Author here - it's difficult to provide a single number to | summarize what we've observed re: CPU, but one data point is | that average CPU utilization across our cluster increased from | ~40% to ~50%. This effect is more pronounced during NA daylight | hours. | | Worth noting that part of the reason this is relatively low | impact for our read queries is that the hot portion of our | dataset is usually in Postgres page cache where the data is | already decompressed (we see a 95-98% cache hit rate under | normal conditions). We've noticed the impact more for | operations that involve large scans - in particular, backups | and index builds have become more expensive. | TedDoesntTalk wrote: | how/why did you choose Postgres over MariaDB? I am facing | such a decision now. | infogulch wrote: | Hey thanks for the clarification. That seems like a | worthwhile tradeoff in your case. | | For backups in particular, are ZFS snapshots alone not | suitable to serve as a backup? Is there something else that | the pg backup process does that is not covered by a "dumb" | snapshot? | jhk727 wrote: | We use wal-g and extensively leverage its archive/point-in- | time restore capabilities. I think it would be tricky to | manage similar functionality with snapshots (and possibly | more expensive if archival involved syncing to a remote | pool). | | That being said, wal-g has worked well enough for us that | we haven't put a ton of time into investigating | alternatives yet, so I can't say for sure whether snapshots | would be a better option. | matsur wrote: | https://blog.cloudflare.com/squeezing-the-firehose/ is our | story of how we moved from lz4 to zstd (with a stop at snappy | in between) in our kafka deployments. Results are/were similar | to what Heap is reporting here. | willis936 wrote: | For anyone like me: home usage workloads are read-heavy with | files that are predominantly already compressed. Moving to | Zstandard might be interesting to toy with if you have more | compute and disk I/O than network throughput, but the benefits | would likely be smaller. | wanderer2323 wrote: | ... from ZFS (lz4) to ZFS 2.x (Zstandard). | chungy wrote: | It is an upgrade, but don't mistake ZFS 2.x as making zstandard | mandatory. The default for compression=on is still lz4. | [deleted] | B1FF_PSUVM wrote: | I don't have a dog in that race, but I've seen it said that the | DB architecture itself should be reviewed, because SSDs make it | possible to use databases in higher "normal form", with more | tables that require more lookups, but less data volume. | | E.g. https://drcoddwasright.blogspot.com: _" In a time of SSD, | multi-core/processor, two terabyte memory and Optane App Direct | Mode machines, there is no reason not to build from BCNF data. | Time to do what Dr. Codd demonstrated. Technology has finally | caught up with the maths."_ | otterley wrote: | [deleted] | whitepoplar wrote: | The post is literally about how ZFS compression saves them | millions of dollars. | pengaru wrote: | > The post was literally about how ZFS compression saves them | millions of dollars. | | ... relative to their previous ZFS configuration. | | They didn't evaluate alternatives to ZFS, did they? They're | still incurring copy-on-write FS overhead, and the | compression is just helping reduce the pain there, no? | drob wrote: | Zstandard gets us 5.5x compression. The previous ZFS config | got us 4.4x compression. | | XFS, which we ran on for years before rolling out ZFS, does | not compress. | pengaru wrote: | Thanks for the clarification | nwmcsween wrote: | OK a one thing that stands out here and please correct me if I'm | wrong: | | > ... multi-petabyte cluster of Postgres instances... blocksize | relatively high at 64 kb ... | | The dataset should be the Postgresql "page size" which IIRC is | 8KB, the reasoning for this is RMW cycles will read 64kb modify | 8KB and write out the full 64KB amplifying writes 8 fold. | | Also IIRC Postgresql will automatically use TOAST when needed? | jhk727 wrote: | Good callout - we use a higher blocksize than Postgres page | size because it gives us a much higher compression ratio, at | the cost of some read/write amplification. | | And yes - Postgres will automatically TOAST oversized tuples | and compress the relevant data (if you configure it to do so). | This is much lower impact for us than filesystem level | compression, as it doesn't affect the main relation heap space | (or any indexes). | nwmcsween wrote: | What about: https://people.freebsd.org/~seanc/postgresql/scal | e15x-2017-p... | | 16k record size 2x amplification and still (?) allows | compression w/ lz4 | jhk727 wrote: | We tested this extensively a few years back. We saw a | compression ratio of ~1.9 with 8k recordsize/lz4, ~2.7 with | 16k/lz4, and now ~5.5 with 64k/zstd. | nwmcsween wrote: | There has to be something better than a potential 8 fold | write performance reduction wrt compression | mikewarot wrote: | My understanding of SSD architecture is that you can't flip bits | in a page, you have to write a page at a time, thus all SSD | systems stall if they get stuck waiting for empty pages (which | take longer than writing pages). Thus, a full SSD (which | internally has a few % allocated spares the customer isn't | supposed to be able to access) is a _slow_ SSD. | | It would seem to me if you can keep the utilization of the disk | under 80%, and support TRIM (which lets the SSD know which pages | can be erased), you should be able to get really high performance | out of them with a Copy on Write file system. | Neil44 wrote: | I was about to post the same, you will quickly run out of ready | trim'd blocks at high utilisation, which the article doesn't | mention. | ddlutz wrote: | I wonder how well Postgres is up to the task of analytical | queries? Most people use Postgres for OLTP, maybe they are | running some version of it that uses a column store? | jhk727 wrote: | Postgres is not designed for OLAP, but you can push it a lot | farther than one would expect with the correct schema and | indexing strategy. See https://heap.io/blog/running-10-million- | postgresql-indexes-i... for a little more detail about how we | schematize for distributed OLAP queries on Postgres at scale. | __s wrote: | They use Citus: https://www.citusdata.com/customers/heap | | Who recently iterated on cstore_fdw to create columnar: | https://www.citusdata.com/blog/2021/03/06/citus-10-columnar-... | | But I don't think Heap's using columnar | jhk727 wrote: | We aren't using cstore_fdw, though we've looked into it in | the past. cstore tables don't support deletes or updates, and | we still rely on updates for some key parts of our write | pipeline. Additionally, we rely heavily on btree partial | indexes, while cstore tables only support skip indexes. | KennyBlanken wrote: | Since the title is clickbaity: they had issues with ZFS due to | too-high a blocksize and too-high a filesystem utilization, so | they upgraded to ZFS 2 for the Zstandard compression and saw an | improvement. | ziddoap wrote: | Is it clickbaity if they actually _did_ save millions in SSD | costs? What would you suggest as a non-clickbaity title? Just | "Upgrading Our Filesystem" leaves out the important parts (the | why and the results). | | A lot of the time I agree that titles can be quite clickbaity. | But this one doesn't really seem to be... At least to me. The | company upgraded their filesystem and it saved them a bunch of | money. Title feels appropriate. | | If the title were "Top ways to save millions on SSDs!" or | similar, I'd wholeheartedly agree. | jaclaz wrote: | But: | | >Total storage usage reduced by ~21% (for our dataset, this | is on the order of petabytes) | | If 21% reduction is "millions" they should be spending in | excess of 10 millions (each what? week/month/year?) provided | that there is a linear correlation between storage usage and | (failed and needing to be replaced?) SSD costs. | slownews45 wrote: | There's a reason I click to the comments first most of the | time. | chrisaycock wrote: | TL;DR They switched compression from lz4 to Zstandard. | | The latter does more compression (and therefore requires less | storage and less IO), but is slower at decompression. Results | show that query (read) performance did not actually change, | whereas write operations needed only half as much time. Storage | also saved ~20% space. So it was a win-win all around. | LolWolf wrote: | Thanks ! | | I love some of the articles, but in this case I definitely went | the "I'm happy for you or sad it happened, but I ain't about to | read all of that" route. | jandrese wrote: | Most of the savings seemed to come from freeing up enough | headroom in each drive to prevent block collating slowdowns on | write. This smells like a temporary workaround to me, as data | tends to grow over time. | | It probably saved them from having to buy more storage this | quarter, but it is a one time savings. | turbocon wrote: | To the contrary! This decreases their storage need by ~20% | which will increase their cost savings over time. Yes they | pushed off increasing their storage footprint in the short | term but in the long term they decreased their rate of total | storage increase. ___________________________________________________________________ (page generated 2021-11-09 23:00 UTC)