[HN Gopher] The cost of cloud, a trillion dollar paradox ___________________________________________________________________ The cost of cloud, a trillion dollar paradox Author : StratusBen Score : 118 points Date : 2021-05-27 18:58 UTC (4 hours ago) (HTM) web link (a16z.com) (TXT) w3m dump (a16z.com) | maerF0x0 wrote: | As I read it i couldnt help but think the following | | Yes, but the benefit of cloud is we only work to optimize those | which gain market adoption. For every twilio there may be | 100-1000 startups that did not make it, it's a good thing if | those were constructed rapidly, tested for product market fit and | then turned off without optimizations applied. | | Also if cloud adoption actually accelerates building, then it may | also aid its adopters if there is a race to product market fit. | thinkingkong wrote: | It isnt really a paradox at all. Its specifically the opportunity | that companies like Oxide will go after. Basically the stage of | your business determines the best course of action. Youd be | insane to start with large datacenters and high capex for an | undefined ROI for most projects. Similarly you dont understand | your requirements or workloads yet, which would also affect the | efficiency of your architecture. | | Whenever a rewrite happens in software there are usually massive | performance gains. The same thing happens with infrastructure. | sanj wrote: | This may be a naive question, but... | | Given that interactions to the cloud are through a relatively | small set of known APIs, I'm surprised that there isn't there a | service which replicates the APIs but with the endpoints being on | "repatriated" hardware. | kenniskrag wrote: | terraform does that. https://www.terraform.io/ | sanj wrote: | That's clever, but not what I'm suggesting. | | I'm looking for a mechanism which apes existing cloud APIs | directly. | fnord77 wrote: | technologies like kubernetes make repatriation even simple in | some cases. | deanCommie wrote: | Except you miss out on all the opportunity cost of cloud-native | solutions that get you moving fast for the sake of a complex | convoluted mess that you need to hire subject matter experts | for to wrangle, instead of focusing on your business logic. | | Kubernetes only makes sense from Google's perspective to sell | you the managed version "from the creators of Kubernetes!" once | you get tired of trying to wrestle it into submission. | newintellectual wrote: | "We show (using relatively conservative assumptions!) that across | 50 of the top public software companies currently utilizing cloud | infrastructure, an estimated $100B of market value is being lost | among them due to cloud impact on margins -- relative to running | the infrastructure themselves." | | That completely overlooks the actual benefit and the reasons | driving cloud adoption, which he states early on and then fails | to integrate here. The real cost/benefit analysis would take into | account the amount of time, money, and opportunity cost saved by | using a cloud in development, a critical time that determines | whether there'll actually be a profitable company eventually. | Optimizing the cloud costs is certainly important, but being able | to spin up highly integrated systems on demand offsets a huge | amount of time and capital during development. | | A takeaway might be that, e.g., AWS, should offer even larger | discounts for large scale operations in order to retain mature | customers. | [deleted] | tibiahurried wrote: | Many underestimate the complexity and the headache that comes | with managing its own data center. It is easy to get a couple of | racks up and running. Then you have to deal: | | - HW that fails, go get a new one, cost + time | | - backup and geographical redundancy | | - security | | - certifications | | - tooling to manage and forecast | | - black friday! scale for few days ... more HW? | | - etc ... | | It is way more convenient for the average enterprise to pay | premium and be done with it. So they can focus on the actual | business and not building infrastructure and tooling to manage | it. | peterbonney wrote: | Am I alone in not seeing a paradox here? It's no different from a | variety of things that make sense at some scales and not at | others. | | Seems similar to office space - a 5 person company will find the | economics of coworking space compelling, a 500 person company | will do better with a long-term lease with a commercial landlord, | and a 50,000 person company might have their own property | management team in-house. That doesn't create a paradox in the | commercial real estate market, just different solutions for | different needs. | dinoreic wrote: | I think point guy tried to make is, that ton of big companies, | that would/could benefit from in-house solution still go for | full cloud based one without thinking about alternatives. | nine_k wrote: | The paradox is that if the coworking space were the cloud, it | would just grow with you, and easily house a 50k-strong | company, while you'd be paying the same (high) rate per | employee. | | Why would this happen? Because what the company builds is a | large and growing money-making contraption right inside the | coworking space, also using parts of the building for critical | functions, and the door is intentionally kept small. The only | way to move that contraption is to dismantle it and reassamble | in a much cheaper, purpose-built hangar, rebuilding some of the | critical parts along the way. During all that time, the | contraption would stop making money. | | That's the beauty of AWS business model: it's a no-brainer for | a startup to use it, but the startup grows, more financially | efficient infrastructure options become unattainable because of | the very high cost of the move. This is the best-executed | vendor lock-in I know. | JumpCrisscross wrote: | The difference is the bottleneck in skilled cloud | practitioners. There aren't that many people who can deploy | these systems at scale, and they are best rewarded at the | centre of the system. So the system centralizes. Enterprises | would love to duplicate AWS, but they don't have the ability | to each hire similar calibre talent and keep them engaged on | their pet clouds. | 908B64B197 wrote: | DropBox did it [0]. | | [0] https://techcrunch.com/2017/09/15/why-dropbox-decided-to- | dro... | gonzo41 wrote: | Twitter was on a laptop, then it was in ruby, then it was in | java. Companies grow and change, and if your seriously | scaling, then your behavior in different environments needs | to adjust. It's simply a strategic planning issue for the | CIO. | throwawayboise wrote: | If moving to another cloud provider, or on-prem, has a "very | high cost" then you're doing it wrong. | abraxas wrote: | I don't think you ever worked on a large enough application | to appreciate this complexity. | scarface74 wrote: | Have you been part of any large migration? Even if you try | to stay "cloud agnostic" (no using Terraform doesn't count. | Each provisioner is cloud specific), everything takes time | at scale. Not to mention data migration costs, project | management, regression testing, a company of any scale is | probably still partially on prem and has all sorts of | interconnecting network points with the cloud provider, | IAM, etc. | throwawayboise wrote: | Of course any migration has costs. But if they are higher | because of deep entanglements in specific cloud vendor | services, that is a vulnerability. It's already | demonstrated that the big cloud providers will cut you | off with little warning if you run afoul of their | political sensibilities. So the only way to use those | services is in the most generic way you can, so that | migrations don't cost more because of it. | okareaman wrote: | This seems like a well known trap that people would know how | to avoid with a little planning, but I guess they have no one | thinking that far ahead | taneq wrote: | The ones thinking that far ahead aren't using AWS in the | first place. | pie420 wrote: | A startup worrying about long term cloud implications is | like a hot dog vendor in new York worrying about how his | branding will be received in India when his hot dog empire | expands there. 99.99% of startups will never have to worry | about migrating off cloud, and if they do, they will have | succeeded beyond their wildest dreams, and will be once | sequential. Nobody is losing sleep over the prospect that | their startup will only be worth 22.4 billion in 17 years, | but if they eschewed the cloud and built in efficiencies | from the get go the company would be worth 23.7 billion. | Not to mention the fact that worrying about long term | things like that would take focus off the present and in | tease chances of failure. | sneak wrote: | There's also the argument that some of the cloud advantages | (such as AWS proprietary services like SQS) simply aren't | directly available as selfhosted replacements. You end up | needing a whole new team to do HA/scaling for a core | service. | | There are off the shelf things for, say, S3 and ALB which | are entirely workable, but once you start getting into more | complicated stuff (like S3's new consistency semantics, or | SQS) then you're looking at a whole small company (at a | minimum) worth of additional work. It's a non-trivial | expansion, even for a large org with lots of money. | | You can avoid using these sorts of services to maintain | flexibility/independence, but you lose out on their unique | benefits. This isn't an accident. It's not like you're | going to be able to selfhost an SES clone and get the same | kind of deliverability percentages as AWS netblocks, no | matter how many engineers you throw at the problem. | zozbot234 wrote: | > such as AWS proprietary services like SQS | | SQS is a message queue service. It's a bit weird to claim | that as "simply aren't directly available as selfhosted | replacements", a message queue is pretty basic stuff. | sneak wrote: | SQS is "just a queue service" the way that S3 is "just an | object store" and the iPhone is "just a smartphone". | | I'm super into self-hosted shared-nothing queue services, | but what AWS is doing with SQS is anything but basic | stuff. | rualca wrote: | > There's also the argument that some of the cloud | advantages (such as AWS proprietary services like SQS) | simply aren't directly available as selfhosted | replacements. You end up needing a whole new team to do | HA/scaling for a core service. | | It's not limited to some. You absolutely need a whole new | team (or teams) to handle your infrastructure, your high | availability needs, and also your security. | | The nifty serverless offerings and other features are | just nice-to-haves in comparison to the core | infrastructure work that cloud providers put into your | system to keep it running. | | Just because cloud providers like GCP and AWS and Azure | and etc have everything put together to let you setup | your whole infra by running a small script that does not | mean nothing needs to be done in order for that to work. | toomuchtodo wrote: | There are ways to engineer around this, but it's going to | depend on balancing risk management with business | agility. Maybe you use Backblaze for object storage | instead of S3. Maybe you use Cloudflare for CDN and | Serverless at the edge. Maybe you use Mailgun or another | SES competitor. Containers let you use managed k8s off | the shelf anywhere. Lots of options for technologists, | PMs, or CTOs to pick from. | sneak wrote: | Graviton2? Aurora? S3 global read-after-write | consistency? Glacier? SQS's ability to scale? | | Some of the goods you just can't get anywhere else, and | they know it. | rad_gruchalski wrote: | It can be done. Putting the right abstractions in front | of these services makes it possible to migrate and | replacements exist. Okay, granted, there will be | complexity in self hosted infra but it is possible. | sneak wrote: | I don't really think anyone with anything less than an 8 | figure budget can get the perf/watt of Graviton2. They're | literally not available for sale - AWS had them fabbed | and then kept all of them to rent out. | wmf wrote: | It's called Ampere Altra. Maybe you should have picked a | different example like Aurora... | [deleted] | tw04 wrote: | >This seems like a well known trap that people would know | how to avoid with a little planning, but I guess they have | no one thinking that far ahead | | But you see part of the magic trick was Amazon spent years | having companies like Garnter proclaim that cloud was | cheaper, and nobody could possibly do on-premises IT | cheaper because of economies of scale. As a result you've | got CIOs everywhere trying to make a name for themselves by | driving their company to the cloud to show immense cost | savings. They don't have time to be bothered by the actual | financial models or real costs. By the time finance | realizes what happened they'll be on to the next gig. | | It was honestly brilliant, the number of "cloud first" | strategies that originated in board rooms filled with | people that don't know the first thing about IT is kind of | disgusting. | toyg wrote: | _> I guess they have no one thinking that far ahead_ | | Small businesses don't care because they are too busy | surviving. | | Medium businesses don't care because why break what works? | | Large businesses don't care because their managerial | careers promote short-term priorities and marquee | "transformation" projects which are buzzword-driven, and | cloud is the current one. | emteycz wrote: | So what is the problem exactly? | void_mint wrote: | > Am I alone in not seeing a paradox here? It's no different | from a variety of things that make sense at some scales and not | at others. | | Yes, it's not a paradox at all. Use this thing until it doesn't | make sense, and then don't use it anymore (or be comfortable | with the lack of optimization). This is not a paradox. | lotsofpulp wrote: | I assume the article gets more attention if it uses the word | paradox. | Agingcoder wrote: | The paradox is that in spite of the huge costs, very large | companies are still moving to (or using) the cloud. | | I've had this discussion many times where I work, and we've | refrained from using cloud services since they are | extraordinary expensive compared to what we can do internally. | | Yet, people seem to be convinced that prices are ok, and that | going outside will let them _reduce_ their infrastructure | footprint. | | For some reason I don't quite understand, people tend to be | attracted by the cloud, even when it makes no sense | economically whatsoever, and talking sense into them is | surprisingly difficult. | | So yes, it seems obvious that people will choose the most cost | effective solution. However, it seems like they don't! | tibiahurried wrote: | The same reason big companies pay premium for external | consultants instead of directly hiring. You don't want to | deal with employees, as well as you don't want to deal with | HW, data center, and all that comes with it. You know you are | paying premium, but you are also delegating tons of headache | and responsibilities. | taneq wrote: | The mining companies I work with often hire equipment semi- | permanently instead of buying it. It's an insane cash crop | for the hire companies because the payback time on the | equipment is a few months and then it's pure profit, but | apparently a $700/week car rental payment is easier to get | through accounts than a one off $50k purchase. | tibiahurried wrote: | Well I guess by renting they can get the latest and | greatest machine available instead of dealing with an old | machine bought say 5 years prior. If the machine breaks, | well you just get another one. No maintenance cost for | the company. I mean it depends. Buy it is not always the | best thing to do. Renting may be more convenient | sometime. | nemothekid wrote: | > _The paradox is that in spite of the huge costs, very large | companies are still moving to (or using) the cloud._ | | In my experience when a large corporation moves to the cloud | it has less to do with pricing and more to do flexibility. | It's easier to get a budget from IT and do whatever you want | in cloud, then to have to wait weeks/months putting in a PO | for hardware, getting access to those machines and installing | what you want. | throwawayboise wrote: | This is true. Where I work a PO was issued several weeks | ago for a rack of servers, and delivery will be in about 3 | months. If cost were no object, those resouces could be | provisioned with a cloud provider today. And getting from | requirements to RFQ to PO took another 3-4 months. | sanderjd wrote: | People are attracted to it because owning stuff sucks. | Renting stuff is more expensive but sucks less. This is why | businesses often prefer opex to capex even if it is pretty | significantly more expensive. It isn't just an IT phenomenon. | mint2 wrote: | Some companies have terrible IT outsourcing and procurement | practices such that getting something comparable to aws | infrastructure is effectively not possible, as it would | require massive internal change whereas moving to the cloud | is easy to get buy in. | oblio wrote: | Also regular devs don't realize but frequently internal IT | departments charge obscene amounts of money for managed | services. | | Devs look at Digital Ocean or Hetzner VM costing $10 for a | ton of RAM, storage and bandwidth when the same thing | internally in a bank or other big enterprise can cost | $100-150. AND be delivered in 3 months, if it's ever | delivered. | rualca wrote: | > Yet, people seem to be convinced that prices are ok, and | that going outside will let them reduce their infrastructure | footprint. | | That's where you get it all wrong. It is not about the price, | at all. It is all about avoiding large upfront capex with a | small periodical opex, and in the process have virtually | boundless growth potential. | | Think about it: when you happen to need a bit more | computational resources to run an app, is it easier to | convince your boss the need to pay, say, 100EUR/month for | extra VMs, or is it easier to convince your boss to shelve | $2k to buy an extra rack? And how many times are you willing | to have that same conversation with your boss whenever you | run out of computational resources? | taneq wrote: | Aren't the real figures more like $200 vs. $700? Cloud | stuff only really makes sense when your resource | requirements as super spiky. | | 100% agree on shifting capex to opex though, capital works | teams are super in favour of that, oddly enough. | karmakaze wrote: | A paradox is only at the most coarse-grained level. Cloud makes | sense because of its cost-efficiency in scaling from nothing | upwards. Then fails because of cost involved in keeping it | running at scale. The thing to recognize is that what cloud is | good for is rapid change. Once you have something stable, even | at not-huge scales, bare-metal hosting may still make more | sense. | | All this happens willingly unless growth happened with your | eyes closed, or really had no choice keeping up with it. Sure, | use the cloud, but don't use every cloud-proprietary | convenience. Use opensource software in a near-standard way. | Even if using the cloud-vendor's offering, refrain from using | proprietary extensions. If your plan is to grow and get out | before things stabilize, then that's on who's come onboard, not | resisted the conveniences, and remaining. | devops000 wrote: | I don't think is a paradox, is a standard "make or buy" decision | which is different based on your company's size and growth phase. | sbazerque wrote: | It's not a purely technical matter, there are different market | dynamics at play as well. | | If you go cloud, you get a handful of very large suppliers, that | provide a lot of non-standardized services that are probably | going to lock you in like hell (as the article wel says). | | If you build infra, you're using /mostly/ commoditized supplies | and skills. | | If the cloud offerings where interchangeable and the industry | reasonably fragmented, the margins of cloud providers would be | slimmer and the paradox would probably go away (in favor of: | always cloud!). | | See for example Porter's analysis framework [1] and how your | positioning changes in the two cases. | | https://en.wikipedia.org/wiki/Porter%27s_five_forces_analysi... | SirOibaf wrote: | A bit of a weird article coming from a VC heavily invested in | SaaS providers | wmf wrote: | They're basically talking about helping SaaS providers reduce | their IaaS spending. They're not talking about migrating off | SaaS. | spoonjim wrote: | I couldn't tell, which onprem portfolio company are they pushing | here? | trjordan wrote: | The opportunity cost is staggering. Not just of the people that | do the work, but the cost of focus and top-level priorities over | the years it takes to do the work. | | Everybody loves to cite DropBox here. The greater arc of DropBox | is their product stalled, they lost the enterprise of their | market to Box, and they found themselves a commodity in a | commodity market. Heck, maybe they'll go back to Amazon if they | keep doing things like this: | https://aws.amazon.com/solutions/case-studies/dropbox-s3/ | | But that's not all. It takes a top-level directive to repatriate | an entire SaaS. That's at the cost of other top-level projects. | It's wild to me that any company that has significant fuel left | in the tank would buy back single-digit COGS percentages instead | of investing in product that could add double-digit growth for | several years at scale. | kelp wrote: | What I know from talking to former Dropbox insiders is Drew | really wanted to be running a consumer company. He really | resisted a push to SMB and Enterprise, even though, at least to | me, that seems like the obviously better business. | | So not sure if it's so much that the product stalled, but that | leadership actively didn't want to move in that direction for | too long, then lost the advantage. | | Instead they spent a bunch of energy on Carousel and Paper. (I | don't have any idea how much of their eng team was working on | those things) | mxschumacher wrote: | slightly tangential, but Box, a storage company that did take | the business route, is about 10% more valuable than it was at | its IPO in early 2015. That is pretty bad in comparison to | most other public software firms that grow like weeds. | | Dropbox is worth about 3x Box as of today | [deleted] | redis_mlc wrote: | > things like this: https://aws.amazon.com/solutions/case- | studies/dropbox-s3/ | | Yeah but why do they have 34 PB of analytics data? | praseodym wrote: | This makes me wonder whether 34PB (+1PB per month) of analytics | data creates more value than it costs to store and process all | of it. I'd think that storing only aggregated information would | provide nearly as much business insight to make strategic | decisions at a fraction of the cost. | [deleted] | liminal wrote: | I think they understated the importance and difficulty of | retaining a sufficiently redundant skilled workforce to manage | the equivalent cloud infrastructure. | StratusBen wrote: | Disclaimer: I'm a Co-Founder at Vantage, a cloud cost platform. I | also worked in public cloud for ~6 years at AWS, DigitalOcean, | etc. | | While repatriation can make sense at a larger scale company, | startups and SMBs can yield the same benefits discussed in this | post by simply tracking and optimizing cloud spend. | | We try to make this as easy as possible for people with | https://www.vantage.sh/ - where we're already helping thousands | of individuals, startups, SMBs and enterprises as it relates to | AWS. | lowbloodsugar wrote: | "But that's just Dropbox." | | The implication here being, "Well, if Dropbox can save this much, | then think about how much everyone else cans save!" But in fact | the opposite is true. Dropbox sells disk storage on the cloud. | For them to do so by effectively _reselling_ someone else 's | disk-storage-on-the-cloud platform is obviously not high margin, | and they'd be better off building it themselves. So, sure. Anyone | else also offering, disk storage on the cloud, or compute | clusters on the cloud, or otherwise just reselling someone else's | product on the cloud, will probably have higher margins doing it | themselves. But that is certainly not most companies. So, yeah, | it's "just Dropbox". | | Disclaimer: I work for a cloud provider, but these opinions are | my own. | throwawaaarrgh wrote: | These people don't see the real value proposition of the cloud. | The value is not in "scaling when you need to". Sure, that can be | very handy. But that is not what 99% of companies are getting out | of the cloud 99% of the time. | | If you self-manage, your capital investment is initially higher, | and lower over time. At the same time, the effort it takes to | reach the same results is _always_ higher, and the quality of the | end product _may_ be lower, depending on how much service quality | affects your product. | | If you pay for managed services, your initial investment is | lower, and higher over time. But at the same time, you require | less effort, and you get higher quality outcomes. | | This is obvious to anyone who has worked in the industry and done | both. First, host your own service: JFrog Artifactory, Atlassian | Confluence, GitLab, whatever. Now rapidly increase the demands on | this service. As demand rises, quality will decrease, because it | takes a lot of time, effort and expertise to build a very | reliable hosted service. Now switch to a managed cloud instance. | Suddenly, the service's average quality increases. Performance is | steady regardless of increase in use. There are virtually no | interruptions to your product or development. | | The impact of a service's quality and reliability has ripple | effects. If poor service quality slows down development, that | means development quality will go down as people cut corners to | try and meet deadlines. If the service is used for production, it | means your product's quality will suffer, and that effects your | bottom line. So a huge amount of the actual cost is not just | paying for a service, but also how much business value is | generated or lost due to service quality. | | There is simply no way to replicate a managed service without | becoming a managed service provider yourself. You have to become | a whole new business within a business. It's like a yogurt | company also becoming a dairy farm. Running a farm is not easy, | and you will screw it up for several years. Seems obvious for | farming, but for some reason people always underestimate this | when it comes to technology. | | On paper, the Cloud's value proposition is scalability. But in | practice, the true value is actually as a force-multiplier for | your product's quality, reliability, and time to market. (Time to | market not just being "I launched my startup" but also "I | released this new feature before my competitor") | lifeisstillgood wrote: | One benefit of "the cloud" not mentioned here is _elapsed time to | acquire a new instance_ | | If you have moved to AWS / other then that time is around 5 | minutes. | | If you are in a major fortune 500 and need a new server, quite | often that time will measure in months (yes really). | | This simple equation just blows every other cost/benefit | calculation out of the water. | | I may have missed it in the discussion. | toomuchtodo wrote: | If you need that server today, absolutely, build an AMI and | fire it up, and then place an order for a server to migrate to. | It's not binary, and it is possible between engineering and | finance for balancing velocity and spend. | | Not everyone needs a server immediately. And not every use case | will generate business value by having that server immediately | available. | mtnygard wrote: | Not trying to peddle cheap skepticism here. | | This article only looks at "seen" costs, and assumes that there | are no "unseen" costs to running on-prem. Many companies do not | have the operational maturity to run on-prem well. The result: | high cost of operations, low availability, and large increase in | time-to-value. | | Second unseen cost: everybody becomes their own SI. So far nobody | really sells the "whole stack" for running on prem. I mean | hardware, network, virtualization, application, traffic mgmt, | etc. I have to buy stuff from two dozen different vendors and | cobble it together into a high-labor, rickety Jenga tower of | stuff. | siddarthd2919 wrote: | You nailed it. Unseen costs are huge and a lot of people ignore | it. | [deleted] | jandrewrogers wrote: | These costs are not unseen. They are straightforward ordinary | costs and fully accounted for. It only seems expensive and | complicated from the perspective of someone that does not have | experience with it. To someone that has done it many times | before it is relatively simple and almost completely mechanical | to get an excellent result. It isn't rocket science, just | domain expertise around things like physical infrastructure | planning and supply chains that a software developer may not | have encountered before but could easily learn. | | If a company was going to run their own data centers, they | would presumably hire someone that knows what they are doing to | lead the effort instead of trying to do it by trial and error. | rualca wrote: | > It only seems expensive and complicated from the | perspective of someone that does not have experience with it. | | Isn't that the target audience for this sort of change? | | I mean, do you see teams of networking , siteops, high | availability, and security engineers hanging around doing | nothing and just waiting for a company to decide to go in- | house? | | No, because those teams do not exist in free-range. That's | something a company needs to build and train and experiment | from the start until they are able to learn all the lessons. | oblio wrote: | > Many companies do not have the operational maturity to run | on-prem well. | kelp wrote: | Also in my experience, on-prem infra, on relatively cheap | gear like Dell servers is highly reliable. | | Yes, you have to plan, yes you have long lead times to get | new gear, DC space, etc. But once it's running, failure rates | are pretty low. At least in the single digit thousands and | servers. | | I've had to deal with much more frequent and odd types of | failures on cloud infra than with on-prem. | gopalv wrote: | In 2012, I was working on a migration from EC2 to on-prem & was | working on infrastructure building for zCloud. The cost factor | was huge, the switch out of EC2 literally paid for itself in a | lot of ways. | | The work was very interesting, because a lot of it was actually | building a private cloud for internal customers & the work | primarily centered around a virtualized data-center aimed at | boom-bust cycles of games (15+ million users for 6 weeks, drop to | 2 million for a month and down to a million in another week). | | The issue is that the infrastructure cost is somewhat constant | when dealing with that sort of fluctuations in revenues, so the | cost to revenue ratio was unpredictable (while the cost was). | | So what happened in the end looked a lot like a fire-sale of | hardware when the cost was unbearable, while if it was an end- | user cloud, that low-demand phase would be able to cut losses as | a spot instance or something. | | Anyway, a few years after I left, back to EC2 it is[1]. | | [1] - https://aws.amazon.com/solutions/case-studies/zynga/ | jsnell wrote: | Why is the infrastructure cost constant? I would have thought | that for gaming, at least compute and transit would scale | linearly with traffic. If games have a boom bust cycle, aren't | they the perfect use case for public clouds? | gopalv wrote: | The infra was on-prem, so it was planned in advance and ops | folks who had on-call rotations every day etc & that was | rented out by the hour to the game teams (a chargeback | model). | | The salaries and hardware cost was paid for even if the games | had a bust. | | The period where it worked well, the games had boom-bust in | somewhat controlled fashion where farmville -> cityville -> | frontierville -> fishville etc, the traffic would move around | rather than die down entirely. | | The world turned mobile-heavy and that whole pipeline fell | apart while they were restructuring into mobile games (words | with friends etc), when the hardware had to be sold or the | payments would start to hurt. | jsnell wrote: | Ah, I think I misread your original post. I thought you | were saying that the infra costs were constant on EC2 too. | Jedd wrote: | At around the same time I was working at a startup in London. | We had two sizeable clusters - Hadoop and Cassandra - neither | of which was conveniently elastic, but were located on EC2. We | had sundry other instances, but those were the major culprits | for our ~ PS25,000 / month. | | It took me about a day to scope out a BOM that would well | surpass that (maybe a year's expected growth) of whitebox | server gear, and get some quotes from nearby co-los. | | IIRC hardware capex was about PS15,000, and monthly rack + | network something around PS3,000. We relocated within a few | weeks. | | One small bonus was predictable & consistent performance -- | back then, the EU-west EC2 offerings were _extremely_ sensitive | to noisy neighbours, and our benchmarks never gave anywhere | near the same results twice. | | I think for genuinely elastic loads, or if you're really | addicted to some vendor-only services, it certainly makes | sense. I suspect most customers are overly optimistic about how | elastic their requirements are, and their stack can be. | api wrote: | This isn't surprising at all. This industry is incredibly fad | driven. Cloud became the fad, and the buzz was that cloud saves | money, so everyone goes cloud, and then cloud eventually costs | more than the original stuff did. | | The same is happening with SaaS. SaaS eliminates the need for in- | house IT! Except it doesn't. It just means you now have a bunch | of recurring SaaS costs that you are locked into forever because | they have your data and you still need IT people to babysit your | massive cloud/SaaS sprawl. | | Not following fads and buzz is a _huge_ competitive advantage in | this industry. Founders and chief engineers / CTOs / CIOs take | note. Just make sure you can explain why you are not using | (insert latest buzzword here). | | The bottom line is that you should analyze the situation using | _your_ work load, _your_ numbers, _your_ culture, etc., and | decide what works the best. Sometimes that 's managed cloud. | Sometimes it's unmanaged cloud. Sometimes it's bare metal. | Sometimes it's on-prem. Your mileage will vary. | bsder wrote: | > Founders and chief engineers / CTOs / CIOs take note. Just | make sure you can explain why you are not using (insert latest | buzzword here). | | 1) It takes amazing intestinal fortitude to fight back against | the fad tide. And, when things go wrong, _your decisions_ get | the blame even if they aren 't at fault. | | 2) As the article points out, cloud is _FINE_ for startups and | small companies. If my startup company reaches gigabucks in | revenue and I now have to worry about $75 million in cloud | spend, I 've _done my job_. And then some. | | 3) Cloud is often about _blame and liability transfer_. I don | 't want the company website getting hacked to be my problem--I | want it to be _somebody else 's_ problem. I'm willing to pay | for that. | dageshi wrote: | Cloud is just a big toolbox of premade tools and resources that | you can play with to your hearts content with much of the | friction taken out, both technical and bureaucratic. | | You pay a higher price for what you use than if you maintain | those tools yourself... but you don't have to maintain those | tools yourself. It's hardly a buzzword, it's almost defacto | standard nowadays. | calvinmorrison wrote: | This. On prem required me to do so much ITIL work to get even | a switch replaced, now I have a budget on azure I can do | whatever the fuck I want | betaby wrote: | and then this happen https://twitter.com/pragmaticandy/stat | us/1168916144121634818... | dageshi wrote: | sure, can happen to anyone. Still easier to shunt massive | amounts of data between aws regions via amazons pipes | than it is to do it yourself. | api wrote: | I think this is an important point. In many cases the | benefit of cloud and SaaS is working around your IT | department. It's a technical way of going around human | management problems. | void_mint wrote: | I feel like this assessment lacks a lot of nuance. I wouldn't | really call "cloud" a "fad" as much as an overused option. | | > Cloud became the fad, and the buzz was that cloud saves | money, so everyone goes cloud | | Cloud services _do_ save money for businesses of the | appropriate size (meaning, those that can't or shouldn't be | focusing on physical servers and networked hardware). | | > The same is happening with SaaS. SaaS eliminates the need for | in-house IT! | | Again, SaaS _does_ eliminate the need for in-house IT _for | certain classes of businesses_. | | > It just means you now have a bunch of recurring SaaS costs | that you are locked into forever because they have your data | and you still need IT people to babysit your massive cloud/SaaS | sprawl. | | The cost of employing someone to babysit a SaaS product is much | lower than the cost to employ someone to build and maintain an | equivalent in-house SaaS product. If you're fine with vendor | lock you're saving money. | | > Not following fads and buzz is a huge competitive advantage | in this industry. | | Ignoring all nuance and claiming things that are popular are | "just fads" is, to me, so much of a competitive disadvantage | that it almost certainly outweights any perceived gain. | jandrewrogers wrote: | This has been widely known for many years. The crossover point | happens much sooner than I think people intuit but most companies | never really measure or model it. My experience at a few | different companies is that DIY infrastructure pencils out at | 30-40% of the cost of the cloud. | | There is a learned helplessness when it comes to companies | running their own data centers that has become widespread over | the last decade. What used to be a fairly mechanical process has | almost been mythologized as some kind of arcane art beyond the | technical ability of any company that isn't Google or Amazon. | Designing data center builds isn't difficult, it is a pretty | straightforward albeit detail-oriented blue-collar engineering | skill, but it seems few people learn it anymore. | dilyevsky wrote: | Yea pretty soon we'll have infra engs that had gone through | their whole career without ever working a real server rack. I | personally find that thought discomforting | scarface74 wrote: | I think people who started out as assembly language | programmers thought the same thing. Everything gets | abstracted at some point. | dilyevsky wrote: | The problem with that comparison is I don't have to pay big | co to be able to write code in C++ | oblio wrote: | No punch cards?!? | paxys wrote: | This analysis completely skips over the "elastic" aspect of cloud | infrastructure, which was a key motivator for using these | providers since the very beginning and is as relevant today | regardless of company size. | | My company (which is in fact part of the charts in the article) | had every single metric across the board spike 15-20x basically | overnight when the pandemic started last year in March. Our | entire infrastructure burden was clicking a few buttons on the | AWS console and making sure everything was provisioning and | scaling as needed. If we had to send out people to buy hard | drives and server racks at that time, there is no chance we would | have been able to meet the extra demand. | | Plus, if you give me a few dozen capable engineers today I'm not | going to waste their efforts on rebuilding AWS to get a best-case | few percentage point return on our cloud spend. I'll launch a new | product for our customers instead. | LimaBearz wrote: | That's a valid point. I don't want to argue pedantry but I'll | say I worked some media companies that experience 5x normal | volumes a few times a year during events before returning to | "normal", making AWS an obvious choice. | | I've also worked for a place so large in AWS we hit walls where | we hit Amazon's literal physical limit and had wait for them to | go out and buy the hardware for us to provision. | | For Dropbox specifically I don't see them being any form of | special case, if it saves money and they obviously had | operational experience so it makes sense | matchagaucho wrote: | From a Moore's Law perspective I'd like to see a true cost of | ownership over time, as most infra goes obsolete in 18-24 months. | | Public clouds, like AWS, have cut their storage costs by more | than 50% since Dropbox built their own infra in 2015. | kelp wrote: | In my experience it's pretty common to stretch your infra out | much longer than 18-24 months. Often finance has a 3 year | depreciation schedule, and depending on your needs, you can | keep existing workloads on existing hardware for many more | years. | | The last time I ran physical infra, we had sever racks that I'd | originally bought, running in production 5+ years later. | | Once you're past the depreciation schedule, they are basically | free except for power and space. Yeah, at some point and scale, | it can make sense to get new gear that is more power and space | efficient to pack more into the same power and cooling budget. | | AWS still lets me launch c1-c6 instance types. So they still | have the older generations sticking around. Yeah, the newer | ones are usually more cost effective, but you do have to do | work to migrate to them. | pintxo wrote: | > [...] paradox: You're crazy if you don't start in the cloud; | you're crazy if you stay on it. | | > So what can companies do to free themselves from this paradox? | As mentioned, we're not making a case for repatriation one way or | the other; rather, we're pointing out that infrastructure spend | should be a first-class metric. What do we mean by this? That | companies need to optimize early, often, and, sometimes, also | outside the cloud. When you're building a company at scale, | there's little room for religious dogma. | bcantrill wrote: | If it needs to be said, this is more or less exactly the thesis | behind Oxide[0], and matches what we are seeing in the market. | It's certainly validating to see a VC firm echo our pitch deck | back to us, even if one that (in)famously doesn't believe in | hardware! ;) | | [0] https://news.ycombinator.com/item?id=27294471 | SiVal wrote: | Yes, but what I'm not seeing in their analysis of variable-cost | cloud vs cheaper fixed-cost colo is the huge fixed cost of the | personnel to manage it. Yes, I agree that switching my food- | delivery business from calling for an Uber whenever I get an | order to buying my own car can be much cheaper, but if I leave | out the cost of hiring a full-time driver for my new car, my | analysis is...incomplete. | | Your pitch emphasizes features that make servers easier to | manage, presumably lowering (but not eliminating) the cost of | personnel, so you're obviously aware of this, but I'm not | seeing any "net of additional, fully-loaded personnel costs" in | their analysis. | | It should still be cheaper above a certain scale, but the | breakeven where it will make sense to "repatriate | infrastructure" is much higher if you include hiring people, | and will go even higher if the cloud providers run the numbers | and match most of the savings with scale-based price breaks. | kelp wrote: | I don't know how every company does it. But I led the | datacenter and networking teams at Square from 2011 to 2017. | In that time we went from 1 DC cage with 4 server racks to 4 | US DCs, one in a Japan and a couple of network pops. A bit | more than 100 server racks total. | | My team had 2 guys doing SiteOps. They would travel to the | various DCs in the Bay Area and Virginia and do all the | maintenance, new installs, etc. And sometimes we'd lean on | the colocation remote hands to do a few things. | | We had about 5 network engineers, that also handled the | corporate network. (12 offices and a network backbone that | connected east and west coast DCs, offices, etc). | | And maybe 2 SWEs who handled things like our host OS install | system, etc. Basically the next layer above the hardware. | | So 9 people that were required to run all of that stuff. But | really, NetEng ended up spending like 70% of their time on | corporate network things because we'd add offices faster than | datacenters. | | So if we focused on production only, we really needed about 6 | or 7 people total. | | I did the math a few times (every single year) and compared | our costs, including people, to the costs of moving to AWS 3 | year reserve instances. | | Doing it ourselves was always half the price. | | Of course the difference here was we built on-prem from the | start. So there was no repatriation that had to happen. | | Since then, I've been responsible for large cloud infra on | all 3 major providers and learned a lot about what kind of | discounts you can get when you're in the double digit | millions in annual spend. | | I still think, at a scale of single digit thousands of | servers, you'd be cheaper on-prem, fully loaded. But | admittedly, I haven't run the numbers since 2017. | mtnygard wrote: | Congrats on the launch of your product! It's damn impressive. | akh wrote: | > By tracking cloud spend, the company enables engineers, and not | just finance teams, to take ownership of cloud spend. | | > tie the pain directly to the folks who can fix the problem | | That's why we're building https://github.com/infracost/infracost | for engineering teams (free open source) | kderbyma wrote: | this isn't a paradox to me, this has always been the case.... big | companies can optimize more. They have more non-optimal points | due to sheet size alone ___________________________________________________________________ (page generated 2021-05-27 23:01 UTC)