[HN Gopher] The many lies about reducing complexity part 2: Cloud ___________________________________________________________________ The many lies about reducing complexity part 2: Cloud Author : rapnie Score : 171 points Date : 2021-01-10 14:20 UTC (8 hours ago) (HTM) web link (ea.rna.nl) (TXT) w3m dump (ea.rna.nl) | ehnto wrote: | I don't understand what people are building in order to need half | of this decoupled and managed elsewhere anyway. It wasn't all | that challenging to self manage it five years ago, what's | changed? | | My guess is that the average small to medium project has drank | the enterprise coolaid, and they are suffering the configuration | and complexity nightmares that surround managing cloud | infrastructure before they really needed to. | | As the article is pointing out, you don't forgo managing these | things by doing it in the cloud, you just manage it inside a | constantly changing Web UI instead of something likely familiar | to your developers. | dkarl wrote: | I guess it's Kool-Aid? I don't know; I don't remember being | lied to when I started using cloud services. I think of cloud | resources as being amazing and basically magical, but I know | there's a limit to the magic, and the rest is work. People | using (for example) AWS S3 should not be surprised that they | still have to work to manage the naming, organization, access | control, encryption, retention, etc. of their data, and they | might encounter problems if they try to load a 100GB S3 object | into a byte array in a container provisioned with 1GB of RAM. | But they are. I don't know if that's human nature or if they're | being lied to by consultants and marketers. | ratww wrote: | There are products (Terraform, CloudFormation) that help | managing without an UI, but they also add complexity, so our | point definitely stills stands. | bob1029 wrote: | This shared responsibility principle that underlies cloud | marketing speak sounds a lot like the self-driving mess we find | ourselves in today - I.e. the responsibility boundary between | parties exists in a fog of war and results in more exceptions | than if one or the other were totally responsible. | | We have been a customer of Amazon AWS for ~6 years now, and we | still really only use ~3 of their products: EC2, Route53 and S3. | I.e. the actual compute/memory/storage/network capacity, and the | mapping of the outside world to it. Because we are a software | company, we write most of our own software. There is no value to | our customers in us stringing together a pile of someone else's | products, especially in a way that we cannot guarantee will be | sustainable for >5 years. We cannot afford to constantly rework | completed product installations. | | We strongly feel that any deeper buy-in with 3rd party technology | vendors would compromise our agility and put us at their total | mercy. Where we are currently positioned in the lock-in game, we | could pull the ripcord and be sitting in a private datacenter | within a week. All we need to do is move VMs, domain | registrations and DNS nameservers if we want to kill the AWS | bill. | | I feel for those who are up to their eyeballs in cloud | infrastructure. Perhaps you made your own bed, but you shouldn't | have to suffer in it. These are very complex decisions. We didn't | get it right at first either. Maybe consider pleading with your | executive management for mercy now. Perhaps you get a shot at a | complete redo before it all comes crashing down. We certainly | did. It's amazing what can happen if you have the guts to own up | to bad choices and start an honest conversation. | | I would also be interested to hear the other side of the coin. | Who out there is using 20+ AWS/Azure/GCP products to back a | single business app and is having a fantastic time of it? | mumblemumble wrote: | I recently inherited a product that was developed from the | ground up on AWS. It's been a real eye opener. | | Yes, it absolutely is locked in, and will never run on anything | but AWS. That doesn't surprise me. What surprises me is all of | the unnecessary complexity. It's one big Rube Goldberg | contraption, stringing together different AWS products, with a | great deal of "tool in search of problem" syndrome for good | measure. I am pretty sure that, in at least a few spots, the | glue code used to plug into Amazon XYZ amounted to a greater | development and maintenance burden than a homegrown module for | solving the same problem would have been. | | NIH syndrome is certainly not any fun. But IH syndrome seems to | be no better. | [deleted] | maria_weber23 wrote: | I second that. It's not only that you make yourself completely | intertwined with a Cloud by using more than fundamental | services. | | The costs of lambda or even DDB are IMMENSE. These only pay off | for services that have a high return per request. I.e. if you | get a lot of value out of lambda calls, sure, use them. But for | anything high-frequency that earns you little to nothing on its | own, forget about it. | | Generally all your critical infrastructure should be Cloud | independent. That narrows your choices largely to EC2, SQS, | perhaps Kinesis, Rout53, and the like. And even there you | should implement all your features with two clouds, i.e. Azure | and AWS, just to be sure. | | The good news is also the bad news. There are effectively only | two options: Azure or AWS. Google Cloud is a joke. They | arbitrary change their prices, terminate existing products, | offer zero support. It's just like we have come to love Google. | They just don't give a shit about customers. Google only cares | about "architecture", i.e. how cool do I feel as engineer | having built that service. Customer service is something that | Google doesn't seem to understand. So think carefully if you | want to buy into their "product". Google, literally, only | develops products for their own benefit. | 0xEFF wrote: | Google Cloud has quite good support and professional | services. | | I've worked with them for 3 years and can't think of any | services that have been killed. | | They are very customer focused. From my perspective as a | partner cloud services are more built for customer use cases | than Google internal use cases. GKE and Anthos for example. | jbmsf wrote: | I can't agree, at least not in general. | | The optionality of being cloud agnostic comes with a huge | cost, both because of all the pieces you have to | build+operate and because of the functionality you have to | exclude from your systems. | | I am sure there are scales where you either have such a large | engineering budget that you can ignore these costs or where | decreasing your cloud spend is the only way to scale your | business. But for the average company, I can't see how | spending so much on infrastructure (and future optionality) | pays off, especially when you could spend on product or | marketing or anything else that has a more direct impact on | your success. | jsiepkes wrote: | > But for the average company, I can't see how spending so | much on infrastructure (and future optionality) pays off, | especially when you could spend on product or marketing or | anything else that has a more direct impact on your | success. | | If you change "average company" to "average startup" then | your point make sense. But for a normal company not | everything needs to make a direct impact on your success. | For example guaranteeing long term business continuity is | an important factor too. | jbmsf wrote: | I take your point, but I still don't quite agree. | | There are obviously plenty of companies that are willing | to couple themselves to a single cloud vendor (e.g. | Netflix with AWS) and plenty of business continuity risks | that companies don't find cost effective to pursue. Has | anyone been as vocal about decoupling from CRM or ERB | systems as they are with cloud? | | My own view is that these kinds of infrastructure | projects create as many risks as the solve and happen at | least as much because engineers like to solve these kinds | of problems than for any other reason. | nucleardog wrote: | Unless you're planning for the possibility of AWS | dropping offline permanently with little to no notice, it | really feels like you're just paying a huge insurance | premium. Like any insurance, it's down to whether you | need insurance or could cover the loss. Whether you'd | rather incur a smaller ongoing cost to avoid the | possibility of a large one time loss. | | If AWS suddenly raised their prices 10x overnight, it | would hurt but not be an existential threat for most | companies. At that point they could invest six months or | a year into migrating off of AWS. | | Rough numbers that would end up costing us like $4m in | cloud spend and staff if we retasked the entire org to | accomplishing that for a year. | | There's certainly an opportunity cost as well, but I'd | argue it's not dissimilar to the opportunity cost we'd | have been paying all along to maintain compatibility with | multiple clouds. | | Obviously it's just conjecture, but my gut says the | increased velocity of working on a single cloud and using | existing Amazon services and tools where appropriate has | made us significantly more than the costs of something | that may never happen. | jbmsf wrote: | Strong agree. | | Plus I've seen more than a few efforts at multi-cloud | that resulted in a strong dependency on all clouds vs the | ability to switch between them. So not only do you not | get to use cloud-specific services, you don't really get | any benefit in terms of decoupling. | zmmmmm wrote: | > The optionality of being cloud agnostic comes with a huge | cost, both because of all the pieces you have to | build+operate | | This sounds like cloud vendor kool aid to me. Nearly every | cloud vendor product above the infrastructure layer is a | version of something that exists already in the world. When | you outsource management of that to your cloud vendor you | might lose 50% of the need to operationally manage that | product but about 50% of it is irreducible. You still need | internal competence in understanding that infrastructure | and one way or another you're going to develop it over | time. But if its your cloud vendor's proprietary stack then | you are investing all your internal learning into non- | transferrable skills instead of ones that can be | generalised. | singron wrote: | Do you have examples of Google Cloud arbitrarily changing | prices and terminating products? | | Sure they terminate consumer products, and there was a Maps | price hike, but I'm not aware of anything that's part of | Cloud. | ma2rten wrote: | A very long time ago App engine went out of beta and there | was a price hike leaving many scrambling. App engine was in | beta so long that many people didn't think that label meant | anything. | yls wrote: | IIRC they introduced a cluster management fee in GKE. | miscaccount wrote: | not much 10 cents per hour https://www.reddit.com/r/kuber | netes/comments/fdgblk/google_g... | jbmsf wrote: | It's always a trade-off though. You say you write most of your | own software, but that's probably not true for, say your OS or | programming language, or editors, or a million other things. | Cloud software is the same; you might not be producing the most | value if you spend your engineering hours (re)creating | something you could buy. | | In my own experience: | | - AWS SNS and SQS are rock solid and provide excellent | foundations for distributed systems. I know I would struggle to | create the same level of reliability if I wrote my own publish- | subscribe primitives and I've played enough with some of the | open source alternatives to know they require operational costs | that I don't want to pay. | | - I use EC2 some of the time (e.g. when I need GPUs), but I | prefer to use containers because they offer a superior solution | for reproducible installation. I tend to use ECS because I | don't want to take on the complexity of K8S and it offers me | enough to have reliable, load-balanced services. ECS with | Fargate is a great building block for many, run-of-the-mill | services (e.g. no GPU, not crazy resource usages). | | - Lambda is incredibly useful as glue between systems. I use | Lambda to connect S3, SES, CloudWatch, and SQS to application | code. I've also gone without Lambda on the SQS side and written | my framework layers to dispatch messages to application code. | This has advantages (e.g. finer-grain backoff control) but | isn't worth it for smaller projects. | | - Secrets manager is a nice foundational component. There are | alternatives out there, but it integrates so well with ECS that | I rarely consider them. | | - RDS is terrific. In a past life, I spent time writing | database failover logic and it was way too hard to get right | consistently. I love not having to think about it. Plus | encryption, backup, and monitoring are all batteries included. | | - VPC networking is essential. I've seen too many setups that | just use the default VPC and run an EC2 instance on a public | IP. The horror. | | - I've recently started to appreciate the value of Step | Functions. When I write distributed systems, I tend to end up | with a number of discrete components that each handle one part | of a problem domain. This works, but creates understandability | problems. I don't love writing Step Functions using a JSON | grammar that isn't easy to test locally, but I find that the | visibility they offer in terms of tracing a workflow is very | nice. | | - CloudFront isn't the best CDN, but it is often good enough. I | tend to use it for frontend application hosting (along with S3, | Route53, and ACM). | | - CloudWatch is hard to avoid, though I rather dislike it. | CloudWatch rules are useful for implementing cron-like triggers | and detecting events in AWS systems, for example knowing | whether EC2 failed to provision spot capacity. | | - I have mixed feeling about DynamoDB as well. It offers a nice | set of primitives and is often easier to starting use for small | projects than something like RDS, but I rarely operate at the | scales where it's a better solution than something like RDS | PostgreSQL with all the terrific libraries and frameworks that | work with it. | | - At some scale, you want to segregate AWS resources across | different accounts, usually with SSO and some level of | automated provisioning. You can't escape IAM here and Control | Tower is a pretty nice solution element as well. | | I'm not sure if I'm up to 20 services yet, but it's probably | close enough to answer your question. There are better and | worse services out there, but you can get a lot of business | value by making the right trade-offs, both because you get | something that would be hard to build with the same level of | reliability and security and because you can spend your time | writing software that speaks more directly to product needs. | | As for "having a fantastic time", YMMV. I am a huge fan of | Terraform and tend to enjoy developing at that level. The | solutions I've built provide building blocks for development | teams who mostly don't have to think about the services. | mycall wrote: | Did you look into multi-cloud solutions like Pulumi or | Terraform to abstract your cloud vendor? | jasode wrote: | _> I would also be interested to hear the other side of the | coin. Who out there is using 20+ AWS/Azure/GCP products to back | a single business app and is having a fantastic time of it?_ | | Netflix uses a lot of AWS higher-level services beyond the | basics of EC2 + S3. Netflix definitely doesn't restrict its use | of AWS to only be a "dumb data center". Across various tech | presentations by Netflix engineers, I count at least 17 AWS | services they use. | | + EC2, S3, RDS, DynamoDB, EMR, ELB, Redshift, Lambda, Kinesis, | VPC, Route 53, CloudTrail, CloudWatch, SQS, SES, ECS, SimpleDB, | <probably many more>. | | I think we can assume they use 20+ AWS services. | p_l wrote: | Certain services IMHO have to be discounted from this list: | | - VPC - basic building block for any AWS-based infra that | isn't ancient | | - CloudTrail - only way to get audit logs out of AWS, no | matter what you feed them into | | - CloudWatch - similar with CloudTrail, many things (but not | all) will log to CloudWatch, and if you use your own log | infra you'll have to pull from it. Also necessary for | metrics. | | - ELB/ELBv2/NLB/ALB - for many reasons they are often the | only ways to pull traffic to your services deployed on AWS. | Yes, you can sometimes do it another way around, but you have | high chances of feeling the pain. | | My personal typical set for AWS is EC2, RDS, all the | VPC/ELB/NLB/ALB stack, Route53, CloudTrail + CloudWatch. S3 | and RDS as needed, as both are easily moved elsewhere. | tidepod12 wrote: | I don't think you can discount them like that. Maybe they | aren't as front of mind as services like S3, EC2, etc, but | if you were to try to rebuild your setup in a personal data | center, replacing the capabilities of VPC, IAM, CloudTrail, | NAT gateways, ELBs, KMS etc would be a huge effort on your | part. The fact that they are "basic building blocks" makes | them more important, not less. In a discussion about the | complexity of cloud providers versus other setups, that | seems especially relevant. | p_l wrote: | Oh, I meant it more in terms of "can you count on them as | _optional_ services ". | | Because they aren't optional, and yes, it takes non | trivial amount to replicate them... but funnily enough, | several of them have to be replicated elsewhere too. | | NAT gateways usually aren't an issue, KMS for many places | can be done relatively quickly with Hashicorp Vault. | | IAM is a weird case, because unless you're building a | cloud for others to use it's not necessarily that | important, meanwhile your own authorization framework is | necessary even on AWS because you can't just piggy back | on IAM (I wish I could). | fiddlerwoaroof wrote: | I mostly agree, although ECS with Fargate is often nicer to | use than EC2 | [deleted] | notretarded wrote: | I'm in two minds about this (deeper integration with a | particular vendor - i.e. "serverless") | | Reduced time to market is incredibly valuable. Current client | base is well in its millions. Ability to test to few and roll | out to many instantly is invaluable. You no longer have to hire | competent software developers who understand all patterns and | practices to make scalable code and infrastructure. Just need | them to work on a particular unit or function. | | The thing which scares me is, some of these companies are | decades of years old, hundreds. How long has AWS/GCP/Azure | abstractions been around for? How quick are we to graveyard | some of these platforms. Quite. A lot quicker than you can | lift, shift and rewrite your solution to elsewhere. | rvanmil wrote: | We carefully select and use PaaS and managed cloud services to | construct our infrastructure with. This allows us to maximize | our focus on what our customers are paying for: creating | software for them which will typically be in use for 5+ years. | We spend close to zero time on infrastructure maintenance and | management, we pay others to do this for us, cheaper and more | reliable. Having to swap out one service for another hasn't | given us any trouble or unreasonable costs yet in the past 5 | years. Unlike the article is trying to convince us of, it has | _massively_ reduced complexity for us. | [deleted] | bird_monster wrote: | > There is no value to our customers in us stringing together a | pile of someone else's products | | Maybe not your business, but there are many businesses in which | this is exactly what happens. Any managed-service is just | combining other people's work into a "product" that gets sold | to customers. And that's great! AWS has a staggering amount of | products, and lots of business don't even want to have to care | about AWS. | | > Who out there is using 20+ AWS/Azure/GCP products to back a | single business app and is having a fantastic time of it? | | Several times. I think cloud products are just tools to get you | further along in your business. Most of the tools I use are | distributed systems tools, because I don't want to have to own | them, and container runtimes/datastores. Every single thing | I've ever deployed across AWS/Azure is used as a generic | interface that could be replaced relatively easily if | necessary, and I've used Terraform to manage my infrastructure | creation/deployment process, so that I can swap resources in | and out without having to change tech. | | If, for some reason, Azure Event Hub stopped providing what we | needed it for, we could certainly deploy a customized Kafka | implementation and have the rest of our code not really know or | care, but from when we set out to build our products, that has | always been a "If we need to" problem, and we've never needed | to. | g9yuayon wrote: | So your company cautiously chooses which services in AWS to | use, and sticks to infrastructure offerings for now. Netflix | called it "paved path", and it worked really well too for | Netflix. Over the years, though, the "paved path" expanded and | extended to more services. It's worth noting that EC2 alone is | a huge productivity booster, bar none. Nothing beats setting up | a cluster of machines, with a few clicks, that auto scales per | dynamic scaling policies. In contrast, Uber couldn't do this | for at least 5 years, and their docker-based cluster system is | a crippled for not supporting the damn persistent volumes. God | knows how much productivity was lost because of the bogus | reasons Uber had for not going to cloud. | bane wrote: | I've worked with a number of teams over the last few years who | use AWS and I'd say from top to bottom they all build their | strategy more or less the same way: | | 0. Whatever is the minimum needed to get a VPC stood up. | | 1. EC2 as 90%+ of whatever they're doing | | 2. S3 for storing lots of stuff and/or crossing VPC boundaries | for data ingress/egress (like seriously, S3 seems to be used | more as an alternative to SFTP than for anything else). This | makes up usually the rest of the thinking. | | 3. _Maybe_ one other technology that 's usually from the set of | {Lambda, Batch, Redshift, SQS} but _rarely_ any combination of | two or more of those. | | And that's it. I know there are teams that go all in. But for | the dozen or teams I've personally interacted with this is it. | The rest of the stack is usually something stuffed into an EC2 | instance instead of using an AWS version and it comes down to | one thing: the difficulties in estimating pricing for those | pieces. EC2 instances are drop-dead simple to price estimate | forward 6 months, 12 months or longer. | | Amazon is probably leaving billions on the table every year | because nobody can figure out how to price things so their | department can make their yearly budget requests. The one time | somebody tries to use some managed service that goes overbudget | by 3000%, and the after action figures out that it would have | been within the budget by using <open source technology> in | EC2, they just do that instead -- even though it increases the | staff cost and maintenance complexity. | | In fact just this past week a team was looking at using | SageMaker in an effort to go all "cloud native", took one look | at the pricing sheet and noped right back to Jupyter and | scikit_learn in a few EC2 instances. | | An entire different group I'm working with is evaluating cloud | management tools and most of them just simplify provisioning | EC2 instances and tracking instance costs. They really don't do | much for tracking costs from almost any of the other services. | hyperdimension wrote: | I'm not very familiar with AWS or The Cloud, but I'm having | trouble understanding what you said about Amazon leaving | money on the table by not directing customers toward | specific-purpose services as opposed to EC2? | | Wouldn't (for AWS to make a profit anyway) whatever managed | service _have_ to be cheaper than some equivalent service | running on an EC2 VM? | | I get the concerns re: pricing and predictability, but it | still seems like more $$$ for AWS. | bane wrote: | Yeah good question. Sibling comments to this one explain it | well, but basically AWS managed services come at a premium | price over some equivalent running in just EC2. (Some | services in fact _do_ charge you for EC2 time + the service | + storage etc.) | | "Managed" usually means "pay us more in exchange for less | work on your part". This is usually pitched as a way to | reduce admin/infrastructure/devops type staff and the | overhead that goes along with having those people on the | payroll. | nucleardog wrote: | No, usually the managed services are a premium over the | bare hardware. When you use RDS for example, you're paying | for the compute resources but also paying for the extra | functionality they provide and their management and | maintenance they're doing for you. You can run your own | Postgres database, or you can pay the premium for Aurora on | RDS and get a multi-region setup with point in time restore | and one-click scaling and automatically managed storage | size and automatic patching and monitoring integrated into | AWS monitoring tools and... | | They're leaving money in the table because instead of using | "Amazon Managed $X" potentially at a premium or paying a | similar amount but in a way where AWS can provide the | service with fewer compute resources than you or I would | need because of their scale and thus more profitably, | people look and see they'll be paying $0.10/1000 requests | and $0.05/1gb of data processed in a query and $0.10/gb for | bandwidth for any transfer that leaves the region and... | people just give up and go "I have no idea what that will | cost or whether I can afford it, but this EC2 instance is | $150/mo, I can afford that." | glogla wrote: | For example: | | Managed Airflow Scheduler on AWS with "large" size costs | $0.99/hour, or $8,672/year per instance. That's ~ $17,500 | considering Airflow for at least non-prod and prod | instances. | | Building it on your own on same size EC2 instance would | cost $3,363/year for the EC2. Times two for two | environments, let's say $6,700. $4,000 if you prepay the | instance. | | That looks way cheaper, but then you have to do the | engineering and the operational support yourself. | | If you consider just the engineering and assume engineer | costs $50/hour and estimate this to initial three weeks of | work and then 2.5 days / month for support (upgrades, | tuning, ...) that's extra $4,000 upfront and $1,000/month. | | So on AWS you're at $17,500/year and on-prem you're at best | $20,000 first year and $16,000 next years. | | So the AWS only comes a bit more expensive - but the math | is tricky on several parts: | | - maybe you need 4 environments deployed instead of 2, | which is more for AWS but not much more for engineering? | | - maybe there's less sustaining cost because you're ok with | upgrading Airflow only once a quarter? | | - you probably already pay the engineers, so it's not an | extra _money_ cost, it 's extra cost of them not working on | other stuff - different boxes and budgets | | - maybe you're in part of a world where good devops | engineer doesn't cost $50/hour but $15 hour | | - I'm ignoring cost of operational support, which can be a | lot for on-prem if you need 24/7 | | - maybe you need 12+ Airflow instances thanks to your | fragmented / federated IT and can share the engineering | cost | | - etc, etc. | | So I think what OP was saying is that if AWS priced Managed | Airflow at $0.5 per hour, it would be no brainer to use | instead of build your own. The way it is, some customers | will surely for their own Airflow instead, because the math | favors it. | | Does that make sense? | bird_monster wrote: | > That looks way cheaper, but then you have to do the | engineering and the operational support yourself. | | In my experience, this is the piece that engineers rarely | realize and that is actually one of the biggest factors | in evaluating cloud providers vs. home-rolled. Especially | if you're a small company, engineering time (really any | employee time) is _insanely valuable_. Valuable such that | even if Airflow is cash-expensive, if using it allows | your engineers to focus on building whatever makes _your | business successful_, it is usually a much better idea to | just use Airflow and keep moving. Clients usually will | not care about whether you implemented your own version | of an AWS product (unless that's your company's specific | business). Clients will care about the features you ship. | If you spent a ton of time re-inventing Airflow to save | some cost, but then go bankrupt before you ever ship, | rolling your own Airflow implementation clearly didn't | save you anything. | glogla wrote: | I agree. | | The only caveat is that this goes for founders or | engineers who are financially tied with the company | success. If the engineer just collects paycheck, they | might prioritize fun - and I feel that might be behind a | lot of the "reinventing the wheel" efforts you see in the | industry. | | Or maybe I'm just cynical. | tpxl wrote: | We used to have on-prem redis and a devops engineer to | manage it, then we moved to redis in the cloud and had a | devops engineer to manage it. | | Saying that in the cloud you don't need engineers to | manage "operational support" is the biggest lie the cloud | managed to sell. | mumblemumble wrote: | It's not just about a straight cost comparison. It's about | how organizational decision-making works. | | The people shopping for products are not spending their own | money, but they are spending their own time and energy. The | people approving budgets are not considering all possible | alternatives, they are only considering the ones that have | been presented to them by the people doing the shopping. | | If the shoppers decide that an option will cost them too | much in time and irritation, then it may be that the people | holding the purse-strings are never even made aware that it | exists. Even if it is the cheapest option. | gregmac wrote: | This is a really good summary of the situation, and I'd | add a bit about risk: | | It's relativity easy to estimate EC2 costs for running | some random service, because it's literally just a per- | hour fee times number of instances. If you're wrong, the | bigger instance size or more instances isn't that much | more expensive. | | For almost every other service, you have to estimate some | other much more detailed metric: number of http requests, | bytes per message, etc. When you haven't yet written the | software, those details can be very fuzzy, making the | whole estimation process extremely risky - it could be | cheaper than EC2, it could be 10x more, and we won't | really know until we've spent at least a coulple months | writing code. And let's hope we don't pivot or have our | customers do anything in a way we're not expecting.. | harikb wrote: | +1 | | I bet cloud providers are incentivized not to provide | detailed billing/usage stats. I remember having to use a 3rd | party service to analyze our S3 usage. | | Infinite scalability is also a curse - we had a case where | pruning history from an S3 bucket was failing for months and | we didn't know until the storage bill became significant | enough to notice. I guess in some ways it is better than | being woken up in the middle of night but we wasted | _millions_ storing useless data | | Azure also has similar issues - deleting a VM sometimes | doesn't cleanup dependent resources and it is a mess to find | and delete later - only because the dependent resources are | deliberately not named with a matching tag. | zmmmmm wrote: | > Infinite scalability is also a curse | | People don't like to admit it, but in many circumstances, | having a service that is escalating to 10x or 100x its | normal demand go off line is probably the _desirable_ | thing. | threentaway wrote: | This seems like you didn't have proper monitoring and | alerting set up for your job, not sure how that is a | downside of AWS. | jjoonathan wrote: | AWS monitoring (and billing) is garbage because they make | an extraordinary amount of money on unintentional spend. | | "But look at how many monitoring solutions they have in | the dashboard! Why, just last re:invent they announced 20 | new monitoring features!" | | They make a big fuss and show about improving monitoring | but it's always crippled in some way that makes it easy | to get wrong and time-consuming or expensive to get | right. | milesvp wrote: | > Infinite scalability is also a curse | | This was the key sentence, I think. This type of problem | actually shows up in other domains as well, queueing | theory comes immediately to mind. Even the halting | problem is only a problem with infinite tape, and becomes | easier with (known?) limited resources. | | When you have some parameter that is unbounded you need | to add extra checks to bound them yourself to some sane | value. You are right, in that the parent failed to | monitor some infrastructure, but if they were in their | own datacenter, once they filled their NAS, I'm positive | someone would have noticed, if only because other checks, | like diskspace are less likely to be forgotten. | | Also, getting a huge surprise bill is a downside of any | option, and the risk needs to be factored into the cost. | I'm constantly paranoid when working in a cloud | environment, even doing something as trivial as a | directory listing from the command line on S3 costs | money. I had a back and forths with AWS support just to | be clear what the order of magnitude of the bill would be | for a simple cleanup action since there were 2 documented | ways to do what I needed, and one appeared to be easier, | yet significantly more expensive. | bird_monster wrote: | +1, but with a container tool (Fargate/ECS, Azure Container | Instances) instead of EC2. | nelsonenzo wrote: | This person has never worked in a data center. He thinks he's | managing a network because he sets a few vpc ips, that's an itsy | tiny fragment of networking, and the cloud has indeed removex a | great deal you previously had to manage on prem. | zoomablemind wrote: | Subjectively, it increasingly feels that while the complexity has | been increasing, the notion of longevity of the underlying | products and services has been degrading. | | While updates to software were expected, general outlook would be | that they would not be breaking the core features. The emphasis | on backwards compatibility was in a way an assurance to | businesses that building their operations on vendor's products is | not risky. Even then, some mission-critical elements would be | defensively abstracted to avoid the dependency risks (at least | theoretically...) | | Now, we all witness the "eternal-beta" paradigm across the most | of the major software products. Frequent builds with automatic | updates, when new features could be suddenly pushed, and old | features removed. | | Sure, it's still possible to spec out a "hard-rock" steady | platform, postpone updates, abstract dependencies and just focus | on business. But...such approach won't be approved, as it's | widely acknowledged that the presence of critical bugs is rather | a "feature" of all software. Postponing the updates is not | prudent, it's a liability. | | So the rock-solid expectations are just an illusion or perhaps a | fantasy promoted widely, just to get the foot in the door. | | Ironically, the most stable elements are the so much dreaded | "legacy", too often in charge of the business-critical logic. | s3tz wrote: | Anyone manage to find part 1? It's not on their site, can't seem | to find it. | jvanderbot wrote: | It's linked in the article https://ea.rna.nl/2016/01/10/a-tale- | of-application-rationali... | | "This was actually part 1 of this story: A tale of application | rationalisation (not)." | [deleted] | CalChris wrote: | Isn't this just _Jevon 's Paradox_ applied to software? | when technological progress ... increases the efficiency with | which a resource is used .., but the rate of consumption of that | resource rises due to increasing demand [1] | | [1] https://en.wikipedia.org/wiki/Jevons_paradox | ChicagoDave wrote: | Reducing complexity should never be about platform (on-prem vs | cloud). | | It should be about constructing software in partnership with the | business and reducing complexity with modeled boundaries. | | You can leverage the cloud to do some interesting things, but the | true benefit in is _what_ you construct, not _how_. | sanp wrote: | There is an element of _how_ as well. You could create simple | monoliths or overengineered microservices. Or, complex | monoliths with heavy coupling vs cleanly designed microservices | with clear separations of concern. | the-smug-one wrote: | Are microservices meant to separate data too? As in, each | service has its own database. | | Wouldn't that lead to non-normalisation of the data or a lot | of expensive network lookups to get what I want/need? | | What is the point of micro services anyway :-)? | kqr wrote: | > Are microservices meant to separate data too? As in, each | service has its own database. | | Yes. | | > Wouldn't that lead to non-normalisation of the data | | Yes. But it's not as bad as it sounds. That is how data on | paper used to work, after all. | | Business rules (at least ones that have been around for | more than 5--10 years) are written with intensely non- | normalised data in mind. | | Business people tend to be fine with eventual consistency | on the scale of hours or even days. | | Non-normalised data also makes total data corruption | harder, and forensics in the case of bugs easier, in some | ways: you find an unexpected value somewhere? Check the | other versions that ought to exist and you can probably | retrace at what point it got weird. | | The whole idea of consistent and fully normalised data is | a, historically speaking, very recent innovation, and I'm | not convinced it will last long in the real world. I think | this is a brief moment in history when our software is | primitive enough, yet optimistic enough, to even consider | that type of data storage. | | And come on, it's not like the complete consistency of the | data is worth _that_ many dollars in most cases, if we | actually bother to compute the cost. | ratww wrote: | _> Are microservices meant to separate data too? As in, | each service has its own database._ | | Ideally yes, to scale. | | Sometimes you have a service with obvious and easy-to-split | boundaries, and microservices are a breeze. | | Some things that are easy to turn into microservices: "API | Wrapper" to a complex and messy third-party API. Logging | and data collection. Sending emails/messages. User | authentication. Search. Anything in your app that could | become another app. | | However, when your data model is tightly coupled and you | need to choose between tradeoffs (data duplication), having | bigger services, or even keeping it as a monolith. | | Btw, if you don't care about scalability, sharing a | database is still still not the best idea. But you can have | a microservice that wraps the database in a service, for | example. Tools like Hasura can be used for that. | NicoJuicy wrote: | Microservices is a solution for an organisational problem ( | multiple employees in one project), not a technical one. | | If you're flying solo, just use DDD for example. It will | give you the same patterns without the devops complexity | mumblemumble wrote: | I honestly believe that hiding complexity behind a closed door | does not eliminate it. However, a lot of software and service | vendors have a vested interest in convincing people otherwise. | And, historically, they've had all sorts of great platforms for | doing so. Who doesn't enjoy a free day out of the office, with | lunch provided? | | It's also much easier to hide complexity than it is to remove | it. One can be accomplished with (relative) turnkey solutions, | generally without ever having to leave the comfort of your | computer. Whereas the other generally requires long hours | standing in front of a chalkboard and scratching your head. | SpicyLemonZest wrote: | On the other hand, hiding complexity behind closed doors can | be a very valuable thing, if it lets you keep track of who | knows about the complexity behind each. I can't count the | number of issues I've encountered that would have taken | minutes instead of hours if only I'd known which specific | experts I needed to talk to. | mumblemumble wrote: | Agreed. Though, that to comes at a cost, so I don't want to | do it except when it's worth it. | | http://yosefk.com/blog/redundancy-vs-dependencies-which- | is-w... | candiddevmike wrote: | I see a lot of mentions in the comments about just using the | basic storage/networking/compute from AWS/AZ/GCP--if that's all | you're using, you should really consider other providers. Linode, | Digital Ocean, and Vultr will be far more competitive and offer | faster machines, cheaper, and with better bandwidth pricing. | | The point of using AWS/AZ/GCP is to leverage their managed | service portfolio and be locked in. If you aren't doing that, | there are better companies that want your business and will treat | you much better. | dilyevsky wrote: | There's also packet (now equinix metal) that gives control over | l2 and have nice things such as ibgp. I think vultr may do too | but their docs are poor and support was uncooperative | ithrow wrote: | IME, the network of AWS is much better than that of DO, Linode. | dilyevsky wrote: | How so? | ithrow wrote: | Less hiccups and downtime. It's faster and with better | latency to other third party services. Superior internal | control. Ex: In, linode a private IP address gives EVERYONE | on the same data center access to your Linode server. Also, | last time I used them they didn't have a Firewall. | dilyevsky wrote: | Right linode is basically old school dedicated servers | afaik but DO should be in different class | dijit wrote: | AWS networking isn't _great_ but it's decidedly better | than DO (which is actually the worst of those listed | based on my own TCP connection tests). | | Linode is pretty stable if not very exciting, Vultr is | "better than DO", but their networks are almost always in | maintenance. | | For a little context; I maintain IRC servers and those | are currently hosted in vultr (with linked nodes in 5 | regions), I notice ping spikes between those nodes often | and sometimes congestion which drops users. (IRC is | highly stateful TCP sessions). | | I've only known two truly good networking suppliers, GCP | (and their magic BGP<->PoP<->Dark Fibre networks) and.. | Tilaa.. (which is only hosted in Netherlands.. which is | why I can't use them for my global network) | dilyevsky wrote: | Awesome thanks for info. For gcp i notice occasional | unavailability on the order of 10s of mins every quarter | or so. That's VM networking. Their loadbalancers are a | different story as they are a complete crap | dijit wrote: | This is very much not my experience, do you have any more | information? | | Any particular regions? Are you certain it's not a local | ISP? | | (I used to run an always online video game and we had a | LOOOOOT of connection issues from "Spectrum internet" on | all of our servers including GCP ones.) | dilyevsky wrote: | Answering here bc bottom post is locked for some reason - | east1 occasionally disconnects from other regions. That | is definitely within google backbone. Central-1 seems | worse tho. If it's less than an hour they dont bother | with the status page. | | For loadbalancer its very much by design as they randomly | send you rst when google rolls them for upgrade and in | some other cases (I'm working on a blog post on this). | Google support recommendation is to retry (foreals) | mrkramer wrote: | Microsoft summarized it nice [1] : | | Advantages of public clouds: | | Lower costs | | No maintenance | | Near-unlimited scalability | | High reliability | | Advantages of a private cloud: | | More flexibility | | More control | | More scalability (compared to pure on-prem solution) | | [1] https://azure.microsoft.com/en-us/overview/what-are- | private-... | indymike wrote: | Hmm... So Azure for unlimited scalability... But private clouds | have more scalability? | beaconstudios wrote: | Presumably both options are written relative to non cloud | setups | 8K832d7tNmiQ wrote: | probably meant that you can just request hundreds of servers | in other part of the world in one single setup compared to | manually building your server there. | mrkramer wrote: | "More scalability -- private clouds often offer more | scalability compared to on-premises infrastructure." | | I think they meant private cloud (renting 3rd party servers | and using/maintaining your private cloud) vs on-prem | (buying servers and building your own data centers). | jvanderbot wrote: | Part 1 is linked in article | | https://ea.rna.nl/2016/01/10/a-tale-of-application-rationali... | | "This was actually part 1 of this story: A tale of application | rationalisation (not)." | [deleted] | skohan wrote: | I have to say, at my current company we are using Serverless, and | it really does feel like it reduces complexity. No | runtime/framework to set up, no uptime monitoring or management | required on the application layer, and scaling is essentially | solved for us. I mean you do pay for what you get, but it does | feel like one of those technologies which really lowers the | barrier to entry in terms of being able to release a production | web application. In terms of building an MVP, a single developer | really can deploy an application without any dev-ops support, and | it will serve 10 million users if you manage to get that much | traffic. | | I'm sure it's not optimal for every case, but for an awful lot of | cases it seems pretty darned good, and you can save on dev ops | hiring. | ratww wrote: | I used to be very excited about serverless, and I still have | high hopes for it. | | But for me it ended up replacing the complexity of runtime and | frameworks with the complexity of configuring auxiliary | srevices like Gateway API, Amazon VPC, etc. We needed to move | the complexity to some tool that configured the services around | Lambda, like Terraform or Cloud Formation, or at best to a | framework like Claudia or Serverless.com. Configuring it by | hand looks fine in tutorials, but is madness: it's still | complex, and makes it all way too fragile. | | There are however some products that make the experience better | by simplifying configuration, like Vercel and Netlify. | skohan wrote: | Yeah I certainly agree that the complexity doesn't _really_ | go away completely, and sometimes it 's much more frustrating | to have to configure poorly documented services rather than | just having access to the host OS. | | I guess my overall point would be that two of the _hardest_ | things to do in terms of making a production-ready | application are scaling and security, and Serverless pretty | much obviates them. So it 's not a magic wand, but it does | take away some of the significant barriers to entry. | ratww wrote: | Yes, I agree with that point. I think my point was more | that Serverless is a good idea, but the current | implementations are still not good at removing complexity. | But I can see this easily changing, with open standards and | the such. | orlovs wrote: | Well, we just need to admit that running applications if | ack all know risks are complex. If we blissfully ignore | risks like lamp or lemp stack its much more easier. Main | question do we need to take in account most of risks, | running within small scale. | 8note wrote: | I was expecting writing serverless to be a mess of writing | configuration, but I've really enjoyed writing CDK for | cloudformation. It's super unclear how you're supposed to | write good cdk code, but I feel like I'm a lot clearer on | what infrastructure I'm actually using than before, where I | was relying on stuff set up by someone else ages ago with | minimal to no documentation | arendtio wrote: | I wonder who tells the story that cloud computing has something | to do with reducing complexity. In my world, cloud computing is a | bout scalability and making things as complex as they need to be | to be scalable. This rarely means that complexity is being | reduced. | didericis wrote: | The simplicity of having one cloud based product rather than | several native products built for different systems is an | argument I've heard a lot. | ratww wrote: | This is an advantage of the web platform, not exactly related | to cloud. You can get this advantage with an on-premises web | product, or with old school hosting. | didericis wrote: | Very true, just trying to explain the source of the "cloud | reduces complexity" argument. There are a number of small | operations that don't want to manage all their own | hardware, so cloud and web are conflated, and you get the | web platform simplicity argument being used to justify a | cloud platform. ___________________________________________________________________ (page generated 2021-01-10 23:01 UTC)