[HN Gopher] The many lies about reducing complexity part 2: Cloud
       ___________________________________________________________________
        
       The many lies about reducing complexity part 2: Cloud
        
       Author : rapnie
       Score  : 171 points
       Date   : 2021-01-10 14:20 UTC (8 hours ago)
        
 (HTM) web link (ea.rna.nl)
 (TXT) w3m dump (ea.rna.nl)
        
       | ehnto wrote:
       | I don't understand what people are building in order to need half
       | of this decoupled and managed elsewhere anyway. It wasn't all
       | that challenging to self manage it five years ago, what's
       | changed?
       | 
       | My guess is that the average small to medium project has drank
       | the enterprise coolaid, and they are suffering the configuration
       | and complexity nightmares that surround managing cloud
       | infrastructure before they really needed to.
       | 
       | As the article is pointing out, you don't forgo managing these
       | things by doing it in the cloud, you just manage it inside a
       | constantly changing Web UI instead of something likely familiar
       | to your developers.
        
         | dkarl wrote:
         | I guess it's Kool-Aid? I don't know; I don't remember being
         | lied to when I started using cloud services. I think of cloud
         | resources as being amazing and basically magical, but I know
         | there's a limit to the magic, and the rest is work. People
         | using (for example) AWS S3 should not be surprised that they
         | still have to work to manage the naming, organization, access
         | control, encryption, retention, etc. of their data, and they
         | might encounter problems if they try to load a 100GB S3 object
         | into a byte array in a container provisioned with 1GB of RAM.
         | But they are. I don't know if that's human nature or if they're
         | being lied to by consultants and marketers.
        
         | ratww wrote:
         | There are products (Terraform, CloudFormation) that help
         | managing without an UI, but they also add complexity, so our
         | point definitely stills stands.
        
       | bob1029 wrote:
       | This shared responsibility principle that underlies cloud
       | marketing speak sounds a lot like the self-driving mess we find
       | ourselves in today - I.e. the responsibility boundary between
       | parties exists in a fog of war and results in more exceptions
       | than if one or the other were totally responsible.
       | 
       | We have been a customer of Amazon AWS for ~6 years now, and we
       | still really only use ~3 of their products: EC2, Route53 and S3.
       | I.e. the actual compute/memory/storage/network capacity, and the
       | mapping of the outside world to it. Because we are a software
       | company, we write most of our own software. There is no value to
       | our customers in us stringing together a pile of someone else's
       | products, especially in a way that we cannot guarantee will be
       | sustainable for >5 years. We cannot afford to constantly rework
       | completed product installations.
       | 
       | We strongly feel that any deeper buy-in with 3rd party technology
       | vendors would compromise our agility and put us at their total
       | mercy. Where we are currently positioned in the lock-in game, we
       | could pull the ripcord and be sitting in a private datacenter
       | within a week. All we need to do is move VMs, domain
       | registrations and DNS nameservers if we want to kill the AWS
       | bill.
       | 
       | I feel for those who are up to their eyeballs in cloud
       | infrastructure. Perhaps you made your own bed, but you shouldn't
       | have to suffer in it. These are very complex decisions. We didn't
       | get it right at first either. Maybe consider pleading with your
       | executive management for mercy now. Perhaps you get a shot at a
       | complete redo before it all comes crashing down. We certainly
       | did. It's amazing what can happen if you have the guts to own up
       | to bad choices and start an honest conversation.
       | 
       | I would also be interested to hear the other side of the coin.
       | Who out there is using 20+ AWS/Azure/GCP products to back a
       | single business app and is having a fantastic time of it?
        
         | mumblemumble wrote:
         | I recently inherited a product that was developed from the
         | ground up on AWS. It's been a real eye opener.
         | 
         | Yes, it absolutely is locked in, and will never run on anything
         | but AWS. That doesn't surprise me. What surprises me is all of
         | the unnecessary complexity. It's one big Rube Goldberg
         | contraption, stringing together different AWS products, with a
         | great deal of "tool in search of problem" syndrome for good
         | measure. I am pretty sure that, in at least a few spots, the
         | glue code used to plug into Amazon XYZ amounted to a greater
         | development and maintenance burden than a homegrown module for
         | solving the same problem would have been.
         | 
         | NIH syndrome is certainly not any fun. But IH syndrome seems to
         | be no better.
        
           | [deleted]
        
         | maria_weber23 wrote:
         | I second that. It's not only that you make yourself completely
         | intertwined with a Cloud by using more than fundamental
         | services.
         | 
         | The costs of lambda or even DDB are IMMENSE. These only pay off
         | for services that have a high return per request. I.e. if you
         | get a lot of value out of lambda calls, sure, use them. But for
         | anything high-frequency that earns you little to nothing on its
         | own, forget about it.
         | 
         | Generally all your critical infrastructure should be Cloud
         | independent. That narrows your choices largely to EC2, SQS,
         | perhaps Kinesis, Rout53, and the like. And even there you
         | should implement all your features with two clouds, i.e. Azure
         | and AWS, just to be sure.
         | 
         | The good news is also the bad news. There are effectively only
         | two options: Azure or AWS. Google Cloud is a joke. They
         | arbitrary change their prices, terminate existing products,
         | offer zero support. It's just like we have come to love Google.
         | They just don't give a shit about customers. Google only cares
         | about "architecture", i.e. how cool do I feel as engineer
         | having built that service. Customer service is something that
         | Google doesn't seem to understand. So think carefully if you
         | want to buy into their "product". Google, literally, only
         | develops products for their own benefit.
        
           | 0xEFF wrote:
           | Google Cloud has quite good support and professional
           | services.
           | 
           | I've worked with them for 3 years and can't think of any
           | services that have been killed.
           | 
           | They are very customer focused. From my perspective as a
           | partner cloud services are more built for customer use cases
           | than Google internal use cases. GKE and Anthos for example.
        
           | jbmsf wrote:
           | I can't agree, at least not in general.
           | 
           | The optionality of being cloud agnostic comes with a huge
           | cost, both because of all the pieces you have to
           | build+operate and because of the functionality you have to
           | exclude from your systems.
           | 
           | I am sure there are scales where you either have such a large
           | engineering budget that you can ignore these costs or where
           | decreasing your cloud spend is the only way to scale your
           | business. But for the average company, I can't see how
           | spending so much on infrastructure (and future optionality)
           | pays off, especially when you could spend on product or
           | marketing or anything else that has a more direct impact on
           | your success.
        
             | jsiepkes wrote:
             | > But for the average company, I can't see how spending so
             | much on infrastructure (and future optionality) pays off,
             | especially when you could spend on product or marketing or
             | anything else that has a more direct impact on your
             | success.
             | 
             | If you change "average company" to "average startup" then
             | your point make sense. But for a normal company not
             | everything needs to make a direct impact on your success.
             | For example guaranteeing long term business continuity is
             | an important factor too.
        
               | jbmsf wrote:
               | I take your point, but I still don't quite agree.
               | 
               | There are obviously plenty of companies that are willing
               | to couple themselves to a single cloud vendor (e.g.
               | Netflix with AWS) and plenty of business continuity risks
               | that companies don't find cost effective to pursue. Has
               | anyone been as vocal about decoupling from CRM or ERB
               | systems as they are with cloud?
               | 
               | My own view is that these kinds of infrastructure
               | projects create as many risks as the solve and happen at
               | least as much because engineers like to solve these kinds
               | of problems than for any other reason.
        
               | nucleardog wrote:
               | Unless you're planning for the possibility of AWS
               | dropping offline permanently with little to no notice, it
               | really feels like you're just paying a huge insurance
               | premium. Like any insurance, it's down to whether you
               | need insurance or could cover the loss. Whether you'd
               | rather incur a smaller ongoing cost to avoid the
               | possibility of a large one time loss.
               | 
               | If AWS suddenly raised their prices 10x overnight, it
               | would hurt but not be an existential threat for most
               | companies. At that point they could invest six months or
               | a year into migrating off of AWS.
               | 
               | Rough numbers that would end up costing us like $4m in
               | cloud spend and staff if we retasked the entire org to
               | accomplishing that for a year.
               | 
               | There's certainly an opportunity cost as well, but I'd
               | argue it's not dissimilar to the opportunity cost we'd
               | have been paying all along to maintain compatibility with
               | multiple clouds.
               | 
               | Obviously it's just conjecture, but my gut says the
               | increased velocity of working on a single cloud and using
               | existing Amazon services and tools where appropriate has
               | made us significantly more than the costs of something
               | that may never happen.
        
               | jbmsf wrote:
               | Strong agree.
               | 
               | Plus I've seen more than a few efforts at multi-cloud
               | that resulted in a strong dependency on all clouds vs the
               | ability to switch between them. So not only do you not
               | get to use cloud-specific services, you don't really get
               | any benefit in terms of decoupling.
        
             | zmmmmm wrote:
             | > The optionality of being cloud agnostic comes with a huge
             | cost, both because of all the pieces you have to
             | build+operate
             | 
             | This sounds like cloud vendor kool aid to me. Nearly every
             | cloud vendor product above the infrastructure layer is a
             | version of something that exists already in the world. When
             | you outsource management of that to your cloud vendor you
             | might lose 50% of the need to operationally manage that
             | product but about 50% of it is irreducible. You still need
             | internal competence in understanding that infrastructure
             | and one way or another you're going to develop it over
             | time. But if its your cloud vendor's proprietary stack then
             | you are investing all your internal learning into non-
             | transferrable skills instead of ones that can be
             | generalised.
        
           | singron wrote:
           | Do you have examples of Google Cloud arbitrarily changing
           | prices and terminating products?
           | 
           | Sure they terminate consumer products, and there was a Maps
           | price hike, but I'm not aware of anything that's part of
           | Cloud.
        
             | ma2rten wrote:
             | A very long time ago App engine went out of beta and there
             | was a price hike leaving many scrambling. App engine was in
             | beta so long that many people didn't think that label meant
             | anything.
        
             | yls wrote:
             | IIRC they introduced a cluster management fee in GKE.
        
               | miscaccount wrote:
               | not much 10 cents per hour https://www.reddit.com/r/kuber
               | netes/comments/fdgblk/google_g...
        
         | jbmsf wrote:
         | It's always a trade-off though. You say you write most of your
         | own software, but that's probably not true for, say your OS or
         | programming language, or editors, or a million other things.
         | Cloud software is the same; you might not be producing the most
         | value if you spend your engineering hours (re)creating
         | something you could buy.
         | 
         | In my own experience:
         | 
         | - AWS SNS and SQS are rock solid and provide excellent
         | foundations for distributed systems. I know I would struggle to
         | create the same level of reliability if I wrote my own publish-
         | subscribe primitives and I've played enough with some of the
         | open source alternatives to know they require operational costs
         | that I don't want to pay.
         | 
         | - I use EC2 some of the time (e.g. when I need GPUs), but I
         | prefer to use containers because they offer a superior solution
         | for reproducible installation. I tend to use ECS because I
         | don't want to take on the complexity of K8S and it offers me
         | enough to have reliable, load-balanced services. ECS with
         | Fargate is a great building block for many, run-of-the-mill
         | services (e.g. no GPU, not crazy resource usages).
         | 
         | - Lambda is incredibly useful as glue between systems. I use
         | Lambda to connect S3, SES, CloudWatch, and SQS to application
         | code. I've also gone without Lambda on the SQS side and written
         | my framework layers to dispatch messages to application code.
         | This has advantages (e.g. finer-grain backoff control) but
         | isn't worth it for smaller projects.
         | 
         | - Secrets manager is a nice foundational component. There are
         | alternatives out there, but it integrates so well with ECS that
         | I rarely consider them.
         | 
         | - RDS is terrific. In a past life, I spent time writing
         | database failover logic and it was way too hard to get right
         | consistently. I love not having to think about it. Plus
         | encryption, backup, and monitoring are all batteries included.
         | 
         | - VPC networking is essential. I've seen too many setups that
         | just use the default VPC and run an EC2 instance on a public
         | IP. The horror.
         | 
         | - I've recently started to appreciate the value of Step
         | Functions. When I write distributed systems, I tend to end up
         | with a number of discrete components that each handle one part
         | of a problem domain. This works, but creates understandability
         | problems. I don't love writing Step Functions using a JSON
         | grammar that isn't easy to test locally, but I find that the
         | visibility they offer in terms of tracing a workflow is very
         | nice.
         | 
         | - CloudFront isn't the best CDN, but it is often good enough. I
         | tend to use it for frontend application hosting (along with S3,
         | Route53, and ACM).
         | 
         | - CloudWatch is hard to avoid, though I rather dislike it.
         | CloudWatch rules are useful for implementing cron-like triggers
         | and detecting events in AWS systems, for example knowing
         | whether EC2 failed to provision spot capacity.
         | 
         | - I have mixed feeling about DynamoDB as well. It offers a nice
         | set of primitives and is often easier to starting use for small
         | projects than something like RDS, but I rarely operate at the
         | scales where it's a better solution than something like RDS
         | PostgreSQL with all the terrific libraries and frameworks that
         | work with it.
         | 
         | - At some scale, you want to segregate AWS resources across
         | different accounts, usually with SSO and some level of
         | automated provisioning. You can't escape IAM here and Control
         | Tower is a pretty nice solution element as well.
         | 
         | I'm not sure if I'm up to 20 services yet, but it's probably
         | close enough to answer your question. There are better and
         | worse services out there, but you can get a lot of business
         | value by making the right trade-offs, both because you get
         | something that would be hard to build with the same level of
         | reliability and security and because you can spend your time
         | writing software that speaks more directly to product needs.
         | 
         | As for "having a fantastic time", YMMV. I am a huge fan of
         | Terraform and tend to enjoy developing at that level. The
         | solutions I've built provide building blocks for development
         | teams who mostly don't have to think about the services.
        
         | mycall wrote:
         | Did you look into multi-cloud solutions like Pulumi or
         | Terraform to abstract your cloud vendor?
        
         | jasode wrote:
         | _> I would also be interested to hear the other side of the
         | coin. Who out there is using 20+ AWS/Azure/GCP products to back
         | a single business app and is having a fantastic time of it?_
         | 
         | Netflix uses a lot of AWS higher-level services beyond the
         | basics of EC2 + S3. Netflix definitely doesn't restrict its use
         | of AWS to only be a "dumb data center". Across various tech
         | presentations by Netflix engineers, I count at least 17 AWS
         | services they use.
         | 
         | + EC2, S3, RDS, DynamoDB, EMR, ELB, Redshift, Lambda, Kinesis,
         | VPC, Route 53, CloudTrail, CloudWatch, SQS, SES, ECS, SimpleDB,
         | <probably many more>.
         | 
         | I think we can assume they use 20+ AWS services.
        
           | p_l wrote:
           | Certain services IMHO have to be discounted from this list:
           | 
           | - VPC - basic building block for any AWS-based infra that
           | isn't ancient
           | 
           | - CloudTrail - only way to get audit logs out of AWS, no
           | matter what you feed them into
           | 
           | - CloudWatch - similar with CloudTrail, many things (but not
           | all) will log to CloudWatch, and if you use your own log
           | infra you'll have to pull from it. Also necessary for
           | metrics.
           | 
           | - ELB/ELBv2/NLB/ALB - for many reasons they are often the
           | only ways to pull traffic to your services deployed on AWS.
           | Yes, you can sometimes do it another way around, but you have
           | high chances of feeling the pain.
           | 
           | My personal typical set for AWS is EC2, RDS, all the
           | VPC/ELB/NLB/ALB stack, Route53, CloudTrail + CloudWatch. S3
           | and RDS as needed, as both are easily moved elsewhere.
        
             | tidepod12 wrote:
             | I don't think you can discount them like that. Maybe they
             | aren't as front of mind as services like S3, EC2, etc, but
             | if you were to try to rebuild your setup in a personal data
             | center, replacing the capabilities of VPC, IAM, CloudTrail,
             | NAT gateways, ELBs, KMS etc would be a huge effort on your
             | part. The fact that they are "basic building blocks" makes
             | them more important, not less. In a discussion about the
             | complexity of cloud providers versus other setups, that
             | seems especially relevant.
        
               | p_l wrote:
               | Oh, I meant it more in terms of "can you count on them as
               | _optional_ services ".
               | 
               | Because they aren't optional, and yes, it takes non
               | trivial amount to replicate them... but funnily enough,
               | several of them have to be replicated elsewhere too.
               | 
               | NAT gateways usually aren't an issue, KMS for many places
               | can be done relatively quickly with Hashicorp Vault.
               | 
               | IAM is a weird case, because unless you're building a
               | cloud for others to use it's not necessarily that
               | important, meanwhile your own authorization framework is
               | necessary even on AWS because you can't just piggy back
               | on IAM (I wish I could).
        
             | fiddlerwoaroof wrote:
             | I mostly agree, although ECS with Fargate is often nicer to
             | use than EC2
        
         | [deleted]
        
         | notretarded wrote:
         | I'm in two minds about this (deeper integration with a
         | particular vendor - i.e. "serverless")
         | 
         | Reduced time to market is incredibly valuable. Current client
         | base is well in its millions. Ability to test to few and roll
         | out to many instantly is invaluable. You no longer have to hire
         | competent software developers who understand all patterns and
         | practices to make scalable code and infrastructure. Just need
         | them to work on a particular unit or function.
         | 
         | The thing which scares me is, some of these companies are
         | decades of years old, hundreds. How long has AWS/GCP/Azure
         | abstractions been around for? How quick are we to graveyard
         | some of these platforms. Quite. A lot quicker than you can
         | lift, shift and rewrite your solution to elsewhere.
        
         | rvanmil wrote:
         | We carefully select and use PaaS and managed cloud services to
         | construct our infrastructure with. This allows us to maximize
         | our focus on what our customers are paying for: creating
         | software for them which will typically be in use for 5+ years.
         | We spend close to zero time on infrastructure maintenance and
         | management, we pay others to do this for us, cheaper and more
         | reliable. Having to swap out one service for another hasn't
         | given us any trouble or unreasonable costs yet in the past 5
         | years. Unlike the article is trying to convince us of, it has
         | _massively_ reduced complexity for us.
        
         | [deleted]
        
         | bird_monster wrote:
         | > There is no value to our customers in us stringing together a
         | pile of someone else's products
         | 
         | Maybe not your business, but there are many businesses in which
         | this is exactly what happens. Any managed-service is just
         | combining other people's work into a "product" that gets sold
         | to customers. And that's great! AWS has a staggering amount of
         | products, and lots of business don't even want to have to care
         | about AWS.
         | 
         | > Who out there is using 20+ AWS/Azure/GCP products to back a
         | single business app and is having a fantastic time of it?
         | 
         | Several times. I think cloud products are just tools to get you
         | further along in your business. Most of the tools I use are
         | distributed systems tools, because I don't want to have to own
         | them, and container runtimes/datastores. Every single thing
         | I've ever deployed across AWS/Azure is used as a generic
         | interface that could be replaced relatively easily if
         | necessary, and I've used Terraform to manage my infrastructure
         | creation/deployment process, so that I can swap resources in
         | and out without having to change tech.
         | 
         | If, for some reason, Azure Event Hub stopped providing what we
         | needed it for, we could certainly deploy a customized Kafka
         | implementation and have the rest of our code not really know or
         | care, but from when we set out to build our products, that has
         | always been a "If we need to" problem, and we've never needed
         | to.
        
         | g9yuayon wrote:
         | So your company cautiously chooses which services in AWS to
         | use, and sticks to infrastructure offerings for now. Netflix
         | called it "paved path", and it worked really well too for
         | Netflix. Over the years, though, the "paved path" expanded and
         | extended to more services. It's worth noting that EC2 alone is
         | a huge productivity booster, bar none. Nothing beats setting up
         | a cluster of machines, with a few clicks, that auto scales per
         | dynamic scaling policies. In contrast, Uber couldn't do this
         | for at least 5 years, and their docker-based cluster system is
         | a crippled for not supporting the damn persistent volumes. God
         | knows how much productivity was lost because of the bogus
         | reasons Uber had for not going to cloud.
        
         | bane wrote:
         | I've worked with a number of teams over the last few years who
         | use AWS and I'd say from top to bottom they all build their
         | strategy more or less the same way:
         | 
         | 0. Whatever is the minimum needed to get a VPC stood up.
         | 
         | 1. EC2 as 90%+ of whatever they're doing
         | 
         | 2. S3 for storing lots of stuff and/or crossing VPC boundaries
         | for data ingress/egress (like seriously, S3 seems to be used
         | more as an alternative to SFTP than for anything else). This
         | makes up usually the rest of the thinking.
         | 
         | 3. _Maybe_ one other technology that 's usually from the set of
         | {Lambda, Batch, Redshift, SQS} but _rarely_ any combination of
         | two or more of those.
         | 
         | And that's it. I know there are teams that go all in. But for
         | the dozen or teams I've personally interacted with this is it.
         | The rest of the stack is usually something stuffed into an EC2
         | instance instead of using an AWS version and it comes down to
         | one thing: the difficulties in estimating pricing for those
         | pieces. EC2 instances are drop-dead simple to price estimate
         | forward 6 months, 12 months or longer.
         | 
         | Amazon is probably leaving billions on the table every year
         | because nobody can figure out how to price things so their
         | department can make their yearly budget requests. The one time
         | somebody tries to use some managed service that goes overbudget
         | by 3000%, and the after action figures out that it would have
         | been within the budget by using <open source technology> in
         | EC2, they just do that instead -- even though it increases the
         | staff cost and maintenance complexity.
         | 
         | In fact just this past week a team was looking at using
         | SageMaker in an effort to go all "cloud native", took one look
         | at the pricing sheet and noped right back to Jupyter and
         | scikit_learn in a few EC2 instances.
         | 
         | An entire different group I'm working with is evaluating cloud
         | management tools and most of them just simplify provisioning
         | EC2 instances and tracking instance costs. They really don't do
         | much for tracking costs from almost any of the other services.
        
           | hyperdimension wrote:
           | I'm not very familiar with AWS or The Cloud, but I'm having
           | trouble understanding what you said about Amazon leaving
           | money on the table by not directing customers toward
           | specific-purpose services as opposed to EC2?
           | 
           | Wouldn't (for AWS to make a profit anyway) whatever managed
           | service _have_ to be cheaper than some equivalent service
           | running on an EC2 VM?
           | 
           | I get the concerns re: pricing and predictability, but it
           | still seems like more $$$ for AWS.
        
             | bane wrote:
             | Yeah good question. Sibling comments to this one explain it
             | well, but basically AWS managed services come at a premium
             | price over some equivalent running in just EC2. (Some
             | services in fact _do_ charge you for EC2 time + the service
             | + storage etc.)
             | 
             | "Managed" usually means "pay us more in exchange for less
             | work on your part". This is usually pitched as a way to
             | reduce admin/infrastructure/devops type staff and the
             | overhead that goes along with having those people on the
             | payroll.
        
             | nucleardog wrote:
             | No, usually the managed services are a premium over the
             | bare hardware. When you use RDS for example, you're paying
             | for the compute resources but also paying for the extra
             | functionality they provide and their management and
             | maintenance they're doing for you. You can run your own
             | Postgres database, or you can pay the premium for Aurora on
             | RDS and get a multi-region setup with point in time restore
             | and one-click scaling and automatically managed storage
             | size and automatic patching and monitoring integrated into
             | AWS monitoring tools and...
             | 
             | They're leaving money in the table because instead of using
             | "Amazon Managed $X" potentially at a premium or paying a
             | similar amount but in a way where AWS can provide the
             | service with fewer compute resources than you or I would
             | need because of their scale and thus more profitably,
             | people look and see they'll be paying $0.10/1000 requests
             | and $0.05/1gb of data processed in a query and $0.10/gb for
             | bandwidth for any transfer that leaves the region and...
             | people just give up and go "I have no idea what that will
             | cost or whether I can afford it, but this EC2 instance is
             | $150/mo, I can afford that."
        
             | glogla wrote:
             | For example:
             | 
             | Managed Airflow Scheduler on AWS with "large" size costs
             | $0.99/hour, or $8,672/year per instance. That's ~ $17,500
             | considering Airflow for at least non-prod and prod
             | instances.
             | 
             | Building it on your own on same size EC2 instance would
             | cost $3,363/year for the EC2. Times two for two
             | environments, let's say $6,700. $4,000 if you prepay the
             | instance.
             | 
             | That looks way cheaper, but then you have to do the
             | engineering and the operational support yourself.
             | 
             | If you consider just the engineering and assume engineer
             | costs $50/hour and estimate this to initial three weeks of
             | work and then 2.5 days / month for support (upgrades,
             | tuning, ...) that's extra $4,000 upfront and $1,000/month.
             | 
             | So on AWS you're at $17,500/year and on-prem you're at best
             | $20,000 first year and $16,000 next years.
             | 
             | So the AWS only comes a bit more expensive - but the math
             | is tricky on several parts:
             | 
             | - maybe you need 4 environments deployed instead of 2,
             | which is more for AWS but not much more for engineering?
             | 
             | - maybe there's less sustaining cost because you're ok with
             | upgrading Airflow only once a quarter?
             | 
             | - you probably already pay the engineers, so it's not an
             | extra _money_ cost, it 's extra cost of them not working on
             | other stuff - different boxes and budgets
             | 
             | - maybe you're in part of a world where good devops
             | engineer doesn't cost $50/hour but $15 hour
             | 
             | - I'm ignoring cost of operational support, which can be a
             | lot for on-prem if you need 24/7
             | 
             | - maybe you need 12+ Airflow instances thanks to your
             | fragmented / federated IT and can share the engineering
             | cost
             | 
             | - etc, etc.
             | 
             | So I think what OP was saying is that if AWS priced Managed
             | Airflow at $0.5 per hour, it would be no brainer to use
             | instead of build your own. The way it is, some customers
             | will surely for their own Airflow instead, because the math
             | favors it.
             | 
             | Does that make sense?
        
               | bird_monster wrote:
               | > That looks way cheaper, but then you have to do the
               | engineering and the operational support yourself.
               | 
               | In my experience, this is the piece that engineers rarely
               | realize and that is actually one of the biggest factors
               | in evaluating cloud providers vs. home-rolled. Especially
               | if you're a small company, engineering time (really any
               | employee time) is _insanely valuable_. Valuable such that
               | even if Airflow is cash-expensive, if using it allows
               | your engineers to focus on building whatever makes _your
               | business successful_, it is usually a much better idea to
               | just use Airflow and keep moving. Clients usually will
               | not care about whether you implemented your own version
               | of an AWS product (unless that's your company's specific
               | business). Clients will care about the features you ship.
               | If you spent a ton of time re-inventing Airflow to save
               | some cost, but then go bankrupt before you ever ship,
               | rolling your own Airflow implementation clearly didn't
               | save you anything.
        
               | glogla wrote:
               | I agree.
               | 
               | The only caveat is that this goes for founders or
               | engineers who are financially tied with the company
               | success. If the engineer just collects paycheck, they
               | might prioritize fun - and I feel that might be behind a
               | lot of the "reinventing the wheel" efforts you see in the
               | industry.
               | 
               | Or maybe I'm just cynical.
        
               | tpxl wrote:
               | We used to have on-prem redis and a devops engineer to
               | manage it, then we moved to redis in the cloud and had a
               | devops engineer to manage it.
               | 
               | Saying that in the cloud you don't need engineers to
               | manage "operational support" is the biggest lie the cloud
               | managed to sell.
        
             | mumblemumble wrote:
             | It's not just about a straight cost comparison. It's about
             | how organizational decision-making works.
             | 
             | The people shopping for products are not spending their own
             | money, but they are spending their own time and energy. The
             | people approving budgets are not considering all possible
             | alternatives, they are only considering the ones that have
             | been presented to them by the people doing the shopping.
             | 
             | If the shoppers decide that an option will cost them too
             | much in time and irritation, then it may be that the people
             | holding the purse-strings are never even made aware that it
             | exists. Even if it is the cheapest option.
        
               | gregmac wrote:
               | This is a really good summary of the situation, and I'd
               | add a bit about risk:
               | 
               | It's relativity easy to estimate EC2 costs for running
               | some random service, because it's literally just a per-
               | hour fee times number of instances. If you're wrong, the
               | bigger instance size or more instances isn't that much
               | more expensive.
               | 
               | For almost every other service, you have to estimate some
               | other much more detailed metric: number of http requests,
               | bytes per message, etc. When you haven't yet written the
               | software, those details can be very fuzzy, making the
               | whole estimation process extremely risky - it could be
               | cheaper than EC2, it could be 10x more, and we won't
               | really know until we've spent at least a coulple months
               | writing code. And let's hope we don't pivot or have our
               | customers do anything in a way we're not expecting..
        
           | harikb wrote:
           | +1
           | 
           | I bet cloud providers are incentivized not to provide
           | detailed billing/usage stats. I remember having to use a 3rd
           | party service to analyze our S3 usage.
           | 
           | Infinite scalability is also a curse - we had a case where
           | pruning history from an S3 bucket was failing for months and
           | we didn't know until the storage bill became significant
           | enough to notice. I guess in some ways it is better than
           | being woken up in the middle of night but we wasted
           | _millions_ storing useless data
           | 
           | Azure also has similar issues - deleting a VM sometimes
           | doesn't cleanup dependent resources and it is a mess to find
           | and delete later - only because the dependent resources are
           | deliberately not named with a matching tag.
        
             | zmmmmm wrote:
             | > Infinite scalability is also a curse
             | 
             | People don't like to admit it, but in many circumstances,
             | having a service that is escalating to 10x or 100x its
             | normal demand go off line is probably the _desirable_
             | thing.
        
             | threentaway wrote:
             | This seems like you didn't have proper monitoring and
             | alerting set up for your job, not sure how that is a
             | downside of AWS.
        
               | jjoonathan wrote:
               | AWS monitoring (and billing) is garbage because they make
               | an extraordinary amount of money on unintentional spend.
               | 
               | "But look at how many monitoring solutions they have in
               | the dashboard! Why, just last re:invent they announced 20
               | new monitoring features!"
               | 
               | They make a big fuss and show about improving monitoring
               | but it's always crippled in some way that makes it easy
               | to get wrong and time-consuming or expensive to get
               | right.
        
               | milesvp wrote:
               | > Infinite scalability is also a curse
               | 
               | This was the key sentence, I think. This type of problem
               | actually shows up in other domains as well, queueing
               | theory comes immediately to mind. Even the halting
               | problem is only a problem with infinite tape, and becomes
               | easier with (known?) limited resources.
               | 
               | When you have some parameter that is unbounded you need
               | to add extra checks to bound them yourself to some sane
               | value. You are right, in that the parent failed to
               | monitor some infrastructure, but if they were in their
               | own datacenter, once they filled their NAS, I'm positive
               | someone would have noticed, if only because other checks,
               | like diskspace are less likely to be forgotten.
               | 
               | Also, getting a huge surprise bill is a downside of any
               | option, and the risk needs to be factored into the cost.
               | I'm constantly paranoid when working in a cloud
               | environment, even doing something as trivial as a
               | directory listing from the command line on S3 costs
               | money. I had a back and forths with AWS support just to
               | be clear what the order of magnitude of the bill would be
               | for a simple cleanup action since there were 2 documented
               | ways to do what I needed, and one appeared to be easier,
               | yet significantly more expensive.
        
           | bird_monster wrote:
           | +1, but with a container tool (Fargate/ECS, Azure Container
           | Instances) instead of EC2.
        
       | nelsonenzo wrote:
       | This person has never worked in a data center. He thinks he's
       | managing a network because he sets a few vpc ips, that's an itsy
       | tiny fragment of networking, and the cloud has indeed removex a
       | great deal you previously had to manage on prem.
        
       | zoomablemind wrote:
       | Subjectively, it increasingly feels that while the complexity has
       | been increasing, the notion of longevity of the underlying
       | products and services has been degrading.
       | 
       | While updates to software were expected, general outlook would be
       | that they would not be breaking the core features. The emphasis
       | on backwards compatibility was in a way an assurance to
       | businesses that building their operations on vendor's products is
       | not risky. Even then, some mission-critical elements would be
       | defensively abstracted to avoid the dependency risks (at least
       | theoretically...)
       | 
       | Now, we all witness the "eternal-beta" paradigm across the most
       | of the major software products. Frequent builds with automatic
       | updates, when new features could be suddenly pushed, and old
       | features removed.
       | 
       | Sure, it's still possible to spec out a "hard-rock" steady
       | platform, postpone updates, abstract dependencies and just focus
       | on business. But...such approach won't be approved, as it's
       | widely acknowledged that the presence of critical bugs is rather
       | a "feature" of all software. Postponing the updates is not
       | prudent, it's a liability.
       | 
       | So the rock-solid expectations are just an illusion or perhaps a
       | fantasy promoted widely, just to get the foot in the door.
       | 
       | Ironically, the most stable elements are the so much dreaded
       | "legacy", too often in charge of the business-critical logic.
        
       | s3tz wrote:
       | Anyone manage to find part 1? It's not on their site, can't seem
       | to find it.
        
         | jvanderbot wrote:
         | It's linked in the article https://ea.rna.nl/2016/01/10/a-tale-
         | of-application-rationali...
         | 
         | "This was actually part 1 of this story: A tale of application
         | rationalisation (not)."
        
       | [deleted]
        
       | CalChris wrote:
       | Isn't this just _Jevon 's Paradox_ applied to software?
       | when technological progress ... increases the efficiency with
       | which a resource is used .., but the rate of consumption of that
       | resource rises due to increasing demand [1]
       | 
       | [1] https://en.wikipedia.org/wiki/Jevons_paradox
        
       | ChicagoDave wrote:
       | Reducing complexity should never be about platform (on-prem vs
       | cloud).
       | 
       | It should be about constructing software in partnership with the
       | business and reducing complexity with modeled boundaries.
       | 
       | You can leverage the cloud to do some interesting things, but the
       | true benefit in is _what_ you construct, not _how_.
        
         | sanp wrote:
         | There is an element of _how_ as well. You could create simple
         | monoliths or overengineered microservices. Or, complex
         | monoliths with heavy coupling vs cleanly designed microservices
         | with clear separations of concern.
        
           | the-smug-one wrote:
           | Are microservices meant to separate data too? As in, each
           | service has its own database.
           | 
           | Wouldn't that lead to non-normalisation of the data or a lot
           | of expensive network lookups to get what I want/need?
           | 
           | What is the point of micro services anyway :-)?
        
             | kqr wrote:
             | > Are microservices meant to separate data too? As in, each
             | service has its own database.
             | 
             | Yes.
             | 
             | > Wouldn't that lead to non-normalisation of the data
             | 
             | Yes. But it's not as bad as it sounds. That is how data on
             | paper used to work, after all.
             | 
             | Business rules (at least ones that have been around for
             | more than 5--10 years) are written with intensely non-
             | normalised data in mind.
             | 
             | Business people tend to be fine with eventual consistency
             | on the scale of hours or even days.
             | 
             | Non-normalised data also makes total data corruption
             | harder, and forensics in the case of bugs easier, in some
             | ways: you find an unexpected value somewhere? Check the
             | other versions that ought to exist and you can probably
             | retrace at what point it got weird.
             | 
             | The whole idea of consistent and fully normalised data is
             | a, historically speaking, very recent innovation, and I'm
             | not convinced it will last long in the real world. I think
             | this is a brief moment in history when our software is
             | primitive enough, yet optimistic enough, to even consider
             | that type of data storage.
             | 
             | And come on, it's not like the complete consistency of the
             | data is worth _that_ many dollars in most cases, if we
             | actually bother to compute the cost.
        
             | ratww wrote:
             | _> Are microservices meant to separate data too? As in,
             | each service has its own database._
             | 
             | Ideally yes, to scale.
             | 
             | Sometimes you have a service with obvious and easy-to-split
             | boundaries, and microservices are a breeze.
             | 
             | Some things that are easy to turn into microservices: "API
             | Wrapper" to a complex and messy third-party API. Logging
             | and data collection. Sending emails/messages. User
             | authentication. Search. Anything in your app that could
             | become another app.
             | 
             | However, when your data model is tightly coupled and you
             | need to choose between tradeoffs (data duplication), having
             | bigger services, or even keeping it as a monolith.
             | 
             | Btw, if you don't care about scalability, sharing a
             | database is still still not the best idea. But you can have
             | a microservice that wraps the database in a service, for
             | example. Tools like Hasura can be used for that.
        
             | NicoJuicy wrote:
             | Microservices is a solution for an organisational problem (
             | multiple employees in one project), not a technical one.
             | 
             | If you're flying solo, just use DDD for example. It will
             | give you the same patterns without the devops complexity
        
         | mumblemumble wrote:
         | I honestly believe that hiding complexity behind a closed door
         | does not eliminate it. However, a lot of software and service
         | vendors have a vested interest in convincing people otherwise.
         | And, historically, they've had all sorts of great platforms for
         | doing so. Who doesn't enjoy a free day out of the office, with
         | lunch provided?
         | 
         | It's also much easier to hide complexity than it is to remove
         | it. One can be accomplished with (relative) turnkey solutions,
         | generally without ever having to leave the comfort of your
         | computer. Whereas the other generally requires long hours
         | standing in front of a chalkboard and scratching your head.
        
           | SpicyLemonZest wrote:
           | On the other hand, hiding complexity behind closed doors can
           | be a very valuable thing, if it lets you keep track of who
           | knows about the complexity behind each. I can't count the
           | number of issues I've encountered that would have taken
           | minutes instead of hours if only I'd known which specific
           | experts I needed to talk to.
        
             | mumblemumble wrote:
             | Agreed. Though, that to comes at a cost, so I don't want to
             | do it except when it's worth it.
             | 
             | http://yosefk.com/blog/redundancy-vs-dependencies-which-
             | is-w...
        
       | candiddevmike wrote:
       | I see a lot of mentions in the comments about just using the
       | basic storage/networking/compute from AWS/AZ/GCP--if that's all
       | you're using, you should really consider other providers. Linode,
       | Digital Ocean, and Vultr will be far more competitive and offer
       | faster machines, cheaper, and with better bandwidth pricing.
       | 
       | The point of using AWS/AZ/GCP is to leverage their managed
       | service portfolio and be locked in. If you aren't doing that,
       | there are better companies that want your business and will treat
       | you much better.
        
         | dilyevsky wrote:
         | There's also packet (now equinix metal) that gives control over
         | l2 and have nice things such as ibgp. I think vultr may do too
         | but their docs are poor and support was uncooperative
        
         | ithrow wrote:
         | IME, the network of AWS is much better than that of DO, Linode.
        
           | dilyevsky wrote:
           | How so?
        
             | ithrow wrote:
             | Less hiccups and downtime. It's faster and with better
             | latency to other third party services. Superior internal
             | control. Ex: In, linode a private IP address gives EVERYONE
             | on the same data center access to your Linode server. Also,
             | last time I used them they didn't have a Firewall.
        
               | dilyevsky wrote:
               | Right linode is basically old school dedicated servers
               | afaik but DO should be in different class
        
               | dijit wrote:
               | AWS networking isn't _great_ but it's decidedly better
               | than DO (which is actually the worst of those listed
               | based on my own TCP connection tests).
               | 
               | Linode is pretty stable if not very exciting, Vultr is
               | "better than DO", but their networks are almost always in
               | maintenance.
               | 
               | For a little context; I maintain IRC servers and those
               | are currently hosted in vultr (with linked nodes in 5
               | regions), I notice ping spikes between those nodes often
               | and sometimes congestion which drops users. (IRC is
               | highly stateful TCP sessions).
               | 
               | I've only known two truly good networking suppliers, GCP
               | (and their magic BGP<->PoP<->Dark Fibre networks) and..
               | Tilaa.. (which is only hosted in Netherlands.. which is
               | why I can't use them for my global network)
        
               | dilyevsky wrote:
               | Awesome thanks for info. For gcp i notice occasional
               | unavailability on the order of 10s of mins every quarter
               | or so. That's VM networking. Their loadbalancers are a
               | different story as they are a complete crap
        
               | dijit wrote:
               | This is very much not my experience, do you have any more
               | information?
               | 
               | Any particular regions? Are you certain it's not a local
               | ISP?
               | 
               | (I used to run an always online video game and we had a
               | LOOOOOT of connection issues from "Spectrum internet" on
               | all of our servers including GCP ones.)
        
               | dilyevsky wrote:
               | Answering here bc bottom post is locked for some reason -
               | east1 occasionally disconnects from other regions. That
               | is definitely within google backbone. Central-1 seems
               | worse tho. If it's less than an hour they dont bother
               | with the status page.
               | 
               | For loadbalancer its very much by design as they randomly
               | send you rst when google rolls them for upgrade and in
               | some other cases (I'm working on a blog post on this).
               | Google support recommendation is to retry (foreals)
        
       | mrkramer wrote:
       | Microsoft summarized it nice [1] :
       | 
       | Advantages of public clouds:
       | 
       | Lower costs
       | 
       | No maintenance
       | 
       | Near-unlimited scalability
       | 
       | High reliability
       | 
       | Advantages of a private cloud:
       | 
       | More flexibility
       | 
       | More control
       | 
       | More scalability (compared to pure on-prem solution)
       | 
       | [1] https://azure.microsoft.com/en-us/overview/what-are-
       | private-...
        
         | indymike wrote:
         | Hmm... So Azure for unlimited scalability... But private clouds
         | have more scalability?
        
           | beaconstudios wrote:
           | Presumably both options are written relative to non cloud
           | setups
        
           | 8K832d7tNmiQ wrote:
           | probably meant that you can just request hundreds of servers
           | in other part of the world in one single setup compared to
           | manually building your server there.
        
             | mrkramer wrote:
             | "More scalability -- private clouds often offer more
             | scalability compared to on-premises infrastructure."
             | 
             | I think they meant private cloud (renting 3rd party servers
             | and using/maintaining your private cloud) vs on-prem
             | (buying servers and building your own data centers).
        
       | jvanderbot wrote:
       | Part 1 is linked in article
       | 
       | https://ea.rna.nl/2016/01/10/a-tale-of-application-rationali...
       | 
       | "This was actually part 1 of this story: A tale of application
       | rationalisation (not)."
        
       | [deleted]
        
       | skohan wrote:
       | I have to say, at my current company we are using Serverless, and
       | it really does feel like it reduces complexity. No
       | runtime/framework to set up, no uptime monitoring or management
       | required on the application layer, and scaling is essentially
       | solved for us. I mean you do pay for what you get, but it does
       | feel like one of those technologies which really lowers the
       | barrier to entry in terms of being able to release a production
       | web application. In terms of building an MVP, a single developer
       | really can deploy an application without any dev-ops support, and
       | it will serve 10 million users if you manage to get that much
       | traffic.
       | 
       | I'm sure it's not optimal for every case, but for an awful lot of
       | cases it seems pretty darned good, and you can save on dev ops
       | hiring.
        
         | ratww wrote:
         | I used to be very excited about serverless, and I still have
         | high hopes for it.
         | 
         | But for me it ended up replacing the complexity of runtime and
         | frameworks with the complexity of configuring auxiliary
         | srevices like Gateway API, Amazon VPC, etc. We needed to move
         | the complexity to some tool that configured the services around
         | Lambda, like Terraform or Cloud Formation, or at best to a
         | framework like Claudia or Serverless.com. Configuring it by
         | hand looks fine in tutorials, but is madness: it's still
         | complex, and makes it all way too fragile.
         | 
         | There are however some products that make the experience better
         | by simplifying configuration, like Vercel and Netlify.
        
           | skohan wrote:
           | Yeah I certainly agree that the complexity doesn't _really_
           | go away completely, and sometimes it 's much more frustrating
           | to have to configure poorly documented services rather than
           | just having access to the host OS.
           | 
           | I guess my overall point would be that two of the _hardest_
           | things to do in terms of making a production-ready
           | application are scaling and security, and Serverless pretty
           | much obviates them. So it 's not a magic wand, but it does
           | take away some of the significant barriers to entry.
        
             | ratww wrote:
             | Yes, I agree with that point. I think my point was more
             | that Serverless is a good idea, but the current
             | implementations are still not good at removing complexity.
             | But I can see this easily changing, with open standards and
             | the such.
        
               | orlovs wrote:
               | Well, we just need to admit that running applications if
               | ack all know risks are complex. If we blissfully ignore
               | risks like lamp or lemp stack its much more easier. Main
               | question do we need to take in account most of risks,
               | running within small scale.
        
           | 8note wrote:
           | I was expecting writing serverless to be a mess of writing
           | configuration, but I've really enjoyed writing CDK for
           | cloudformation. It's super unclear how you're supposed to
           | write good cdk code, but I feel like I'm a lot clearer on
           | what infrastructure I'm actually using than before, where I
           | was relying on stuff set up by someone else ages ago with
           | minimal to no documentation
        
       | arendtio wrote:
       | I wonder who tells the story that cloud computing has something
       | to do with reducing complexity. In my world, cloud computing is a
       | bout scalability and making things as complex as they need to be
       | to be scalable. This rarely means that complexity is being
       | reduced.
        
         | didericis wrote:
         | The simplicity of having one cloud based product rather than
         | several native products built for different systems is an
         | argument I've heard a lot.
        
           | ratww wrote:
           | This is an advantage of the web platform, not exactly related
           | to cloud. You can get this advantage with an on-premises web
           | product, or with old school hosting.
        
             | didericis wrote:
             | Very true, just trying to explain the source of the "cloud
             | reduces complexity" argument. There are a number of small
             | operations that don't want to manage all their own
             | hardware, so cloud and web are conflated, and you get the
             | web platform simplicity argument being used to justify a
             | cloud platform.
        
       ___________________________________________________________________
       (page generated 2021-01-10 23:01 UTC)