[HN Gopher] Terraform vs. AWS CloudFormation ___________________________________________________________________ Terraform vs. AWS CloudFormation Author : historynops Score : 80 points Date : 2021-10-06 20:25 UTC (2 hours ago) (HTM) web link (gswallow.medium.com) (TXT) w3m dump (gswallow.medium.com) | johnl1479 wrote: | I can appreciate the author's criticisms of the shortcomings of | Cloudformation, but this is really just a "Why you should use | Terraform" post. | mylons wrote: | "But CDK transpiles into CloudFormation templates. For that | reason alone I can't recommend it." | | CDK is superior to terraform for a glaring reason: it's a first | class citizen in AWS' eyes and terraform is not. | thecopy wrote: | > With Terraform, your local executable makes rest calls to each | service's REST API for you, meaning no intermediary sits between | you and the service you're controlling. Want an RDS instance? | Terraform will make calls directly to the RDS API. | | How is this different than CloudFormation making the same calls? | lykr0n wrote: | You give CloudFormation a list of instructions. It accepts it | and gives you an ID to watch for updates, then it goes off and | executes them. | | Terraform executes a list of instructions. It executes them in | front of you while you wait. | | Both are fine until you run into something like this: | | I'm pushing a Elastic Container Service Task Definition change | via CDK. A CloudFormation change is submitted, and I wait for | it to finish. In the background, it's trying to do the update | but the update fails due to some misconfiguration with the new | container. | | CloudFormation doesn't fail or return an error. It times out | after an hour and reverts the change. I have to know to dig | into the AWS console to find my failed tasks to view the error. | | If I did this update via Terraform, I would get the error back | in my console quickly as Terraform is directly telling ECS to | make the change. With CDK, the CloudFormation changeset is | generated, it is submitted to CloudFormation, then the tool | polls the AWS API for progress updates. Sometimes you get | specific messages back, sometimes it fails and you need to go | in and see what it failed on. | kennu wrote: | That's right - use AWS CDK instead. You don't have to worry about | the low-level CloudFormation syntax and details. I switched a few | years ago and haven't looked back. CDK keeps getting better and | better, also handling things like asset deployments (Docker | images, S3 content, bundling), quick Lambda updates with | --hotswap, quick stack debugging with --no-rollback, etc. | fdgsdfogijq wrote: | I'm always surprised that more people arent aware of CDK. Its | an extremely powerful way to write software. Especially once | you get good at it. CFN pales in comparison, CDK to me feels | like the future of software development. | k__ wrote: | Pulumi is also nice for non-AWS related stuff. | nagyf wrote: | I agree, and have the same experience. CDK is so much easier, | much less verbose, and unit testable (at least to some degree). | | Since resource importing is possible in CDK (not nice, but | possible) you can even start using it if you already have | resources that you do not want to recreate. | zenux wrote: | Fun fact: in the leak of the Twitch (Amazon) repositories of this | morning, I saw that the developers use Terraform ! | cube2222 wrote: | You can use more than one tool. | | CloudFormation is great because of its transactionality, so it | lends itself nicely to deploying multiple services which are | versioned together. You either succeed fully, or all services | will be rolled back. | | This way you can deploy your whole infra with Terraform, and then | deploy to your i.e. ECS cluster using CloudFormation. Works great | in practice. | zapt02 wrote: | The rollback functionality of CF is a blessing. We use both CF | and Terraform at my company and i vividly recall multiple times | where my connection had cut out during "terraform apply" and | left the Terraform infrastructure in a half-finished state. | acdha wrote: | > The rollback functionality of CF is a blessing | | When it works, which is a big caveat: we had far more cases | where it failed in a way which required manual remediation | and the gaps in validation meant that you'd be in a "apply / | error / rollback" loop requiring 20+ minutes before you could | try again. Terraform was always considerably faster but it | was especially the orders of magnitude improvement in retry | time which convinced most of us to switch. | | The CloudFormation team has been working on this so it's | possible that experience has improved but the scar tissue | will take time to fade. | nickjj wrote: | Rollback doesn't always work with CF. I've noticed so many | times that it would mostly delete everything but not certain | things once in a while. Then you're left having to play | detective to manually figure out what you need to delete | while having to delete dependencies by hand in a specific | order. | | I've spent hours just waiting for CF to fail deleting EKS or | RDS related resources then I end up getting billed for $30+ a | month sometimes because I forgot to manually delete a NAT | gateway. | vageli wrote: | > i vividly recall multiple times where my connection had cut | out during "terraform apply" | | The issue could be at least partially resolved by using | automation (like atlantis for example) to apply your plans. | l0b0 wrote: | Unless things have changed in the meantime, the killer feature of | CloudFormation for me is that I don't have to keep track of the | state locally. Having to set up tracking of the infra state in | Terraform is a huge pain, since it should be stored independently | of both the infra code (to allow deploying anything but HEAD) and | the infra itself (duh). As long as Terraform doesn't query the | existing infra to work out what needs doing I don't want to go | back to it. | Pensacola wrote: | While the read was interesting and informative, something about | the tone made me search for a disclaimer/disclosure of interest. | Are you an "influencer?" | draklor40 wrote: | CloudFormation, with its HORRIBLE YAML templating (whatever | dsl/language) and arcane error messages is a horror story. I hate | it so much that I'd rather quit my job than debug why | CloudFormation decided for no reason to update my RDS instance | for a PR that was just a README file update. | emmanueloga_ wrote: | How about Pulumi? [1] Seems compatible with CF and supports | TypeScript as configuration language. Any fans? | | 1: https://www.pulumi.com/docs/guides/adopting/from_aws/ | orf wrote: | I spent a bit of time trying to deploy a lambda app with | Cloudformation. I wanted to use a relational database, so I | needed to handle migrations. | | Ok, so apparently I need to write a custom Cloudformation | resource to execute a lambda function that will run the | migrations prior to deploying the new version of the lambda. Kind | of neat that you can do that. | | Except I messed up the output of the custom resource lambda and | Cloudformation completely locked my deployment up for _3 hours_. | 3 hours. I couldn 't do _anything_ - rollback, update, whatever. | | Cloudformation via a CDK is interesting, and I don't hate it, but | oh boy if it gets into a weird state it can completely kill your | iteration loop. And the docs say something along the lines of "if | it's stuck for too long contact support". No thanks. | zapt02 wrote: | CF does have a lot of quirks (especially stacks locking up for | various reasons, or rollbacks taking hours). | | I find it easiest to run migrations when an application is | first starting up (with an appropriate transaction lock so | other instances won't cause the migration to run more than | once), this way you don't have to do a lot of devops magic for | it to work. | singlewind wrote: | To be honest, I don't agree this. Manage an infrastructure need | evidence and trace how this get created. I've been in the | situation a few times. Have been threw projects terraform code | doesn't match aws infrastructure. We don't know when an how the | drift happen. At least, cloudformation can have some feature to | detect the difference and help me trace back which commit | actually has been deployed. CDK make the job easier for | developers because it deliver some convenience and offer more | pattern to write code. I like both. | Hikikomori wrote: | Vanilla cloudformation is bad, but so is terraform (for my use | case anyway). We wrap our cloudformation with python, you need | something similar for terraform to make it less terrible (cdktf, | terragrunt, terrascript). | flurie wrote: | One of the most amazing things I saw at AWS reInvent was an | advanced talk on IaC that provided the code of a lambda function | inline in a CloudFormation template. I realize that this is just | one talk, and there are plenty of ways to structure things well, | but this practice is directly encouraged by the design of | CloudFormation[1]. AWS has attempted redefining the lambda | deployment story multiple times, there are multiple companies | whose primary offering is providing a better way to deploy code | to serverless offerings, but this still stands out to me as one | of the most terrible ways to do things, and I blame the design of | CloudFormation. | | [1] | https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui... | xyzzy123 wrote: | I'm going off track here but Pulumi have a totally mind-bending | feature where you can write the code of a lambda function not | only inline, but such that it captures the value of variables | from the surrounding infra code at the time the function is | serialized. | | See: https://www.pulumi.com/docs/intro/concepts/function- | serializ... | | Seeing the specific examples they use it for (AWS infra glue) | makes me think that there is room for infrastructure related | lambdas to be defined right in cfn or infra code, with very low | ceremony, even if you wouldn't want to deploy "applications" | like that. | nzoschke wrote: | Counterpoint... Use CloudFormation! | | Managed services offer big benefits over software. With CF, new | stacks, change sets, updates, rollbacks and drift detection are | an API call away. | | Managed service providers offer big benefits over software. With | CF and AWS support, help with problems are a support ticket away. | | Using a single cloud provider has a big benefit over a multi- | cloud tooling. I only run workloads on AWS, so the CF syntax, | specs and docs unlocks endless first party features. A portable | Terraform + Kubernetes contraption is a lowest common denominator | approach. | | Of course everything depends. | | I've configured literally 1000s of systems with CloudFormation | with very few problems. | | I have seen Terraform turn into a tire-fire of migrations from | state files to Terraform enterprise to Atlantis that took an | entire DevOps team to care for. | acdha wrote: | > Managed services offer big benefits over software. With CF, | new stacks, change sets, updates, rollbacks and drift detection | are an API call away. > > Managed service providers offer big | benefits over software. With CF and AWS support, help with | problems are a support ticket away. | | The problem is when those help tickets get responses like "try | deleting everything by hand and see if it recreates without an | error next time". They've worked on CloudFormation over the | last year or but everyone I've known who's switched to tools | like Terraform did so after getting tired of unpredictable | deployment times or hitting the many cases where CloudFormation | gets itself into an irrecoverable state. I can count on no | fingers the number of development teams who used CF and didn't | ask for help recovering from an error state in CF which | required out-of-band remediation. | | I believe they've also gotten better at tracking new AWS | features but there were multiple cases where using Terraform | got you the ability to use a feature 6+ months ahead of CF. | | > A portable Terraform + Kubernetes contraption is a lowest | common denominator approach. | | Terraform is much, much richer than CloudFormation so I'd | compare it to CDK (with the usual aesthetic debate over | declarative vs. procedural models) and it doesn't really make | sense to call it LCD in the same way that you might use that to | describe Kubernetes because it's not trying to build an | abstraction which covers up the underlying platform details. | Most of the Terraform I've written controls AWS but there's a | significant value to also being able to use the same tool to | control GCP, GitLab, Cloudflare, Docker, various enterprise | tools, etc. with full access to native functionality. | dolni wrote: | > I've configured literally 1000s of systems with | CloudFormation with very few problems. | | This is a great way of saying "I've never used CloudFormation" | without stating it directly. | void_mint wrote: | > Managed services offer big benefits over software. | | TF can be used as a managed service. | | > Managed service providers offer big benefits over software. | With CF and AWS support, help with problems are a support | ticket away. | | The same is true with TF, except 100000% better unless you're | paying boatloads of money for higher tiered support. | | > I only run workloads on AWS, so the CF syntax, specs and docs | unlocks endless first party features. | | CF syntax is an abomination. Lots of the bounds of CF are | dogmatic and unhelpful. | | > I have seen Terraform turn into a tire-fire of migrations | from state files to Terraform enterprise to Atlantis that took | an entire DevOps team to care for. | | CF generally takes an entire DevOps team to care for, for any | substantial project. | ldoughty wrote: | Agree. CF is not a magic bullet, but neither is ansible or | terraform. | | We used ansible heavily with AWS for 2 years. Then we decided | to gut it out and do CF directly. Why? If we want to switch | clouds, it's not like the ansible or terraform modules are | transferable ... So might as well go the native supported | route. | | I agree with the article, messages can be cryptic, but at the | end of the day, I have a CF stack that represents an entity. I | can blow away the stack, and if there's any failure or issue, I | can escalate my permissions and kill it again. Still a problem? | Then it's AWS's fault and a ticket away (though I've only had | to do this once in 5 years and > 150,000 CF stacks. | | I also would argue, if a stack deletion stalls development, you | are probably using hard-coded stack names, which isn't wise. | Throw in a "random" value like a commit or pipeline identifier. | | I've had far less issues with CF than terraform or ansible. I | have yet to see CF break backward compatibility, while I had a | nightmare day when I couldn't run any playbooks in ansible | because the module had a new required parameter on a minor or | patch version bump.l (which was when I called it quits on | ansible, I then relooked at terraform, and decided to go | native) | | I will caveat that our use case for AWS involves LOTS of | creation and deletion, so I find it super helpful to manage my | infrastructure in "stacks" that are created and deleted as a | unit.. I dont need to worry about partial creations or | deletions.. like ever... It basically never fails redoing | known-working stuff... Only "first time" and usually because we | follow least-privilege heavily | HatchedLake721 wrote: | I'm confused. Isn't Ansible and CloudFormation what apple is | to an orange with completely different use cases and purpose? | | One is a configuration management and deployment tool. | | The other one is cloud resource provisioning service. | | They're meant to work in tandem, not one to replace another. | mooreds wrote: | I think Ansible has extensions which allow for managing | infra such as AWS. See https://docs.ansible.com/ansible/lat | est/collections/amazon/a... for example. | booleanbetrayal wrote: | Yeah, importing existing resources into Cloudformation is a | nightmare in "Am I going to break everything? _Fingers Crossed_ | ". | | It is also very possible to get into very bad situations if your | settings drift and you attempt to reconcile those changes. | easton wrote: | Something funny (well, kind of sad) about CloudFormation I | noticed this summer was that if you deploy a CloudFormation stack | which updates a ECS service and deploys tasks which then fail | health checks, CloudFormation will do nothing about this and just | let ECS keep killing and restarting tasks for.. well, at least | several hours. You have to know to go into ECS and drain the | tasks manually and then initiate a rollback from CF to get your | service back into a good state. The bug reports about this I | found were going back years. | | The upside is that I got really well acquainted with how ECS | worked. | fictionfuture wrote: | I had this same bug!! Cost us like $1000 before we fixed it; | tkahnoski wrote: | 100x this. Prior company committed to doing Infrastructure as | Code and CloudFormation worked well except for this hiccup. We | didn't even have that many services on ECS but we probably had | 1 ticket a week asking support to help us with a 'stuck' stack. | | Our commitment to CloudFormation was doubled down on that we | could do containers, Lambda, and 95% of any other AWS | Services.... | | However, in hidsight using SAM and the ECS CLI probably would | have resulted in a more predictable CI/CD process as we weren't | fighting deploy semantics through CloudFormation abstraction. | cmaggiulli wrote: | Writing Terraform scripts for AWS is 70% of my job. I do have | some issues with the AWS provider in Terraform. Firstly, there | are bugs. I ran into a bug a few days ago where the ARN attribute | on a Lamba alias was resolving to the ARN of the Lambda, not it's | alias. I only figured it out because I found a GitHub Issue. | Additionally, Hashicorp is often playing catch-up with Amazon. A | few days ago AWS released a new instruction set architecture for | Lambdas that would save my org a lot of money. However after I | saw the announcement in AWS I see tons of different GitHub issues | created to add this functionality. So I start editing my files | based off the documentation only for that issue to be closed and | pointed to a new one with different syntax. So I start working | off the new syntax only for that issue to close and be pointed to | a different one | robohoe wrote: | That's right, don't use CloudFormation. Use CDK which will | generate and obfuscate CF for you and you won't have to worry | about it. | Arelius wrote: | I'm not sure I understand... Is obfuscating the CF a good | thing? | yjftsjthsd-h wrote: | I'm pretty sure that was sarcasm. I disagree with said | sarcasm, because CDK takes you one layer away from the actual | thing that gets run but gives you a much nicer thing to work | with so it can still be a good trade off; writing rust (or | whatever) "obfuscates" the underlying CPU instructions but it | still turns out to be a good idea. | robohoe wrote: | I will admit that troubleshooting permissions-related deployment | issues in StackSets are a super nightmare inducing events. ___________________________________________________________________ (page generated 2021-10-06 23:00 UTC)