[HN Gopher] AWS Lambda Behind the Scenes ___________________________________________________________________ AWS Lambda Behind the Scenes Author : garblegarble Score : 289 points Date : 2021-07-10 12:33 UTC (10 hours ago) (HTM) web link (www.bschaatsbergen.com) (TXT) w3m dump (www.bschaatsbergen.com) | abarrak wrote: | A couple of days ago, I tried to search on how AWS operates RDS | behind the scenes, since it is a managed stateful service I was | wondering whether it runs in a traditional way VM-based or in a | fully containerized environment? .. Unfortunately, a simple | search will lead you to the consumer/customer resources out there | only. | rorykoehler wrote: | Based on how they bill it, it looks like it's running on VMs | StratusBen wrote: | Agreed. AWS RDS instance types are just EC2 instance types | prefixed with "db." and you're choosing either single-AZ or | multi-AZ deployments so presumably AWS is just spinning up 1 | to 3 EC2 instances with some preconfigured software on them. | navaati wrote: | From what I know there _is_ a secret sauce beyond a mere | AMI and a control plane, based on some EBS volumes magic. I | may be mixing things up with Aurora though. | fierro wrote: | this is very, very safe to assume. There is probably a | hundred engineer's worth of "secret sauce" for an entire | managed DB product line. | aeyes wrote: | There were some comments in the early days that the | Multi-AZ magic for classic RDS was just drbd on top of | EBS. | | Aurora is a completely different approach where the RDBMS | code is modified to directly interface with EBS instead | of going through a traditional OS filesystem layer. | rrdharan wrote: | Amazon published a paper describing how Aurora works: | | https://www.allthingsdistributed.com/files/p1041-verbitsk | i.p... | ec109685 wrote: | This is a good paper that talks about Aurora and provides some | insight into how RDS operates: | https://www.allthingsdistributed.com/files/p1041-verbitski.p... | | It's nice that AWS builds their own higher level abstractions | on the same primitives outside developers use. Feels like they | eat their own dogfood much more than Google where they bypass | GCP and instead utilize underlying Borg primitives for many | services. | dilyevsky wrote: | Which gcp services run directly on borg? My understanding is | at least bigtable, cloud sql and other dbs are within | "hidden" VMs. I think loadbalancers and storage are | exceptions but same is true for aws (except the classic elb | probably) | rrdharan wrote: | Bigtable, Firestore and Spanner run directly on Borg. | | Cloud SQL V2 runs in hidden VMs. | orf wrote: | Are those not internal services that pre-date GCP that | are exposed externally _through_ GCP? | abarrak wrote: | Nice. I wonder if the stateful merits provided and marketed | by containers orchestrates (e.g. K8S) is something they will | consider in the future? .. | rejectedandsad wrote: | To build a new service at Amazon, the general path of least | resistance these days is to use Lambda. If not Lambda, then | ECS. If not ECS (or if it requires bare metal) then EC2. | rejectedandsad wrote: | This is a new thing really - it used to be that you'd use a | different system that's in many ways better integrated with | how the rest of development works but far worse in terms of | UX and capacity planning, etc. Now many of the tools are | basically frankenstein transformation from the old way into | the Amazon-specific way AWS is used via the multi-account | pattern. | tnolet wrote: | Great write up. Besides the technical parts, AWS Lambda probably | created a ton of new businesses/ startups that otherwise would | have been hard or at least expensive to get going. | personlurking wrote: | Aside: Lambda, out of Peru, is apparently the next virus variant | we need to worry about. | | Edit: I had never seen the word before then I saw it used twice | in two days, in different contexts. Just thought it was odd. | adflux wrote: | Can we agree to leave COVID out of this discussion? | [deleted] | minitoar wrote: | It's not that odd if you consider it is the 11th letter of the | Greek alphabet. | Exmoor wrote: | https://en.wikipedia.org/wiki/Frequency_illusion | daxfohl wrote: | One other thing I learned here is that lambda@edge is not | actually run on the edge at all. It is forwarded to the nearest | datacenter to execute. Not enough capacity in edges to spin up | entire VMs for everything, even with Firecracker. | chews wrote: | nice writeup for how the magic really works. lambdas rock! | Something1234 wrote: | Fantastic paper. So I've been playing with the java and python | runtimes and it's absolutely stunning how much better python is | on execution and start up time. | | Also how does an event actually get to the lambda handler? | Because they can come from all kind of sources. | mdaniel wrote: | I believe they fire up an http server, based on how their local | executor behaves, and then do "servlet-y" (or WSGi-y) dispatch | into the entry point method | mlerner wrote: | If you're interested in Firecracker, I wrote a summary of the | original paper here: | https://www.micahlerner.com/2021/06/17/firecracker-lightweig... | daxfohl wrote: | Any idea how much it has diverged from crosvm? | mjb wrote: | Quite a lot. Initially, a lot of the changes were removing | things from crosvm, but adding features like snapshots, and | factoring things out into RustVMM, has made them diverge a | lot more. | | There's some data in the paper about how similar they were | then, too. | bschaatsbergen wrote: | Great article @mlerner | dr_kretyn wrote: | Is this write up correct? How do they know that? I don't see any | references on info source except a talk at re:invent. | bschaatsbergen wrote: | Both Marc Brooker (lead developer on the AWS Lambda team) | giving the talks at Re:Invent as I mentioned in the footnotes, | and the official documentation that's out there will provide | you with a lot of information. | garblegarble wrote: | There's a decent references section at the bottom, and having | watched the talks and briefly scanning the Firecracker paper | referenced, they do back up the writer. | dr_kretyn wrote: | That's a footnote section and to me the listing is only | partially related to the text, i.e. the write up contains a | lots more details on multiple components. | | Thanks though for backing up this write up. That's +1 for | confidence. | dmarinus wrote: | When I was at re:invent 2019 I joined some chalk talks which | weren't recorded (or not published). Some of the hosts told lot | of details of their internal infrastructure. | chrisweekly wrote: | This is great! Awesome writeup w thekind of details that are | sometimes opaque and hard to find documentation for. I recently | deployed a NextJS app using Serverless framework (and serverless- | nextjs), so Lambda@Edge... looking fwd to playing more with | compute at CDN edgein general (eg fly.io). Amazing how easy it | is, esp. as someone who came into webdev in 1998. | emteycz wrote: | Considering your long experience, didn't you feel like we lost | a lot post-PHP? I also stepped out of the PHP world into JS, | and never understood why there isn't any apache2-modnodejs... | And to me, the serverless JS movement seems to be just that, | but with a lot of unnecessary baggage. | ec109685 wrote: | Lambda gets you back to the one request per process model | that made php so easy to reason about and performance flat. | With normally deployed JavaScript and single process | concurrency, callbacks could all complete at same time and | all block waiting to get cpu time to complete the request. | nuclearnice3 wrote: | We surely picked up some baggage. Much of it vendor specific. | But also we jettison some? You stick a lambda into API | gateway and you're on the internet. No servers. No linux | setup. No apache conf. | | I'd encourage you to dive in for 20 hours in pure curiosity | mode and see what you find. | throwaway3699 wrote: | Lambda is a proprietary solution that only works for people | on AWS. Linux is open, and I just need to put it on a box. | How do I install the AWS Lambda stack on a standard Linux | box? | nuclearnice3 wrote: | You can't. | | Are you actually asking this question? | | Or you pretending to ask a question because you think the | fact that AWS Lambda run on AWS is some huge gotcha that | I never imagined and no one would ever tolerate? | | I explicitly note vendor-specific baggage. AWS revenue is | over 45 billion annually and half of customers use | lambda. | throwaway3699 wrote: | My point is that the industry really hasn't moved on from | the old LAMP stack if it's been replaced by a single | company. When it truly comes down to it, the day-to-day | tools are not ours if they aren't open. | | And if deploying a lambda function on my own hardware is | vastly more complex, then the tools haven't really | changed, they just got outsourced. | | There are a bunch of semi-standards like Serverless | Framework and Knative, but nothing concrete. | nuclearnice3 wrote: | I agree the tools got outsourced. | | I also agree there are big chunks of LAMP under the | covers of running an AWS Lambda. So, in that sense, we | haven't "moved on" from the old LAMP stack. | | I also agree the tools are "not ours" if they aren't | open. They do useful things. It's a tradeoff. | beckingz wrote: | Openstack? | | https://www.openstack.org/ | js4ever wrote: | It's great for a lot of use cases ... But unfortunately | there is several important points that prevents to use this | combo as a silver bullet. | | API gateway is limited to 29 sec of execution, if you need | anything longer you will need an EC2 instance (or ECS or | fargate) to act as a webserver and call the lambda (up to | 15 min), cloudfront is also not an option for this comon | use case because it's limited to 180 sec. | bschaatsbergen wrote: | Good to know you enjoyed the read! | carlosf wrote: | Really cool post! | | From the architecture, it's not really clear to me why Lambdas | have the 15 min limitation. It seems to me AWS could use the same | infrastructure to make a product that competes with Google Cloud | Run. Maybe it's a businesses thing? | cloakandswagger wrote: | I can't think of any reason outside of product positioning. | | A lot of the novelty of Lambda is its identity as a function: | small units of execution run on-demand. A Lambda that can run | perpetually is made redundant by EC2, and the opinionated time | limit informs a lot of design. | ignoramous wrote: | It may be product positioning, but Lambda really stems from | AWS desire to do something about the dismal utilisation ratio | of their most expensive bill item: Servers [0]. | | I speculate, 1min or 15mins workloads are optimum to schedule | and run uncorrelated workloads. Any more, and it may diminish | returns? | | [0] https://youtu.be/dInADzgCI-s?t=524 (James Hamilton, 2013) | mdaniel wrote: | > A Lambda that can run perpetually is made redundant by EC2 | | Is only conceptually true outside of "EC2 Classic", because | (to the best of my knowledge) every other EC2 launches into a | VPC, even if it's the default one for the account per region, | and even then into the default security group (and one must | specify the IDs). That may sound like "yeah, yeah" but is a | level of moving parts that Lambda doesn't require a consumer | to dive into unless they want to control its networking | settings | | I would think removing the time limit on Lambda would be like | printing money since I bet per second for Lambda is greater | than EC2 | kolanos wrote: | This service exists, it's called AWS Fargate [0]. | | [0]: https://read.iopipe.com/how-far-out-is-aws- | fargate-a2409d2f9... | slumdev wrote: | This isn't true. | | Fargate scales in minutes, not seconds. And it never scales | to zero. | simonw wrote: | Fargate isn't a competitor to Cloud Run (I wish it was) | because it doesn't scale to zero in between requests and | scale back up again when new traffic arrives. | carlosf wrote: | Oof | | Makes sense! | | I wish Fargate was easier to use and had a scale to 0 | feature. | | If App Runner ends up supporting private deployments then we | can have a true Cloud Run competitor. | kolanos wrote: | > I wish Fargate was easier to use and had a scale to 0 | feature. | | Fargate can be scaled to zero. Also, have you tried the | CLI? [0] | | [0]: https://github.com/aws/copilot-cli | simonw wrote: | When I say "scale to zero" I mean like Cloud Run or AWS | Lambda: I define it as the service automatically scaling | to zero (and hence costing nothing to run) in between | requests, but automatically starting up again when a new | request comes in - so the request still gets served, it | just suffers from a few seconds of cold-start time. | | I'm pretty sure Fargate doesn't offer this. It sounds | like you're talking about the ability to manually (or | automatically through scripting) turn off your Fargate | containers, then manually turn them back on again - but | not in a way that an incoming request still gets served | even though the container wasn't running when the request | first arrived. | simonw wrote: | This is a great article - I really appreciate when people take | the time to assemble details from a bunch of different sources | (Firecracker paper, re:Invent talks) and turn them into a useful | overview like this. | | Clearly Bruno got a lot of the details right, Jeff Barr tweeted a | link to this a few weeks ago: | https://twitter.com/jeffbarr/status/1404512248152825857 ___________________________________________________________________ (page generated 2021-07-10 23:00 UTC)