[HN Gopher] AWS Lambda Behind the Scenes
       ___________________________________________________________________
        
       AWS Lambda Behind the Scenes
        
       Author : garblegarble
       Score  : 289 points
       Date   : 2021-07-10 12:33 UTC (10 hours ago)
        
 (HTM) web link (www.bschaatsbergen.com)
 (TXT) w3m dump (www.bschaatsbergen.com)
        
       | abarrak wrote:
       | A couple of days ago, I tried to search on how AWS operates RDS
       | behind the scenes, since it is a managed stateful service I was
       | wondering whether it runs in a traditional way VM-based or in a
       | fully containerized environment? .. Unfortunately, a simple
       | search will lead you to the consumer/customer resources out there
       | only.
        
         | rorykoehler wrote:
         | Based on how they bill it, it looks like it's running on VMs
        
           | StratusBen wrote:
           | Agreed. AWS RDS instance types are just EC2 instance types
           | prefixed with "db." and you're choosing either single-AZ or
           | multi-AZ deployments so presumably AWS is just spinning up 1
           | to 3 EC2 instances with some preconfigured software on them.
        
             | navaati wrote:
             | From what I know there _is_ a secret sauce beyond a mere
             | AMI and a control plane, based on some EBS volumes magic. I
             | may be mixing things up with Aurora though.
        
               | fierro wrote:
               | this is very, very safe to assume. There is probably a
               | hundred engineer's worth of "secret sauce" for an entire
               | managed DB product line.
        
               | aeyes wrote:
               | There were some comments in the early days that the
               | Multi-AZ magic for classic RDS was just drbd on top of
               | EBS.
               | 
               | Aurora is a completely different approach where the RDBMS
               | code is modified to directly interface with EBS instead
               | of going through a traditional OS filesystem layer.
        
               | rrdharan wrote:
               | Amazon published a paper describing how Aurora works:
               | 
               | https://www.allthingsdistributed.com/files/p1041-verbitsk
               | i.p...
        
         | ec109685 wrote:
         | This is a good paper that talks about Aurora and provides some
         | insight into how RDS operates:
         | https://www.allthingsdistributed.com/files/p1041-verbitski.p...
         | 
         | It's nice that AWS builds their own higher level abstractions
         | on the same primitives outside developers use. Feels like they
         | eat their own dogfood much more than Google where they bypass
         | GCP and instead utilize underlying Borg primitives for many
         | services.
        
           | dilyevsky wrote:
           | Which gcp services run directly on borg? My understanding is
           | at least bigtable, cloud sql and other dbs are within
           | "hidden" VMs. I think loadbalancers and storage are
           | exceptions but same is true for aws (except the classic elb
           | probably)
        
             | rrdharan wrote:
             | Bigtable, Firestore and Spanner run directly on Borg.
             | 
             | Cloud SQL V2 runs in hidden VMs.
        
               | orf wrote:
               | Are those not internal services that pre-date GCP that
               | are exposed externally _through_ GCP?
        
           | abarrak wrote:
           | Nice. I wonder if the stateful merits provided and marketed
           | by containers orchestrates (e.g. K8S) is something they will
           | consider in the future? ..
        
             | rejectedandsad wrote:
             | To build a new service at Amazon, the general path of least
             | resistance these days is to use Lambda. If not Lambda, then
             | ECS. If not ECS (or if it requires bare metal) then EC2.
        
           | rejectedandsad wrote:
           | This is a new thing really - it used to be that you'd use a
           | different system that's in many ways better integrated with
           | how the rest of development works but far worse in terms of
           | UX and capacity planning, etc. Now many of the tools are
           | basically frankenstein transformation from the old way into
           | the Amazon-specific way AWS is used via the multi-account
           | pattern.
        
       | tnolet wrote:
       | Great write up. Besides the technical parts, AWS Lambda probably
       | created a ton of new businesses/ startups that otherwise would
       | have been hard or at least expensive to get going.
        
       | personlurking wrote:
       | Aside: Lambda, out of Peru, is apparently the next virus variant
       | we need to worry about.
       | 
       | Edit: I had never seen the word before then I saw it used twice
       | in two days, in different contexts. Just thought it was odd.
        
         | adflux wrote:
         | Can we agree to leave COVID out of this discussion?
        
           | [deleted]
        
         | minitoar wrote:
         | It's not that odd if you consider it is the 11th letter of the
         | Greek alphabet.
        
         | Exmoor wrote:
         | https://en.wikipedia.org/wiki/Frequency_illusion
        
       | daxfohl wrote:
       | One other thing I learned here is that lambda@edge is not
       | actually run on the edge at all. It is forwarded to the nearest
       | datacenter to execute. Not enough capacity in edges to spin up
       | entire VMs for everything, even with Firecracker.
        
       | chews wrote:
       | nice writeup for how the magic really works. lambdas rock!
        
       | Something1234 wrote:
       | Fantastic paper. So I've been playing with the java and python
       | runtimes and it's absolutely stunning how much better python is
       | on execution and start up time.
       | 
       | Also how does an event actually get to the lambda handler?
       | Because they can come from all kind of sources.
        
         | mdaniel wrote:
         | I believe they fire up an http server, based on how their local
         | executor behaves, and then do "servlet-y" (or WSGi-y) dispatch
         | into the entry point method
        
       | mlerner wrote:
       | If you're interested in Firecracker, I wrote a summary of the
       | original paper here:
       | https://www.micahlerner.com/2021/06/17/firecracker-lightweig...
        
         | daxfohl wrote:
         | Any idea how much it has diverged from crosvm?
        
           | mjb wrote:
           | Quite a lot. Initially, a lot of the changes were removing
           | things from crosvm, but adding features like snapshots, and
           | factoring things out into RustVMM, has made them diverge a
           | lot more.
           | 
           | There's some data in the paper about how similar they were
           | then, too.
        
         | bschaatsbergen wrote:
         | Great article @mlerner
        
       | dr_kretyn wrote:
       | Is this write up correct? How do they know that? I don't see any
       | references on info source except a talk at re:invent.
        
         | bschaatsbergen wrote:
         | Both Marc Brooker (lead developer on the AWS Lambda team)
         | giving the talks at Re:Invent as I mentioned in the footnotes,
         | and the official documentation that's out there will provide
         | you with a lot of information.
        
         | garblegarble wrote:
         | There's a decent references section at the bottom, and having
         | watched the talks and briefly scanning the Firecracker paper
         | referenced, they do back up the writer.
        
           | dr_kretyn wrote:
           | That's a footnote section and to me the listing is only
           | partially related to the text, i.e. the write up contains a
           | lots more details on multiple components.
           | 
           | Thanks though for backing up this write up. That's +1 for
           | confidence.
        
         | dmarinus wrote:
         | When I was at re:invent 2019 I joined some chalk talks which
         | weren't recorded (or not published). Some of the hosts told lot
         | of details of their internal infrastructure.
        
       | chrisweekly wrote:
       | This is great! Awesome writeup w thekind of details that are
       | sometimes opaque and hard to find documentation for. I recently
       | deployed a NextJS app using Serverless framework (and serverless-
       | nextjs), so Lambda@Edge... looking fwd to playing more with
       | compute at CDN edgein general (eg fly.io). Amazing how easy it
       | is, esp. as someone who came into webdev in 1998.
        
         | emteycz wrote:
         | Considering your long experience, didn't you feel like we lost
         | a lot post-PHP? I also stepped out of the PHP world into JS,
         | and never understood why there isn't any apache2-modnodejs...
         | And to me, the serverless JS movement seems to be just that,
         | but with a lot of unnecessary baggage.
        
           | ec109685 wrote:
           | Lambda gets you back to the one request per process model
           | that made php so easy to reason about and performance flat.
           | With normally deployed JavaScript and single process
           | concurrency, callbacks could all complete at same time and
           | all block waiting to get cpu time to complete the request.
        
           | nuclearnice3 wrote:
           | We surely picked up some baggage. Much of it vendor specific.
           | But also we jettison some? You stick a lambda into API
           | gateway and you're on the internet. No servers. No linux
           | setup. No apache conf.
           | 
           | I'd encourage you to dive in for 20 hours in pure curiosity
           | mode and see what you find.
        
             | throwaway3699 wrote:
             | Lambda is a proprietary solution that only works for people
             | on AWS. Linux is open, and I just need to put it on a box.
             | How do I install the AWS Lambda stack on a standard Linux
             | box?
        
               | nuclearnice3 wrote:
               | You can't.
               | 
               | Are you actually asking this question?
               | 
               | Or you pretending to ask a question because you think the
               | fact that AWS Lambda run on AWS is some huge gotcha that
               | I never imagined and no one would ever tolerate?
               | 
               | I explicitly note vendor-specific baggage. AWS revenue is
               | over 45 billion annually and half of customers use
               | lambda.
        
               | throwaway3699 wrote:
               | My point is that the industry really hasn't moved on from
               | the old LAMP stack if it's been replaced by a single
               | company. When it truly comes down to it, the day-to-day
               | tools are not ours if they aren't open.
               | 
               | And if deploying a lambda function on my own hardware is
               | vastly more complex, then the tools haven't really
               | changed, they just got outsourced.
               | 
               | There are a bunch of semi-standards like Serverless
               | Framework and Knative, but nothing concrete.
        
               | nuclearnice3 wrote:
               | I agree the tools got outsourced.
               | 
               | I also agree there are big chunks of LAMP under the
               | covers of running an AWS Lambda. So, in that sense, we
               | haven't "moved on" from the old LAMP stack.
               | 
               | I also agree the tools are "not ours" if they aren't
               | open. They do useful things. It's a tradeoff.
        
               | beckingz wrote:
               | Openstack?
               | 
               | https://www.openstack.org/
        
             | js4ever wrote:
             | It's great for a lot of use cases ... But unfortunately
             | there is several important points that prevents to use this
             | combo as a silver bullet.
             | 
             | API gateway is limited to 29 sec of execution, if you need
             | anything longer you will need an EC2 instance (or ECS or
             | fargate) to act as a webserver and call the lambda (up to
             | 15 min), cloudfront is also not an option for this comon
             | use case because it's limited to 180 sec.
        
       | bschaatsbergen wrote:
       | Good to know you enjoyed the read!
        
       | carlosf wrote:
       | Really cool post!
       | 
       | From the architecture, it's not really clear to me why Lambdas
       | have the 15 min limitation. It seems to me AWS could use the same
       | infrastructure to make a product that competes with Google Cloud
       | Run. Maybe it's a businesses thing?
        
         | cloakandswagger wrote:
         | I can't think of any reason outside of product positioning.
         | 
         | A lot of the novelty of Lambda is its identity as a function:
         | small units of execution run on-demand. A Lambda that can run
         | perpetually is made redundant by EC2, and the opinionated time
         | limit informs a lot of design.
        
           | ignoramous wrote:
           | It may be product positioning, but Lambda really stems from
           | AWS desire to do something about the dismal utilisation ratio
           | of their most expensive bill item: Servers [0].
           | 
           | I speculate, 1min or 15mins workloads are optimum to schedule
           | and run uncorrelated workloads. Any more, and it may diminish
           | returns?
           | 
           | [0] https://youtu.be/dInADzgCI-s?t=524 (James Hamilton, 2013)
        
           | mdaniel wrote:
           | > A Lambda that can run perpetually is made redundant by EC2
           | 
           | Is only conceptually true outside of "EC2 Classic", because
           | (to the best of my knowledge) every other EC2 launches into a
           | VPC, even if it's the default one for the account per region,
           | and even then into the default security group (and one must
           | specify the IDs). That may sound like "yeah, yeah" but is a
           | level of moving parts that Lambda doesn't require a consumer
           | to dive into unless they want to control its networking
           | settings
           | 
           | I would think removing the time limit on Lambda would be like
           | printing money since I bet per second for Lambda is greater
           | than EC2
        
         | kolanos wrote:
         | This service exists, it's called AWS Fargate [0].
         | 
         | [0]: https://read.iopipe.com/how-far-out-is-aws-
         | fargate-a2409d2f9...
        
           | slumdev wrote:
           | This isn't true.
           | 
           | Fargate scales in minutes, not seconds. And it never scales
           | to zero.
        
           | simonw wrote:
           | Fargate isn't a competitor to Cloud Run (I wish it was)
           | because it doesn't scale to zero in between requests and
           | scale back up again when new traffic arrives.
        
           | carlosf wrote:
           | Oof
           | 
           | Makes sense!
           | 
           | I wish Fargate was easier to use and had a scale to 0
           | feature.
           | 
           | If App Runner ends up supporting private deployments then we
           | can have a true Cloud Run competitor.
        
             | kolanos wrote:
             | > I wish Fargate was easier to use and had a scale to 0
             | feature.
             | 
             | Fargate can be scaled to zero. Also, have you tried the
             | CLI? [0]
             | 
             | [0]: https://github.com/aws/copilot-cli
        
               | simonw wrote:
               | When I say "scale to zero" I mean like Cloud Run or AWS
               | Lambda: I define it as the service automatically scaling
               | to zero (and hence costing nothing to run) in between
               | requests, but automatically starting up again when a new
               | request comes in - so the request still gets served, it
               | just suffers from a few seconds of cold-start time.
               | 
               | I'm pretty sure Fargate doesn't offer this. It sounds
               | like you're talking about the ability to manually (or
               | automatically through scripting) turn off your Fargate
               | containers, then manually turn them back on again - but
               | not in a way that an incoming request still gets served
               | even though the container wasn't running when the request
               | first arrived.
        
       | simonw wrote:
       | This is a great article - I really appreciate when people take
       | the time to assemble details from a bunch of different sources
       | (Firecracker paper, re:Invent talks) and turn them into a useful
       | overview like this.
       | 
       | Clearly Bruno got a lot of the details right, Jeff Barr tweeted a
       | link to this a few weeks ago:
       | https://twitter.com/jeffbarr/status/1404512248152825857
        
       ___________________________________________________________________
       (page generated 2021-07-10 23:00 UTC)