hngopher.com

       [HN Gopher] Ask HN: Who operates at scale without containers?
       ___________________________________________________________________
        
       Ask HN: Who operates at scale without containers?
        
       In other words, who runs operations at a scale where distributed
       systems are absolutely necessary, without using any sort of
       container runtime or container orchestration tool?  If so, what
       does their technology stack look like? Are you aware of any good
       blog posts?  edit : While I do appreciate all the replies, I'd like
       to know if there are any organizations out there who operate at web
       scale without relying on the specific practice of shipping software
       with heaps of dependencies. Whether that be in a container or in a
       single-use VM. Thank you in advance and sorry for the confusion.
        
       Author : disintegore
       Score  : 350 points
       Date   : 2022-03-22 15:37 UTC (7 hours ago)
        
       | mumblemumble wrote:
       | I don't know that I'd say "web scale", in part because I still
       | don't think I know exactly what that means, but I used to work at
       | a place that handled a lot of data, in a distributed manner, in
       | an environment where reliability was critical, without
       | containers.
       | 
       | The gist of their approach was radical uniformity. For the most
       | part, all VMs ran identical images. Developers didn't get to pick
       | dependencies willy-nilly; we had to coordinate closely with ops.
       | (Tangentially, at subsequent employers I've been amazed to see
       | how just a few hours of developers handling things for themselves
       | can save many minutes of talking to ops.) All services and
       | applications were developed and packaged to be xcopy deployable,
       | and they all had to obey some company standards on how their CLI
       | worked, what signals they would respond to and how, stuff like
       | that. That standard interface allowed it all to be orchestrated
       | with a surprisingly small volume - in terms of SLOC, not
       | capability - of homegrown devops infrastructure.
        
       | tombert wrote:
       | I don't think this will get me in trouble; the iTunes side of
       | Apple, for a long time (they might have changed semi-recently),
       | was using a home-spun, container-free distribution platform
       | called "Carnival". It was running JVMs straight on Linux boxes,
       | and as far as I know, was not using any kind of containers.
       | 
       | There was talk of moving to k8s before I left, so it's possible
       | this is no longer true.
        
       | seti0Cha wrote:
       | Place I worked until recently had (and probably still has) the
       | majority of the site running on bare metal. Java stack, home
       | grown RPC framework, home grown release system that boiled down
       | to a whole lot of rsync and ssh commands by one controller script
       | which knew how to do things in parallel. Configuration was
       | through files which lived in source control. Our servers were
       | hand packed, which would have sucked except we had a fairly
       | limited number of (gigantic) services. We handled load on the
       | order of millions of requests per minute. It actually worked
       | surprisingly well. Our biggest pain point was slow startup times
       | from giant services and sometimes needing to move groups of
       | services around when load got too heavy.
        
         | mst wrote:
         | Moving stuff around is always aggravating but I do suspect that
         | for a lot of organisations the time spent swearing at that ends
         | up being lower over time than the time spent swearing at k8s
         | until it reliably does it for you.
         | 
         | (this is an observation, not a recommendation, different teams
         | and workloads will result in different preferences for how to
         | pick your trade-offs)
        
       | throwaway-9824 wrote:
       | My company uses Apache Storm + Clojure. Millions of daily users,
       | never felt the need for containers because submitting a .jar to a
       | Storm topology is so simple.
        
       | xeus2001 wrote:
       | We currently run a Java Monolith that is build on every push to
       | master on Gitlab pipeline, when successful the fat JAR is copied
       | to S3 (including all resources) and then a config file is pushed
       | to S3 to ask DEV servers to run it. The machines are pure EC2
       | instances with the service registered in systemd with auto-
       | restart after 5s. A simple shell script downloads the config
       | file, the fat JAR and runs it. We detect the environment and
       | machine we're at using the EC2 meta- and user-data, which is set
       | when the EC2 instance is launched. All of this is basically
       | simple plain batch script using JQ and other standard tools. It
       | is a little more complicated, because we have graceful restart by
       | first asking a specific instance in each cluster to update itself
       | and when it reports that is is updated, the next instance will be
       | asked to update. All the EC2 instances in a cluster are behind a
       | simple Global Accelerator and are in multiple regions. The
       | service has an API that is invoked by Gitlab with a token to ask
       | for graceful restart, then it will report back to GA that it is
       | unhealthy for some time, so it can finish pending requests (close
       | sockets) and then it simply terminates, the rest is back up to
       | bash script and systemd. To deploy or redeploy to an environment
       | is as well a simply click in Gitlab UI and especially important
       | for On-Call, so you can see which version was deployed to which
       | environment when by whom. Additionally some JIRA tickets are
       | automatically created. Eventually this only needs Gitlab, EC2
       | instances, Global Accelerator and Bash scripts, while allow us to
       | be multi-region and have stateful connections, we can even ask
       | clients to directly connect to specific instances or ask GA to
       | redirect certain ports to specific instances. Basically GA is our
       | load balancer, router, edge location. It is stable, fast and easy
       | with the smallest amount of pieces involved. We can remove
       | individual instances, update one instances in a specific region
       | for testing to a new version and so on.
        
       | alex_duf wrote:
       | Hey former Guardian employee here.
       | 
       | The Guardian has hundreds of servers running, pretty much all EC2
       | instances. EC2 images are baked and derived from official images,
       | similarly to the way you bake a docker image.
       | 
       | We built tools before docker became the de facto standard, so we
       | could easily keep the EC2 images up to date. We integrated pretty
       | well with AWS so that the basic constructs of autoscaling and
       | load balancer were well understood by everyone.
       | 
       | The stack is mostly JVM based so the benefits of running docker
       | locally weren't really significant. We've evaluated moving to a
       | docker solution a few times and always reached the conclusion
       | that the cost of doing so wouldn't be worth the benefits.
       | 
       | Now for a company that starts today I don't think I'd recommend
       | that, it just so happen that The Guardian invested early on the
       | right tooling so that's pretty much an exception.
        
         | rr808 wrote:
         | > Now for a company that starts today I don't think I'd
         | recommend that
         | 
         | Any reason why? It sounds pretty good.
        
         | sparsely wrote:
         | > We integrated pretty well with AWS so that the basic
         | constructs of autoscaling and load balancer were well
         | understood by everyone.
         | 
         | This is an underappreciated point I think sometimes. Once you
         | have a team which is familiar with your current, working setup,
         | the benefits of moving away have to be pretty huge for it to be
         | worthwhile.
        
         | weego wrote:
         | Also ex employee. Riff Raff is absolutely still an excellent
         | mod for build and deploy. At the time I was there it the
         | initial stack build via handwritten cloudformation script that
         | was the friction and pain point.
        
       | Negitivefrags wrote:
       | We don't use containers with a pretty large deployment. (1k or so
       | bare metal servers)
       | 
       | If you statically link all your binaries then your deployment
       | system can be rsync a directory.
       | 
       | The only dependency is that the Linux kernel is new enough. Other
       | than that the packages we deploy will run on literally any setup.
        
         | claytonjy wrote:
         | what languages do you use in this way? this sounds a lot easier
         | and more natural for C than python
        
           | Negitivefrags wrote:
           | C++ for the most part.
           | 
           | I don't use python much but I don't know why it wouldn't be
           | possible build it statically and package your entire
           | dependency set into a single directory.
        
             | claytonjy wrote:
             | It's certainly possible in python, and there's a handful of
             | tools/approaches, but the last one I used, PEX, left me
             | scarred. It's certainly a much less common or simple
             | approach then copying a venv into a docker container.
        
       | abadger9 wrote:
       | I have a private consulting company which has delivered some
       | pretty sizable footprints (touches most fortune 500 companies via
       | integration with a service) and i prefer deploying without
       | containers. In fact i'll say I hate deploying with containers,
       | which is what i do at my 9-5 and i've lost job opportunities at
       | growth startups because someone was a devout follower of
       | containers and i would rather be honest than using a technology i
       | didn't care for.
        
       | firebaze wrote:
       | We do. Our containers are AWS EC2 instances. We briefly used
       | Kubernetes successfully, but decided that it's not worth the
       | maintenance burden.
       | 
       | (small SaaS, max out at ~100k concurrent users right now, but
       | growing fast)
        
       | dangus wrote:
       | > without relying on the specific practice of shipping software
       | with heaps of dependencies
       | 
       | Many companies just rely on the practice of shipping software
       | with heaps of dependencies.
       | 
       | I worked at a place that simply spun up blank AWS images in
       | autoscaling groups and allowed configuration management to
       | install literally everything: security/infra/observability
       | agents, the code and dependencies (via AWS CodeDeploy), and any
       | other needed instance configuration.
       | 
       | The downside of this practice was slow startup times. The upside
       | was...I don't know, I think this pattern happened by accident.
       | Packaging these AWS instances into images beforehand would be
       | smarter. Newly created services were generally moved over to k8s.
       | 
       | These were stateless web services for the most part.
       | 
       | I think the lesson I learned from this was "nearly any operating
       | paradigm can be made reliable enough to tolerate in production."
        
       | 0xbadcafebee wrote:
       | I mean, sure, any statically compiled application can be deployed
       | at scale without any dependencies at all. If you don't consider
       | the application's 1GB worth of compiled-in SDKs and libraries to
       | be bundled dependencies. :)
       | 
       | Back in the day we used to continuously deploy to thousands of
       | servers without VMs or containers. Probably the largest-traffic
       | sports site of the 2000s. But the application was mostly
       | mod_perl, so dependencies still had to be managed, and it was
       | intermittently a tire fire. There was no abstraction to run the
       | applications immutably or idempotently. We actually acquired a
       | company that had built a whole immutable build/deploy system
       | around RPMs, so things became more predictable, but still host OS
       | limitations would cause random bugs and the management interface
       | was buggy. Re-bootstrapping hosts to shift load in the middle of
       | peak traffic is a huge pain.
       | 
       | Containers would have been great. We could've finally ditched
       | Cfengine2 and most of the weird custom host-level configuration
       | magic and just ship applications without fear that something that
       | worked in dev would break in prod due to a configuration or build
       | issue. We also could have changed load patterns nearly instantly,
       | which you can't do with VMs as you have to spin up new VMs to
       | have a new host OS (not that we had VMs, what a luxury!)
        
       | KaiserPro wrote:
       | Large VFX places _used_ to not use containers for large scale
       | stuff.
       | 
       | Where I worked we had something like 40k servers running about 80
       | different types of software. It was coordinated by pixar's
       | alfred, a single threaded app that uses something that looks
       | suspiciously like the athena widget set. (
       | https://hradec.com/ebooks/CGI/RMS_1.0/alfred/scheduling.html )
       | 
       | It was wrapped in a cgroup wrapper to avoid memory contention
        
       | paxys wrote:
       | I work for one of the largest enterprise productivity companies
       | out there. We have tens of thousands of self-managed EC2 VMs, no
       | containers.
       | 
       | We use Chef and Terraform for the most part.
        
         | markstos wrote:
         | Are you maintaining AMIs or using a tool like Ansible? For
         | instance, how do you apply security patches to 10,000
         | instances?
        
           | paxys wrote:
           | I'm not too involved with the ops side, but I believe Chef
           | handles it.
        
       | brunojppb wrote:
       | StackOverflow is known to operate their own servers on their own
       | datacenters and as far I recently read, they are still tuning
       | with no container tech.
       | https://nickcraver.com/blog/2016/05/03/stack-overflow-how-we...
        
       | efficax wrote:
       | Back in 2016 at least, Stack overflow was container free
       | https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...
       | 
       | No idea how much has changed since then
        
         | 16bytes wrote:
         | That's the first thing that came to mind as well, and Stack
         | Overflow definitely operates at scale, but OP asked for people
         | doing distributed systems. SO is famously monolithic, famously
         | operating from just a handful of machines.
        
           | NicoJuicy wrote:
           | I think this was changed a bit since moving to dot net core (
           | not sure).
        
           | bluefirebrand wrote:
           | If anything, Stack Overflow should be taken as an example
           | that large scale operations can be run a lot more simply than
           | many of us are making them.
        
       | nunez wrote:
       | Stack Overflow/Stack Exchange maybe?
        
       | [deleted]
        
       | popotamonga wrote:
       | Containers yes but nothing else.
       | 
       | Running 1B valuation with manually going to instances and docker
       | pull xxx && docker-compose down && docker-compose up -d. EC2
       | created by hand. No issues.
        
         | raffraffraff wrote:
         | I worked at a shop that got to $8.5b acquisition running on
         | centos, puppet, docker, consul and vault. Some bash, spit and
         | glue tied it together pretty nicely.
        
       | softwarebeware wrote:
       | Is this part of some new anti-container sentiment?
        
       | Nextgrid wrote:
       | I've been at a company where they weren't (yet) using containers
       | nor K8S.
       | 
       | The build process would just create VM images with the required
       | binaries in there and then deploy that to an autoscaling group.
       | 
       | It worked well and if you only ever intend to run a single
       | service per machine then is the right solution.
        
         | tailspin2019 wrote:
         | I'm in the process of moving to exactly this approach.
         | 
         | I've been trying to pick the right Linux distro to base my
         | images on. Ubuntu Server is the low effort route but a bit big
         | to redeploy constantly.
         | 
         | I've also been looking at the possibility of using Alpine Linux
         | which feels like a better fit but a bit more tweaking needed
         | for compatibility across cloud providers.
         | 
         | Unikernels are also interesting but I think that might be going
         | too far in the other direction for my use case.
         | 
         | For others doing this, I'd be interested in what distros you're
         | using?
        
           | maxk42 wrote:
           | Ubuntu is too heavy. Try CentOS.
        
             | goodpoint wrote:
             | Urgh. If anything, use Debian.
        
             | tailspin2019 wrote:
             | Thanks, CentOS was on my radar too. I'll take a closer
             | look!
        
               | Melatonic wrote:
               | Isnt CentOS basically discontinued at this point?
        
               | paulryanrogers wrote:
               | It's changed from a downstream distro to upstream. Rocky
               | Linux has since appeared to take its place in the
               | community, by one of the original CentOS founders.
        
             | twistedcheeslet wrote:
             | How so? Doesn't CentOS share a lot of the same issues as
             | Ubuntu? Legitimate question.
        
           | gen220 wrote:
           | Google has used debian as their base. Netflix uses a BSD
           | flavor (I forget which) as their CDN cache. FB used CentOS,
           | not sure what they use today since CentOS is EOL'd.
           | 
           | Debian (and, formerly, CentOS) is a good standard: it
           | occupies a sweet spot between ubuntu server and alpine, in
           | the sense that it's batteries-included and very well-
           | supported (apt/yum), but not particularly bloated.
           | 
           | I use debian for all my personal stuff. The BSD flavors (http
           | s://en.wikipedia.org/wiki/Comparison_of_BSD_operating_sy...)
           | are tempting, though. Perhaps one weekend!
           | 
           | For other distros, if you're working on non-x86, or have
           | uncommon dependencies, you should be prepared to pull in and
           | build your dependencies from scratch, which is not much fun
           | to maintain.
        
             | mst wrote:
             | Google's debian-based distroless is fascinating:
             | https://github.com/GoogleContainerTools/distroless
        
               | gen220 wrote:
               | See also https://marc.merlins.org/linux/talks/ProdNG-
               | LinuxCon2013/Pro... which has some interesting history
               | and context from that time.
               | 
               | In paper form (more gory details): https://www.usenix.org
               | /system/files/conference/lisa13/lisa13...
        
             | tailspin2019 wrote:
             | Useful insight, thanks. What attracts you to BSD over
             | Debian if you were to go that route at some point?
        
               | gen220 wrote:
               | No problem! If you search around for "Debian vs BSD",
               | you'll find more exhaustive explanations [1], but it
               | mostly reduces to the fact that BSD is more coherent &
               | organized than GNU/Linux; it's more feature-rich in the
               | domains that sysadmins and hackers appreciate, but
               | feature-sparse in the domains "normal users" appreciate.
               | Depending on your needs these facts can be advantages or
               | disadvantages.
               | 
               | I tend to gravitate towards projects like BSD, since they
               | align with my principles, but you do pay a cost in terms
               | of compatibility with common software.
               | 
               | For example, Firefox does not treat BSD as a first class
               | build target, so it's up to community members to build
               | and deploy Firefox binaries, and report feedback on bugs
               | that break BSD installations.
               | 
               | If access to the most up-to-date compiled versions of
               | popular software packages is a big sticking point, BSD-
               | land may not be the right choice. But if you're living
               | mostly in `vim`, `bash` and `man`, and are willing to
               | roll up the sleeves, BSD feels cozy.
               | 
               | It's maybe a 20% correct analogy (don't read too deeply
               | into it, or you'll draw incorrect conclusions), but C++
               | is to GNU/Linux, as D/Rust/Zig is to BSD.
               | 
               | [1]: https://unixsheikh.com/articles/technical-reasons-
               | to-choose-...
        
               | tailspin2019 wrote:
               | Very interesting. I knew very little about BSD but your
               | and mst's comments have convinced me that I need to take
               | a closer look.
               | 
               | Thanks for the link - I'm taking a look at that now!
        
               | mst wrote:
               | Having an operating system where the kernel and core
               | userland are developed together is a qualitatively
               | different experience to one that's been stitched together
               | from external projects. I'm not really sure how to
               | describe the difference in experience but the "this is an
               | integrated whole" feel is really quite pleasant as a
               | sysadmin and as a user.
               | 
               | Note that my personal infrastructure is a mixture of
               | Debian and FreeBSD and I dearly love both - if you forced
               | me to pick one of the two to keep I'd be horribly torn.
        
             | _-david-_ wrote:
             | >FB used CentOS, not sure what they use today since CentOS
             | is EOL'd.
             | 
             | They are using CentOS Stream now.
        
         | strikelaserclaw wrote:
         | this what my company does as well
        
         | dolni wrote:
         | At my workplace we use Docker to run services, but there is no
         | container orchestration like Kubernetes. An AMI bakes in some
         | provisioning logic and the container image. Autoscaling does
         | the rest.
         | 
         | Even without orchestration, I argue containers are useful. They
         | abstract the operating system from the application and allow
         | you to manage each independently. Much more easily than you'd
         | be able to otherwise, anyway.
         | 
         | Plus you can run that image locally using something like
         | docker-compose for an improved developer experience.
        
           | zokier wrote:
           | > An AMI bakes in some provisioning logic and the container
           | image
           | 
           | That does sound bit weird setup to me; with something like
           | Packer building an AMI is almost as easy as building a Docker
           | image with Dockerfiles so the benefits of using Docker seem
           | quite slim?
        
             | dolni wrote:
             | Nah, Docker still has lots of benefits. You can run a
             | Docker image on a developer system, using docker-compose,
             | and it could even be the exact same one running in prod.
             | 
             | That takes a TON of variables out of the equation.
        
           | iangregson wrote:
           | I'm in this spot too. It's looks increasingly likely that an
           | orchestration layer is in our near future. But we've got
           | pretty far with this approach.
        
             | dolni wrote:
             | What factors are causing you at container orchestration as
             | opposed to continuing with your current strategy?
        
           | brimble wrote:
           | I tend to use Docker heavily as basically just a cross-
           | distro, isolated package and process manager that also
           | presents config in a reasonably consistent manner. It's
           | wildly better than dealing with "native" distro packages,
           | unless you are _all in_ on doing _everything_ The Correct Way
           | on a particular distro and have tooling in place to make that
           | not suck. That it happens to use containers barely matters to
           | me.
        
         | fragmede wrote:
         | Container images are a decent way to get close to a root
         | filesystem but with what approaches a REPL loop for debugging.
        
         | css wrote:
         | Same here, the workflow is great.
        
         | secondcoming wrote:
         | My company works this way.
         | 
         | Our Jenkins creates a debian package, and deploying involves
         | installing this package on a machine and creating an image.
         | This image is then deployed to the ASGs. We operate in 5
         | regions in GCP.
        
       | tyingq wrote:
       | There's certainly a lot of large scale VmWare out there, though
       | that's somewhat the same idea.
        
       | frellus wrote:
       | Great talk here from HCA Healthcare about their architecture and
       | tech stack using Elixir OTP, Riak:
       | 
       | https://www.youtube.com/watch?v=cVQUPvmmaxQ
       | 
       | TL;DR - Elixir/OTP for fault tolerance and process supervision,
       | Riak as a KV store (no other database) and an interesting
       | process-actor model for patients I found delightful. Bonus: hot
       | code patching, zero downtime
        
       | blacklight wrote:
       | Booking.com - at least until I left in 2018, not sure about now.
       | 
       | Their code base is (still) mostly in Perl (5), running on uWSGI
       | bare instances, installed on KVMs deployed on a self-hosted
       | infrastructure.
        
         | Thaxll wrote:
         | Working in Perl in 2022, the horror, especially because most
         | likely it's 10y/o perl version.
        
           | mst wrote:
           | Booking has competent perl ops people - including people who
           | contribute to the language core and toolchain - and keeps
           | very much up to date.
           | 
           | Sane application scale perl is pretty much a different
           | language to old school scripting perl - the two just happen
           | to share a parser and a VM.
           | 
           | (not everything about Booking's code is necessarily sane,
           | when you've got a codebase that's been running that long
           | making heaps of money there's inevitably stuff people wish
           | they could replace, but your 'most likely' simply isn't
           | descriptive of most companies running OO perl application
           | stacks in 2022 - the ones that wrote crap perl mostly already
           | blamed perl and are now writing hopefully slightly less crap
           | something else)
        
         | juliogreff wrote:
         | I left B.com in early 2020, and they had been using Kubernetes
         | for newer services (I actually worked on the deploy tooling),
         | but until I left nothing of consequence had been moved to
         | containers yet. The overwhelming majority of their workloads
         | still ran on bare metal indeed, and to be really honest I still
         | kinda miss it sometimes.
        
       | tiffanyh wrote:
       | I'd imagine the answer largely depends on whether or not your
       | company either builds or buys software.
       | 
       | If your company builds - it's no garuntee containers are used
       | (but it's a choice).
       | 
       | If your company buys software - I highly doubt containers are
       | used at all.
        
         | topspin wrote:
         | The commercial, licensed software I've dealt with in the past
         | year has all been containerized.
        
           | wsostt wrote:
           | I agree. I've seen containerization mentioned by many vendors
           | in the last year. It adds another layer of questions to vet
           | like "do they possibly know what they're doing?"
        
             | JonChesterfield wrote:
             | Do you see shipping in containers as a sign that they do
             | know what they're doing, or as a sign that they don't?
        
               | mst wrote:
               | The key thing is that it's a sign that you're effectively
               | going to be outsourcing a complete userland to them,
               | which means you'll be much more dependent on the vendor
               | for security updates to anything their code depends on as
               | well as their code itself.
               | 
               | Whether or not this is a good idea depends on your
               | situation and their level of competence.
        
               | wsostt wrote:
               | Neither by default. I look to be convinced that they've
               | implemented containers because it made sense for their
               | technical architecture or strategy.
               | 
               | I have a vendor who pitched their new "cloud-native" re-
               | platforming project and it really spooked me. It's data
               | management-type tool that was migrating from a
               | traditional on-prem client/server architecture to an AWS-
               | hosted Angular interface with a MongoDB backend. I got
               | the same pitch a year later and the entire stack had
               | changed. I was spooked the first time; now I'm really
               | spooked and thinking about migrating off the platform.
        
               | Melatonic wrote:
               | To be fair they may have just realized that MongoDB is
               | sort of meh and decided to change to something better
        
           | benatkin wrote:
           | OCI images don't necessarily mean lxc/docker style
           | containers. Many are run directly on MicroVMs.
           | https://fly.io/blog/docker-without-docker/ Ditto Google Cloud
           | Run.
        
       | robot wrote:
       | We use Elastic Beanstalk with NodeJS. It works seamlessly, it
       | does use containers as the underlying infrastructure but we don't
       | see them, or manage them. Deploying takes not more than 10-15
       | seconds. It doesn't break. Devops maintenance work is very
       | little.
        
       | hgshwenghe wrote:
        
       | hgshwenghe wrote:
        
       | hgshwenghe wrote:
        
       | toast0 wrote:
       | I worked at WhatsApp, prior to moving to Facebook infra, we had
       | some jails for specific things, but mostly ran without
       | containers.
       | 
       | Stack looked like:
       | 
       | FreeBSD on bare metal servers (host service provided a base
       | image, our shell script would fetch source, apply patches,
       | install a _small_ handful of dependencies, make world, manage
       | system users, etc)
       | 
       | OTP/BEAM (Erlang) installed via rsync etc from build machine
       | 
       | Application code rsynced and started via Makefile scripts
       | 
       | Not a whole lot else. Lighttpd and php for www. Jails for stud (a
       | tls terminator, popular fork is called hitch) and ffmpeg (until
       | end to end encrypted media made server transcoding unpossible).
       | 
       | No virtualized servers (I ran a freebsd vm on my laptop for dev
       | work, though).
       | 
       | When WA moved to Facebook infra, it made sense to use their
       | deployment methodology for the base system (Linux containers),
       | for organizational reasons. There was no consideration for which
       | methodology was technically superior; both are sufficient, but
       | running a very different methodology inside a system that was
       | designed for everyone to use one methodology is a recipie for
       | operational headaches and difficulty getting things diagnosed and
       | fixed as it's so tempting to jump to the conclusion that any
       | problem found on a different setup is because of the difference
       | and not a latent problem. We had enough differences without
       | requiring a different OS.
        
         | freeqaz wrote:
         | That's a very cool deployment method to just use rsync like
         | that! Simple and very composable.
         | 
         | And now I feel self-conscious about my pile of AWS and Docker!
        
           | motoboi wrote:
           | Containers are just tar.gz files, you know? The whole layers
           | thing it's just an optimization.
           | 
           | You can actually very simply run those tar.gz files without
           | docker involved, just cgroups. But then you'll have to write
           | some daemon scripts to start, stop, restart, etc
           | 
           | Follow this path and soon you'll have a (worst) custom
           | docker. Try to create a network out of those containers and
           | soon a (worst) SDN network appears.
           | 
           | Try to expand that to optimal node usage and soon a (worst)
           | Kubernetes appears.
           | 
           | My point here is: it's just software packaged with their
           | dependencies. The rest seems inconsequential, but it's
           | actually the hard part.
        
             | treesknees wrote:
             | This is essentially what fly.io does, they unpack the
             | container image into a micro-VM and execute it directly.
             | Calling it worse is subjective, you certainly need to
             | account for things you would never worry about by using
             | traditional Docker. But for some companies it ends up
             | working much better than Docker.
             | 
             | https://fly.io/blog/docker-without-docker/
        
             | z3t4 wrote:
             | You can do a lot with just SystemD today. Would not get
             | surprised if SystemD would eat up Docker too.
        
               | Art9681 wrote:
               | I'm personally interested in Unikernels. That tech seems
               | to be a good contender for replacing K8s. Every app an
               | appliance running in a hypervisor.
        
             | lamontcg wrote:
             | "worst" might not really be all that bad.
             | 
             | K8s may be the "worst" orchestrator for you, and you may
             | not actually need all or any of that
             | complexity/functionality.
             | 
             | But starting with containers viewed as an RPM/tarball with
             | some extra sauce (union filesystem and cgroups/jail) is a
             | way better mental model than the whole "immutable
             | infrastructure" meme to me.
        
           | toast0 wrote:
           | It's 'easy' to do it with rsync and Make as long as you don't
           | have a lot of dependencies. Once you start pulling in
           | dependencies, you have to figure out how to get them
           | installed, hopefully before you roll code that uses them. If
           | the dependencies are small, you can pull them into your
           | source (more or less) and deploy them that way, but that may
           | make tracking upstream harder. (otoh, a lot of things I
           | personally used like that were more of pull this probably
           | abandoned thing that's close to what we need, and modify it
           | quite a bit to fit our needs; there wasn't a need or desire
           | to follow upstream after that)
        
             | tiffanyh wrote:
             | Just curious, would your deployment method be any different
             | today? If so, how?
        
               | toast0 wrote:
               | If I was going to do it again, I would probably work on a
               | user management script a lot sooner (you don't really
               | _need_ it, but it would have been nice, and the earlier
               | you do it, the less crufty differences need to be
               | tolerated) and maybe make build on one host, deploy on
               | many or at least push from workstation to colo once and
               | fan out; might have been nice to have that around 100
               | hosts instead of much later.
               | 
               | Of course, today, my fleet is down to less than ten
               | hosts, depending on exactly how you count, and I'm
               | usually the only user, so I can do whatever.
        
       | kristjansson wrote:
       | > ... without relying on the specific practice of shipping
       | software with heaps of dependencies. Whether that be in a
       | container or in a single-use VM.
       | 
       | Either you have to package your dependencies with your software,
       | rely on your deployment environment to have all of the
       | dependencies available, and correctly versioned, or write
       | software without dependencies.
       | 
       | Since you seem to have a pretty specific pattern in mind, I'd be
       | curious to know more about what you've envisioned or are dealing
       | with.
        
       | oceanplexian wrote:
       | I worked at (massive live-streaming website) and my ops team
       | operated tens of thousands of bare-metal machines. Not to say we
       | didn't have an enormous amount of containerized infrastructure in
       | AWS, but we had both.
       | 
       | When the company was younger containerized networking had latency
       | and throughput issues, especially when you were trying to squeeze
       | every bit of traffic you can from a white-box bare-metal server,
       | i.e. bonding together 10Gb or 40Gb network interfaces. The other
       | thing is that the orchestration engines like K8s simply had
       | maturity issues when not using Cloud Load Balancers.
       | 
       | As for the implementation details, I've worked at lots of
       | companies doing metal and they look a lot alike. PXE and TFTP,
       | something like Chef, Puppet, Ansible (But at a certain scale you
       | have to transcend those tools and come up with better patterns),
       | you need services to manage IPMI or console servers, power
       | strips, etc., you need a team of folks to rack and stack things,
       | you need inventory, you need network engineers, and so on. At a
       | certain scale you can simply push code around with SSH and a
       | build system, at a scale beyond that you need to come up with
       | some special sauce like P2P asset distribution or an internal
       | CDN. At the pinnacle of bare-metal, you'd ideally have a very
       | evolved control-plane, a slimmed down OS that runs a single
       | static binary, and a stateless application. It takes a lot of
       | work to get there.
       | 
       | Of course, getting servers to run some code is scratching the
       | surface. Service discovery, network architecture, security, etc.,
       | are all things that require specialized skill sets and
       | infrastructure. You also need to build and maintain every "glue"
       | service you get from a cloud provider, you need to run file
       | servers, you need to run repositories, you need to run and manage
       | your own databases, and so on. Sometimes you can hybridize those
       | with cloud services but that opens up yet another can of worms
       | and teams of people who need to answer questions like.. what if
       | The Cloud(tm) goes down? What if there's some kind of split brain
       | scenario? What if there's a networking issue? How does service
       | discovery work if half of the services disappear? etc. etc.
        
       | onebot wrote:
       | We use freebsd jails and a lightweight in house orchestration
       | tool written in Rust. We are running hundreds of Ryzen machines
       | with 64 cores. Our costs compared to running equivalent on Amazon
       | is so much less. We estimate our costs are about 6x lower than
       | AWS and we have far better performance in terms of networking,
       | CPU, and disk write speed.
       | 
       | Jails has been a pleasure to work with! We even dynamically scale
       | up and down resources as needed.
       | 
       | We use bare metal machines on Interserver. But there a quite a
       | few good data centers worth considering.
        
         | eatonphil wrote:
         | What company is this? And are you hiring? Asking for a friend.
        
         | wswope wrote:
         | As a tangent to this, I highly recommend https://bloom.host for
         | anyone looking for just a few cores worth of Ryzen performance.
         | They're primarily a gaming server host, but their general-
         | purpose VPS are absolutely stellar, and the support is top-
         | notch. I tested them out after they were recommended on another
         | HN thread, and got a 3x speedup based on a few heavy postgres
         | queries I benchmarked on a $20/mo Digital Ocean instance vs. a
         | $20/mo Bloom instance.
         | 
         | Data center Ryzen really knocks it out of the park.
        
         | x0x0 wrote:
         | I stay quasi-anonymous on here, but I worked at a company that
         | bursted -- at the time -- to approx 1.5m requests per second.
         | Average traffic was approx 400k rps.
         | 
         | It was run on naked boxes in many pops worldwide. We heavily
         | used the jvm, which dramatically simplifies application
         | distribution. Boxes were all netboot; they came up, grabbed the
         | OS, and based on some pretty simple config figured out what
         | jars to download and run.
         | 
         | We also costed out a move to aws, and it was somewhere between
         | 5 and 6x as expensive as buying our own hardware.
         | 
         | Thousands of boxes worldwide managed with a team of two
         | fulltime sysadmins plus two more primarily engaged in dev with
         | sysadmin duties as-needed, plus remote hands services in the
         | various datacenters.
        
       | smilliken wrote:
       | My company runs without containers. We process petabytes of data
       | monthly, thousands of CPU cores, hundreds of different types of
       | data pipelines running continously, etc etc. Definitely a
       | distributed system with lots of applications and databases.
       | 
       | We use Nix for reproducible builds and deployments. Containers
       | only give reproducible deployments, not builds, so they would be
       | a step down. The reason that's important is that it frees us from
       | troubleshooting "works on my machine" issues, or from someone
       | pushing an update somewhere and breaking our build. That's not
       | important to everyone if they have few dependencies that don't
       | change often, but for an internet company, the trend is
       | accelerating towards bigger and more complex dependency graphs.
       | 
       | Kubernetes has mostly focused on stateless applications so far.
       | That's the easy part! The hard part is managing databases. We
       | don't use Kubernetes, but there's little attraction because it
       | would be addressing something that's already effortless for us to
       | manage.
       | 
       | What works for us is to do the simplest thing that works, then
       | iterate. I remember being really intimidated about all the big
       | data technologies coming out a decade ago, thinking they are so
       | complex that they must know what they're doing! But I'd so often
       | dive in to understand the details and be disillusioned about how
       | much complexity there is for relatively little benefit. I was in
       | a sort of paralysis of what we'd do after we outgrew postgresql,
       | and never found a good answer. Here we are years later, with a
       | dozen+ postgresql databases, some measuring up to 30 terabytes
       | each, and it's still the best solution for us.
       | 
       | Perhaps I've read too far into the intent of the question, but
       | maybe you can afford to drop the research project into containers
       | and kubernetes, and do something simple that works for now, and
       | get back to focusing on product?
        
         | ixxie wrote:
         | I would love to hear more about your architecture and
         | deployment... do your services run as NixOS modules? Are you
         | using NixOps or Morph or something else? How does your world
         | look like without K8s and Containers?
        
           | smilliken wrote:
           | For the first 6 years of using Nix, it was depoyed on Ubuntu.
           | We recently migrated to NixOS. NixOS is fantastic for other
           | reasons that are similar but separate from Nix as a package
           | manager/build system.
           | 
           | It's easier to incrementally switch to Nix first then NixOS
           | later.
           | 
           | We don't use systemd services for our code. We only use NixOS
           | as an operating system, not for application layer. Our code
           | works just the same on any linux distribution, or even macos
           | minus linux-only stuff. That way we don't need to insist that
           | everyone uses the same OS for development, and no one is
           | forced into VMs. Everything can be run and tested locally.
        
             | smilliken wrote:
             | > Are you using NixOps or Morph or something else? How does
             | your world look like without K8s and Containers?
             | 
             | Not using NixOps or Morph. I recall considering a few
             | others as well. In each case, I wasn't able to understand
             | what they were doing that I couldn't do with a simple
             | script. Instead, there's a python program that does an
             | rsync, installs some configuration files, and runs the nix
             | build. It deploys to any linux OS, but it does some stuff
             | differently for NixOS, and the differences account for
             | about 100 lines while increasing scope (managing
             | filesystems, kernel versions, kernel modules, os users,
             | etc). A deploy to all hosts takes about a minute because it
             | runs in parallel. Deploys are zero downtime, including
             | restarting a bunch of web apps.
             | 
             | The NixOS configuration itself is another ~200 lines, plus
             | a bit per host for constants like hostname and ip
             | addresses. It's really neat being able to define an entire
             | operating system install in a single configuration file
             | that gets type checked.
        
         | claytonjy wrote:
         | is Nix a hurdle when onboarding engineers? do only a few people
         | need to know the language? I've wanted to use Nix and home-
         | manager for personal stuff but the learning curve seems big.
        
           | smilliken wrote:
           | In my case, adopting Nix was a response to having a poor
           | onboarding process for new engineers. It was always fully
           | automated, but it wasn't reliable before nix. So somebody
           | would join the team, and it was embarrassing because the
           | first day would be them troubleshooting a complex build
           | process to get it to work on their machine. Not a great first
           | impression.
           | 
           | So I adopted Nix "in anger" and now new machines always build
           | successfully on the first try.
           | 
           | For me it was easy to set up. 80% done on the first day. It
           | helped that I understood the conceptual model and benefits
           | first. There's a lot of directions you can take it, I'd
           | recommend getting something really simple working then
           | iterating. Don't try to build the Sistine Chapel with all the
           | different things integrated on day one.
           | 
           | It's hard to overstate how well that decision has worked out
           | for me. Computers used to stress me out a lot more because of
           | dependencies and broken builds. Now I have a much healthier
           | relationship with computers.
        
             | john-shaffer wrote:
             | Have you had problems with Nix on macOS? Nix works great
             | for me, but I can't use it much to deal with installs or
             | synchronize dependencies because the devs on macOS can't
             | get Nix running.
        
           | silviogutierrez wrote:
           | It's a hurdle to be able to write Nix stuff, but to consume
           | it is pretty easy.
           | 
           | Shameless plug here:
           | https://www.reactivated.io/documentation/why-nix/
        
             | [deleted]
        
             | olifante wrote:
             | I just tried Nix yesterday for the first time in order to
             | play with reactivated. Was surprised by how simple it was
             | to get it running. Made me think that perhaps containers
             | are not as indispensable as I thought.
             | 
             | And reactivated is awesome, by the way. I just worry about
             | how brittle it will be as both Django and React evolve.
        
               | silviogutierrez wrote:
               | Thanks! Hopefully not brittle at all. Specially since
               | both React and Django are very stable projects, the
               | latter even more than the former.
        
         | rapsey wrote:
         | I've seen comments on here from people with large Nix
         | deployments with like hundreds of thousands of Nix script code
         | and saying it is a complete nightmare to manage.
        
         | Thaxll wrote:
         | The thing is, you spent probably a lot of time on things that
         | were granted elsewhere and while the rest of the world is
         | improving on those tech you keep your home grown solution that
         | is harder and harder to maintain. Plus the knowledge that is
         | not transferable.
         | 
         | > Containers only give reproducible deployments, not builds, so
         | they would be a step down.
         | 
         | This is not true, if you use a docker image A with specific
         | version to build then the result is reproducible.
         | 
         | That's what most people should be doing, you pin versions in
         | your build image that's it.
         | 
         | > Kubernetes has mostly focused on stateless applications so
         | far. That's the easy part
         | 
         | Kubernetes can run database, and stateless application is not a
         | solved problem, I mean how does your services get restarted on
         | your system, is it something that you had to re-create as well?
        
           | lnxg33k1 wrote:
           | I also don't agree with the concept, and I've had problems
           | about with docket using a specific OS version but pulling in
           | some minor software version which broke, so I think even if
           | being slightly true, that statement represent something that
           | might be solved by specifying minor versions for the packages
           | you depend on, which is a level of effort compatible with the
           | one you need on nix to specify packages hashes, so you can
           | have both with docker, a strict and non strict approach,
           | while I guess nix only supports strict
        
           | jandrewrogers wrote:
           | Sure, Kubernetes can run a database, but not efficiently.
           | Companies with intensive data infrastructure frequently
           | operate at scale without containers, either VMs or bare
           | metal. The larger the data volume, the less likely they are
           | to use containers because efficiency is more important. It is
           | also simpler to manage this kind of thing outside containers,
           | frankly, since you are running a single process per server.
           | 
           | People have been building this infrastructure since long
           | before containers, thousands of servers in single operational
           | clusters. Highly automated data infrastructure without
           | containers is pretty simple and straightforward if this is
           | your business.
        
             | remram wrote:
             | Running on Kubernetes doesn't mean all data has to be on
             | network disks.
        
             | anonymousDan wrote:
             | Can you point to any resources on the perf implications of
             | docker for databases?
        
               | Art9681 wrote:
               | OP is likely making assumptions on obsolete knowledge.
               | Tech moves really fast which means the knowledge we
               | gained last week might no longer be relevant. Here is an
               | article showing it is perfectly acceptable to deploy
               | Postgres in Kubernetes:
               | 
               | https://www.redhat.com/en/resources/crunchy-data-and-
               | openshi...
               | 
               | The great majority of Kubernetes experts I have met use a
               | managed service and have not been exposed to the
               | internals or the control plane back end. An abstracted
               | version of Kubernetes that works well for most use cases
               | but makes it more difficult to solve some of the problems
               | listed in this thread.
               | 
               | Most folks dont really know Kubernetes. They know how to
               | deploy X application using X abstraction (Helm,
               | Operators, CD tooling) to some worker node in a public
               | cloud using public sources and thats basically it. It's
               | no wonder they cannot solve some of the common problems
               | because they didnt really deploy the apps. They filled
               | out a form and pushed a button.
        
           | zomglings wrote:
           | The OP is talking about reproducible _builds_. A docker image
           | is a build _artifact_.
           | 
           | If I gave you a Dockerfile for a node.js service that
           | installed its dependencies using npm, and the service used
           | non-trivial dependencies, and I gave you this file 10 years
           | after it was first written, chances are you would not be able
           | to build a new image using that Dockerfile.
           | 
           | This would be a problem for you if you had to make some small
           | change to that node.js service (without changing any
           | dependencies).
        
             | Thaxll wrote:
             | Docker is also used to build things not just run them,
             | multistage build with dependency vendoring, you have a 100%
             | reproducible build.
             | 
             | It's very easy to do in Go.
             | 
             | You pin the version used to build your app ( the runtime
             | version ) In your git repo you have dependcies
             | 
             | That's it.
        
             | Art9681 wrote:
             | There are other container build tools such as Kaniko that
             | have solved the reproducible builds issue right? If we're
             | operating at scale, it probably means we are using some
             | flavor of Kubernetes and the Docker runtime is no longer
             | relevant. Would Redhat OpenShift's source-to-image (S2I)
             | build process solve the reproducible builds requirement?
             | This space moves quickly and the assumptions we had last
             | week may no longer be relevant.
             | 
             | For example Crunchy Postgres offers an enterprise supported
             | Kubernetes Operator that leverages stateful sets. Yet I am
             | reading in these comments that it is an unsolved issue.
        
           | smilliken wrote:
           | Dan Luu wrote an essay on this topic recently:
           | https://danluu.com/nothing-works/
           | 
           | He ponders why it is that big websites inevitably have kernel
           | developers. Way out of their domain of expertise, right? If
           | you adopt a technology, you're responsible for it.
           | 
           | When Kubernetes inevitably has an issue that is a blocker for
           | us, I don't have confidence in my ability to fix it. When an
           | internal python or shell program has an issue that is a
           | blocker for us, I change it.
           | 
           | PostgreSQL is used by probably millions of people, but we've
           | had to patch it on multiple occasions and run our fork in
           | production. Nobody wants to do that, but sometimes you have
           | to.
           | 
           | The point is, you can't just say "oh, we use kubernetes so we
           | don't have to think about it". No. You added it to your
           | stack, and now you're responsible for it. Pick technologies
           | that you're able to support if they're abandoned,
           | unresponsive to your feature requests and bug reports, or not
           | interested in your use case. Pick open source, obviously.
           | 
           | This is another reason I like Nix. It's a one-line change to
           | add a patch file to a build to fix an issue. So I can
           | contribute a fix upstream to some project, and then I don't
           | have to wait for their release process, I can use the patch
           | immediately and let go of it whenever the upstream eventually
           | integrates it. It lowers the cost of being responsible for
           | third party software that we depend on.
        
             | Thaxll wrote:
             | > When Kubernetes inevitably has an issue that is a blocker
             | for us, I don't have confidence in my ability to fix it.
             | When an internal python or shell program has an issue that
             | is a blocker for us, I change it.
             | 
             | I doubt that this happen in reality because Kubernetes
             | cover uses cases for pretty much everyone, now I would
             | doubt the knowledge of a regular dev to try to mimic a
             | solution that k8s already does.
        
           | cj wrote:
           | > the rest of the world is improving on those tech
           | 
           | Does that matter if the current stack works just fine?
           | 
           | Imagine someone who built a calculator app with jQuery
           | javascript 10 years ago, and all it does is add and subtract
           | numbers. You could spend time porting it to Ember, and then
           | migrating to Angular, and then porting it to React, and then
           | porting it to React with SSR and hooks.
           | 
           | If the calculator app worked with 15 year old code, you can
           | either leave it be, or try to keep up with latest tech trends
           | and continuously refactor and port the code.
           | 
           | I think there are a lot of "calculator app" type components
           | of many systems that, at a certain point, can be considered
           | "done, complete" without ever needing a rebuild or rewrite.
           | Even if they were built on technologies that might now be
           | considered antiquated.
        
             | Thaxll wrote:
             | It works fine until the core people that know that stack
             | leave your compagny and no one wants to touch it. Then the
             | new people comes in a pich a solution that is widely used.
        
             | Thaxll wrote:
             | At some point you just don't want to re-invent the wheel,
             | reminds me a lot of:
             | https://en.wikipedia.org/wiki/Not_invented_here
        
             | hinkley wrote:
             | The people who are invested in their home-grown solutions
             | are usually pretty bad at doing tech support for them. They
             | enjoy writing code, and it doesn't take long before you've
             | created enough surface area that you couldn't possibly keep
             | up with requests even if you wanted to.
             | 
             | Which they don't, because nothing has convinced me of the
             | crappiness of some of my code more thoroughly than watching
             | other people try and fail to use it. They are a mirror and
             | people don't always like what they see in it. That guy who
             | keeps talking trash about how stupid everyone else is for
             | not understanding his beautiful code is deflecting. If
             | they're right then my code does not have value/isn't smart
             | which means _I_ don 't have value/am not smart, so clearly
             | the problem is that I'm surrounded by assholes.
             | 
             | I don't think we would be using as much 3rd party software
             | without StackOverflow, and almost nobody creates a
             | StackOverflow for internal tools. Nobody writes books for
             | internal tools. Nobody writes competitors for internal
             | tools which either prove or disprove the utility of the
             | original tool. All of those options are how some people
             | learn, and in some cases how they debug. You're cutting off
             | your own nose when you reinvent something and don't invest
             | the time to do all of these other things. If I lose someone
             | important or get a big contract I didn't think I could win,
             | I can't go out and hire anyone with 3 years experience in
             | your internal tool, trading money for time. I have to hire
             | noobs and sweat it out while we find out if they will ever
             | learn the internal tools or not.
             | 
             | The older I get the more I see NIH people as working a
             | protection racket, because it amplifies the 'value' of
             | tenure at the company. The externalities created really
             | only affect new team members, delaying the date when you
             | can no longer dismiss their ideas out of hand. You have in
             | effect created an oligarchy, whereas most of the rest of us
             | prefer a representative republic (actual democracy is too
             | many meetings).
        
               | p10_user wrote:
               | Nice post. I wish I was reading this 6 years ago, but at
               | least I quickly learned to be humble about the quality of
               | my code.
        
         | deathanatos wrote:
         | > _Kubernetes has mostly focused on stateless applications so
         | far. That 's the easy part! The hard part is managing
         | databases._
         | 
         | Kubernetes can absolutely host stateful applications. A
         | StatefulSet is the app-level construct pretty much intended for
         | exactly that use case. My company runs a distributed database
         | on top of Kubernetes.
         | 
         | (We have another app that uses StatefulSets to maintain a cache
         | across restarts, so that it can come up quickly, as it
         | otherwise needs to sync a lot of data. But, it is technically
         | stateless: we could just sync the entire dataset, which is only
         | ~20 GiB, each time it starts, but that is wasteful, and it
         | makes startup, and thus deployments of new code, quite slow. It
         | would also push the limits of what an emptyDir can accommodate
         | in our clusters' setup.)
         | 
         | (The biggest issue we've had, actually, with that, is that
         | Azure supports NVMe, and Azure has AKS, but Azure -- for God
         | only knows what reason -- apparently doesn't support combining
         | the two. We'd love to do that, & have managed k8s & nice disks,
         | but Azure forces us to choose. This is one of my chief gripes
         | about Azure's services in general: they do not compose, and
         | trying to compose them is just fraught with problems. But,
         | that's an _Azure_ issue, not a Kubernetes one.)
        
         | tharne wrote:
         | > What works for us is to do the simplest thing that works,
         | then iterate.
         | 
         | The older I get, the more often I'm reminded that this un-sexy
         | approach is really the best way to go.
         | 
         | When I was younger, I always thought the old guys pushing
         | boring solutions just didn't want to learn new things. Now I'm
         | starting to realize that after several decades of experience,
         | they simply got burned enough times to learn a thing or two had
         | developed much better BS-detectors than 20-something me.
        
           | tra3 wrote:
           | Choose boring technology [0].
           | 
           | For me, the choice is a trade off between the journey and the
           | destination.
           | 
           | Destination is the final objective of your project, business
           | value or whatever.
           | 
           | The journey is tinkering with "stuff".
           | 
           | Depending on the project there's value and enjoyment in both.
           | 
           | [0]: https://mcfunley.com/choose-boring-technology
        
             | mateuszf wrote:
             | > Chose boring technology
             | 
             | Doesn't fit in this case, I'd say Nix is still cutting
             | edge.
             | 
             | It may also be simple, but it's not easy (in the Rich
             | Hickey sense).
        
               | spicybright wrote:
               | Can you elaborate more? From what I've generally heard,
               | Nix has been a good foundation for a lot of companies.
        
               | mateuszf wrote:
               | As far as I know the final effect is great - getting
               | reproducible, statically linked programs.
               | 
               | However, learning Nix / NixOS is quite difficult - it
               | requires a specific set of skills (functional
               | programming, shell, Nix building blocks which you can
               | learn from existing Nix code), the documentation is
               | lacking, error messages are cryptic, debugging support is
               | bad.
        
               | Ericson2314 wrote:
               | Yeah Nix is why "choose boring" is a bit of a misnomer.
               | 
               | The truth is tautologically unhelpful: choose _good_
               | technologies.
               | 
               | Most people have trouble separating snake oil and fads
               | from good things, so old/boring is a safe heuristic.
               | 
               | Nix is not boring, but it is honest. It's hard because it
               | doesn't take half measures. It's not an easy learning
               | curve gaslighting to you to a dead end.
               | 
               | But taste in choosing good technologies is hard to teach,
               | especially when people's minds are warped by a shitty
               | baseline.
        
               | mountainriver wrote:
               | Thank you, I've always hated the Boring Technology idiom.
               | The right tool doesn't have to be boring, but it
               | certainly can be
        
               | charcircuit wrote:
               | Nix isn't a good option for developing rails
               | applications. The nix ecosystem doesn't have good support
               | for ruby meaning you will waste a lot of time and run
               | into a lot of issues getting stuff to work. That's not
               | boring. At example of being boring would be using Nix for
               | something like C where the ecosystem around it is already
               | built up.
               | 
               | I was really pro Nix at the time I took on a rails
               | project and it caused many problems for my team due to
               | Nix just not having good support for my use case. If I
               | could go back in time I would definitely choose something
               | else.
        
               | dqv wrote:
               | What was the problem you were having with Ruby (on Rails)
               | and Nix? It wasn't ergonomic for me, but I was able to
               | get things to work.
        
               | tharne wrote:
               | > It's not an easy learning curve gaslighting to you to a
               | dead end.
               | 
               | This is a really good way of describing a certain type of
               | technology. I'm stealing this line.
        
           | 0x20cowboy wrote:
           | Jonathan blow had a great take on this that really spoke to
           | me. I can't do it justice, but paraphrasing as best I can:
           | 
           | Get the simple to understand, basic thing working and push
           | the harder refactor / abstraction until later. Leave the
           | fancy stuff for a future engineer who better understands the
           | problem.
           | 
           | That future engineer is you with more experience with the
           | actual problem.
        
             | xmprt wrote:
             | The way I see it, I can either prematurely use the complex
             | solution where I don't fully understand the problem or the
             | solution OR I use the simple solution and learn why the
             | complex solution is needed and more importantly what
             | parameters and specific ways I can best leverage the
             | complex solution for this specific problem.
        
               | chousuke wrote:
               | Simple solutions are also easier to refactor into more
               | complex solutions when the complexity becomes necessary.
               | Going the other way is _much_ harder.
               | 
               | In simple systems, you often have the option of simply
               | throwing away entire components and remaking them
               | _because_ the system is simple and the dependencies
               | between components are still clear.
        
               | toast0 wrote:
               | Sometimes the simple solution works well enough for as
               | long as you need a solution, too. Often times, it just
               | works, or the problem changes, or the company changes.
        
             | mikepurvis wrote:
             | This works well if you know you're building it all yourself
             | regardless. I think the calculus does change a bit though
             | if you have frameworks available in your problem domain.
             | 
             | Like, there's no point in scratching away at your own web
             | CMS just to prove to yourself that you really do need Rails
             | or Django. Especially when part of the purpose of a
             | framework is to guide your design toward patterns which
             | will fit with the common, preexisting building blocks that
             | you are likely to end up needing. If you do the design
             | separately, you risk ending up in a place where what you've
             | built isn't really suited to being adapted to other stuff
             | later.
             | 
             | For a humble example of this from myself, I built an
             | installer system for Linux computers-- initially very
             | simple, just a USB boot, perform partitioning, unpack a
             | rootfs tarball, and set up the bootloader. Then we added
             | the ability to do the install remotely using kexec. Then we
             | added an A/B partition scheme so there would always be
             | fallback. It got more and more complicated and fragile, and
             | the market around us matured so that various commercial and
             | OSS solutions existed that hadn't when we first embarked on
             | this effort. But the bigger and less-maintainable the
             | homegrown system became, the more features and capabilities
             | it had which it made it difficult to imagine ditching it in
             | favour of something off the shelf. There isn't really a
             | moral to this story other than it would be even worse if
             | we'd built the in-house thing without doing a thorough
             | evaluation of what was indeed already out there at the
             | time.
        
           | caffeine wrote:
           | Being really good at using boring tech is better than being a
           | noob with cool tech.
           | 
           | So as you get older and you get more competent, the relative
           | benefits of boring tech grow.
           | 
           | Whereas if you're a noob at everything, being a noob with
           | cool tech is better.
           | 
           | If the problems are hard enough, and you are competent, you
           | iterate on boring tech for a while to solve the hard
           | problems, and then it becomes cool tech.
        
           | usaphp wrote:
           | Or it could be that you just became old and don't want to
           | learn new things anymore ))
        
             | basisword wrote:
             | Or their experience helps them have the wisdom to determine
             | which new things are worth learning and which are a waste
             | of time.
        
             | CuriouslyC wrote:
             | Some old things really are better, or at least not worse
             | and already familiar.
        
             | tomc1985 wrote:
             | Such ageism.
        
             | rattlesnakedave wrote:
             | This certainly seems more common. Every time I've had to
             | refactor 20 year old cruft it's been because of engineers
             | with the launch-and-iterate mentality 20 years ago, that
             | stopped caring about the "iterate" part once they gained
             | enough job security.
        
             | goodpoint wrote:
             | Such "new things" are just a big bunch of unnecessary
             | complexity.
        
               | rattlesnakedave wrote:
               | Or is cope for a loss of neuroplasticity? Has to be
               | evaluated case by case. "Experience" doesn't count for
               | much when evaluating new technical tooling. Landscape
               | shifts far too much.
        
               | tomc1985 wrote:
               | The landscape doesn't need to shift nearly as often as it
               | does. But we are an industry obsessed with hiring magpie
               | developers as cheaply as we can, and those magpie
               | developers demand cool merit badges to put on their
               | resumes, and here we are.
               | 
               | "Everything old is new again" does not BEGIN to do tech
               | justice. Compared to nearly every other profession on the
               | planet, tech knowledge churn is like 10x as fast.
        
               | goodpoint wrote:
               | > tech knowledge churn is like 10x as fast.
               | 
               | Spot on! Moreover, every time the wheel is unnecessarily
               | reinvented previous lessons are lost.
               | 
               | What is even worse is that people really want to reinvent
               | the wheel and get defensive if you point that out.
               | 
               | Comments like "Or is cope for a loss of neuroplasticity"
               | are a good example.
        
               | tomc1985 wrote:
               | It's outright naked ageism.
               | 
               | I have been studying a different trade, completely
               | unrelated to software engineering or tech, and by far the
               | weirdest thing is reading books or watching seminars from
               | 20 or 30 years ago that _are still relevant_. How much of
               | technology writing has that honor? Very little.
               | 
               | Like I have been obsessed with tech and computers my
               | whole life, I studied computer stuff when all the other
               | kids were outside playing football or chasing girls or
               | whatever, I self-taught to a very high level of expertise
               | and that knowledge has just been getting systematically
               | ripped away. Which is fine, to an extent, gotta embrace
               | change and yada yada yada. But I no longer have the
               | patience to grind to expertise-level on something over a
               | period of years only to find out the industry has moved
               | on to something completely different to do the exact same
               | thing! All because some starry-eyed shit somewhere in
               | FAANG doesn't like something!
        
               | repomies69 wrote:
               | Sometimes yes. But it is impossible to know for sure. New
               | things still get adopted, some of those new things
               | actually stick and provide value
        
               | spicybright wrote:
               | And at the very least, being old lets you be able to
               | evaluate these new things more accurately.
        
               | rattlesnakedave wrote:
               | How?
        
               | goodpoint wrote:
               | It obviously does, but experience is not the only way:
               | you can always read what people wrote and did in the
               | past. It only takes some humility.
        
             | jenkstom wrote:
             | This is the antivax sentiment of the IT world. It's new!
             | It's shiny! Give me some Ivermectin because I know things
             | other people don't! (Yes, I realize you're probably joking.
             | I'm not, particularly.)
        
               | ThePowerOfFuet wrote:
               | What does ivermectin have to do with anything? Are you
               | suffering from chronic parasitic infection?
        
               | rattlesnakedave wrote:
               | You have it reversed. "I don't want the new vaccine! Who
               | knows what the long term side effects are!" is equivalent
               | to the technical curmudgeon. The voluntary early trial
               | test group are the bleeding edge tech people. The
               | Ivermectin crowd is the guy that suggests rewriting the
               | backend in Haskell for no good reason.
        
         | mamcx wrote:
         | Any hints in how use nix with DBs?
         | 
         | I have a semi-docker setup where I stay with the base OS +
         | PostgreSQL and the rest is docker. But that means that upgrade
         | the OS/DB is not that simple.
         | 
         | I tried before dockerize PG, but still the OS need management
         | anyway and was harder to use diagnostics/psql/backups with
         | docker....
        
         | barbazoo wrote:
         | > Containers only give reproducible deployments, not builds
         | 
         | Could someone elaborate on this please? Doesn't it depend
         | entirely on your stack how reproducible your build is? Say I
         | have a Python app with its OS level packages installed into a
         | (base) image and its Python dependencies specified in a
         | Pipfile, doesn't that make it pretty reproducible?
         | 
         | Is the weak spot here any OS dependencies being installed as
         | part of the continuous build?
        
           | jffry wrote:
           | > Containers only give reproducible deployments, not builds
           | 
           | I think that means containers alone are insufficient for
           | creating reproducible builds, not that containers make
           | reproducible builds impossible.
        
             | lazide wrote:
             | I believe that is correct. It's also hard to make container
             | builds really reproducible (even with other parts of the
             | build being so). Some sibling comments have talked about
             | why.
             | 
             | Containers are to ops like C is to programming languages.
             | Man is it useful, but boy are there footguns everywhere.
             | 
             | Usually still better than the ops equivalent of assembly,
             | which is what came before though - all the custom manual
             | processes.
             | 
             | Kubernetes is maybe like early C++. A bit weird, still has
             | footguns, and is it really helping? Mostly it is in most
             | cases, but we're also discovering many more new failure
             | modes and having to learn some new complex things.
             | 
             | As it matures, new classes of problems will be way easier
             | to solve with less work, and most of the early footguns
             | will be found and fixed.
             | 
             | There will be new variants that hide the details and de-
             | footgun things too, and that will have various trade offs.
             | 
             | No idea what will be the equivalent of Rust or Java yet.
        
               | thinkharderdev wrote:
               | I love this analogy
        
           | smilliken wrote:
           | You already got a few good answers, but I'll echo them: you
           | can do reproducible builds in containers, and nothing's
           | stopping you from using nix inside containers. But you're at
           | the mercy of all the different package managers that people
           | will end up using (apt, npm, pip, make, curl, etc). So your
           | system is only as good as the worst one.
           | 
           | I inherited a dozen or so docker containers a while back that
           | I tried to maintain. Literally none of them would build--
           | they all required going down the rabbit hole of
           | troubleshooting some build error of some transitive
           | dependency. So most of them never got updated and the problem
           | got worse over time until they were abandoned.
           | 
           | The reason Nix is different is because it was a radically
           | ambitious idea to penetrate deep into how all software is
           | built, and fix issues however many layers down the stack it
           | needed to. They set out to boil the ocean, and somehow
           | succeeded. Containers give up, and paper over the problems by
           | adding another layer of complexity on top. Nix is also
           | complex, but it solves a much larger problem, and it goes
           | much deeper to address root causes of issues.
        
             | nunez wrote:
             | I don't know Nix and can't comment on that, but in my
             | experience, when I've inherited containers that couldn't
             | build, this was usually due to its image orphaned from
             | their parent Dockerfiles (i.e. someone wrote a Dockerfile,
             | pushed an image from said Dockerfile, but never committed
             | the Dockerfile anywhere, so now the image is orphaned and
             | unreproducable) or due to the container being mutated after
             | being brought up with `docker exec` or similar.
             | 
             | Assuming that the container's Dockerfile is persisted
             | somewhere in source control, the base image used by that
             | Dockerfile is tagged with a version whose upstream hasn't
             | changed, and that the container isn't modified from the
             | image that Dockerfile produced, you get extremely
             | reproducable builds with extremely explicit dependencies
             | therein.
             | 
             | That said, I definitely see the faults in all of this (the
             | base image version is mutable, and the Dockerfile schema
             | doesn't allow you to verify that an image is what you'd
             | expect with a checksum or something like that, containers
             | can be mutated after startup, containers running as root is
             | still a huge problem, etc), but this is definitely a step
             | up from running apps in VMs. Now that I'm typing this out,
             | I'm surprised that buildpacks or Chef's Habitat didn't take
             | off; they solve a lot of these problems while providing
             | similar reproducability and isolation guarantees.
        
               | tonyarkles wrote:
               | So as a quick example from my past experiences, using an
               | Ubuntu base image is fraught. If you don't pin to a
               | specific version (e.g. pinning to ubuntu:20.04 instead of
               | ubuntu:focal-20220316), then you're already playing with
               | a build that isn't reproducible (since the image you get
               | from the ubuntu:20.04 tag is going to change). If you
               | _do_ pin, you have a different problem: your apt
               | database, 6 months from now, will be out of date and lots
               | of the packages in it will no longer exist as that
               | specific version. The solution is  "easy": run an "apt
               | update" early on in your Dockerfile... except that goes
               | out onto the Internet and again becomes non-
               | deterministic.
               | 
               | To make it much more reproducible, you need to do it in
               | two stages: first, pinning to a specific upstream
               | version, installing all of the packages you want, and
               | then tagging the output image. Then you pin to to that
               | tag and use that to prep your app. That's... probably...
               | going to be pretty repeatable. Only downside is that if
               | there is, say, a security update released by Ubuntu
               | that's relevant, you've now got to rebuild both your
               | custom base and your app, and hope that everything still
               | works.
        
             | jka wrote:
             | Do your developers run the same Nix packages that you
             | deploy to production?
             | 
             | (that sounds like a energy-level-transition in developer
             | productivity and debugging capability, if so)
        
               | smilliken wrote:
               | Yeah, the environment is bit-for-bit identical in dev and
               | prod. Any difference is an opportunity for bugs.
               | 
               | OK, there's one concession, there's an env var that
               | indicates if it's a dev and prod environment. We try to
               | use it sparingly. Useful for stuff like not reporting
               | exceptions that originate in a dev environment.
               | 
               | Basically, there's a default.nix file in the repo, and
               | you run nix-shell and it builds and launches you into the
               | environment. We don't depend on anything outside of the
               | environment. There's also a dev.nix and a prod.nix, with
               | that single env var different. There's nothing you can't
               | run and test natively, including databases.
               | 
               | Oh, it also works on MacOS, but that's a different
               | environment because some dependencies don't make sense on
               | MacOS, so some stuff is missing.
        
               | Gwypaas wrote:
               | How do you manage quick iteration loops?
        
           | thinkingkong wrote:
           | The dependencies can change out from underneath you
           | transparently unless you pin everything all the way down the
           | stack. Upstream docker images for example are an easy to
           | understand vector of change. The deb packages can all change
           | minor versions between runs, the npm packages (for example)
           | can change their contents without making a version bump.
           | Theres tons of implicit trust with all these tools, build
           | wise.
        
             | Osiris wrote:
             | npm packages can change without a version change?
             | 
             | Can you explain this?
             | 
             | npm doesn't allow you to delete any published versions (you
             | can only deprecate them). You aren't allowed to publish a
             | version that's already been published.
             | 
             | Even when there have been malicious packages published the
             | solution has been to publish newer versions of the package
             | with the old code. There's no way to delete the malicious
             | package (maybe npm internally can do it?).
        
               | thinkingkong wrote:
               | Sorry, without a minor version change. You can easily
               | publish a patch version and most people don't pin that
               | part of their dependency.
        
               | LunaSea wrote:
               | They don't need to pin it directly.
               | 
               | They only need to "npm ci" (based on package-lock.json)
               | instead of "npm install" (based on package.json) within
               | the Docker container to get a fully reproducible build.
        
           | monocasa wrote:
           | I think that's what they're saying, that containers are
           | orthogonal to reproducible builds.
        
           | q_eng_anon wrote:
           | https://github.com/GoogleContainerTools/distroless
           | 
           | This is the container community's response to this ambiguity
           | I think.
        
           | hinkley wrote:
           | This is something that even the Docker core team is not
           | entirely clear on (either they understand it and can't
           | explain it, or they don't understand it).
           | 
           | The RUN command is Turing complete, therefore cannot be
           | idempotent. Nearly every usable docker image and base image
           | include the RUN command, often explicitly to do things that
           | are not repeatable, like "fetch the latest packages from the
           | repository".
           | 
           | This is all before you get to application code which may be
           | doing the same non-repeatable actions with gems or crates or
           | maven or node modules, or calling a service and storing the
           | result in your build. Getting repeatable docker images
           | usually is a matter of first getting your build system to be
           | repeatable, and then expanding it to include the docker
           | images. And often times standing up a private repository so
           | you have something like a reliable snapshot of the public
           | repository, one you can go back to if someone plays games
           | with the published versions of things.
        
             | treis wrote:
             | >Getting repeatable docker images
             | 
             | But there's no real need for repeatable build docker
             | images. You copy the image and run it where ever you need
             | to. The entire point of Docker is to not have to repeat the
             | build.
        
               | hinkley wrote:
               | If you could explain that to the Docker team you would be
               | performing a great humanitarian service.
               | 
               | They for many years close issues and reject PRs based on
               | the assertion that Dockerfile output needs to be
               | repeatable (specifically, that the requested feature or
               | PR makes it not repeatable). It's not, it can't be
               | without a time machine, and if it was I believe you'd
               | find that ace icing traction would have been more
               | difficult, even impossible.
        
             | meatmanek wrote:
             | pedantic nit-pick that doesn't detract from your main
             | point: Turing completeness doesn't imply that it can never
             | be idempotent. In fact we'd expect that a particular Turing
             | machine given a particular tape should reliably produce the
             | same output, and similarly any Turing-complete program
             | given the exact same input should produce the exact same
             | output. Turing machines are single-threaded, non-networked
             | machines with no concept of time.
             | 
             | The main reason the RUN statement isn't idempotent, in my
             | opinion, is that it can reach out to the network and e.g.
             | fetch the latest packages from a repository, like you
             | mention. (Other things like the clock, race conditions in
             | multi-threaded programs, etc. can also cause RUN statements
             | to do different things on multiple runs, but I'd argue
             | those are relatively uncommon and unimportant.)
        
               | dilyevsky wrote:
               | Yup, the problem is not that RUN is Turing Complete the
               | problem is that it's non-hermetic.
        
         | rootusrootus wrote:
         | > But I'd so often dive in to understand the details and be
         | disillusioned about how much complexity there is for relatively
         | little benefit.
         | 
         | I feel similarly about many cloud services in e.g. AWS. I get
         | where it's sometimes handy, but the management overhead can get
         | pretty insane. I used to hear people say "just do X in AWS and
         | ta-da you're done!" only to find out that in many cases it
         | isn't actually a net improvement until you are doing a _lot_ of
         | the same thing. And then it costs $$$$$$.
        
           | jmt_ wrote:
           | Totally agree. Something I find myself thinking/saying more
           | and more is "don't underestimate what you can accomplish with
           | a $5 VPS". Maybe a little hyperbolic, but computers are
           | really fast these days.
           | 
           | Elastic resources are pretty magical when you need them, but
           | the majority of software projects just don't end up requiring
           | as many resources as devs/management often seem to assume.
           | Granted, migrating to something like AWS, if you do end up
           | needing it, is a pain, but so is adding unnecessary layers of
           | AWS complexity for your CRUD app/API backend.
           | 
           | Also wonder how many devs came up on AWS and prefer it for
           | familiarity and not having to know and worry too much about
           | aspects of deployment outside of the code itself (i.e not
           | having to know too much about managing postgres, nginx, etc).
        
         | higeorge13 wrote:
         | It sounds interesting. You mentioned big data; what is your
         | tech stack other than postgres?
        
           | smilliken wrote:
           | Python, PostgreSQL, Nix, Linux. Javascript and Typescript for
           | frontend. Everything else is effectively libraries. After
           | 300k lines of Python, with the occasional C extension, we
           | haven't found any limit.
        
         | sideway wrote:
         | Out of curiosity, what industry is your company in?
        
           | smilliken wrote:
           | Details in bio.
        
         | disintegore wrote:
         | I asked the question without context because I didn't want to
         | fire the thread off in the wrong direction but I suppose in a
         | comment chain it's fine. For what it's worth we do use
         | containers and K8s heavily at my current job.
         | 
         | I know that there are quite a few people opposed to the state
         | of containers and the technologies revolving around them. I
         | don't think the arguments they present are bad. [Attacking the
         | premise](https://drewdevault.com/2017/09/08/Complicated.html)
         | isn't particularly hard. What I don't see a lot of, however, is
         | _alternatives_.  "Learn how to do ops" is not exactly pertinent
         | when most documentation on the subject will point you towards
         | containers.
         | 
         | In addition the whole principle, while _clearly_ proven, does
         | strike me as a patch on existing platforms and executable
         | formats that weren 't designed to solve the sorts of problems
         | that we have today. While efforts could have been made to make
         | software universally more portable it seems we opted for
         | finding the smallest feasible packaging method with already
         | existing technology and have been rolling with it for a decade.
         | 
         | So essentially I'm interesting in knowing how people find ways
         | to reproduce the value these technologies offer, what they
         | instead rely on, which things they leave on the table (eg "we
         | work with petabytes of data just fine but deploying updates is
         | a nightmare"), how much manual effort they put into it, etc.
         | New greenfield projects attempting to replace the fundamentals
         | rather than re-orchestrate them are also very pertinent here.
        
           | SQueeeeeL wrote:
           | >So essentially I'm interesting in knowing how people find
           | ways to reproduce the value these technologies offer, what
           | they instead rely on, which things they leave on the table
           | 
           | Decades of work typically. That or just standard parallel
           | deployments, most things Docker is good at would take the
           | average developer many unnecessary hours to reproduce.
        
           | anamax wrote:
           | > So essentially I'm interesting in knowing how people find
           | ways to reproduce the value these technologies offer, what
           | they instead rely on, which things they leave on the table
           | 
           | It's interesting that you're not interested in what they get
           | by not using containers...
        
             | disintegore wrote:
             | What? That's the entire point.
        
           | javajosh wrote:
           | What you're asking for is an essay on comparative devops
           | architectures, with a focus on k8s alternatives. I think what
           | you'll find is a lot of ad hoc persistent systems that tend
           | to drift over time in unpredictable ways, and take on the
           | feel of a public lobby if you're being generous, a public
           | restroom if you're not. So what you're asking is really a
           | sample of these ad hoc approaches. What I think you'll find
           | are a few inspired gems, and about 1000 bad implementations
           | of K8s, Docker, Salt, Terraform and all the rest.
           | 
           | The case that always interested me is the service that can
           | really fit on one machine. Machines are truly gigantic in
           | terms of memory and disk. Heck, I worked at BrickLink which
           | ran on two beefy machines that serviced 300k active users
           | (SQL Server is actually quite good, it turns out). (A single
           | beefy server is a really great place to start iterating on
           | application design! PostgreSQL + Spring Boot (+ React) is a
           | pretty sweet stack, for example. By keeping it simple, there
           | are so many tools you just don't need anymore, because their
           | purpose is to recombine things you never chose to split. I
           | can't imagine why you'd need more than 16 cores, 128G of RAM
           | and 10T of storage to prove out an ordinary SaaS. That is a
           | ferocious amount of resources.)
        
             | anonymousDan wrote:
             | Availability?
        
               | javajosh wrote:
               | Why isn't "we're going to risk downtime to speed up our
               | BTD loop for cheap" a good answer?
        
               | NomDePlum wrote:
               | I've made that very decision.
               | 
               | Partly forced due to internal resource constraints.
               | However, swapping getting a working system out and tested
               | with real users instead of waiting an undefined time to
               | get high availability didn't lose me any sleep.
               | 
               | It's also the case that some systems can withstand a
               | degree of downtime that others can't, or it's not worth
               | paying the cost for the perceived benefit gained.
        
         | david38 wrote:
         | Ehhh, I run a high performance database system in Kubernetes
         | and it works great. It's distributed, uses EBS volumes. That's
         | about as opposite of stateless as it gets.
        
           | tetha wrote:
           | Do note that you have offloaded a good chunk of state
           | management into EBS volumes via CSI. Attaching the CSI
           | volumes is one thing, running the disks hosting the volumes
           | is another thing.
        
         | brimble wrote:
         | > That's the easy part! The hard part is managing databases.
         | 
         | Ding ding ding ding ding.
         | 
         | "But what about disk?" (so, relatedly, databases) is the hard
         | part. Balancing performance (network disks suuuuuck) and
         | flexibility ("just copy the disk image to another machine" is
         | fine when it's a few GB--less useful when it's _lots_ of GB and
         | you might need to migrate it _to another city_ and also you 'd
         | rather not have much downtime) is what's tricky. Spinning up
         | new workers, reliable- and repeatable-enough deployment with
         | rollbacks and such, that's easy, _until_ it touches the disk
         | (or the DB schema, that 's its own whole messy dance to get
         | right)
        
           | fer wrote:
           | Sometimes I think K8s is largely pushed for
           | Amazon/Google/Microsoft to sell the disk that goes along with
           | it.
        
             | zomglings wrote:
             | Don't forget network costs (especially for "high
             | availability" clusters spanning multiple regions).
        
           | zthrowaway wrote:
           | People still think stateful things are impossible on k8s but
           | Stateful sets and persistent volumes solves a lot of this.
           | You should be relying on out of the box DB replication to
           | make sure data is available in multiple areas. This is no
           | different on other platforms.
        
             | hawk_ wrote:
             | Can you elaborate on "multiple areas"? Do you mean each
             | node inside a db "cluster" should run in a different area?
             | And how does one achieve that?
        
             | mountainriver wrote:
             | Yes you can run DBs on kube now, much of people thinking
             | this isn't good comes from years back when it wasn't
        
       | freedomben wrote:
       | A previous company I was at we built a new AMI for each prod
       | release, and used EC2 auto scaling groups. I much prefer k8s, but
       | that method worked fine since we were already vendor-locked to
       | AWS for other reasons.
       | 
       | I'm not sure what you mean by:
       | 
       | > _without relying on the specific practice of shipping software
       | with heaps of dependencies._
       | 
       | Do you mean like Heroku? That usually gets expensive really
       | quickly as you scale.
        
       | chewmieser wrote:
       | AWS-specific but prior to our migration to ECS containers we used
       | their OpsWorks service. This worked reasonably well - we would
       | setup clusters of servers with specific jobs and autoscaling
       | groups would spin up servers to meet demand using Chef cookbooks
       | to set them up.
       | 
       | We used a bash script to handle what we now use GitLab's CI
       | system for. Deployments were handled through CodeDeploy and
       | infrastructure would be replaced in a blue/green fashion.
        
       | andrewfromx wrote:
       | Container technologies at Coinbase Why Kubernetes is not part of
       | our stack
       | 
       | By Drew Rothstein, Director of Engineering
       | 
       | https://blog.coinbase.com/container-technologies-at-coinbase...
        
       | kuon wrote:
       | We use ansible on bare metal (no VM) to manage about 200 servers
       | in our basement. We use PXE booting to manage the images. We use
       | a customized arch linux image and we have a few scripts to select
       | what feature we'd like. It's "old school" but it's been working
       | fine for nearly 20 years (we used plain scripts before ansible,
       | so we always used the "agentless" approach). Our networking stack
       | uses OpenBSD.
        
         | yabones wrote:
         | That sounds really interesting. If you have a write-up about
         | how it's built, the decisions that went into it, and the
         | problems you've had to solve, I would absolutely read it!
        
           | kuon wrote:
           | We have no public writing, but that could be interesting.
        
       | lowbloodsugar wrote:
       | Is your question "How do I write software without tons of
       | dependencies?" or "How do I ship software that has tons of
       | dependencies but without shipping the dependencies with them?" or
       | "How do I ship software with tons of dependencies without using
       | containers or VMs?"
        
       | wizwit999 wrote:
       | Most AWS dataplane services are on ALB + EC2, some of the newer
       | higher level ones do use containers.
        
       | donatj wrote:
       | We're still nginx + PHP + Aurora something like 30 million users
       | in and ten years later. Horizontally scales beautifully. Couple
       | small microservices but no containers outside of CI.
        
       | nailer wrote:
       | Probably most people? Newer apps often use MicroVMs, older apps
       | often use Xen VMs. Containers aren't the only containment
       | mechanism and some implementations are known as complex time
       | sinkholes.
        
         | Melatonic wrote:
         | Surprising amount of places are still just running VM's -
         | especially hybrid cloud or on "on prem" type stuff. If what you
         | have already works and your staff is trained for that type of
         | environment then I personally do not think it makes sense to
         | switch unless there is a big advantage.
        
       | cespare wrote:
       | We run on thousands of EC2 instances and our biggest systems
       | operate at millions of requests/sec. No containers*. We use EC2,
       | Route53, S3, and some other AWS stuff, plus custom tooling built
       | on their APIs. Most of our code is Go or Clojure so deployments
       | generally consist of self-contained artifacts (binary or jar)
       | plus some config files; there's little to no customization of the
       | instance for the application.
       | 
       | *Well we do have an in-house job queue system that runs jobs in
       | Linux namespaces for isolation. But it doesn't use Docker or
       | whole-OS images at all.
        
         | eckesicle wrote:
         | Would you mind naming the company you work for? It sounds like
         | a really nice place to work.
        
           | cespare wrote:
           | I work at Liftoff (https://liftoff.io/). It is indeed a great
           | place to work! Join me :) https://liftoff.io/company/careers/
        
       | zemo wrote:
       | when I was at Jackbox we ran the multiplayer servers without
       | containers and handled hundreds of thousands of simultaneous
       | websocket connections (a few thousand per node). The servers were
       | statically compiled Go binaries that took care of their own
       | isolation at the process level and didn't write to disk, they
       | were just running as systemd services. Game servers are
       | inherently stateful, they're more like databases than web
       | application layer servers. For large audience games I wrote a
       | peering protocol and implemented a handful of CRDT types to
       | replicate the state, so it was a hand-rolled distributed system.
       | Most things were handled with chef, systemd, terraform, and aws
       | autoscaling groups.
        
         | pipe_connector wrote:
         | Interesting -- how did you handle redeploys? Given that game
         | servers are stateful (so you'd want to drain servers at their
         | own pace instead of force them down at a specific time), it
         | seems like redeploying a server without machinery to do things
         | like dynamically allocate ports/service discovery for an
         | upstream load balancer would be tricky.
        
           | zemo wrote:
           | > it seems like redeploying a server without machinery to do
           | things like dynamically allocate ports/service discovery for
           | an upstream load balancer would be tricky
           | 
           | like most things with running servers, it's not that hard,
           | there's just an industry dedicated to making people think
           | it's hard (the tech industry).
           | 
           | Every game has a room code to identify the game, and a
           | websocket open handshake has a URL in it. Every room listens
           | on a unix domain socket and nginx just routes the websocket
           | to the correct domain socket based on the URL.
        
             | pipe_connector wrote:
             | How do you handle deploying a new version of nginx without
             | forcibly closing connections?
        
               | zemo wrote:
               | you don't. That requires new nodes, so you have to drain
               | nodes and replace them with new nodes that have the
               | upgrade installed before adding them to the pool that
               | hosts games. That process sucked but how often are you
               | upgrading nginx?
        
               | pipe_connector wrote:
               | Very often if your load balancer is custom. For example,
               | we have an edge service that fulfils the role you have
               | nginx for, but it handles both websockets and raw tcp
               | traffic. Our edge service is the gateway for
               | authentication and authorization -- from that service we
               | can connect users to chat rooms, matchmaking, or actual
               | game instances. We could get away with just nginx + room
               | ids and manual upgrades to the pool of
               | game/matchmaking/chat services for internal traffic
               | possibly, but we ship updates to our edge service all the
               | time and that process needs to be painless, so I was
               | curious how others have done this.
               | 
               | We're currently using systemd sockets with Accept=no,
               | then multiple edge services can accept() from the same
               | socket that's always open and bound to a known port. Once
               | a new service has started, we can signal the old service
               | to shutdown for however long it needs by no longer
               | listening on the socket and letting connections drain
               | naturally. We're thinking about changing to dynamically
               | allocated ports/sockets which is pretty natural in the
               | container orchestration world.
        
       | ryanjkirk wrote:
       | Up until recently, I was the steward of a large, distributed, and
       | profitable app that used no containers. The infrastructure was
       | managed with packages and puppet, and it worked well.
        
       | freemint wrote:
       | Almost every HPC center. Tech stack: Linux (RHEL-like); MPI as
       | middle ware for distributed communication through vendor specific
       | communication hardware (also called interconnect); shared high
       | performance network filesystem usually setup on login node,
       | scheduler like SLURM, IBM Spectrum LSF Suites or others to launch
       | jobs from login node which accessed via SSH. This setup scales to
       | tens of thousands of machines.
        
       | q3k wrote:
       | Depends what you mean by 'container runtime' or 'container
       | orchestration tool'...
       | 
       | For example, Google's Borg absolutely uses Linux namespacing for
       | its workloads, and these workloads get scheduled automatically on
       | arbitrary nodes, but this doesn't feel at all like Docker/OCI
       | containers (ie., no whole-filesystem image, no private IP address
       | to bind to, no UID 0, no control over passwd...). Instead, it
       | feels much closer to just getting your binary/package installed
       | and started on a traditional Linux server.
        
         | isseu wrote:
         | Yeah was gonna post this. Not sure if is updated enough but
         | it's worth reading the Borg paper
         | https://research.google/pubs/pub43438/
        
         | sitkack wrote:
         | Install debs into a dynamically provisioned container, it will
         | feel similar.
        
         | menage wrote:
         | > no whole-filesystem image
         | 
         | At least in the past, almost all jobs ran in their own private
         | filesystem - it was stitched together in userspace via bind
         | mounts rather than having the kernel do it with an overlayfs
         | extracted from layer tar files (since overlayfs didn't exist
         | back then), but the result was fairly similar.
         | 
         | Most jobs didn't actually request any customization so they
         | ended up with a filesystem that looked a lot like the node's
         | filesystem but with most of it mounted read-only. But e.g. for
         | a while anything running Java needed to include in their job
         | definition an overlay that updated glibc to an appropriate
         | version since the stock Google redhat image was really old.
        
         | [deleted]
        
       | songeater wrote:
       | As a longtime pseudo-lurker on these boards, and def not a
       | software dev - this seems to be the platonic deal of an ASK HN
       | question. I'll ASK HN if that is the case.
        
       | jdavis703 wrote:
       | I worked one place that did so. Our traffic was sharded over a
       | couple dozen sites, but combined we were at US top-100 scale.
       | 
       | Every site was on a very large bare metal box (sites were grouped
       | together when possible, IIRC only one required it's own dedicated
       | machine). Each box was a special snowflake.
       | 
       | The DBs were on separate hardware.
       | 
       | When I left they were starting to embrace containerized
       | microservices.
        
       | ex_amazon_sde wrote:
       | Amazon.
       | 
       | It uses an internal tool but the implementation is not important.
       | Applications and libraries and packaged independently just like
       | any Linux distribution.
        
       | zn44 wrote:
       | king games (candy crush) run on prem without containers while i
       | was there (until 2016)
        
       | pjmlp wrote:
       | We do in plenty of our workloads, using classical Windows VMs or
       | AppServices.
        
       | deckard1 wrote:
       | Many years ago I worked at a place that deployed thousands of
       | bare metal servers. We were effectively running our own cloud
       | before the cloud became a thing.
       | 
       | The way it worked was simple. We created our own Red Hat variant
       | using a custom Anaconda script. We then used PXE boot. During OS
       | install this script would call back to our central server to
       | provision itself. You can do that a few ways. If I recall, we
       | baked in a minimal set of packages into the ISO to get us up and
       | then downloaded a provision script that was more frequently
       | updated to finish the job.
       | 
       | This is still a fine way of handling horizontal SaaS type
       | scaling, where you do a sort of white labeling of your service
       | with one customer per VM. Swap Postgres/MySQL for SQLite on each
       | node and everything is just stupidly simple.
        
         | jeffrallen wrote:
         | Did you work for Tellme too?
        
       | claytongulick wrote:
       | I've done relatively large scale projects without containers.
       | 
       | In one case, we were running something like 80% of all auto
       | dealer websites on two bare metal web servers and one bare metal
       | SQL Server db, with a fail-over hot replica. Quite a bit of
       | traffic, especially on holiday weekends, and we never came close
       | to maxing out the machines. This was in 2007, on fairly modest
       | hardware.
       | 
       | I used to write fulfillment systems for Verizon, we handled about
       | 30,000 orders and 10,000ish returns per day, with pretty complex
       | RMA and advance-replacement logic, ILEC integration and billing
       | in Business Basic on NCR Unix, with complex pick/pack/ship rules
       | and validation. Again, that was a single bare metal db server,
       | SQL Server and a web server with SOAP/XML/WSDL services (this was
       | in early 2000's, on laughable hardware by today's standard).
       | 
       | I was part of writing a healthcare claims processing system that
       | did about 1TB per day of data processing and storage, on a single
       | bare metal SQL Server instance and OLAP cubes for analytics.
       | 
       | I've also been involved in projects that took the opposite
       | approach, Kubernetes, Kafka, CQRS, etc... in order to do "massive
       | scale" and the result was that they struggled to process a few
       | thousand health care messages per day. Obviously the devil is in
       | the details of implementation, but I wasn't particularly
       | impressed with the "modern" tech stack. So many layers of
       | abstraction, each has a performance and operational cost.
       | 
       | These days I mostly use Node and Postgres, so I haven't had a lot
       | of need for containers. npm install is a pretty simple mechanism
       | for dependencies, I try to keep the stack minimal and lean. With
       | the current cloud offerings of hundreds of VCPUs, hundreds of
       | gigs of memory and petabytes of storage, it's difficult for me to
       | envision a scenario where vertical scale wouldn't meet the needs
       | of any conceivable use case.
       | 
       | This works for me, partly because I'm a fair hand at sysadmin
       | stuff on linux and prefer maintaining a well-tuned "pet" over a
       | bunch of ephemeral and difficult to debug "cattle".
        
       | dijit wrote:
       | depends on what kind of scale.
       | 
       | I used to make online games and our gameservers at launch were in
       | the order of 100,000 physical CPU cores and about 640TiB of RAM
       | spread across the world.
       | 
       | But we did this: On windows, with a homegrown stack, before
       | kubernetes was a thing (or when it was becoming a thing).
       | 
       | With the advent of cloud we wrote a predictive autoscaler too.
       | That was fun.
       | 
       | I don't work there anymore, and they moved to Linux, but they're
       | hiring: https://www.massive.se/
       | 
       | You can learn a lot from a technical interview ;)
        
       | jedberg wrote:
       | Netflix was container free or nearly so when I left in 2015, but
       | they were starting to transition then and I think they are now
       | container based.
       | 
       | At the time they would bake full machine images, which is really
       | just a heavyweight way of making a container.
        
       | camtarn wrote:
       | Don't know if they still use it (I suspect so!) but at least as
       | of 2015 Amazon was using a homebrewed deployment service called
       | Apollo, which could spin up a VM from an internally developed
       | Linux image then populate it with all the software and
       | dependencies needed for a single service. It later inspired AWS
       | CodeDeploy which does the same thing.
       | 
       | I remember it being pretty irritating to use, though, since it
       | wasn't particularly easy to get Apollo to deploy to a desktop
       | machine in the same way it would in production, and of course you
       | couldn't isolate yourself from the desktop's installed
       | dependencies in the same way. I'm using Docker nowadays and it
       | definitely feels a lot smoother.
       | 
       | This is a nice writeup:
       | https://www.allthingsdistributed.com/2014/11/apollo-amazon-d...
        
         | mijoharas wrote:
         | Not too much of an update, but they were still using it in
         | 2017.
        
         | mitchs wrote:
         | I've always thought of Apollo environments as containers before
         | kernel features for containers existed. With enough environment
         | variables and wrapper scripts taking the name of real binaries
         | to populate stuff like LD_LIBRARY_PATH, Apollo makes a private
         | environment that is only _slightly_ contaminated by the host.
        
           | camtarn wrote:
           | Ooh, I'd forgotten about the wrapper scripts.
           | 
           | And yeah, the other thing that made it work, I guess, was
           | having the machine image be very minimal, very tightly
           | controlled, and very infrequently changed - so you didn't
           | have to worry about things changing all the time due to the
           | upstream distro.
        
           | ShroudedNight wrote:
           | Apollo environments try to solve the same problem, but
           | they're _definitely_ not containers - one cannot depend on
           | any fixed paths. I still bear scars from wrestling with
           | various packages ' autotools / glib / Python path
           | dependencies trying and hoping desperately to find poorly /
           | undocumented environment variable overrides to get them to
           | let go of a static, build-time specified path and play nicely
           | with Apollo's environment path shell-game.
        
           | whateveracct wrote:
           | Apollo reminded me more of Nix than containers. The wrapper
           | scripts are super Nix-y :)
        
             | cheeze wrote:
             | That's what it was. VM with a barebones deployment system
             | that had a ton of hooks in it.
             | 
             | Really really smart idea that IMO helped Amazon in the
             | 2010s immensely. While everyone else was figuring out k8s
             | and whatnot, Amazon had a good system with CI in place for
             | years.
             | 
             | I wonder how it's fared over time. Amazon was never known
             | for internal tooling in many other places. I hope Apollo is
             | still running strong today.
        
               | lamontcg wrote:
               | Anyone remember disco and third-party packages, guam and
               | cmf?
        
       | tikkabhuna wrote:
       | The finance company I work at has historically been copying jars
       | and booting them up with some scripts. Servers all bare metal and
       | some tuned for performance. We have a lot of pet servers as
       | different users needed different tools and asked a sysadmin to
       | install it.
       | 
       | We're now heavily moving towards containers and the primary
       | motivator is choice of languages and standardisation. Being bare
       | metal is fine when you use a single language, but you'll find you
       | shoehorn other languages with the same process. Interpreted
       | languages (Node/Python) are a nightmare and you'll have to find a
       | pattern for running multiple versions on the same host.
       | 
       | Containers is just a real nice deployment unit and is well
       | supported by other tools.
       | 
       | If you are really keen on this path, do consider up front how you
       | will handle version upgrades of the runtime or dependencies.
        
       | tptacek wrote:
       | Ironically, here at Fly.io, we run containers (in single-use VMs)
       | for our customers, but none of our own infrastructure is
       | containerized --- though some of our customer-facing stuff, like
       | the API server, is.
       | 
       | We have a big fleet of machines, mostly in two roles (smaller
       | traffic-routing "edge" hosts that don't run customer VMs, and
       | chonky "worker" hosts that do). All these hosts run `fly-proxy`,
       | a Rust CDN-style proxy server we wrote, and `attache`, a Consul-
       | to-sqlite mirroring server we built in Go. The workers also run
       | our orchestration code, all in Go, and Firecracker (which is
       | Rust). Workers and WireGuard gateways run a Go DNS server we
       | wrote that syncs with Consul. All these machines are linked
       | together in a WireGuard mesh managed in part by Consul.
       | 
       | The servers all link to our logging and metrics stack with Vector
       | and Telegraf; our core metrics stack is another role of chonky
       | machines running VictoriaMetrics.
       | 
       | We build our code with a Buildkite-based CI system and deploy
       | with a mixture of per-project `ctl` scripts and `fcm`, our in-
       | house Ansible-like. Built software generally gets staged on S3
       | and pulled by those tools.
       | 
       | Happy to answer any questions you have. I think we fit the bill
       | of what you're asking about, even though if you read the label on
       | our offering you'd get the opposite impression.
        
         | cpach wrote:
         | Intriguing!
         | 
         | This makes me curious: How does one learn to design and build
         | systems like this...?
         | 
         | Also: How do you folks at Fly decide what parts to use "as is"
         | and what parts to build from scratch? Do you have any specific
         | process for making those choices?
        
           | tptacek wrote:
           | We had to build the orchestration stuff (it was originally a
           | Nomad driver, but has outgrown that) because the tooling to
           | run OCI containers as Firecracker VMs didn't exist in a
           | deployable form when we started doing this stuff.
           | 
           | Most of the big CDNs seem to start with an existing traffic
           | server like Nginx, Varnish, or ATS. One way to look at what
           | we did with our "CDN" layer is that rather than building on
           | top of something like Nginx, we built on top of Tokio and
           | Hyper and its whole ecosystem. We have more control this way,
           | and our routing needs are fussy.
           | 
           | By comparison, we use VictoriaMetrics and ElasticSearch (I
           | don't know about "as-is" --- lots of tooling! --- but we
           | don't muck with the cores of these packages), because our
           | needs are straightforwardly addressed by what's already
           | there.
           | 
           | Lots of companies doing stuff similar to what we're doing
           | have elaborate SDN and "service mesh" layers that they built.
           | We get away with the Linux kernel networking stack and a
           | couple hundred lines of eBPF.
           | 
           | We definitely don't have a specific process for this stuff;
           | it's much more an intuition, and is more about our
           | constraints as a startup than about a coherent worldview.
        
       | asciimov wrote:
       | I know a few places that do. Their systems were already super-
       | reliable and the decision makers don't feel the need to change
       | things just because new tools are available.
        
       | jcadam wrote:
       | Is using CUDA inside a container still a massive PITA?
        
         | bogomipz wrote:
         | Can you elaborate on what makes containerized CUDA so
         | difficult?
        
           | Melatonic wrote:
           | I would guess the fact that its a highly proprietary design
           | that is designed to specifically scale to massive amounts of
           | parallel threads on specific physical hardware devices
        
           | nancarrow wrote:
           | the CUDA library versions are very coupled to the hosts' CUDA
           | driver versions and the container system itself (Docker)
           | needs special code to link the two
        
       | cyberge99 wrote:
       | I believe tools like nomad and consul shine here.
       | 
       | Using nomad as a job scheduler and deployer allows you to use
       | various modules for jobs: java, shell, ec2, apps (and
       | containers).
       | 
       | I use it in my homelab and it's great. That said, I don't use it
       | professionally.
       | 
       | I think Cloudflare is running this stack alongside firecracker
       | for some amazing edge stuff.
        
         | toomuchtodo wrote:
         | Nomad is used by large financial institutions, Cloudflare,
         | EBay, Target, Walmart etc. It's a solid tool for orchestration
         | and scheduling. I use it personally at home for ArchiveTeam-esq
         | archival operations.
         | 
         | https://www.nomadproject.io/docs/who-uses-nomad
         | 
         | (have advocated for its use in multiple financial orgs as part
         | of my day gig, no affiliation with Hashicorp)
        
         | nhoughto wrote:
         | Recent convo with Cloudflare they said they are mostly
         | container-less, debian packages installed and running on the
         | the host, which is impressive at their scale and complexity.
         | They did seem to think that this approach was hitting its
         | limits and what the future might look like when I spoke to ppl
         | there recently (end 2021), assumption being the future could be
         | containers.
        
       | [deleted]
        
       | armcat wrote:
       | Not sure if this counts, but for more than a decade I was at a
       | telecom vendor, working with radio base stations (3G, 4G and 5G).
       | That (to me), is probably one of the most distributed systems on
       | the planet - we worked across several million nodes around the
       | globe. I've been out of the loop for a bit, but I know they now
       | have vRAN, Cloud RAN, etc (basically certain soft-real time
       | functions pulled out of base stations and deployed as VMs or
       | containers). But back then, there was no virtualization being
       | used.
       | 
       | The tech stack was as follows: hardware was either PowerPC or ARM
       | based System-on-Chip variants; we initially used our own in-house
       | real-time OS, but later switched to a just-enough Linux distro;
       | management functions were implemented either in IBM's "real-time"
       | JVM (J9), or in Erlang; radio control plane (basically messages
       | used to authenticate you, setup the connection and establish
       | radio bearers, i.e. "tunnels" for payload) was written in C++.
       | Hard real-time functions (actual scheduling of radio channel
       | elements, digital signal processing, etc) were written in C and
       | assembly.
       | 
       | Really cool thing - we even deployed a xgboost ML model on these
       | (used for fast frequency reselection - reduced your time in low
       | coverage) - the model was written in C++ (no Python runtime was
       | allowed), and it was completely self-supervised, closed-look (it
       | would update/finetune its parameters during off-peak periods,
       | typically at night).
       | 
       | Back then, we were always self-critical of ourselves, but looking
       | back at it, it was an incredibly performant and robust system. We
       | accounted for every CPU cycle and byte - at one point I was able
       | to do a walkthrough (from-memory) of every single memory
       | allocation during a particular procedure (e.g. a call setup). We
       | could upgrade thousands of these nodes in one maintenance window,
       | with a few secs of downtime. The build system we always
       | complained about, but looking back at it, you could compile and
       | package everything in a matter of minutes.
       | 
       | Anyway, I think it was a good example of what you can accomplish
       | with good engineering.
        
       | boredtofears wrote:
       | Hashicorp packer + AWS CDK (or Terraform) can you get a lot of
       | the characteristics of containerized deployments without actual
       | containers.
        
       | maxk42 wrote:
       | Back in 2010 I built and operated MySpace' analytics system on 14
       | EC2 instances. Handled 30 billion writes per day. Later I was
       | involved in ESPN's streaming service which handled several
       | million concurrent connections with VMs but no containers. More
       | recently I ran an Alexa top 2k website (45 million visitors per
       | month) off of a single container-free EC2 insurance. Then I spent
       | two years working for a streaming company that used k8s +
       | containers and would fall over of it had more than about 60
       | concurrent connections per EC2 instance. K8s + docker is much
       | heavier than advertised.
        
         | danielrhodes wrote:
         | Docker is far heavier - the overhead is the flexibility and
         | process isolation you get. I imagine that's really useful for
         | certain types of workloads (e.g. an ETL pipeline), but is crazy
         | inefficient for something single purpose like a web app.
        
           | ryanjkirk wrote:
           | Docker is heavier (and more dangerous) because of dockerd,
           | the management and api daemon that runs as root. Actual
           | process isolation is handled by cgroup controls which are
           | already built into the kernel and have been for years. You
           | can apply them to any process, not just docker ones.
           | 
           | However, Docker is essentially dead; the future is CRI-O or
           | something similar which has no daemon and runs as an
           | unprivileged user. And you still get the flexibility and
           | process isolation, but with more security.
        
             | freebuju wrote:
             | All the so-called "docker killers" are essentially
             | unfinished products. They don't compare 1:1 to docker in
             | feature set and even if they run as rootless, they still
             | are vulnerable to namespace exploits in the Linux kernel.
             | Though docker runs as root, it's still well protected out-
             | of-the-box for the average user and is a very mature
             | technology.
        
               | ryanjkirk wrote:
               | Are you from 2018? Everyone running OpenShift is using
               | CRI-O and that footprint is not small. We made the switch
               | in our EKS and vanilla k8s clusters in 2021. Docker has
               | now even made their API OCI-compliant in order to not be
               | left behind. And the point is that most people don't want
               | a docker feature-for-feature running in prod. The attack
               | surface is simply too large. I don't need an API server
               | running as root on all my container hosts.
               | 
               | Use docker on your laptop, sure. Its time in prod is
               | over.
        
               | Art9681 wrote:
               | Agreed. Tons of obsolete assumptions in this thread. We
               | have been using Podman / OpenShift in production and
               | never ran into a use case where Docker was needed.
        
               | richardwhiuk wrote:
               | Kubernetes has removed docker, so I think that's
               | basically it from a large scale perspective.
        
               | p_l wrote:
               | One of the biggest benefits of k8s for me, back in 2016
               | when I first used it in prod, was that it threw away all
               | the extra features of Docker and implemented them
               | directly by itself - better. Writing was already in the
               | wall that docker will face stern competition that doesn't
               | have all of its accidental complexity (rktnetes and
               | hypernetes were a thing already)
        
               | richardwhiuk wrote:
               | Kubernetes used to (tediously) pass everything through to
               | Docker, but since 1.20, that's resolved, and it now uses
               | containerd.
        
               | p_l wrote:
               | Not everything - for a bunch of things, the actual setup
               | increasingly happened _outside_ docker then docker was
               | just informed how to access it, bypassing all the higher
               | level logic in Docker.
               | 
               | 1.20 is when docker mode got deprecated, IIRC, but many
               | of us were already happily running in containerd for some
               | time.
        
       | jokethrowaway wrote:
       | Can't say I'm working at scale but one of my product is latency
       | sensitive and we stripped docker because it was slowing down each
       | request.
       | 
       | I never really got to the bottom of it, someone linked me a bug
       | in the interaction of docker / linux kernel (now fixed) which
       | could have caused it, but I don't have time to waste chasing
       | docker performance.
       | 
       | Ours is a fairly simple setup: one postgres db per machine, one
       | python app per machine on $cheapVPSProvider; number of instances
       | goes up and down based on traffic (basically cloning one of the
       | machines); a load balancer in front; data gets updated once per
       | day and replicated; auth / subscription status data is stored in
       | redis
        
       | wanderr wrote:
       | Grooveshark didn't use any of that. We were very careful about
       | avoiding dependencies where possible and keeping our backend code
       | clean and performant. We supported about 45M MAU at our biggest,
       | with only a handful of physical servers. I'm not aware of any
       | blog posts we made detailing any of this, though. And if you're
       | not familiar with the saga, Grooveshark went under for legal, not
       | technical reasons. The backend API was powered by nginx, PHP,
       | MySQL, memcache, with a realtime messaging server built in Go. We
       | used Redis and Mongodb for some niche things, had serious issues
       | with both which is understandable because they were both immature
       | at the time, but Mongodb's data loss problems were bad enough
       | that I would still not use them today.
       | 
       | That said, I'm using Docker for my current side project. Even if
       | it never runs at scale, I just don't want to have to muck around
       | with system administration, not to mention how nice it is to have
       | dev and prod be identical.
        
         | jimbob45 wrote:
         | I miss Grooveshark to this day. Thanks for building such an
         | excellent product!
        
         | hungryforcodes wrote:
         | Groove shark was the best.
        
         | brimble wrote:
         | > That said, I'm using Docker for my current side project. Even
         | if it never runs at scale, I just don't want to have to muck
         | around with system administration, not to mention how nice it
         | is to have dev and prod be identical.
         | 
         | This is why I use docker, at work and for my own stuff. No
         | longer having to give a shit whether the hosting server is LTS
         | or latest-release is _wonderful_. I barely even have to care
         | which distro it is. Much faster and easier than doing something
         | similar with scripted-configuration VMs, plus the hit to
         | performance is much lower.
        
         | mhitza wrote:
         | What a great service. I'd be curious if you could go into
         | details how the radio feature worked back then, because I found
         | myself receiving worse suggestions when I used similar features
         | in Spotify/Google Play Music.
        
         | pineconewarrior wrote:
         | I loved Grooveshark! thanks for your work
        
         | hexfish wrote:
         | Thanks for giving some insight into this. Grooveshark was
         | absolutely great!
        
         | turkeywelder wrote:
         | I miss Grooveshark so much - Licensing issues aside it was one
         | of the best UIs for music ever. I'd love to hear more stories
         | about the backend
        
         | Aachen wrote:
         | Man I miss Grooveshark still today. Spotify is okay but still a
         | step down. Needing billion-dollar licensing schemes to even get
         | started makes this such a hard market to actually get into and
         | provide a competitively superior experience.
        
       ___________________________________________________________________
       (page generated 2022-03-22 23:00 UTC)