[HN Gopher] Accidental complexity, essential complexity, and Kub...
       ___________________________________________________________________
        
       Accidental complexity, essential complexity, and Kubernetes
        
       Author : paulgb
       Score  : 83 points
       Date   : 2022-09-05 19:47 UTC (3 hours ago)
        
 (HTM) web link (driftingin.space)
 (TXT) w3m dump (driftingin.space)
        
       | theteapot wrote:
       | > Through Brooks' accidental-vs-essential lens, a lot of
       | discussion around when to use Kubernetes boils down to the idea
       | that essential complexity can become accidental complexity when
       | you use the wrong tool for the job. The number of states my
       | microwave can enter is essential complexity if I want to heat
       | food, but accidental complexity if I just want a timer. With that
       | in mind ...
       | 
       | I'm twisting my mind trying to grasp this interpretation in
       | Brooks' complexity paradigm. I'm sure Brooks would be interested
       | to learn there exists so called wrong-tools that can reduce
       | essential complexity to accidental :). I think Brooks would put
       | it that the essential complexity is the same and irreducible, but
       | the accidental complexity is increased when the wrong tool is
       | used.
        
       | [deleted]
        
       | threeseed wrote:
       | > The fact that you need a special distribution like minikube,
       | Kind, k3s, microk8s, or k0s to run on a single-node instance.
       | (The fact that I named five such distributions is a canary of its
       | own.)
       | 
       | They serve completely different purposes though.
       | 
       | Some are Docker-based designed to be lightweight for testing e.g
       | Kind, others are designed to be scalable to full clusters e.g
       | k3s.
       | 
       | And I think having the choice is a good thing as it proves that
       | Kubernetes is vendor-agnostic.
        
       | bob1029 wrote:
       | Also see closely related paper:
       | 
       | http://curtclifton.net/papers/MoseleyMarks06a.pdf
       | 
       | This one inspired our most recent system architecture.
        
         | candiddevmike wrote:
         | Thank you for sharing this, much more insightful than the
         | article
        
       | orf wrote:
       | > One way to think of Kubernetes is as a distributed framework
       | for control loops. Broadly speaking, control loops create a
       | declarative layer on top of an imperative system.
       | 
       | Finally: a post about Kubernetes that actually understands what
       | it fundamentally is.
       | 
       | > Kubernetes abstracts away the decision of which computer a pod
       | runs on, but reality has a way of poking through. For example, if
       | you want multiple pods to access the same persistent storage
       | volume, whether or not they are running on the same node suddenly
       | becomes your concern again.
       | 
       | I'm not sure I agree that this itself constitutes a leaky
       | abstraction. Kubernetes still abstracts away the decision of
       | which computer a pod runs on, the persistent storage volume is
       | just another declarative _constraint_ on where a pod can be
       | scheduled. It 's no different from specifying you need "40" CPU
       | cores, and thus only being able to be scheduled on nodes with >40
       | cores.
       | 
       | > There's no fundamental reason that I should need anything
       | except my compiler to produce a universally deployable unit of
       | code.
       | 
       | I agree - but you can do that with 'containers'. You just create
       | an empty "FROM scratch" container then copy your binary in. The
       | end result is that your OCI image is just a single .tar.gz file
       | containing your binary alongside a JSON document specifying
       | things like the entrypoint and any labels. Just like a shipping
       | container, this plugs into anything that ships things shaped like
       | that.
       | 
       | You get a _bunch_ of nice stuff for free with this (tagged +
       | content addressable remote storage with garbage collection!
       | inter-image layer caching!), even if you 're slinging about WASM
       | binaries I'd still package them as OCI images.
       | 
       | > The use of YAML as the primary interface, which is notoriously
       | full of foot-guns.
       | 
       | There's a lot more to say here, and it's more of a legitimate
       | criticism of Kubernetes than most of the "hurr dur k8s complex"
       | critisisms you commonly see. The ecosystem has kind of centered
       | around Helm as a way of templating resources, but it's...
       | horrible. It's all so horrible. A huge step up from ktml or other
       | rubbish from the past, but go's template language isn't fun.
       | 
       | But I'm not sure how it could have gone any differently. JSON is
       | as standard and simple as it gets and is simple to auto-generate,
       | but it's not user friendly. So either the kubernetes getting
       | started guide starts with "first go auto-generate your JSON using
       | some tool we don't distribute", or you provide a more user-
       | friendly way for people to onboard.
       | 
       | YAML is a good middle ground here between giving users the
       | ability to auto-generate resources from other systems (i.e
       | spitting out JSON and shovelling it into a k8s API) and something
       | user-friendly for people to interact with and view using
       | kubectl/k9s/whatever.
        
         | soco wrote:
         | I'm confused, I thought JSON was created because XML wasn't
         | user friendly. I on the other hand see no user friendliness in
         | either YAML, JSON or XML. Just formatted text with or without
         | tabs. Users don't like standards so whatever you choose
         | somebody will complain.
        
         | dinosaurdynasty wrote:
         | I often wonder how many footguns could've been removed if
         | projects like k8s/ansible used TOML instead of YAML.
        
           | bvrmn wrote:
           | TOML is awful substitution in the ansible context. Let's play
           | a game, how about I give you a real-life playbook and you
           | translate it to TOML?
        
       | [deleted]
        
       | mati365 wrote:
       | Kubernetes and its complexity again..
        
       | bvrmn wrote:
       | > The number of states my microwave can enter is essential
       | complexity
       | 
       | Two dials to set power and clock-timer is enough. Microwaves with
       | digital panel and led display are a perfect example of accidental
       | complexity.
        
       | hbrn wrote:
       | The more I think about it, the more I realize that generic
       | declarative style, despite sounding very promising, might not be
       | a good fit for modern deployment. Mainly due to technology
       | fragmentation.
       | 
       | There's no one true way to deploy an abstract app, each tech
       | stack is fairly unique. Two apps can look exactly the same from
       | the desired state perspective, but have two extremely different
       | deployment processes. They might even have exactly the same tech
       | stack, but different reliability requirements.
       | 
       | Somehow you need to be able to accommodate for those differences
       | in your declarative framework. So you'll pay the abstraction
       | costs (complexity). But you will only reap the benefits if you're
       | constantly switching the components of your architecture. And
       | that typically doesn't happen: by the time you get to a point
       | where you need a deployment framework, your architecture fairly
       | rigid.
       | 
       | Maybe k8s makes a lot of sense if you're Google. But 99.99% of
       | companies are not Google.
        
       | jrockway wrote:
       | I agree very much about the accidental complexity of containers.
       | Ignoring the runtime concerns (cgroups, namespaces, networking,
       | etc.), the main problem seems to have been "we can't figure out
       | how to get Python code to production", and the solution was "just
       | ship a disk image of some developer's workstation to production".
       | To do this, they created a new file format ("OCI image format",
       | though not called that at the time), a new protocol for
       | exchanging files between computers ("OCI distribution spec"), a
       | new way to run executables on Linux ("OCI runtime spec"), and
       | various proprietary versions of Make, with a caching mechanism
       | that isn't aware of the actual dependencies that go into your
       | final product. The result is a staggering level of complexity,
       | all to work around the fact that nobody even bothered trying to
       | add a build system to Python.
       | 
       | Like the author, I tend to write software in a language that
       | produces a single statically linked binary, so I don't need any
       | of this distribution stuff. I don't need layers, I don't need a
       | Make-alike, I don't need a build cache, but I still have to go
       | out of my way to wrap the generated binary and push it to a super
       | special server. Imagine a world where we just skipped all of
       | this, and your k8s manifest looks like:
       | containers:         - name: foo-app           image:           -
       | architecture: amd64             os: linux             binary:
       | https://github.com/whatever/download/foo-app/foo-app-1.0.0-linux-
       | amd64             checksum: sha256@1234567890abcdef           -
       | architecture: arm64             os: linux             binary:
       | https://github.com/whatever/download/foo-app/foo-app-1.0.0-linux-
       | arm64             checksum: sha256@abdcef1234567890
       | 
       | Meanwhile, if you don't want to orchestrate your deployment and
       | just want to run your app:                   wget
       | https://github.com/whatever/download/foo-app/foo-app-1.0.0-linux-
       | amd64         chmod a+x foo-app-1.0.0-linux-amd64         ./foo-
       | app-1.0.0-linux-amd64
       | 
       | I dunno. I feel like, as an industry, we spent billions of
       | dollars, founded brand new companies, created a new job title
       | ("Devops", don't get me started), all so that we could avoid
       | making Python output a single binary. I'm not sure that random
       | chaos did the right thing, and you're right to be bitter about
       | it.
        
       | tsimionescu wrote:
       | The article is pretty lite, though I agree with most of the
       | points.
       | 
       | However, I think that, especially in the context context of
       | Kubernetes, this part is completely wrong:
       | 
       | > Containers were a great advance, but much of their complexity
       | falls into the "accidental" category. There's no fundamental
       | reason that I should need anything except my compiler to produce
       | a universally deployable unit of code.
       | 
       | Containers are not used in Kubernetes or other similar
       | orchestrators because of their support for bundling dependencies
       | - that is a bonus at best.
       | 
       | Instead, they are used because they are a standard way of using
       | cgroups to tightly control what pieces of the system a process
       | has access to, so that multiple processes running on the same
       | system can't accidentally affect each other, and so that a
       | process can't accidentally depend on the system it is running on
       | (including things like open ports). These are key properties for
       | a system that seeks to efficiently distribute workloads on a
       | number of computers without having to understand the specifics of
       | what workload it's running.
       | 
       | They are also used because container registries are a ready made
       | secure Linux distribution-agnostic way of retrieving software and
       | referring to it with a unique name. quay.io/python:3.7 will work
       | on Ubuntu, SuSE, RedHat or any other base system, unlike relying
       | on apt/yum/etc.
        
         | paulgb wrote:
         | > Containers are not used in Kubernetes or other similar
         | orchestrators because of their support for bundling
         | dependencies - that is a bonus at best.
         | 
         | > Instead, they are used because they are a standard way of
         | using cgroups to tightly control what pieces of the system a
         | process has access to
         | 
         | Well, sure, you get both. But the point is that needing a post-
         | build step just to get that _is_ accidental complexity. I 'm by
         | no means a Java fan, but the way JVM gives jars as a compile
         | target and lets you run them with some measure of isolation is
         | an example of how things could be.
        
           | threeseed wrote:
           | With Maven, Gradle etc you can output a Docker container from
           | a single package command.
           | 
           | And I don't understand this idea of using code bundles as the
           | deployment artefact.
           | 
           | They don't specify _how_ the code should be run e.g. Java
           | version, set of SSL certificates in your keystore etc.
           | 
           | Do you really want Kubernetes to have hundreds of options for
           | each language ? Or is it better to just leave that up to the
           | user.
        
             | anonymous_sorry wrote:
             | I think the suggestion is to use native Linux binaries with
             | static linking of any libraries and resources in the
             | executable.
             | 
             | Java programs would need to be compiled as a standard ELF.
             | I think GraalVM can do this.
        
               | threeseed wrote:
               | Only some Java/Scala programs can be compiled as a single
               | binary.
               | 
               | It's been a concept that has been around for years but
               | not progressing to the point where it comes close to
               | negating the need for the JVM in a Production
               | environment.
               | 
               | In the meantime containers works today and is
               | significantly more powerful and flexible.
        
           | jayd16 wrote:
           | I just don't follow the argument. You can also use Java
           | tooling to wrap up a container image. Why is it _accidental_
           | complexity that we settled on a language agnostic target that
           | also wraps up a lot of the process isolation metadata?
        
             | paulgb wrote:
             | Java is an outlier here. For most languages the process of
             | dockerizing a codebase involves learning a separate tool
             | (Docker). As someone who knows Docker, I get the temptation
             | to say "who cares, just write a few lines of Dockerfile",
             | but after talking to a bunch of developers about our
             | container-based tool, having to leave their toolchain to
             | build an image is a bigger sticking point than you might
             | think.
        
               | jayd16 wrote:
               | But this is begging the question. If the work was more
               | integrated with the compiler, it would still need to be
               | learned. If your compiler of choice spit out an
               | deployable unit akin to an image you'd still need
               | something akin to the Dockerfile to specify exposed ports
               | and volumes and such, no?
        
               | paulgb wrote:
               | The Dockerfile doesn't specify volumes, those are
               | configured at runtime. In the case of Kubernetes, they're
               | configured in the pod spec.
               | 
               | As for ports, EXPOSE in a Dockerfile is basically just
               | documentation/metadata. In practice ports are exposed at
               | runtime (if using docker) or by configuring services (if
               | using Kubernetes), and these are unaffected by whether a
               | port is exposed in the dockerfile.
               | 
               | IMHO this is how it should be -- if I'm running a
               | containerized database or web server, I want to be able
               | to specify the port that the rest of the world sees it
               | as, I don't want the container creator to decide that.
        
         | candiddevmike wrote:
         | Containers are a temporary solution while we wait for everyone
         | to realize self contained binaries with static linking are the
         | real solution.
        
           | rocmcd wrote:
           | The executable piece is nice, but not the whole picture.
           | Configuration and, more importantly, isolation of the
           | runtime, are also huge benefits that come with containers.
        
           | tsimionescu wrote:
           | No, containers give you things that static binaries just
           | don't. How do you specify the maximum allowed memory for a
           | static binary? The ports it opens? The amount of CPU usage?
           | The locations it will access from the host file system?
           | 
           | Also, how will you distribute this static binary? How do you
           | check that the result of a download is the specified version,
           | and the same that others are downloading? How will you
           | specify its name and version?
           | 
           | By the time you have addressed all of these, you will have
           | re-implemented the vast majority of what standard container
           | definitions and registries do.
        
             | anonymous_sorry wrote:
             | The orchestrator can still use cgroups for resource
             | constraints and isolation. Or it could use virtualization -
             | it would be an implementation detail. But devs would not
             | have to build a container.
             | 
             | Binary distribution, versioning and checksumming shouldn't
             | need to be coupled to a particular format.
             | 
             | Obviously docker solves a bunch of disparate problems.
             | That's kind of the objection.
        
           | threeseed wrote:
           | That is only a real solution if everything is running on the
           | same operating system in the same environment.
           | 
           | What is the point of Kubernetes at all in your situation.
        
             | candiddevmike wrote:
             | Everything already is running on the same operating system
             | with Kubernetes where it matters (kernel, runc/crun, etc).
             | Containers are a band aid to wrap an app with additional
             | files/libraries/whatever. In Go for instance, I can include
             | all this stuff at compile time (even conditionally!).
             | 
             | I'd love to see Kubernetes be able to schedule executables
             | directly without using containers by way of systemd nspawn
             | or similar. You could have the "container feel" without the
             | complexity of the tool chain required to
             | build/deploy/run/validate containers.
        
           | theteapot wrote:
           | Sounds pretty much like containers are the solution to
           | containers.
        
           | [deleted]
        
         | zamalek wrote:
         | I think they mean different things to different people. From
         | the problems I have faced with customer-controlled OS
         | installations, the biggest thing that they offer is
         | configuration isolation (or rather independence). I have seen
         | some truly crazy shit done by customer's administrators and
         | even crazier shit done by management software that they
         | install. Taking away that autonomy is huge.
        
           | tsimionescu wrote:
           | Absolutely, but the context of the article was specifically
           | Kubernetes.
        
         | jfoutz wrote:
         | After thinking about this for a few minutes, the author might
         | be on the wrong track connecting this with containers.
         | 
         | But I think there's something really there about "environmental
         | linting". I know deep in my bones I need write access to make
         | log files, but I don't know how many times I've debugged
         | systems lacking this permission.
         | 
         | I know the log path won't be known until runtime, I know the
         | port can be specified at runtime, but I think there's a ton of
         | room for improvement around a tool that says - hey, you're
         | making this set of assumptions about your environment, and you
         | should have these health checks or tests or whatever.
         | 
         | I agree with you about this is what containers give, but I
         | think the author is really on to something about the dev
         | tooling and environment warning about what sorts of permissions
         | are needed to, like, work.
        
       ___________________________________________________________________
       (page generated 2022-09-05 23:00 UTC)