[HN Gopher] Pitfalls of Helm - Insights from 3 years with the le...
       ___________________________________________________________________
        
       Pitfalls of Helm - Insights from 3 years with the leading K8s
       package manager
        
       Author : louis_w_gk
       Score  : 66 points
       Date   : 2023-12-14 15:49 UTC (7 hours ago)
        
 (HTM) web link (glasskube.eu)
 (TXT) w3m dump (glasskube.eu)
        
       | aschleck wrote:
       | I found that using "helm template" to convert every Helm chart
       | into yaml, and then using Pulumi to track changes and update my
       | clusters (with Python transformation functions to get per-cluster
       | configuration) made my life so much better than using Helm.
       | Watching Pulumi or Terraform watch Helm watch Kubernetes update a
       | deployment felt pointlessly complicated.
        
         | jpgvm wrote:
         | I do the same with Tanka + Jsonnet, definitely a million times
         | better than dealing with Helm itself or god forbid, letting it
         | apply manifests.
        
         | notnmeyer wrote:
         | i am hearing this more and more from folks.
        
       | empath-nirvana wrote:
       | I think helm is at it's best when you need to _publicly
       | distribute_ a complex application to a large number of people in
       | a way that's configurable through parameters.
       | 
       | For internal applications, it's in an awkward place of being both
       | too complex and too simple, and in a lot of cases what you really
       | want to do is just write your own operator for the complex cases
       | and use kustomize for the simple cases.
       | 
       | Most of the problems with updating and installing helm charts go
       | away if you manage it with something like argocd to automatically
       | keep everything up to date.
        
         | numbsafari wrote:
         | Personally much prefer kustomize for the "ship an app"
         | business.
         | 
         | Probably even better is to ship a controller and a CRD for the
         | config.
         | 
         | Doing it that means you ship a schema for the parameters of the
         | config, and that you have code that can handle complexities of
         | upgrades/migrations that tools like kustomize and helm struggle
         | or fail at altogether.
        
           | mountainriver wrote:
           | Controller + CRD is the way to go and seems more in line with
           | how k8s was intended to be used.
           | 
           | The challenge has historically been that controllers are a
           | lot harder to write, but I think that story has improved over
           | the years
        
             | arccy wrote:
             | operators are great when you control it. less so when it's
             | some third party one that doesn't support that field you
             | need on a resource it creates
             | 
             | and all the customizations just end up being yaml merges
             | from a configmap string or CRD if you're lucky
        
               | mountainriver wrote:
               | Fair enough, the UX is just so much better that I'd
               | gamble it in most use cases
        
           | cortesoft wrote:
           | We switched from kustomize to helm and I really can't
           | understand why anyone would prefer kustomize. Having the
           | weird syntax for replacing things, having to look at a bunch
           | of different files to see what is going on...
           | 
           | I love how in Helm I can just look at the templates and
           | figure out what values I need to change to get what I want,
           | and I love each environment only needing a single values file
           | to see all the customizations for it.
           | 
           | People complain about it being a template language, but that
           | is exactly what you need!
        
             | Hamuko wrote:
             | > _Having the weird syntax for replacing things_
             | 
             | Isn't the "weird syntax" just either Yaml files or just
             | JSON Patches, which is a pretty easy standard?
             | 
             | > _having to look at a bunch of different files to see what
             | is going on_
             | 
             | I consider that a feature, not a bug. prod/larger-memory-
             | request.yaml makes it much easier for me to see what goes
             | into deploying the prod environment instead of for example
             | the test environment.
        
               | cortesoft wrote:
               | By "weird syntax" I mean stuff like "patchesJson6902" or
               | "configMapGenerator" or "patchesStrategicMerge" where you
               | have to know what each field means and how they work.
               | 
               | A template is much easier to read. I had zero experience
               | with go templating, but was able to figure out what it
               | all meant just by looking at the templates... they still
               | looked like kubernetes resources
               | 
               | As for looking at a bunch of different files, if you like
               | having a "larger-memory-request" file, you can still do
               | that with helm... you can use as many values files as you
               | want, just include them in precedence order. You can have
               | your "larger-memory-request" values file.
        
           | morelisp wrote:
           | > Probably even better is to ship a controller and a CRD for
           | the config.
           | 
           | Maybe it's just us, but our operations team puts pretty hard
           | restrictions on how we're allowed to talk to the K8s API
           | directly. We can turn a regular Deployment around as fast as
           | we can write it, but if we needed a controller and CRD update
           | it'd take us like three days minimum. (Which, I even sort of
           | understand because I see the absolute garbage code in some of
           | the operators the other teams are asking them to deploy...)
        
             | jen20 wrote:
             | If you run a multi-tenant Kubernetes cluster at scale,
             | operators with poor discipline spamming the API servers and
             | taking etcd down is a leading cause of sadness.
        
               | morelisp wrote:
               | This is the common view among our ops team, sure, but for
               | a vocation so prima facie obsessed with postmortems/five-
               | whys/root-causes/etc it's depressingly shallow.
        
             | cassianoleal wrote:
             | Generally speaking, operators and CRDs are more in the
             | domain of your platform rather than your products. They
             | should provide common interfaces to implement the business
             | requirements around things like uptime, HA, healthchecking,
             | observability, etc.
             | 
             | If a product team sees itself needing to deploy an
             | operator, it's likely the platform is subpar and should be
             | improved, or the product team is overengineering something
             | and could do with rethinking their approach.
             | 
             | As in most cases, a conversation with your
             | platform/ops/devops/sre/infra team should help clarify
             | things.
        
           | jpdb wrote:
           | > Probably even better is to ship a controller and a CRD for
           | the config.
           | 
           | But how do you package the controller + CRD? The two leading
           | choices are `kubectl apply -f` on a url or Helm and as soon
           | as you need any customization to the controller itself you
           | end up needing a tool like helm.
        
             | numbsafari wrote:
             | Just use kustomize. It's baked into kubectl. No need for a
             | separate tool.
        
               | cassianoleal wrote:
               | Agree. I'd recommend to start with static YAML though.
               | Use kustomize for the very few customisations required
               | for, say, different environments. Keep them to a minimum
               | - there's no reason for a controller's deployment to vary
               | too much - they're usually deployed once per cluster.
        
         | evancordell wrote:
         | This is interesting, I have the opposite opinion. I dislike
         | helm for public distribution, because everyone wants _their_
         | thing templated, so you end up making every field of your chart
         | templated and it becomes a mess to maintain.
         | 
         | Internal applications don't have this problem, so you can
         | easily keep your chart interface simple and scoped to the
         | different ways you need to deploy your own stack.
         | 
         | With Kustomize, you just publish the base manifests and users
         | can override whatever they want. Not that Kustomize doesn't
         | have its own set of problems.
        
           | moondev wrote:
           | Kustomize also supports helm charts as a "resource" which
           | makes it handy to do last mile modifications of values and
           | 'non value exposed" items without touching or forking the
           | upstream chart.
        
         | jen20 wrote:
         | The need to use something like Helm to distribute a complex
         | application is a good indication you've built something which
         | is a mess, and probably should be rethought from first
         | principles.
         | 
         | Most of the problems associated with Helm go away if you stop
         | using Kubernetes.
        
           | imglorp wrote:
           | Vendors shipping things for customers to run in their clouds
           | and prems have a very limited set of common denominators.
           | When you add in requirements like workload scaling,
           | availability, and durability, that set is very small.
           | 
           | So yeah we do this. Our product runs in 3 public clouds
           | (working on 5), single VM, etc. and our customers install it
           | themselves. We're helm plus Replicated. AMA.
        
       | deathanatos wrote:
       | > _See, there is no general schema for what goes and doesn 't go
       | inside a values.yaml file. Thus, your development environment
       | cannot help you beyond basic YAML syntax highlighting._
       | 
       | ... this is just an odd complaint. Naturally, there isn't a
       | schema -- there inherently cannot be one. Values are the options
       | _for the app_ at hand; they 're naturally dependent on the app.
       | 
       | > _but without any schema validation!_
       | 
       | I have seen people supply JSON schemas for values with the chart.
       | I appreciate that.
       | 
       | Of all the pitfalls ... the clunky stringly-typed "manipulate
       | YAML with unsafe string templating" is the biggest pitfall, to
       | me...
        
         | everforward wrote:
         | A lot of those values end up being used in places that _do_
         | have schemas. I think they 're asking for what is basically
         | inferred types.
         | 
         | They want Helm to recognize that the cpuLimit value is used as
         | a CPU limit for a Pod and throw errors for any cpuLimit that
         | isn't a valid CPU limit.
         | 
         | Agreed that the user will have to write their own schema for
         | CLI arguments.
        
       | jpgvm wrote:
       | Helm makes me sad.
       | 
       | What I do to remediate this sadness is use Helm from Tanka. There
       | is still sadness but now it's wrapped in a nice Jsonnet wrapper
       | and I can easily mutate the output using Jsonnet features without
       | having to mess with nasty Go templating.
       | 
       | I've said it a million times before but it's always worth saying
       | again:
       | 
       | Don't use string templating for structured data.
        
         | ivan4th wrote:
         | Yep. Many complain that with Lisp, you need to count
         | parentheses (spoiler: you don't need to). And then proceed to
         | count spaces for indent/nindent in the charts... That's somehow
         | ok with almost everyone
        
           | morelisp wrote:
           | It's absolutely crazy to me how many tools are in common use
           | for k8s templating which would all be wiped away with any
           | decent macro system.
        
             | speedgoose wrote:
             | The template engine is not specific to Kubernetes but
             | Golang. I wish they used something more adapted.
             | https://pkg.go.dev/text/template
        
               | morelisp wrote:
               | Not just helm. There are probably a half dozen tools for
               | rendering manifests in our company, only some use
               | text/template, and they all suck. Text replacements are
               | bad. Declarative structured patches are bad. Control flow
               | in JSON is bad. We've had a language for dealing with
               | generating complex nested structured data for years!
        
               | anonacct37 wrote:
               | text/template is probably ok... For some version of text.
               | ditto with jinja and most templating languages. The
               | cardinal sin of DevOps is using text macros to produce
               | structured data. It only exists because unfortunately
               | there is no other lowest common denominator for every
               | config file syntax.
        
               | morelisp wrote:
               | Sure and that forgives its use in maybe, like, Salt and
               | Ansible. Not in Kubernetes where everything is structured
               | in the same way, even with API-available schemas, to
               | begin with.
        
               | ianburrell wrote:
               | Have you seen Jsonnet, Dhall, and Cue? They are
               | configuration language that are more limited than general
               | purpose languages, more powerful that static, and
               | designed for config files unlike templates.
        
             | twelfthnight wrote:
             | I can't actually put it into production at my company, but
             | for selfish catharsis, I ran datamodel-codegen over our
             | cluster's jsonschema and generated Python pydantic models
             | for all resources. I was able to rewrite all our helm using
             | pure Python and Pydantic models, since Pydantic serializes
             | to json and json is valid yaml. Felt pretty good
             | 
             | We don't have any CRD, but the approach would extend to
             | those, plus you get auto complete. The k8s jsonachema isn't
             | super easy to work directly with, though.
        
         | mountainriver wrote:
         | Jsonnet isn't great either, and has been tried a bunch in the
         | k8s community.
         | 
         | I'll never understand why we don't just use a language. I
         | started writing all my k8s config in python and it's great.
        
           | mplewis wrote:
           | I agree. I write all of my K8s and surrounding cloud infra
           | specs in Pulumi using TypeScript. Never going back to Helm.
        
       | clvx wrote:
       | Another one, when you upgrade your cluster and there's an API
       | that is candidate for removal, helm doesn't have a way to update
       | the Kind reference in their metadata which causes the inability
       | to delete and update the release.
       | 
       | I personally like cuelang's philosophy but it could become a
       | little messy when you have to iterate and handle user inputs in
       | large codebases.
        
         | jpgvm wrote:
         | Cue/Jsonnet/friends are definitely the right tools for the job.
         | It's a shame they aren't more popular.
        
       | notnmeyer wrote:
       | i generally don't mind helm but im not sure i agree with every
       | point. for the really simple stateless app situation, its trivial
       | to create a chart with all the important or unique bits extracted
       | to a values file.
       | 
       | the crd shit is borderline untenable. i learned about it during
       | an absolutely cursed calico upgrade. oops.
       | 
       | since kustomize integrates tightly with kubectl these days
       | though, i just use that for new things.
       | 
       | i want fewer, simpler tools.
        
       | markbnj wrote:
       | Over seven years of using a variety of deployment tooling
       | including helm (2 and 3), kustomize and our own scripting we
       | concluded that helm's strength is as a package manager, akin to
       | dpkg. Writing good packages is complex, but the tool is quite
       | powerful if you take the time to do that. For our own deployments
       | what we typically want to to do is: build, test and push an
       | image, plug some context specific things into yaml, send the yaml
       | to the control plane and maybe monitor the result for pod
       | readiness. We have some custom tooling that does this in gitlab
       | pipelines, relying on kustomize for the yaml-spattering bits. We
       | still do use a lot of our own and third-party helm charts but for
       | us there's a clear distinction between installing packages (which
       | tend to be longer-term stable infra things) and rapidly iterating
       | on deployments of our own stuff.
        
         | degenerate wrote:
         | Any advice/ideas/articles/references on using kustomize
         | efficiently?
         | 
         | I love the idea of using a tool bundled with kubectl for zero
         | dependencies, but their examples and tutorials are horrible. I
         | can't figure out how to use it correctly to have 1 copy of YAML
         | that would deploy to 5 different environments. It seems I would
         | need multiple copies of kustomization.yaml in multiple folders,
         | if I have multiple namespaces/pods/etc...
        
           | leetrout wrote:
           | Not a kustomize expert - but yes, you likely would have a
           | folder for each thing you target.
           | 
           | It wasn't bad once I got through the docs / examples. They
           | just assume so much existing knowledge I didn't have.
        
           | Hamuko wrote:
           | We use kustomize with multiple copies of kustomization.yaml
           | and I don't know if there is a way to do it without that.
           | Basically, there's a base kustomization.yaml and then there's
           | test/kustomization.yaml, prod1/kustomization.yaml,
           | prod2/kustomization.yaml, and so on.
        
           | markbnj wrote:
           | The model is base yaml with patches applied to it results in
           | final yaml that get sent to the api, so the typical structure
           | for us is to have the base yaml live with the service source,
           | be maintained by the service owners and include all
           | environment-agnostic properties. We then have one folder per
           | targeted environment for that service which includes any
           | patches and the kustomization.yaml manifest. Basically in
           | line with what other replies have mentioned.
        
             | degenerate wrote:
             | Thanks everyone that replied, I thought I was doing
             | something wrong!
        
       | LittleChimera wrote:
       | It's a good list, although I think there's more to it even. I
       | wrote a bit more about helm design a while ago [0]. Nowadays, I
       | use Helm from kustomize quite a lot because some projects don't
       | provide any other way of deploying. However, you still need to
       | check what helm is actually generating, especially if there's any
       | hooks that need to be replaced with something declarative.
       | 
       | [0]: https://littlechimera.com/posts/helm-design/
        
       | cortesoft wrote:
       | Isn't the last point wrong? You can query the kubernetes
       | environment in your templates to customize the output based on
       | cluster specific things
        
         | BossingAround wrote:
         | The point isn't that you can never query the API, but that you
         | can't really use helm chart as a controller (and, e.g. restart
         | a pod under a certain condition, which is trivial for an
         | operator).
        
           | cortesoft wrote:
           | Oh... why wouldn't you just write an operator then? Seems
           | like a different requirement.
        
       | renlo wrote:
       | https://archive.is/grNy4
        
       | throwawaaarrgh wrote:
       | ...that's it? What about hooks being an anti pattern? What about
       | upgrades potentially resulting in blowing away your whole
       | deployment without warning? What about a lack of diffs or
       | planning changes? Or the complexity/kludginess of go template
       | logic? Or the lack of ability to order installation of resources
       | or processing of subcharts? Or waiting until a resource is done
       | before continuing to the next one? Or the difficulty of
       | generating and sharing dynamic values between subcharts? Or just
       | a dry run (template) that can reference the K8s api?
       | 
       | There's a ton of pitfalls and limits. It's still a genuinely
       | useful tool and provides value. But it's mostly useful if you use
       | it in the simplest possible ways with small charts.
       | 
       | I just wish the "operation engine" were decoupled from the
       | "generation engine", and pluggable. I like how it watches the
       | deployment, has atomic upgrades, can do rollbacks. But if you
       | want a complex deployment, you currently have to DIY it.
        
       | dijit wrote:
       | I got lazy and just wrote scripts that output k8s manifests.
       | 
       | The development story is much better (breakpoints! WHAT!?, loops
       | and control flow!?), you can catch common issues quicker by
       | adding tests, there's one "serialise" step so you don't have to
       | deal with YAML's quirks and you can version/diff your generated
       | manifests.
       | 
       | It's dumb, and stupid, but it works and it's far less cognitive
       | load.
       | 
       | Now: handling _mildly dynamic_ content outside of those generated
       | manifests... that 's a massive pain, releasing a new version of a
       | container and avoiding to touch the generated manifests: not
       | working for me.
        
         | leetrout wrote:
         | I do the same with Terraform sometimes.
         | 
         | I appreciate that TF has loops and dynamic blocks, etc etc, but
         | sometimes it's just a lot easier to look at a Jinja2 template
         | and run a script to generate the TF.
        
         | temp_praneshp wrote:
         | at my current place, we started off with kustomize. I rewrote
         | everything into helm, which was good initially (at least you
         | can force inject some common params, and others can include
         | this in their charts).
         | 
         | But people (including me) were unhappy at yaml reading; I also
         | grew to hate it with a passion because it's neither go nor
         | yaml, and super difficult to read in general. We are a
         | typescript company, and https://cdk8s.io/ has been great for
         | us. We can unit test parts of charts without rendering the
         | whole thing, distribute canonical pod/deployment/service
         | definitions, etc.
         | 
         | In all of the cases, we combined this with config outputted by
         | terraform, for env specific overrides, etc.
        
         | imglorp wrote:
         | Found the workaround confession thread.
         | 
         | Because you effectively CAN'T dynamically configure subcharts
         | with templating that's done in your main chart, see eg
         | https://github.com/helm/helm/pull/6876 here comes the hack.
         | 
         | We run helm in helm. The top chart runs post-install and post-
         | upgrade hook job which runs helm in a pod with a lot of
         | permissions. The outer helm creates values override yaml for
         | the subchart into a ConfigMap, using liberal templating, which
         | gets mounted in the helm runner pod. Then helm runs in there
         | with the custom values and does its own thing.
         | 
         | Not proud but it lets us do a lot of dynamic things straight
         | helm can't.
        
       | francoismassot wrote:
       | While I really enjoy helm when playing with k8s or kickstarting
       | projects, I never feel "safe" when using it in the long run for
       | updates/upgrades. "values.yaml" files and templating YAML files
       | are too error-prone...
        
       | baq wrote:
       | Helm is a tool to use Jinja to write an object tree (or dag), in
       | yaml.
       | 
       | This is not endorsement. This is to point out that it makes
       | hardly any sense! Use a proper programming language and serialize
       | the object tree/network to whatever format is necessary.
        
         | BossingAround wrote:
         | I think that's where the landscape is heading--language
         | frameworks that output YAML on one end, and operators that
         | control YAMLs through the K8s control loop on the other end.
        
         | jen20 wrote:
         | > tool to use Jinja
         | 
         | It's not really that. Jinja is python, Helm is written in Go
         | and uses one of the Go template languages, which has a passing
         | similarity to Jinja.
         | 
         | The rest of your comment is spot on, of course.
        
       | jimbobimbo wrote:
       | I'm not buying the example of using the operator to figure out
       | things dynamically. Especially that detection of the cloud in the
       | example is done by looking at some random labels or other
       | attributes specific to a cloud provider.
       | 
       | This is what values and templates are for: no need to guess where
       | you are deployed, I'll tell you that via values, template will
       | make sense and adjustments of how resources will look like.
        
       | btown wrote:
       | On top of what the OP mentions, Helm still doesn't have a way to
       | forward logs from its pre/post-install hooks to the system
       | calling helm upgrade (such as a Github Action) - a feature first
       | requested in 2017 and still stuck in RFC stage.
       | 
       | https://github.com/helm/helm/issues/2298
       | 
       | https://github.com/helm/helm/pull/10309
       | 
       | https://github.com/helm/community/pull/301
       | 
       | I can understand moving cautiously, but it's at a point where it
       | almost feels like allowing users to understand what Helm is doing
       | seems not to be a priority for Helm's developers.
        
       | galenmarchetti wrote:
       | points #3 and #4; "user-friendly helm chart creation" and
       | "values.yaml is an antipattern"...I think we're just all stuck in
       | this horrible middle ground between "need static declarative
       | configurations for simplicity of change management/fewest chances
       | to mess it up" and "need dynamic, sometimes even imperative logic
       | for flexibility, configurability, and ease of development"
       | 
       | several commenters have mentioned Cue/Jsonnet/friends as great
       | alternatives, others find them limiting / prefer pulumi with a
       | general purpose language
       | 
       | our solution at kurtosis is another, and tilt.dev took the same
       | route we did...adopt starlark as a balanced middle-ground between
       | general-purpose languages and static configs. you do get the
       | lovely experience of writing in something pythonic, but without
       | the "oops this k8s deployment is not runnable/reproducible in
       | other clusters because I had non-deterministic evaluation /
       | relied on external, non-portable devices"
        
       | ithkuil wrote:
       | A few years I go I tried out an alternative approach to
       | "templating".
       | 
       | Basically the idea starts from a world without templates where
       | you would distribute the k8s YAML in a form that is ready to be
       | directly applied, with whatever sensible defaults you want
       | directly present in the YAML
       | 
       | The the user would then just change the values in their copy of
       | the file to suit their needs and apply that.
       | 
       | We all recoil in horror to such a thought, but let's stop a
       | moment to think about why we do:
       | 
       | The user effectively "forked" the YAML by placing their values
       | there and what a nightmare would that be once the user would get
       | a new version of the upstream file, potentially completely
       | overhauled .
       | 
       | If the changes are very small, a simple three way merge like
       | you'd do with git would suffice to handle that. But what about
       | larger changes?
       | 
       | Most of the conflicts in the simple cases stem from the fact that
       | text based diff/merge tools are oblivious to the structure of the
       | YAML file and can only so a so-so job with many of the changes.
       | Unfortunately most people are familiar only with text based merge
       | tools and so they have been primed the hard way to assume that
       | the merges only rarely work.
       | 
       | Structural merges otoh so work much much better. But still if the
       | upstream refractors the application in a significant way (e.g.
       | changes a deployment into a stateful set or moves pieces of
       | config from a configmap into a secret!) not even a structural
       | merge can save you.
       | 
       | My idea was to bring the manifest author into play and make them
       | "annotate" the pieces of the manifest forest that contain
       | configuration that has a high level meaning to the application
       | and that would be moved around in the YAML forest as it gets
       | reshaped.
       | 
       | Another realization was that often such configuration snippets
       | are deeply embedded in other internal "languages" wrapped inside
       | string fields, subject to escaping and encodings (e.g. base64).
       | E.g. a JSON snippet inside a TOML string value inside # base64
       | encoded annotation value (if you haven't seen these abominations
       | I'm so happy for you you innocent child)
       | 
       | So I implemented a tool that uses neated bidirectional parsers
       | ("lenses") that can perform in-place editing of structured files.
       | The edits preserve formatting, comments, quoting styles, etc.
       | 
       | Even steing fields that are normally thought of as just strings
       | are actually better though if as nested "formats". For example
       | the OCI image references are composed of multiple parts. If you
       | want to just copy images to your private registry and "rebase"
       | all you image references to the new base, you can do it with an
       | update that understands the format of the OCI image references
       | instead of just doing substring replacement.
       | 
       | Knot8 is an opinionated tool meant to help manifest authors and
       | users manage setting/diffing/pulling annotated YAML k8s manifest
       | packages
       | 
       | https://github.com/mkmik/knot8
       | 
       | I didn't have the time to evangelize this approach much so it
       | didn't get any traction (and perhaps it would because it doesn't
       | have enough merits). But I encourage to give it a go. It might
       | inspire you
       | 
       | I also pulled out the "lens" mechanism in a separate binary in
       | case it could be useful to edit general purpose files:
       | 
       | https://github.com/kubecfg/lensed
        
       | nunez wrote:
       | Despite its pitfalls, I've found that Helm is still the best way
       | to distribute an app that's destined for Kubernetes. It's easy to
       | use and understand and provides just enough for me to ship an app
       | along with its dependencies.
       | 
       | I use kapp from Project Carvel and the "helm template" subcommand
       | to work around Helm's inability to control desired state. I've
       | found that kapp does a pretty good job of converging whatever
       | resources Helm installed.
        
       ___________________________________________________________________
       (page generated 2023-12-14 23:00 UTC)