[HN Gopher] Modern CI is too complex and misdirected ___________________________________________________________________ Modern CI is too complex and misdirected Author : zdw Score : 139 points Date : 2021-04-07 17:13 UTC (5 hours ago) (HTM) web link (gregoryszorc.com) (TXT) w3m dump (gregoryszorc.com) | choeger wrote: | The biggest problem with any CI system is that you need an | _execution environment_. Changing this environment should be the | same as changing the code. Docker (or rather podman) has given us | the tools to do this. | | Now if CI systems would allow me to build that container image | myself, I could pretty much guarantee that local build/tests and | CI build/tests _can_ run inside the same environment. I hacked | something like this for gitlab but it 's ugly and slow. | | So in conclusion, I think that CI systems should expect container | creation, some execution _inside_ that container, and finally | some reporting or artifact generation from the results. | mlthoughts2018 wrote: | Docker / containers are necessary but not sufficient. For | example, in a machine learning CI / CD system, there could be a | fundamental difference between executing the same step, with | the same code, on CPU hardware vs GPU hardware. | hctaw wrote: | I've spent a ton of time thinking/working on this problem myself. | I came to roughly the same conclusions about a year ago except | for a few things: | | - Build system != CI. Both are in the set of task management. | Centralized task management is in the set of decentralized tasks. | Scheduling centralized tasks is easy, decentralized is very hard. | Rather than equating one to the other, consider how a specific | goal fits into a larger set relative to how tasks are run, where | they run, what information they need, their byproducts, and what | introspection your system needs into those tasks. It's quite | different between builds and CI, especially when you need it | decentralized. | | - On the market: What's the TAM of git? That's what we're talking | about when it comes to revamping the way we build/test/release | software. | | - There's a perverse incentive in CI today, which is that making | your life easier costs CI platforms revenue. If you want to solve | the business problem of CI, solve this one. | | - There are a number of NP Hard problems in the way of a perfect | solution. Cache invalidation and max cut of a graph come to mind. | | - I don't know how you do any of this without a DAG. Yet, I don't | know how you represent a DAG in a distributed system such that | running tasks through it remain consistent and produce | deterministic results. | | - Failure is the common case. Naive implementations of task | runners assume too much success, and recovery from failure is | crucial to making something that doesn't suck. | gran_colombia wrote: | The complaint is invalid. CI pipelines are build systems. | alexcroox wrote: | That's why I love Buddy (buddy.works), you build your CI | pipelines with a UI, all the config and logic is all easily | configurable without having to know the magic config key/value | combination. Need to add a secrets file or private key; just add | it to the "filesystem" and it'l be available during the run, no | awkwardly base64ing contents into an environment string. | Unfortunately I have to use github actions/CircleCi for my iOS | deployments still, but I read MacOS container support is coming | soon. | lenkite wrote: | I so oppressed with YAML chosen as the configuration language for | mainstream CI systems. How do people manage to live with this ? I | always make mistakes - again and again. And I can never keep | anything in my head. It's just not natural. | | Why couldn't they choose a programming language ? Restrict what | can be done by all means, but something that has a proper parser, | compiler errors and IDE/editor hinting support would be great. | | One can even choose an embedded language like Lua for restricted | execution environments. Anything but YAML! | kzrdude wrote: | I guess YAML has the best solution to nested scopes, just | indentation | cratermoon wrote: | He doesn't mention Concourse, but let me say the complexity there | is beast. | jacques_chester wrote: | I've heard Concourse team members describe it by analogy to | make, FWIW. | | One thing Concourse does well that I haven't seen replicated is | strictly separating what is stateful and what isn't. That makes | it possible to understand the history of a given resource | without needing a hundred different specialised instruments. | jayd16 wrote: | This makes no sense to me. Modern build systems have reproducible | results based on strict inputs. | | Modern CI/CD handles tasks that are not strictly reproducible. | The continuous aspect also implies its integrated to source | control. | | I guess I don't understand the post if its not just semantic word | games based on sufficient use of the word sufficient. | | Maybe the point is to talk about how great Taskcluster is but the | only thing mentioned is security and that is handled with user | permissions in Gitlab and I assume Github. Secrets are associated | with project and branch permissions, etc. No other advantage is | mentioned in detail. | | Can someone spell out the point of this article? | mlthoughts2018 wrote: | Many CI systems try to strictly enforce hermetic build | semantics and disallow non-idempotent steps from being | possible. For example, by associating build steps with an exact | source code commit and categorically disallow a repeat of a | successful step for that commit. | brown9-2 wrote: | Your CI pipeline builds and tests your project, which is the | same thing your build system does, except they are each using | different specifications of how to do that. The author argues | this is a waste. | | I think by introducing continuous deployment you are changing | the topic from what the author wrote (which strictly referred | to CI). | epage wrote: | Anything in CS can be generalized to its purest, most theoretical | forms. The question is how usable is it and how much work does it | take to get anything done. | | See | | - https://danstroot.com/2018/10/03/hammer-factories/ | | - https://web.archive.org/web/20120427101911/http://jacksonfis... | | Bazel, for example, is tailored to the needs of reproducible | builds and meets its audience where it is at, on the command | line. People want fast iteration time and only occasionally need | "everything" ran. | | Github Actions is tailored for completeness and meets is audience | where its at, the PR workflow (generally, a web UI). The web UI | is also needed for visualizing the complexity of completeness. | | I never find myself reproducing my build in my CI but do find I | have a similar shape of needs in my CI but in a different way. | | Some things more tailored to CI that wouldn't fit within the | design of something like Bazel include | | - Tracking differences in coverage, performance, binary bloat, | and other "quality" metrics between a target branch and HEAD | | - Post relevant CI feedback directly on the PR | donatj wrote: | I was really disappointed when GitHub Actions didn't more closely | resemble Drone CI. Drone's configuration largely comes down to a | Docker container and a script to run in said Docker container. | mattacular wrote: | Not familiar with Drone but just pointing out Github Actions | can be used for lots of stuff besides CICD. That's just the | most popular and obvious usage. | | I actually prefer being able to grab some existing actions | plugins rather than having to write every single build step | into a shell script like with eg. Aws codePipeline for every | app. You don't have to use them, though. You could have every | step be just shell commands with Github Actions. | throwaway823882 wrote: | Drone is the ultimate example of simplicity in CI. Automatic | VCS-provider OAuth, automatic authorization for repos/builds, | native secrets, plugins are containers, server configuration is | just environment variables, SQL for stateful backend, S3 for | logs, and a yaml file for the pipelines. There's really nothing | else to it. And when you need a feature like dynamic variables, | they support things you're already familiar with, like Bourne | shell parameter expansion. Whoever created it deeply | understands KISS. | | I think the only reason it doesn't take over the entire world | is the licensing. (It's not expensive at all considering what | you're getting, but most companies would rather pay 10x to roll | their own than admit that maybe it's worth paying for software | sometimes) | salawat wrote: | Ding ding ding. We have a winner. | | All a CI pipeline is is an artifact generator. | | All a CD pipeline is is an artifact shuffler that can kick off | dependent CI jobs. | | The rest is just as the author mentions. Remote Code Execution as | a service. | hinkley wrote: | But what we have is CI tools with integrations to a million | things they shouldn't probably have integrations to. | | Most of the build should be handled by your build scripts. Most | of the deploy should be handled by deploy scripts. What's left | for a CI that 'stays in its lane' is fetching, scheduling, and | reporting, and auth. Most of them could stand to be doing a lot | more scheduling and reporting, but all evidence points to them | being too busy being distracted by integrating more addons. | There are plenty of addons that one could write that relate to | reporting (eg, linking commit messages to systems of record), | without trying to get into orchestration that should ultimately | be the domain of the scripts. | | Otherwise, how do you expect people to debug them? | | I've been thinking lately I could make a lot of good things | happen with a Trac-like tool that also handled CI and stats as | first class citizens. | jdfellow wrote: | I've been wishing for one of my small projects (3 developers) for | some kind of "proof of tests" tool that would allow a developer | to run tests locally, and add some sort of token to the commit | message assuring that they pass. I could honestly do without a | ton of the remote-execution-as-a-service in my current GitLab CI | setup, and be happy to run my deployment automation scripts on my | own machine if I could have some assurance of code quality from | the team without CI linters and tests running in the cloud (and | failing for reasons unrelated to the tests themselves half of the | time). | chriswarbo wrote: | Git pre-commit hooks can run your tests, but that's easy to | skip. | | I don't know about a "proof of test" token. Checking such a | token would presumably require some computation involving the | repo contents; but we already _have_ such a thing, it 's called | 'running the test suite'. A token could contain information | about branches taken, seeds for any random number generators, | etc. but we usually want test suites to be deterministic (hence | not requiring any token). We could use such a token in | property-based tests, as a seed for the random number | generator; but it would be easier to just use one fixed seed | (or, we could use the parent's commit ID). | zomglings wrote: | You are right. Small teams absolutely do not need to execute | code remotely, especially if the cost is having an always on | job server. | | My team writes test output to our knowledge base: | bugout trap --title "$REPO_NAME tests: $(date -u +%Y%m%d-%H%M)" | --tags $REPO_NAME,test,zomglings,$(git rev-parse HEAD) -- | ./test.sh | | This runs test.sh and reports stdout and stderr to our team | knowledge base with tags that we can use to find information | later on. | | For example, to find all failed tests for a given repo, we | would perform a search query that looked like this: "#<repo> | #test !#exit:0". | | The knowledge base (and the link to the knowledge base entry) | serve as proof of tests. | | We also use this to keep track of production database | migrations. | neeleshs wrote: | Airflow is a pretty generic DAG execution platform. In fact, some | people have built a CI (and CD) system with Airflow. | | https://engineering.ripple.com/building-ci-cd-with-airflow-g... | | https://airflowsummit.org/sessions/airflow-cicd/ | | etc | Smaug123 wrote: | "Build Systems a la Carte" is not so much ringing a bell as | shattering it with the force of its dong. | | https://www.microsoft.com/en-us/research/uploads/prod/2018/0... | | To expand, the OP ends with an "ideal world" that sounds to me an | awful lot like someone's put the full expressive power of Build | Systems a la Carte into a programmable platform, accessible by | API. | jacques_chester wrote: | Nitpick: I think you meant to write "gong". | rubiquity wrote: | Call me crazy, but I don't think they did... | godfryd2 wrote: | There is a new CI solution: https://kraken.ci. Workflows are | defined in Starlark/Python. They can be arranged in DAG. Beside | providing base CI features it also has features supporting big | scale testing that are missing in other systems. | throwawayboise wrote: | What is the name for this phenomenon: | | Observation: X is too complex/too time-consuming/too error-prone. | | Reaction: Create X' to automate/simplify X. | | Later: X' is too complex. | temporama1 wrote: | Software Development | stepbeek wrote: | I don't know the name of the fallacy/phenomenon, but it always | reminds me of this xkcd: https://xkcd.com/927/ | blacktriangle wrote: | The alternative to this seems even worse. To deal with the | overcomplexity we get things like Boot and Webpack, which | aren't even really build tools, but tools for building build | tools. | cryptonector wrote: | "tech" | slaymaker1907 wrote: | One big difference in my opinion is that a CI system can (and | should) allow for guarantees about provenance. Ideally code | signing can only be done for your app from the CI server. This | allows users/services to guarantee that the code they are running | has gone through whatever workflows are necessary to be built on | the build server (most importantly that all code has been | reviewed). | | As a principle, I consider any codebase which can be compromised | by corrupting a single person to be inherently vulnerable. | Sometimes this is ok, but this is definitely not ok for critical | systems. Obviously it is better to have a stronger safety factor | and require more groups/individuals to be corrupted, but there | are diminishing returns considering it is assumed said | individuals are already relatively trustworthy. Additionally, it | is really surprising to me how many code platforms make providing | even basic guarantees like this one impossible. | gilbetron wrote: | It is something we're definitely concerned about: | https://www.datadoghq.com/blog/engineering/secure-publicatio... | pydry wrote: | It's weird that people keep building DSLs or YAML based languages | for build systems. It's not a new thing, either - I remember | using whoops-we-made-it-turing complete ANT XML many years ago. | | Build systems inevitably evolve into something turing complete. | It makes much more sense to implement build functionality as a | library or set of libraries and piggyback off a well designed | scripting language. | bpodgursky wrote: | This is sorta Gradle's approach. | oblio wrote: | Gradle's approach is commendable, but it's too complicated | and they built Gradle on top of Groovy. Groovy is not a good | language (and it's also not a good implementation of that not | good language). | imtringued wrote: | It's good enough for me. It's unlikely to be "replaced" by | the new hotness because it was created way before the JVM | language craze where there were dozens of competing JVM | languages. Sure that means it has warts but at least I | don't have to learn a dozen sets of different warts. | gilbetron wrote: | Joe Beda (k8s/Heptio) made this same point in one of his TGI | Kubernetes videos: https://youtu.be/M_rxPPLG8pU?t=2936 | | I agree 100%. Every time I see "nindent" in yaml code, a part | of my soul turns to dust. | ithkuil wrote: | I wish more people who for some reason are otherwise forces | to use a textual templating system to output would remember | that every json object is a valid yaml value, so instead of | fiddling with indent you just ".toJson" or "| json" or | whatever is your syntax and it pull get something less | brittle. | | (Or use a structural templating system like jsonnet or ytt) | theptip wrote: | > Every time I see "nindent" in yaml code, a part of my soul | turns to dust. | | Yup. For this reason it's a real shame to me that Helm won | and became the lingua franca of composable/configurable k8s | manifests. | | The one benefit of writing in static YAML instead of dynamic | <insert-DSL / language>, is that regardless of primary | programming language, everyone can contribute; more complex | systems like KSonnet start exploding in first-use complexity. | gilbetron wrote: | Can just default to something like """ | apiVersion: v1 appName: "blah" | """.FromYaml().Execute() | | or something. | qbasic_forever wrote: | I wouldn't say helm has won, honestly. The kubectl tool | integerated Kustomize into it and it's sadly way too | underutilized. I think it's just that the first wave of k8s | tutorials that everyone has learned from were all written | when helm was popular. But now with some years of real use | people are less keen on helm. There are tons of other good | options for config management and templating--I expect to | see it keep changing and improving. | jahewson wrote: | The problem with arbitrary operations is that they are not | composable. I can't depend on two different libraries if they | do conflicting things during the build process. And I can't | have tooling to identify these conflicts without solving the | halting problem. | throwaway823882 wrote: | I call this the fallacy of apparent simplicity. People think | what they need to do is simple. They start cobbling together | what they think will be a simple solution to a simple problem. | They keep realizing they need more functionality, so they keep | adding to their solution, until just "configuring" something | requires an AI. | carlhjerpe wrote: | What is a well defined "scripting" language? Lua, Python, Ruby? | | I do agree it'd be nice with a more general purpose language | and a lib like you say, but should this lib be implemented in | rust/c so that people can easily integrate it into their own | language? | | Many unknowns but great idea. | oblio wrote: | Tcl. We already have that language and it's been around for | decades, but it's not a cool language. Its community is also | ancient, and it feels like it. | karlicoss wrote: | Literally any real language would be better. Even if I have | to learn a bit of it to write pipelines, at least I'll end up | with some transferable knowledge as a result. | | In comparison, if I learned Github Actions syntax, the only | thing I know is... Github actions syntax. Useless and | isolated knowledge, which doesn't even transfer to other | YAML-based systems because each has its own quirks. | orthoxerox wrote: | Kotlin. It has good support for defining DSLs and can | actually type check your pipeline. | jayd16 wrote: | They probably mean an interpreted language or at least | something distributed as text. But honestly, you could come | up with some JIT scheme for any language, most likely. | tremon wrote: | I'd say it's not about the capabilities of the language, but | the scope of the environment. You need a language to | orchestrate your builds and tests (which usually means | command execution, variable interpolation, conditional | statements and looping constructs), and you need a language | to interact with your build system (fetching code, storing | and fetching build artifacts, metadata administration). | | Lua would be a good candidate for the latter, but its | standard library is minimal on purpose, and that means a lot | of the functionality would have to be provided by the build | system. Interaction with the shell from Python is needlessly | cumbersome (especially capturing stdout/stderr), so of those | options my preference would be Ruby. Heck, even standard | shell with a system-specific binary to call back to the build | system would work. | oblio wrote: | People hate on it, but do you know what language would be | perfect these days? | | Easy shelling - check. | | Easily embeddable - check | | Easily sandboxable - check. | | Reasonably rich standard library - check. | | High level abstractions - check. | | If you're still guessing what language it is, it's Tcl. | Good old Tcl. | | It's just that is syntax is moderately weird and the | documentation available for it is so ancient and creaky | that you can sometimes see mummies through the cracks. | | Tcl would pretty much solve all these pipeline issues, but | it's not a cool language. | | I really wish someone with a ton of money and backing would | create "TypeTcl" on top of Tcl (a la Typescript and | Javascript) and market it to hell and back, create brand | new documentation for it, etc. | imtringued wrote: | I'd rather see an Lua successor reach widespread | adoption. Wren pretty much solved all my gripes with Lua | but there is no actively maintained Java implementation. | mumblemumble wrote: | How is the Windows support? One of my big needs for any | general-purpose build system is that I can get a single | build that works on both Windows and POSIX. Without using | WSL. | | That said, you're right, at least at first blush, tcl is | an attractive, if easy to forget, option. | jcranmer wrote: | The way I would categorize build systems (and by extension, a | lot of CI systems) is semi-declarative. That is to say, we can | describe the steps needed to build as a declarative list of | source files, the binaries they end up in, along with some | special overrides (maybe this one file needs special compiler | flags) and custom actions (including the need to generate | files). To some degree, it's recursive: we need to build the | tool to build the generated files we need to compile for the | project. In essence, the build system boils down to computing | some superset of Clang's compilation database format. However, | the steps needed to produce this declarative list are | effectively a Turing-complete combination of the machine's | environment, user's requested configuration, package | maintainers' whims, current position of Jupiter and Saturn in | the sky, etc. | | Now what makes this incredibly complex is that the | configuration step itself is semi-declarative. I may be able to | reduce the configuration to "I need these dependencies", but | the list of dependencies may be platform-dependent (again with | recursion!). Given that configuration is intertwined with the | build system, it makes some amount of sense to combine the two | concepts into one system, but they are two distinct steps and | separating those steps is probably saner. | | To me, it makes the most sense to have the core of the build | system be an existing scripting language in a pure environment | that computes the build database: the only accessible input is | the result of the configuration step, no ability to run other | programs or read files during this process, but the full | control flow of the scripting language is available (Mozilla's | take uses Python, which isn't a bad choice here). Instead, the | arbitrary shell execution is shoved into the actual build | actions and the configuration process (but don't actually use | shell scripts here, just equivalent in power to shell scripts). | Also, make the computed build database accessible both for | other tools (like compilation-database.json is) and for build | actions to use in their implementations. | humanrebar wrote: | > Build systems inevitably evolve into something turing | complete. | | CI systems are also generally distributed. You want to build | and test on all target environments before landing a change or | cutting a release! | | What Turing complete language cleanly models some bits of code | running on one environment and then transitions to other code | running on an entirely different environment? | | Folks tend to go declarative to force environment-portable | configuration. Arguably that's impossible and/or inflexible, | but the pain that drives them there is real. | | If there is a framework or library in a popular scripting | language that does this well, I haven't seen it yet. A lot of | the hate for Jenkinsfile (allegedly a groovy-based framework!) | is fallout from not abstracting the heterogeneous environment | problem. | pydry wrote: | >What Turing complete language cleanly models some bits of | code running on one environment and then transitions to other | code running on an entirely different environment? | | Any language that runs in both environments with an | environment abstraction that spans both? | | >Folks tend to go declarative to force environment-portable | configuration. | | Declarative is always better if you can get away with it. | However, it inevitably hamstrings what you can do. In most | declarative build systems some dirty turing complete hack | will inevitably need to be shoehorned in to get the system to | do what it's supposed to. A lot of build systems have tried | to pretend that this won't happen but it always does | eventually once a project grows complex enough. | humanrebar wrote: | > Any language that runs in both environments with an | environment abstraction that spans both? | | Do you have examples? This is harder to do than it would | seem. | | You would need an on demand environment setup (a virtualenv | and a lockfile?) or a homogeneous environment and some sort | of RPC mechanism (transmit a jar and execute). I expect | either to be possible, though I expect the required | verbosity and rigor to impede significant adoption. | | Basically, I think folks are unrealistic about the ability | to be pithy, readable, and robust at the same time. | [deleted] | jayd16 wrote: | Scripting languages aren't used directly because people want a | declarative format with runtime expansion and pattern matching. | We still don't have a great language for that. We just end up | embedding snippits in some data format. | phtrivier wrote: | Who are the "people" who really want that, are responsible | for a CI build, and are not able to use a full programming | language ? | | I used jenkins pipeline for a while, with groovy scripts. I | wish it had been a type checked language to avoid failing a | build after 5minutes because of a typo, but, it was working. | | Then, somehow, the powers that be decided we had to rewrite | everything in a declarative pipeline. I still fail to see the | improvement ; but doing "build X, build Y, then if Z build W" | is now hard to do. | ryan29 wrote: | People used to hate on Gradle a lot, but it was way better | than dealing with YAML IMO. Add in the ability to write | build scripts in Kotlin and it was looking pretty good | before I started doing less Java. | | I think a CI system using JSON configured via TypeScript | would be neat to see. Basically the same thing as Gradle | via Kotlin, but for a modern container (ie: Docker) based | CI system. | | I can still go back to Gradle builds I wrote 7-8 years ago, | check them out, run them, understand them, etc.. That's a | good build system IMO. The only thing it could have done | better was pull down an appropriate JDK, but I think that | was more down to licensing / legal issues than technical | and I bet they could do it today since the Intellij IDEs do | that now. | thinkharderdev wrote: | It's funny. If you stick around this business long enough | you see the same cycles repeated over and over again. | When I started in software engineering, builds were done | with maven and specified using an XML config. If you had | to do anything non-declarative you had to write a plugin | (or write a shell script which called separate tasks and | had some sort of procedural logic based on the outputs). | Then it was gradle (or SBT for me when I started coding | mostly in scala) with you could kind of use in a | declarative way for simple stuff but also allowed you to | just write code for anything custom you needed to do. And | one level up you went from Jenkins jobs configured | through the UI to Jenkinsfiles. Now I feel like I've come | full circle with various GitOps based tools. The build | pipeline is now declarative again and for any sort of | dynamic behavior you need to either hack something or | write a plugin of some sort for the build system which | you can invoke in a declarative configuration. | gonzo41 wrote: | Is maven old now....oh uh...gotta get with the cool kids | ryan29 wrote: | It's so true. I used Ant > Maven > Gradle. The thing that | I think is different about modern CI is there's no good, | standard way of adding functionality. So it's almost | never write a plugin and always hack something together. | And none of it's portable between build systems which are | (almost) all SaaS, so it's like getting the absolute | worst of everything. | | I'll be absolutely shocked if current CI builds still | work in 10 years. | phtrivier wrote: | I was waiting for jai to see how the build scripts are | basically written in... Jai Itself. | | It seems that zig [1] already does it. Hoping to try that | someday... | | [1] https://ziglearn.org/chapter-3/ | imtringued wrote: | You can activate typechecking in groovy with | @CompileStatic. It's an all or nothing thing though (for | the entire file). | neopointer wrote: | > Build systems inevitably evolve into something turing | complete. It makes much more sense to implement build | functionality as a library or set of libraries and piggyback | off a well designed scripting language. | | This is so true. That's why I hate and love Jenkins at the same | time. | reubenmorais wrote: | That's kind of what Bazel does. Skylark is a Python dialect. | pydry wrote: | Why is an entirely new dialect necessary? Why couldn't it | just have been a python library? | brown9-2 wrote: | To limit what can be done, to make it easier to reason | about: https://docs.bazel.build/versions/master/skylark/lan | guage.ht... | qbasic_forever wrote: | It started as that at Google and was a nightmare in the | long run. People would sneak in dependencies on non- | hermetic or non-reproducible behavior all the time. The | classic "this just needs to work, screw it I'm pulling in | this library to make it happen" problem. It just kept | getting more and more complex to detect and stop those | kinds of issues. Hence a new language with no ability to | skirt around its hermetic and non-turing nature. | thrower123 wrote: | Things like rake always made more sense to me - have your build | process defined in a real programming language that you were | actually using for your real product. | | Then again, grunt/gulp was a horrible, horrible shitshow, so | it's not a silver bullet either... | 10ko wrote: | I cannot stop thinking this guy is describing what we do at | https://garden.io. | | He seems to go on describing the Stack Graph and the | build/test/deploy/task primitives, the unified configuration | between environments, build and test results caching, the | platform agnosticism (even though we are heavy focused on | kubernetes) and the fact that CI can be just a feature, not a | product on itself. | | One thing I definitely don't agree with is: "The total | addressable market for this idea seems too small for me to see | any major player with the technical know-how to implement and | offer such a service in the next few years." | | We just published the results from an independent survery we | commissioned last week and one of the things that came out is: it | doesn't matter the size of the company, the amount of hours teams | spend mantaining this overly complex build systems, CI systems, | Preview/Dev environments etc. is enormous and often is object of | the biggest complaints across teams of Tech organizations. | | So yeah, I agree with the complexity bit but I think the author | is overly positive about the current state of the art, at least | in the cloud native world. | bob1029 wrote: | We got tired of using external tools that were not well-aligned | with our build/deployment use cases - non-public network | environments. GitHub Actions, et. al. cannot touch the target | environments that we deploy our software to. Our customers are | also extremely wary of anything cloud-based, so we had to find an | approach that would work for everyone. | | As a result, we have incorporated build & deployment logic _into_ | our software as a first-class feature. Our applications know how | to go out to source control, grab a specified commit hash, | rebuild themselves in a temporary path, and then copy these | artifacts back to the working directory. After all of this is | completed, our application restarts itself. Effectively, once our | application is installed to some customer environment, it is like | a self-replicating organism that never needs to be reinstalled | from external binary artifacts. This has very important security | consequences - we build on the same machine the code will execute | on, so there are far fewer middle men who can inject malicious | code. Our clients can record all network traffic flowing to the | server our software runs on and definitively know 100% of the | information which constitutes the latest build of their | application. | | Our entire solution operates as a single binary executable, so we | can get away with some really crazy bullshit that most developers | cannot these days. Putting your entire app into a single self- | contained binary distribution that runs as a single process on a | single machine has extremely understated upsides these days. | lstamour wrote: | Sounds like Chrome, minus the build-on-the-customer's-machine | part. Or like Homebrew, sort of. Also sounds like a malware | dropper. That said, it makes sense. I would decouple the build- | on-the-customer's machine part from the rest, having a CI | system that has to run the same way on every customer's machine | sounds like a bit of a nightmare for reproducibility if a | specific machine has issues. I'd imagine you'd need to ship | your own dependencies and set standards on what version of | Linux, CPU arch and so on you'd support. And even then I'd feel | safer running inside overlays like Docker allows for, or how | Bazel sandboxes on Linux. | | Also reminds me a bit of Istio or Open Policy Agent in that | both are really apps that distribute certificates or policy | data and thus auto-update themselves? | bob1029 wrote: | We use .NET Core + Self-Contained Deployments on Windows | Server 2016+ only. This vastly narrows the scope of weird | bullshit we have to worry about between environments. | | The CI system running the same way on everyone's computer is | analogous to MSBuild working the same way on everyone's | computer. This is typically the case due to our platform | constraints. | nohuck13 wrote: | "Bazel has remote execution and remote caching as built-in | features... If I define a build... and then define a server-side | Git push hook so the remote server triggers Bazel to build, run | tests, and post the results somewhere, is that a CI system? I | think it is! A crude one. But I think that qualifies as a CI | system." | | --- | | Absolutely. | | The advisability of rolling your own CI aside, treating CI as | "just another user" has real benefits, and this was a pleasant | surprise for me when using Bazel. When your run the same build | command (`say bazel test //...`) across development and CI, then: | | - you get to debug your build pipeline locally like code | | - the CI DSL/YAML files mostly contain publishing and other CI- | specific information (this feels right) | | - the ability of a new user to pull the repo, build, and have | everything just work, is constantly being validated by the CI. | With a bespoke CI environment defined in a Docker image or YAML | file this is harder. | | - tangentially: the remote execution API [2] is beautiful in its | simplicity it's doing a simple core job. | | [1] OTOH: unless you have a vendor-everything monorepo like | Google, integrating with external libraries/package managers is | unnatural; hermetic toolchains are tricky; naively-written rules | end up system-provided utilities that differ by host, breaking | reproducibility, etc etc. | | [2] https://github.com/bazelbuild-remote- | apis/blob/master/build/... | qznc wrote: | How does Bazel deal with different platforms? For example, run | tests on Windows, BSD, Android, Raspberry Pi, RISCv5, or even | custom hardware? | nohuck13 wrote: | Bazel differentiates between the "host" environment (your dev | box) the "execution" environment (where the compiler runs) | and the "target" environment (e.g. RISCv5) | | Edit: there's a confusing number of ways of specifying these | things in your build, e.g. old crosstool files, | platforms/constraints, toolchains. A stylized 20k foot view | is: | | Each build target specifies two different kinds of inputs: | sources (code, libraries) and "tools" (compilers). A | reproducible build requires fully-specifying not just the | sources but all the tools you use to build them. | | Obviously cross-compiling for RISCv5 requires different | compiler flags than x86_64. So instead of depending on "gcc" | you'd depend on an abstract "toolchain" target which defines | ways to invoke different version(s) of gcc based on your | host, execution, and target platforms. | | In practice, you wouldn't write toolchains yourself, you'd | depend on existing implementations provided by library code, | e.g. many many third party language rules here: | https://github.com/jin/awesome-bazel#rules | | And you _probably_ wouldn't depend on a specific toolchain in | every single rule, you'd define a global one for your | project. | | "platforms" and "constraints" together let you define more | fine-grained ways different environments differ (os, cpu, | etc) to avoid enumerating the combinatoric explosion of build | flavors across different dimensions. | | HTH, caveat, I have not done cross-compilation in anger. | Someone hopefully will correct me if my understanding is | flawed. | EdSchouten wrote: | Pretty well! You can set up a build cluster that provides | workers for any of these different platforms. Each of these | platforms is identified by a different set of label values. | Then you can run Bazel on your personal system to 'access' | any of those platforms to run your build actions or tests. | | In other words: A 'bazel test' on a Linux box can trigger the | execution of tests on a BSD box. | | (Full transparency: I am the author of Buildbarn, one of the | major build cluster implementations for Bazel.) | joshuamorton wrote: | https://docs.bazel.build/versions/master/platforms.html is | probably what you want. | | So you can, conceivably, bazel running on your local, x86 | machine, run the build on an ARM (rpi) build farm, | crosscompiling for RISCv5. | | I presume that this specific toolchain isn't well supported | today. | lstamour wrote: | The reason this isn't a concern is because Bazel tries very | hard to not let any system libraries or configurations | interfere with the build, at all, ever. So it should rarely | matter what platform you're running a build on, the goal | should be the same output every time from every platform. | | Linux is recommended, or a system that can run Docker and | thus Linux. From there it depends on the test or build step. | I haven't done much distributed Bazel building or test runs | yet myself. I imagine you can speak to other OSes using qemu | or network if speed isn't a concern. You can often build for | other operating systems without natively using other | operating systems using a cross-compiling toolchain. | | That said Bazel is portable - it generally needs Java and | Bash and is generally portable to platforms that have both, | though I haven't checked recently. There are exceptions | though, and it will run natively in Windows, just not as | easily. https://docs.bazel.build/versions/master/windows.html | It also works on Mac, but it's missing Linux disk sandboxing | features and makes up for it using weird paths and so on. | qbasic_forever wrote: | Why even have a difference between production and CI in the first | place? I see a future where it's Kubernetes all the way down. My | CI system is just a different production-like environment. The | same job to deploy code to production is used to deploy code to | CI. The same deployment that gets a pod running in prod will get | a pod running in test under CI. Your prod system handles events | from end-users, your CI system handles events from your dev | environment and source repo. Everything is consistent and the | same. Why should I rely on some bespoke CI system to re-invent | everything that my production system already has to do? | sly010 wrote: | Re: correct language/abstraction | | At the highest level you want a purely functional DSL with no | side effects. Preferably one that catches dependency cycles so it | halts provably. | | On the lowest level, however, all your primitives are unix | commands that are all about side effects. Yet, you want them to | be reproducible, or at least idempotent so you can wrap them in | the high level DSL. | | So you really need to separate those two worlds, and create some | sort of "runtime" for the low level 'actions' to curb the side | effects. | | * Even in the case of bazel, you have separate .bzl and BUILD | files. * In the case of nix, you have nix files and you have the | final derivation (a giant S expression) * In the case of CI | systems and github actions, you have the "actions" and the "gui". | | Re: CI vs build system, I guess the difference is that build | systems focus on artifacts, while CI systems also focus on side | effects. That said, there are bazel packages to push docker | images, so it's certainly a very blurry line. | throwaway894345 wrote: | > Re: CI vs build system, I guess the difference is that build | systems focus on artifacts, while CI systems also focus on side | effects. That said, there are bazel packages to push docker | images, so it's certainly a very blurry line. | | I think the CI and build system have basically the same goals, | but they're approaching the problem from different directions, | or perhaps it's more accurate to say that "CI" is more | imperative while build systems are more declarative. I really | want a world with a better Nix or Bazel. I think purely | functional builds are always going to be more difficult than | throwing everything in a big side-effect-y container, but I | don't think they have to be Bazel/Nix-hard. | sly010 wrote: | Out of curiosity, what's hard about bazel? | | From my experience the main issue is interoperability with | third party build systems. i.e. using a cmake library that | was not manually bazeled by someone. | simonw wrote: | It genuinely hadn't crossed my mind that a CI system and a build | system were different things - maybe because I usually work in | dynamic rather than compiled languages? | | I've used Jenkins, Circle CI, GitLab and GitHub Actions and I've | always considered them to be a "remote code execution in response | to triggers relating to my coding workflow" systems, which I | think covers both build and CI. | jrockway wrote: | I think that modern CI is actually too simple. They all boil down | to "get me a Linux box and run a shell script". You can do | anything with that, and there are a million different ways to do | everything you could possibly want. But, it's easy to implement, | and every feature request can be answered with "oh, well just | apt-get install foobarbaz3 and run quuxblob to do that." | | A "too complex" system, would deeply integrate with every part of | your application, the build system, the test runner, the | dependencies would all be aware of CI and integrate. That system | is what people actually want ("run these browser tests against my | Go backend and Postgres database, and if they pass, send the | exact binaries that passed the tests to production"), but have to | cobble together with shell-scripts, third-party addons, blood, | sweat, and tears. | | I think we're still in the dark ages, which is where the pain | comes from. | stingraycharles wrote: | I agree with you in principle, but I have learned to accept | that this only works for 80% of the functionality. Maybe this | works for a simple Diablo or NodeJS project, but in any large | production system there is a gray area of "messy shit" you | need, and a CI system being able to cater to these problems is | a good thing. | | Dockerizing things is a step in the right direction, at least | from the perspective of reproducibility, but what if you are | targeting many different OS'es / architectures? At QuasarDB we | target Windows, Linux, FreeBSD, OSX and all that on ARM | architecture as well. Then we need to be able to set up and | tear down whole clusters of instances, reproduce certain | scenarios, and whatnot. | | You can make this stuff easier by writing a lot of supporting | code to manage this, including shell scripts, but to make it an | integrated part of CI? I think not. | cogman10 wrote: | While it comes up, I think it's more of a rare problem. So | much stuff is "x86 linux" or in rare cases "ARM linux" that | it doesn't often make sense to have a cross platform CI | system. | | Obviously a db is a counter example. So is node or a | compiler. | | But at least from my experience, a huge number of apps are | simply REST/CRUD targeting a homogeneous architecture. | viraptor wrote: | Unless we're talking proprietary software deployed to only | one environment, or something really trivial, it's still | totally worth testing other environments / architectures. | | You'll find dependency compilation issues, path case | issues, reserved name usage, assumptions about filesystem | layout, etc. which break the code outside of Linux x86. | Already__Taken wrote: | Hard agree. I've been using gitlab CI/CD for a long time now. I | almost want to say it's been around longer or as long as | docker? | | It has a weird duality of running as docker images, but also | really doesn't understand how to use container images IN the | process. Why volumes don't just 1:1 map to artifacts and | caching, always be caching image layers to make things super | fast etc. | sytse wrote: | "I almost want to say it's been around longer or as long as | docker?" | | I had to look it up but GitLab CI has been around longer than | Docker! Docker was released as open-source in March 2013. | GitLab CI was first released in 2012. | ryan29 wrote: | I don't know if I'd say they're too simple. I think they're too | simple in some ways and too complex in others. For me, I think | a ton of unnecessary complexity comes from isolating per build | step rather than per pipeline, especially when you're trying to | build containers. | | Compare a GitLab CI build with Gradle. In Gradle, you declare | inputs and outputs for each task (step) and they chain together | seamlessly. You can write a task that has a very specific role, | and you don't find yourself fighting the build system to deal | with the inputs / outputs you need. For containers, an image is | the output of `docker build` and the input for `docker tag`, | etc.. Replicating this should be the absolute minimum for a CI | system to be considered usable IMO. | | If you want a more concrete example, look at building a Docker | container on your local machine vs a CI system. If you do it on | your local machine using the Docker daemon, you'll do something | like this: | | - docker build (creates image as output) | | - docker tag (uses image as input) | | - docker push (uses image/tag as input) | | What do you get when you try to put that into modern CI? | | - build-tag-push | | Everything gets dumped into a single step because the build | systems are (IMO) designed wrong, at least for anyone that | wants to build containers. They should be isolated, or at least | give you the option to be isolated, per pipeline, not per build | step. | | For building containers it's much easier, at least for me, to | work with the concept of having a dedicated Docker daemon for | an entire pipeline. Drone is flexible enough to mock something | like that out. I did it a while back [1] and really, really | liked it compared to anything else I've seen. | | The biggest appeal was that it allows much better local | iteration. I had the option of: | | - Use `docker build` like normal for quick iteration when | updating a Dockerfile. This takes advantage of all local | caching and is very simple to get started with. | | - Use `drone exec --env .drone-local.env ...` to run the whole | Drone pipeline, but bound (proxied actually) to the local | Docker daemon. This also takes advantage of local Docker caches | and is very quick while being a good approximation of the build | server. | | - Use `drone exec` to run the whole Drone pipeline locally, but | using docker-in-docker. This is slower and has no caching, but | is virtually identical to the build that will run on the CI | runner. | | That's not an officially supported method of building | containers, so don't use it, but I like it more than trying to | jam build-tag-push into a single step. Plus I don't have to | push a bunch of broken Dockerfile changes to the CI runner as | I'm developing / debugging. | | I guess the biggest thing that shocks me with modern CI is | people's willingness to push/pull images to/from registries | during the build process. You can literally wait 5 minutes for | a build that would take 15 seconds locally. It's crazy. | | 1. https://discourse.drone.io/t/use-buildx-for-native-docker- | bu... | noir_lord wrote: | Pretty much. | | Docker in my experience is the same way, people see docker as | the new hotness then treat it like a Linux box with a shell | script (though at least with the benefit you can shoot it in | the head). | | One of the other teams had an issue with reproducibility on | something they where doing so I suggested that they use a | multistage build in docker and export the result out as an | artefact they could deploy, they looked at me like I'd grown a | second head yet have been using docker twice as long as me, | though I've been using Linux for longer than all of them | combined. | | It's a strange way to solve problems all around when you think | about what it's actually doing. | | Also feels like people adopt tools and cobble shit together | from google/SO, what happened to RTFM. | | If I'm going to use a technology I haven't before the first | thing I do is go read the actual documentation - I won't | understand it all on the first pass but it gives me an | "index"/outline I can use when I do run into problems, if I'm | looking at adopting a technology I google "the problem with | foobar" not success stories, I want to know the warts not the | gloss. | | It's the same with books, I'd say two 3/4 of the devs I work | with don't buy programming books, like at all. | | It's all cobbled together knowledge from blog posts, that's | fine but a cohesive book with a good editor is nearly always | going to give your a better understanding than piecemeal bits | from around the net, that's not to say specific blog posts | aren't useful but the return on investment on a book is higher | (for me, for the youngsters, they might do better learning from | tiktok I don't know..). | pbrb wrote: | I personally love it. RTFM is pretty much the basis of my | career. I always at minimum skim the documentation (the | entire doc) so I have an index of where to look. It's a great | jumping off point for if you do need to google anything. | | Books are the same. When learning a new language for example, | I get a book that covers the language itself (best practices, | common design patterns, etc), not how to write a for loop. It | seems to be an incredibly effective way to learn. Most | importantly, it cuts down on reading the same information | regurgitated over and over across multiple blogs. | cogman10 wrote: | lol... yeah.... I've become "the expert" on so much shit | just because of RTFM :D | | It's amazing how much stuff is spelt out in manuals that | nobody bothers to read. | | The only issue is that so few people RTFM that some manuals | are pure garbage to try and glean anything useful. In those | cases, usually the best route is often to just read the | implementation (though that is tedious). | [deleted] | dkarl wrote: | > treat it like a Linux box with a shell script (though at | least with the benefit you can shoot it in the head) | | To be fair, that by itself is a game-changer, even if doesn't | take full advantage of Docker. | toomanyducks wrote: | Any CI books you can recommend? I have been completely | treating it as a linux box with a shell script and have | cobbled together all of my shit from google/SO. | daniellarusso wrote: | I am very similar. I read through the documentation the first | time to get a 'lay of the land', so I can deep-dive into the | various sections as I require. | reubenmorais wrote: | You're vehemently agreeing with the author, as far as I can | see. The example you described is exactly what you could | do/automate with the "10 years skip ahead" part at the end. You | can already do it today locally with Bazel if you're lucky to | have all your dependencies usable there. | twoslide wrote: | Am I too old fashioned in thinking it's good to define an acronym | the first time it's used? I think many well educated readers | wouldn't know CI | mmphosis wrote: | Too Many Acronyms (TMA) Continuous Integration | (CI) Continuous Delivery or Deployment (CD) Domain- | Specific Language (DSL) Structured Query Language (SQL) | YAML Ain't Markup Language (YAML) Windows Subsystem for | Linux (WSL) Portable Operating System Interface (POSIX) | Berkeley Software Distribution (BSD) Reduced Instruction | Set Computer (RISC) Central processing unit (CPU) | Operating System (OS) Quick EMUlator (QEMU) Total | Addressable Market (TAM) Directed Acyclic Graph (DAG) | Computer Science (CS) User Interface (UI) PR? | workflow (Pull Request, Purchase Request, Public Relations, | Puerto Rico) GitHub Actions (GHA) GitLab? or | Graphics Library (GL) Facebook Apple Amazon Netflix | Google (FAANG) | | The only acronym I care about is B: alias | b='code build' | tremon wrote: | If you're in software development and haven't heard about | Continuous Integration, you're definitely old-fashioned. | Although fashion moves fast and breaks things here. | iudqnolq wrote: | I was thinking the other day that Jetpack compose or some other | reactive framework might be the perfect build system. It has | declaritivity by default, sane wrapping of IO, and excellent | composability (just define a function!), plus of course a real | language. | tauntz wrote: | I built a CI service many moons ago (was https://greenhouseci.com | back then) on the premise that _most_ companies don't actually | require a hugely complicated setup and don't actually need all | the flexibility / complexity that most CI systems have. | | After talking to tens and tens of companies with apparent complex | CI requirements, I still stand by that assertion. When drilling | down into _why_ they say they need all that configurability, it's | sometimes as easy as "but FAANG is doing X so my 5 person team | needs it as well". | Znafon wrote: | Is this still working? The certificate is expired. | tauntz wrote: | I parted ways with my child many years ago. The product | itself is still around and has gone through 2 rebrands and | has evolved a lot. It's now home under https://codemagic.io | mlthoughts2018 wrote: | Isn't Drone also an example of the author's ideal solution? | chatmasta wrote: | There is a lot of valid criticism here, but the suggestion that | "modern CI" is to blame is very much throwing the baby out with | the bathwater. The GitHub Actions / GitLab CI feature list is | immense, and you can configure all sorts of wild things with it, | but you don't _have to_. At our company our `gitlab-ci.yml` is a | few lines of YAML that ultimately calls a single script for each | stage. We put all our own building /caching/incremental logic in | that script, and just let the CI system call it for us. As a nice | bonus that means we're not vendor locked, as our "CI system" is | really just a "build system" that happens to run mostly on remote | computers in response to new commits. | | It's not _exactly_ the same as the local build system, because | development requirements and constraints are often distinct from | staging /prod build requirements, and each CI paltform has subtle | differences with regards to caching, Docker registries, etc. But | it uses a lot of the same underlying scripts. (In our case, we | rely a lot on Makefiles, Docker BuildKit, and custom tar contexts | for each image). | | Regarding GitHub actions in particular, I've always found it | annoyingly complex. I don't like having to develop new mental | models around proprietary abstractions to learn how to do | something I can do on my own machine with a few lines of bash. I | always dread setting up a new GHA workflow because it means I | need to go grok that documentation again. | | Leaning heavily on GHA / GL CI can be advantageous for a small | project that is using standardized, cookie-cutter approaches, | e.g. your typical JS project that doesn't do anything weird and | just uses basic npm build + test + publish. In that case, using | GHA can save you time because there is likely a preconfigured | workflow that works exactly for your use case. But as soon as | you're doing something slightly different from the standard | model, relying on the cookie-cutter workflows becomes inhibitive | and you're better off shoving everything into a few scripts. Use | the workflows where they integrate tightly with something on the | platform (e.g. uploading an artifact to GitHub, or vendor- | specific branch caching logic), but otherwise, prefer your own | scripts that can run just as well on your laptop as in CI. To be | honest, I've even started avoiding the vendor caching logic in | favor of using a single layer docker image as an ad-hoc FS cache. | thinkharderdev wrote: | I've always had a pipe dream of building a CI system on top of | something like Airflow. A modern CD pipeline is just a long- | running workflow, so it would be great to treat it just like any | other data pipeline. ___________________________________________________________________ (page generated 2021-04-07 23:01 UTC)