[HN Gopher] Typed Config Languages ___________________________________________________________________ Typed Config Languages Author : nalgeon Score : 52 points Date : 2022-01-20 06:36 UTC (1 days ago) (HTM) web link (kevincox.ca) (TXT) w3m dump (kevincox.ca) | milliams wrote: | Can anyone else read that white text on a light grey background? | kevincox wrote: | Author: Oops, I fixed the inline code snippets but accidentally | broke the blocks. I'll push out a fix shortly. | | Edit: Should be fixed (might need a hard reload). It turns out | inverting something twice doesn't really do much. | Zababa wrote: | The white and orange/brown are hard to read for me. Both have | less than 1.5 contrast in the chrome dev tools. To continue on | the nitpicks, I think there's a typo, the procedural macros | have "drive" instead of "derive". And the boxes of code taking | all the page compared to the "terminal" text make it a bit hard | to "get" the flow of the page. | | Other than that it was a nice read, and the grey background is | nice and easy on the eyes. I really like the breadcrumbs too, | they act as a minimalist menu and easily clickable URL. | kevincox wrote: | Thanks for the feedback. I think now that the bug with the | syntax highlighting is fixed the accessibility should be ok. | I need to find a better way to test though because I am using | the "invert" hack for the light theme and the build-in | accessibility checker for Chromium and Firefox don't appear | to take this into consideration. I'll need to find a basic | calculator and do the math manually when I have time. | | Typo fixed. | | Thanks for the flow feedback, I thought the full-width code | was cool and it can help for wide code without making the | page text too wide (it allows the code to grow to the right) | but maybe it is more confusing than it is worth. And thanks | for the feedback on the background and breadcrumbs, it is | good to hear both positive and negative thoughts. | Zababa wrote: | Thanks, it's way easier to read now. The flow feedback is | very personal, feel free to ignore it. After rewatching the | article, I like how unique it is. | Cyberdog wrote: | Looks like nobody has mentioned TOML yet, so I will. It's | basically INI syntax with stricter rules and can be used as such | if you wish, but also supports some extra features like arrays, | dictionaries/tables, timestamp values, nested sections, and so | on. It's a joy to use in the rare times that I'm able to use it. | | https://toml.io/en/ | kstrauser wrote: | TOML's great, although I'm not in love with the syntax for | nested sections. | jer0me wrote: | Tom's Obvious Markup Language, as in Tom Preston-Werner of | GitHub | AaronFriel wrote: | I've increasingly found myself drawn to writing declarations as | programs, not as config in any plain textual format. | | Whether it's going all the way in the direction of supporting | general purpose languages like Pulumi does, or with a niche but | still Turing-complete language like Nix's expression language, or | even Dhall, which other commentators have mentioned. That isn't | to say there is no place for simple, human readable schema. I | think these tools need a fallback to something simpler. | | Nothing these tools do couldn't be emulated by manual processes | of hand-writing complex makefiles or YAML or whatever, but what | is striking to me is the use of general purpose languages, | usually with tooling to do type-checking and IDE assistance to | writing these things, which lowers the barrier to entry and | empowers someone who "just knows (Python|JavaScript|Go|...)" to | contribute in a familiar environment. | | A couple more examples: | | Envoy: I have had the displeasure, recently, of writing by-hand a | configuration for the Envoy load balancer. Envoy uses a "typed | config language", which is often represented as YAML or JSON, but | it's very painful to write by hand. On the other hand, if you | have the protocol buffers/gRPC schema available in code, it's | vastly less painful to use any programming language to build the | typed objects and then export to plain text. The xDS protocol is | designed for being interacted with via programs, not plaintext. | | GitLab CI: GitLab supports writing a program, which runs as part | of the CI/CD job, to generate the configuration for a subsequent | pipeline. This makes writing complex jobs, or repetitive monorepo | tasks much simpler. A 10 line Python program that effectively | does "for each folder in `python`, emit this block of YAML" is | incredibly powerful. | | That last example is salient to me: markup languages are really | easy to parse, but challenging to read when they're dynamic. | Wouldn't it be nice to be able to mix and match? YAML/JSON where | it makes sense and the intent and meaning is self-evident, and to | write code where you need dynamism? | | Full disclosure: I work for Pulumi, opinions are my own, etc. | jandrese wrote: | In the past it was fairly common to write configuration files | in TCL and just running the script to parse them. | | This was generally considered to be a mistake. It opens up a | huge threat surface for your program and few people ended up | using the more advanced capabilities it created. If your config | format supports simple globbing that covers the 95% use case | for otherwise needing a turing compete language for your | configuration files. | blacksqr wrote: | > This was generally considered to be a mistake. | | Pity. Tcl allows you to create safe interpreters within which | you can disable any commands you want in order to have a | trustworthy environment for running configuration scripts. | | Tcl itself uses them to build its internal list of available | package modules. | AaronFriel wrote: | I think that speaks to having that flexibility when needed. | The other 5% of use cases are significant too. | jandrese wrote: | The other 5% is generally solved by people who write | scripts to generate the config files. It's not like they're | up a creek. | kaba0 wrote: | That's why I think Dhall is really interesting, it being | purposefully not Turing-complete, yet still very flexible. | tromp wrote: | Dhall [1] [2] looks promising to me. | | [1] http://dhall-lang.org/ | | [2] https://github.com/dhall-lang/dhall-lang | corysama wrote: | Does anyone know if using Dhall to generate JSON that is | validated using Cue a sensical idea? I don't know enough about | them to be sure. | kevinmgranger wrote: | dhall serves as the validation layer itself. | | Unless you're already consuming some other API that publishes | Cue validation, in which case dhall is just the templating | language. | vzaliva wrote: | I am using Cerberus schema to validate my YAML configs: | | https://docs.python-cerberus.org/en/stable/schemas.html | msoad wrote: | Like it or not YAML configuration files are everywhere. I've had | a lot of luck using JSON Schema with YAML config files. Luckily | VSCdoe and possibly many other editors can be configured to | provide type hints for completion. Using any available JSONSchema | checker you can validate your config files in CI and elsewhere. | | Most of the time, what's missing an accurate JSON Schema for a | configuration. I usually encourage owners of those configurations | to sit down and write it for everyone to benefit from | pietroppeter wrote: | I like the approach of strictyaml. A parser that concentrates on | a restricted subset of yaml and allows to use a schema to have a | type safe validator. | | https://github.com/crdoconnor/strictyaml | nicoburns wrote: | If we're on the topic of config languages, I'd like to plug Gura | (https://github.com/gura-conf/gura). It's not too well-known, but | it probably has the best design I've seen, and seems to have a | good coverage of languages with an available library. | milliams wrote: | YAML's parsing of `no` as `False` has not been part of the spec | for 13 years now. It was changed in YAML 1.2 in 2009 to only be | `true` and `false` (with variations in case allowed I think). | kbd wrote: | As has come up in this thread already, any discussion of typed | config languages nowadays that doesn't mention Cue | (https://cuelang.org/) seems incomplete. They really seem to be | tackling the problem in a thorough way. I hope it catches on. | | For anyone who knows more about Cue: right now you can go from | Cue<->yaml (in fact, their docs on yaml also use the "no" case as | an example: https://cuelang.org/docs/integrations/yaml/) to | integrate with existing systems, but I suppose eventually the | goal would be to have direct support in libraries like Serde? | kevincox wrote: | (disclaimer: author) | | Cue is a very cool language, but it is quite different than the | "typed config language" that I have described here. Maybe I | picked a poor title but in the post I am talking about using | the type information to "improve" parsing. IIUC Cue does not | due this, it parses in a "dynamically typed" manor, then uses | the type system to evaluate the turing complete (or close to | it) expression language. | kbd wrote: | Yeah that's why I was asking about Cue's eventual goals with | libraries like Serde. I assume eventually they'd like to be | able to auto-generate type definitions for a target language, | but I don't know. | | > Cue does not due this, it parses in a "dynamically typed" | manor, then uses the type system to evaluate the turing | complete (or close to it) expression language. | | As I understand it Cue would help in two ways currently. 1. | It would be able to type-check existing yaml files to catch | things like the "no" case. 2. if you write your config in | Cue, it would output properly-typed yaml to avoid things like | "no". | kevincox wrote: | Yes. I agree. It would "prevent" the "no case" by returning | an error on parse/evaluation. However the solution | described here can do better. It can correctly parse the no | case. Basically by knowing it is parsing a string the | grammar can be simpler, it doesn't have to decide if it is | a int/bool/string anymore. | yegle wrote: | Re the first note in the post: a good serialization format is | both easy to read by machine and read/write by human. I think the | text protobuf file is one of such example. A (human read/write- | able) config language needs to be consumed by program anyway, in | a sense a config language is a human-to-computer serialization | format. | [deleted] | usrbinbash wrote: | > Statically typed programming languages are catching on so why | don't we extend this typing to our config files? | | Because I simply don't want to expend the same amount of | cognitive load to read config files as I do for code. | | Yes, yaml has some minor ambiguities. These are easily solved. To | use the example from the article: countries: | - ca - "no" - us | | There, done. The problem was solved with 2 extra characters and | remembering the fact that `no` is special in yaml. Comparing that | to the amount of typing I have to do to define a scheme, the | syntax of which I have to learn, which I also have to read or | remember and keep in mind every time I read the config, I take | the 2 extra double-quotes. | | And, speaking of statically typed languages: This problem would | be caught immediately anyway if the config is read into static | types. | andrewzah wrote: | "There, done." | | Except, we're not done. | | YAML has multiple footguns like this, which I have to remember, | forever. And anyone who works with YAML. It's unintuitive and | confusing, and costs space in my brain that I really should be | using for more important things. | | Not to mention that if you -don't- know about these ahead of | time, debugging them can be confusing. | | A type system is marginally more work for decreased cognitive | load and eliminating stupid, idiotic bugs that nobody should | have to waste their time tracking down. | | With IDE integration, the cost is pretty much negligible other | than learning the syntax, which, c'mon, is not difficult and | we're being paid to do it. | | There are even tools like Dhall [0] that auto-generate yaml for | us. | | [0]: https://github.com/dhall-lang/dhall-lang | TrainedMonkey wrote: | Would most of the footguns be solved by quoting all of the | strings? e.g: "countries": - "ca" | - "no" - "us" | kevincox wrote: | Yes, but now you are losing a lot of the clean syntax that | causes most people to use YAML in the first place. There is | a reason that most people don't write YAML like JSON with | trailing commas and comments, it is nice to cut most of | this noise. | meowface wrote: | You can also use a stricter subset of YAML that removes | things like the "no" footgun. Plenty of such strict | parsers exist across languages. Maybe it's no longer | technically YAML at that point, but you get all the nice | parts of YAML without having to revamp everything with | static typing. | Spivak wrote: | So yes it's a footgun but it makes some sense. Most people | wouldn't really complain about true not being equivalent to | "true" or 100 not keeping it "100". People just aren't used | to yes/no being reserved words. Ruby's klass is a funny | workaround to this. enable_feature: yes | | Is totally natural. Nobody reads the spec though. If you're | outputting YAML documents with string builders you're headed | for ruin no matter what. You don't need Dhall, you need | yaml.dump which handles the types too. | CBLT wrote: | I agree that the problem is one of quicker feedback - the dev | cycle should involve a program checking the yaml correctness | (using types or otherwise) straight away and giving a useful | error message. Too often I've seen incorrect yaml checked into | git that fails with a cryptic error when deploying the | application. | | The strength of types, in my opinion, is composability. Most | config files I've seen have ultimately pulled in input from | another source and used that to create their output. Types | would allow the configuration to be checked for correctness | even in the face of unknowns. | kevincox wrote: | > Comparing that to the amount of typing I have to do to define | a scheme | | From my use case of config files the code that is reading them | knows the type anyways. So for a setup like Rust+serde there is | no overhead to set this up. | | > This problem would be caught immediately anyway if the config | is read into static types. | | That is true, but it still breaks you out of your flow. You get | a confusing error, it probably doesn't tell you the exact line | number and you need to look over your changes. If you changed a | lot of places in the file it may be easy to miss that adding | `no` to a list was the mistake. Because problems like that are | easy to understand in retrospect, but if you keep reading "no" | as "Norway" it is easy to look straight at this mistake and | think it is fine before hunting elsewhere in the file. | | I think you are right. It is still unclear if the cognitive | overhead when writing the file is worth it, but from my point | of view the upsides are much more valuable then you make them | appear to be. | simplify wrote: | Funny enough, I've implemented a config language that fits | exactly this bill https://github.com/gilbert/zaml | | An example (also see it in the online editor[0]): | users { andy beth { admin true | } carl } | | The author is right that you gain syntax benefits when you define | a schema. For those who say this adds cognitive overhead, it | actually doesn't; the schema and compiler are able to _reduce_ | that overhead, because if you make a mistake, you get a nice, | accurate error message. | | [0] | https://gilbert.github.io/zaml/editor.html#s=N4IgzgxgFgpgtgQ... | dlrush wrote: | Just use ruby for advanced config file capabilities: | | https://darrenrush.medium.com/ruby-is-the-ultimate-config-fi... | unwind wrote: | Pretty cool, I'm thinking about config file formats at the moment | so this was timely. | | A minor note since the author seems to be around (and noboby has | mentioned it that I could see). There's a typo in the example: | allowed-countires | | should of course spell "countries" more like I just did. The same | error occurs twice. | alaties wrote: | I'm kind of shocked no one brought up protobufs yet. protobuf | libraries are available in pretty much all mainstream languages | and the textproto format is pretty mature. | | It's albeit clunkier and less freeform than YAML. And if you ever | only plan on using rust the proposed solution here is probably | cleaner. | | Having portability over multiple languages maintained by large | organizations can be useful in some cases though. | dqpb wrote: | > Statically typed programming languages are catching on so why | don't we extend this typing to our config files? | | What you really need is Cuelang. Cuelang does graph unification | over a type-value lattice. This allows the user to do progressive | type -> value refinement (e.g. type->range->value). | | For configuration, this is both better than regular type systems, | and better than inheritance. | verdverm wrote: | Have you seen CUE? Just so happens v0.4.1 was released today | vmchale wrote: | Is the author aware of Dhall? He might be interested. | | I think it's better in general but in any case it gets away from | the "everything has to be a keyed map" silliness and you get | sums/products. | brundolf wrote: | This is why, in the statically-typed programming language I'm | working on, the project manifest is just a file written in the | language itself which can export a special (typed) const to | configure things like the linter. It gets to piggyback off of all | the existing tooling for the language, particularly type checks, | and can even be constructed using functions, etc if desired. ___________________________________________________________________ (page generated 2022-01-21 23:00 UTC)