[HN Gopher] Libxo: Easy way to generate text, XML, JSON, and HTML ___________________________________________________________________ Libxo: Easy way to generate text, XML, JSON, and HTML Author : edward Score : 60 points Date : 2023-07-14 18:29 UTC (4 hours ago) (HTM) web link (juniper.github.io) (TXT) w3m dump (juniper.github.io) | sam_bristow wrote: | I would love having structured output from shell commands, but | for now I'd settle for people using stdout and stderr correctly. | moody__ wrote: | I see people say they would like this quite often. There is a | command line interface that does do this, powershell is very | structured in this exact way. But no one is throwing away their | sh/bash/tcsh/ksh for powershell. I think this is just a classic | example of the grass being more green on the other side. | JNRowe wrote: | Every time I see this I'm hoping that it has taken off, because | it feels like such an obvious improvement. Instead of that we | have some support for it in FreeBSD, and a homegrown solution in | some packages like util-linux. Yeah, there are some concerns, but | the concept seems sound and the implementation can be iterated | upon. | | For instance, years later and it still isn't packaged in Debian. | If nothing out of the tens of thousands of Debian packages has a | dependency on it there presumably must be a good reason. | | It strikes me as one of those libtermkey1/libvterm things where | Leonerd pushed it for years before anybody really used it, | despite it being a seemingly obvious improvement over the status | quo. | | 1 https://www.leonerd.org.uk/code/libtermkey/ | ComputerGuru wrote: | > Instead of that we have some support for it in FreeBSD | | My google fu is failing me right now but FreeBSD also has a | shared library used for reading/parsing config files and | providing either a common or universal dsl for all conf files | using the same library. This is one of the benefits of using an | OS instead of a distribution - all the tools are developed | holistically and refactors such as providing a shared, | universal input or output format, sandboxing everything with | capsicum, etc across the board are much more possible. | | EDIT | | Remembered it. Surprised at how bad Google was at finding this, | though! | | UCL - Universal Configuration Language [0]. Introduced in a | paper by Allan Jude in 2015 [1]. Man page: libucl(3) [2]. | | [0]: https://github.com/vstakhov/libucl/ | | [1]: https://papers.freebsd.org/2015/bsdcan/allanjude-ucl/ | | [2]: | https://man.freebsd.org/cgi/man.cgi?query=libucl&sektion=3&f... | kristopolous wrote: | I bet you could do some pretty clever heuristic hacks to wrap a | bunch of programs in this especially if you attach to the process | and say, clobber printf. I'm thinking in the spirit of rlwrap. | | It's certainly more of a game genie approach but it might | occasionally be awesome. | zokier wrote: | Seeing that this originates from FreeBSD ecosystem, did it get | actually adopted widely in FreeBSD base system? At least I | interpreted that to have been the goal: | https://juniper.github.io/libxo/libxo-manual.html#can-you-sh... | tedunangst wrote: | It's used inconsistently. df uses it, but not du. ps, but not | ls. | Norfair wrote: | In Haskell we use autodocodec for this. | 38 wrote: | you can already do this in other languages. For example, here is | Go: package main import ( | "encoding/json" "encoding/xml" "os" | ) type wc struct { File []file | } type file struct { Lines int | Words int Characters int Filename | string } func main() { | etc_motd := wc{ []file{ {25, 1165, | 1140, "/etc/motd"}, }, } | json.NewEncoder(os.Stdout).Encode(etc_motd) | xml.NewEncoder(os.Stdout).Encode(etc_motd) } | paulddraper wrote: | Gee thanks mister! | 38 wrote: | get outta here kid. | ComputerGuru wrote: | Serde can _kind of_ do this for rust projects, but you 're | usually constrained to outputs that are "identical but for the | syntax/format" (i.e. same field names though perhaps with | different naming conventions). | | I've used that to convert configuration files from one language | to the other, such as this json2toml and toml2json tool [0]. | | [0]: https://github.com/neosmart/toml2json | mananaysiempre wrote: | > Serde can kind of do this for rust projects, but you're | usually constrained to outputs that are "identical but for the | syntax/format" (i.e. same field names though perhaps with | different naming conventions). | | Libxo's distinguishing feature (IMO) is that its schema | specifications are plaintext output with markup specifying | which parts are data to be extracted into structured formats, | so that you can port your usual Unix tool to it and not ruin | its original ad-hoc output. I don't know of anything positioned | as a serialization library that can do this with comparable | grace, Serde included. | | Related: the section on marking up plaintext output in the Ivo | essay[1]. | | [1] | https://web.archive.org/web/20111204021526/http://lubutu.com... | (discussed at the time at | https://news.ycombinator.com/item?id=3300264) | ComputerGuru wrote: | Yup. libxo is much more free-form, while anything going | through a serialize-deserialize process is going to | necessarily have to be more regular. | ary wrote: | This, at least in concept, looks like a potential successor to | printf() et al. The general accessibility of it is lacking given | that it's a C-only API at the moment (there don't appear to be | bindings for other languages), and I'm left questioning whether | format strings are the best way. Perhaps worse is better in this | case. | | When thinking about this problem I've not been able to get beyond | the decision of "should it be done with something like a builder | pattern and a graph of objects/structures" or "should it be done | with a DSL" (which is what I consider the format strings approach | to be). A DSL is more immediately convenient when creating | output, but when you want to understand the structure you're | emitting it seems better to have code that is explicit and | imperative. | loeg wrote: | libxo is in practice a poor approach to generating structured | output from unix utilities. There are at least a few problems. | The format strings do not easily replace existing formatted | prints, so it is not straightforward to adopt. For anything | more complicated than simple row records, you have to change | the structure of your program significantly and might as well | just use a different path for formatting structured output. It | is unaware of locales, and as a result, butchers text in non | ASCII/UTF-8 encodings. Finally, a separate-binary-with- | structured-text-output is a poor library interface to quite a | lot of these utilities -- a callable C API would be more | broadly useful. | yyyk wrote: | Libxo is used in FreeBSD. I can't say I'm a fan of the approach | though. | | Typical printf usage is imperative and additive: | | if (enter) printf("Hello "); else printf("Goodbye "); | printf("World!\n"); | | Using the format string forces the programmer to keep implicit | state (the document format) all over the place or get an | inconsistent document. For example, imagine the first printf | calls the column 'Text' and the others call it 'Output'. We can | easily do this for a single format, but the complexity will get | higher the more we add. | | If you do this properly (emit to an object and render from that), | the result is trivially consistent. The difficulty here is to get | streaming, this is however not always required and can be | achieved with a little effort. | lelanthran wrote: | > Using the format string forces the programmer to keep | implicit state (the document format) all over the place or get | an inconsistent document. | | I'm not understanding your objection[1]; surely you would only | define the libxo format string _once_ , and then reuse it | everywhere? Without libxo you'd need to duplicate your code | everywhere for every output format you want to support. | | IOW, you'd have to construct your libxo format string using a | string concatenation library, something like this: | const char *s1 = "{:Text%7ju}"; const char *s2 = | "{:Output%7ju}"; char *final = NULL; if | (enter) { final = strdup (s1); } else { | final = strdup (s2); } final = strconcat (final, | s2); | | Isn't that a better mechanism than printf? | | [1] It's late, I've the flu and feeling a little stupid right | now. Also, this is the first I've seen this project. | yyyk wrote: | >surely you would only define the libxo format string once, | and then reuse it everywhere | | I would be rather scared of using a variable for a format | string. IMHO, these types of format strings are an | antipattern for a different reason - we don't see the format | at point of use, if we make some too easy mistakes, we have a | crash or CVE*. I think nowaways there are tools to do some | verification**, and I guess we could use an IDE (but most C | programmers don't?), but I am unfamiliar with any such tool | which supports libxo style format strings. | | * https://en.wikipedia.org/wiki/Format_string_attack | | ** IIRC, GCC/Clang eventually added a verifier? But it | doesn't apply to all cases or scanf? I don't recall. | mpweiher wrote: | > The difficulty here is to get streaming | | Polymorphic Write Streams do this. | | ACM DL: https://dl.acm.org/doi/10.1145/3359619.3359748 | | pdf: | http://www.hirschfeld.org/writings/media/WeiherHirschfeld_20... | | Code: | https://github.com/mpw/MPWFoundation/tree/master/Streams.sub... | | Fast JSON parsing using this approach: | https://blog.metaobject.com/2020/04/somewhat-less-lethargic-... | | Presentation (DLS '19): | https://www.youtube.com/watch?v=DG5MtsMojgI ___________________________________________________________________ (page generated 2023-07-14 23:00 UTC)