[HN Gopher] Libxo: Easy way to generate text, XML, JSON, and HTML
       ___________________________________________________________________
        
       Libxo: Easy way to generate text, XML, JSON, and HTML
        
       Author : edward
       Score  : 60 points
       Date   : 2023-07-14 18:29 UTC (4 hours ago)
        
 (HTM) web link (juniper.github.io)
 (TXT) w3m dump (juniper.github.io)
        
       | sam_bristow wrote:
       | I would love having structured output from shell commands, but
       | for now I'd settle for people using stdout and stderr correctly.
        
         | moody__ wrote:
         | I see people say they would like this quite often. There is a
         | command line interface that does do this, powershell is very
         | structured in this exact way. But no one is throwing away their
         | sh/bash/tcsh/ksh for powershell. I think this is just a classic
         | example of the grass being more green on the other side.
        
       | JNRowe wrote:
       | Every time I see this I'm hoping that it has taken off, because
       | it feels like such an obvious improvement. Instead of that we
       | have some support for it in FreeBSD, and a homegrown solution in
       | some packages like util-linux. Yeah, there are some concerns, but
       | the concept seems sound and the implementation can be iterated
       | upon.
       | 
       | For instance, years later and it still isn't packaged in Debian.
       | If nothing out of the tens of thousands of Debian packages has a
       | dependency on it there presumably must be a good reason.
       | 
       | It strikes me as one of those libtermkey1/libvterm things where
       | Leonerd pushed it for years before anybody really used it,
       | despite it being a seemingly obvious improvement over the status
       | quo.
       | 
       | 1 https://www.leonerd.org.uk/code/libtermkey/
        
         | ComputerGuru wrote:
         | > Instead of that we have some support for it in FreeBSD
         | 
         | My google fu is failing me right now but FreeBSD also has a
         | shared library used for reading/parsing config files and
         | providing either a common or universal dsl for all conf files
         | using the same library. This is one of the benefits of using an
         | OS instead of a distribution - all the tools are developed
         | holistically and refactors such as providing a shared,
         | universal input or output format, sandboxing everything with
         | capsicum, etc across the board are much more possible.
         | 
         | EDIT
         | 
         | Remembered it. Surprised at how bad Google was at finding this,
         | though!
         | 
         | UCL - Universal Configuration Language [0]. Introduced in a
         | paper by Allan Jude in 2015 [1]. Man page: libucl(3) [2].
         | 
         | [0]: https://github.com/vstakhov/libucl/
         | 
         | [1]: https://papers.freebsd.org/2015/bsdcan/allanjude-ucl/
         | 
         | [2]:
         | https://man.freebsd.org/cgi/man.cgi?query=libucl&sektion=3&f...
        
       | kristopolous wrote:
       | I bet you could do some pretty clever heuristic hacks to wrap a
       | bunch of programs in this especially if you attach to the process
       | and say, clobber printf. I'm thinking in the spirit of rlwrap.
       | 
       | It's certainly more of a game genie approach but it might
       | occasionally be awesome.
        
       | zokier wrote:
       | Seeing that this originates from FreeBSD ecosystem, did it get
       | actually adopted widely in FreeBSD base system? At least I
       | interpreted that to have been the goal:
       | https://juniper.github.io/libxo/libxo-manual.html#can-you-sh...
        
         | tedunangst wrote:
         | It's used inconsistently. df uses it, but not du. ps, but not
         | ls.
        
       | Norfair wrote:
       | In Haskell we use autodocodec for this.
        
       | 38 wrote:
       | you can already do this in other languages. For example, here is
       | Go:                   package main                  import (
       | "encoding/json"            "encoding/xml"            "os"
       | )                  type wc struct {            File []file
       | }                  type file struct {            Lines      int
       | Words      int             Characters int            Filename
       | string         }                  func main() {
       | etc_motd := wc{               []file{                  {25, 1165,
       | 1140, "/etc/motd"},               },            }
       | json.NewEncoder(os.Stdout).Encode(etc_motd)
       | xml.NewEncoder(os.Stdout).Encode(etc_motd)         }
        
         | paulddraper wrote:
         | Gee thanks mister!
        
           | 38 wrote:
           | get outta here kid.
        
       | ComputerGuru wrote:
       | Serde can _kind of_ do this for rust projects, but you 're
       | usually constrained to outputs that are "identical but for the
       | syntax/format" (i.e. same field names though perhaps with
       | different naming conventions).
       | 
       | I've used that to convert configuration files from one language
       | to the other, such as this json2toml and toml2json tool [0].
       | 
       | [0]: https://github.com/neosmart/toml2json
        
         | mananaysiempre wrote:
         | > Serde can kind of do this for rust projects, but you're
         | usually constrained to outputs that are "identical but for the
         | syntax/format" (i.e. same field names though perhaps with
         | different naming conventions).
         | 
         | Libxo's distinguishing feature (IMO) is that its schema
         | specifications are plaintext output with markup specifying
         | which parts are data to be extracted into structured formats,
         | so that you can port your usual Unix tool to it and not ruin
         | its original ad-hoc output. I don't know of anything positioned
         | as a serialization library that can do this with comparable
         | grace, Serde included.
         | 
         | Related: the section on marking up plaintext output in the Ivo
         | essay[1].
         | 
         | [1]
         | https://web.archive.org/web/20111204021526/http://lubutu.com...
         | (discussed at the time at
         | https://news.ycombinator.com/item?id=3300264)
        
           | ComputerGuru wrote:
           | Yup. libxo is much more free-form, while anything going
           | through a serialize-deserialize process is going to
           | necessarily have to be more regular.
        
       | ary wrote:
       | This, at least in concept, looks like a potential successor to
       | printf() et al. The general accessibility of it is lacking given
       | that it's a C-only API at the moment (there don't appear to be
       | bindings for other languages), and I'm left questioning whether
       | format strings are the best way. Perhaps worse is better in this
       | case.
       | 
       | When thinking about this problem I've not been able to get beyond
       | the decision of "should it be done with something like a builder
       | pattern and a graph of objects/structures" or "should it be done
       | with a DSL" (which is what I consider the format strings approach
       | to be). A DSL is more immediately convenient when creating
       | output, but when you want to understand the structure you're
       | emitting it seems better to have code that is explicit and
       | imperative.
        
         | loeg wrote:
         | libxo is in practice a poor approach to generating structured
         | output from unix utilities. There are at least a few problems.
         | The format strings do not easily replace existing formatted
         | prints, so it is not straightforward to adopt. For anything
         | more complicated than simple row records, you have to change
         | the structure of your program significantly and might as well
         | just use a different path for formatting structured output. It
         | is unaware of locales, and as a result, butchers text in non
         | ASCII/UTF-8 encodings. Finally, a separate-binary-with-
         | structured-text-output is a poor library interface to quite a
         | lot of these utilities -- a callable C API would be more
         | broadly useful.
        
       | yyyk wrote:
       | Libxo is used in FreeBSD. I can't say I'm a fan of the approach
       | though.
       | 
       | Typical printf usage is imperative and additive:
       | 
       | if (enter) printf("Hello "); else printf("Goodbye ");
       | printf("World!\n");
       | 
       | Using the format string forces the programmer to keep implicit
       | state (the document format) all over the place or get an
       | inconsistent document. For example, imagine the first printf
       | calls the column 'Text' and the others call it 'Output'. We can
       | easily do this for a single format, but the complexity will get
       | higher the more we add.
       | 
       | If you do this properly (emit to an object and render from that),
       | the result is trivially consistent. The difficulty here is to get
       | streaming, this is however not always required and can be
       | achieved with a little effort.
        
         | lelanthran wrote:
         | > Using the format string forces the programmer to keep
         | implicit state (the document format) all over the place or get
         | an inconsistent document.
         | 
         | I'm not understanding your objection[1]; surely you would only
         | define the libxo format string _once_ , and then reuse it
         | everywhere? Without libxo you'd need to duplicate your code
         | everywhere for every output format you want to support.
         | 
         | IOW, you'd have to construct your libxo format string using a
         | string concatenation library, something like this:
         | const char *s1 = "{:Text%7ju}";        const char *s2 =
         | "{:Output%7ju}";        char *final = NULL;             if
         | (enter) {           final = strdup (s1);        } else {
         | final = strdup (s2);        }        final = strconcat (final,
         | s2);
         | 
         | Isn't that a better mechanism than printf?
         | 
         | [1] It's late, I've the flu and feeling a little stupid right
         | now. Also, this is the first I've seen this project.
        
           | yyyk wrote:
           | >surely you would only define the libxo format string once,
           | and then reuse it everywhere
           | 
           | I would be rather scared of using a variable for a format
           | string. IMHO, these types of format strings are an
           | antipattern for a different reason - we don't see the format
           | at point of use, if we make some too easy mistakes, we have a
           | crash or CVE*. I think nowaways there are tools to do some
           | verification**, and I guess we could use an IDE (but most C
           | programmers don't?), but I am unfamiliar with any such tool
           | which supports libxo style format strings.
           | 
           | * https://en.wikipedia.org/wiki/Format_string_attack
           | 
           | ** IIRC, GCC/Clang eventually added a verifier? But it
           | doesn't apply to all cases or scanf? I don't recall.
        
         | mpweiher wrote:
         | > The difficulty here is to get streaming
         | 
         | Polymorphic Write Streams do this.
         | 
         | ACM DL: https://dl.acm.org/doi/10.1145/3359619.3359748
         | 
         | pdf:
         | http://www.hirschfeld.org/writings/media/WeiherHirschfeld_20...
         | 
         | Code:
         | https://github.com/mpw/MPWFoundation/tree/master/Streams.sub...
         | 
         | Fast JSON parsing using this approach:
         | https://blog.metaobject.com/2020/04/somewhat-less-lethargic-...
         | 
         | Presentation (DLS '19):
         | https://www.youtube.com/watch?v=DG5MtsMojgI
        
       ___________________________________________________________________
       (page generated 2023-07-14 23:00 UTC)