[HN Gopher] Gojq: Pure Go Implementation of Jq
       ___________________________________________________________________
        
       Gojq: Pure Go Implementation of Jq
        
       Author : laqq3
       Score  : 62 points
       Date   : 2022-08-21 18:05 UTC (4 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | simonw wrote:
       | "gojq does not keep the order of object keys" is a bit
       | disappointing.
       | 
       | I care about key order purely for cosmetic reasons: when I'm
       | designing JSON APIs I like to put things like the "id" key first
       | in an object layout, and when I'm manipulating JSON using jq or
       | similar I like to maintain those aesthetic choices.
       | 
       | I know it's bad to write code that depends on key order, but it's
       | important to me as a way of keeping JSON as human-readable as
       | possible.
       | 
       | After all, human readability is one of the big benefits of JSON
       | over various other binary formats.
        
         | haasted wrote:
         | I bet it's an artifact of Go having a randomized iteration
         | order over maps [0]. Getting a deterministic ordering requires
         | extra work.
         | 
         | [0] https://stackoverflow.com/questions/9619479/go-what-
         | determin...
        
           | vips7L wrote:
           | Does Go not have more than one Map implementation in the
           | standard library?
        
             | [deleted]
        
             | esprehn wrote:
             | It does not. Maps are not even a real interface you can
             | implement, it's compiler magic encoded in the language
             | spec: https://dave.cheney.net/2018/05/29/how-the-go-
             | runtime-implem...
             | 
             | This is all fallout of not having generics.
        
           | [deleted]
        
           | simonw wrote:
           | I used to have the exact same problem with Python, until
           | Python 3.7 made maintaining sort order a feature of the
           | language: https://softwaremaniacs.org/blog/2020/02/05/dicts-
           | ordered/
        
             | c2h5oh wrote:
             | Go actually went in the other direction for a bunch of
             | reasons (e.g. hash collision dos) and made key order quasi-
             | random when iterating. Small maps used to maintain order,
             | but a change was made to randomize that so people didn't
             | rely on that and get stung when their maps got larger:
             | https://github.com/golang/go/issues/6719
        
               | tialaramex wrote:
               | Right, the startling thing about Python's previous dict
               | was that it was _so_ terrible that the ordered dict was
               | actually significantly faster.
               | 
               | It's like if you did such a bad job making a drag racer
               | that the street legal model of the same car was
               | substantially faster over a quarter mile despite also
               | having much better handling and reliability.
               | 
               | In some communities the reaction would have been to write
               | a good _unordered_ dict which would obviously be even
               | faster, but since nobody is exactly looking for the best
               | possible performance from Python, they decided that
               | ordered behaviour was worth the price, and it 's not as
               | though existing Python programmers could complain since
               | it was faster than what they'd been tolerating
               | previously.
               | 
               | Randomizing is the other choice if you actually want your
               | maps to be fast and want to resist Hyrum's law, but see
               | the absl experience - they initially didn't bother to
               | randomize tiny maps but then the order of those tiny maps
               | changed for technical reasons and... stuff broke. Because
               | hey, in testing I made six of this tiny map, they always
               | had the same order therefore (ignoring the documentation
               | imploring me not to) I shall assume the order is always
               | the same...
        
               | alecthomas wrote:
               | > Right, the startling thing about Python's previous dict
               | was that it was so terrible that the ordered dict was
               | actually significantly faster.
               | 
               | I've never heard that before and it would be really
               | surprising, given that Python's builtin dict is used for
               | everything from local symbol to object field lookup. Do
               | you have more information?
        
               | aaronbee wrote:
               | Here's a description of the new map implementation and
               | why it's more efficient:
               | https://www.pypy.org/posts/2015/01/faster-more-memory-
               | effici...
        
         | cerved wrote:
         | into it's not about code, it's about predicable and consistent
         | layout so that you can easily diff
        
         | zxcvbn4038 wrote:
         | Yeah, this is a deal breaker. While technically the key order
         | doesn't matter, in the real world it really does matter. People
         | have to read this stuff. People have to be able to
         | differentiate between actual changes and stuff moving around
         | just because. Luckily it's a solved problem and you can write
         | marshalers that preserve order, but it's extra work and
         | generally specific to an encoding format. It would be nice to
         | have ordered maps in the base library as an option.
        
         | silverwind wrote:
         | Agree, this is deterring me from this tool. Many
         | languages/tools nowadays guarantee object key order which is
         | convenient in many ways.
        
           | lapser wrote:
           | For what it's worth, JSON Objects are not guaranteed to be
           | ordered. Maps in many different languages are implemented
           | without an order.
        
       | fwip wrote:
       | Not implementing key-sorting is a curious decision:
       | 
       | > gojq does not keep the order of object keys. I understand this
       | might cause problems for some scripts but basically, we should
       | not rely on the order of object keys. Due to this limitation,
       | gojq does not have keys_unsorted function and --sort-keys (-S)
       | option. I would implement when ordered map is implemented in the
       | standard library of Go but I'm less motivated.
       | 
       | I feel like --sort-keys is most useful when it is producing
       | output for tools that do not understand JSON - for example,
       | generating diffs or hashes of the JSON string. There is value in
       | the output formatting being deterministic for a given input.
        
         | renewiltord wrote:
         | Could pipe through gron and sort to resort
        
           | Someone wrote:
           | That helps when you want to sort by key, but not when you
           | want to keep the order of object keys as in the input file.
        
         | eropple wrote:
         | I agree with you that there's value to sorted keys from a
         | presentational standpoint (we are not beep-boop robots, humans
         | have to read this stuff too), but now there also exists a JSON
         | canonicalization RFC that tools can/should follow (with all the
         | usual caveats about canonicalization being fraught):
         | https://www.rfc-editor.org/rfc/rfc8785
        
           | mdaniel wrote:
           | I guess "Informational" is better than /dev/null, but unless
           | everyone adopts it doesn't that run the risk of it just being
           | My Favorite Canonicalization™?
           | 
           | Either way, I'm guessing if the gojq author has _that much_
           | heartburn about implementing --sort-keys, --canonical is just
           | absolutely off the table :-(
        
           | fwip wrote:
           | Thank you for letting me know! I hadn't thought to look.
        
       | spullara wrote:
       | i neither know nor care what language the original jq was
       | implemented in.
        
         | brundolf wrote:
         | I can think of two reasons it matters here:
         | 
         | - Can be used as a library in Go projects
         | 
         | - Memory-safe (could be relevant when processing foreign data,
         | esp as a part of some automated process)
        
           | donio wrote:
           | Yep, Benthos is an example of a cool project that uses gojq
           | for its jq syntax support.
        
       | lapser wrote:
       | I have actually fully replaced my jq installation with gojq
       | (including an `ln -s gojq jq`) for a few years, and no script has
       | broken so far. I'm super impressed by the jq compatibility.
       | 
       | If you are going down this route, do be careful with performance.
       | I don't know which is more performant as I've never really had to
       | work with large data sets, but I can't help but feel jq will be
       | faster than gojq in such case. I have no benchmarks backing this
       | up, but who knows, maybe someone will benchmark both.
       | 
       | One of my favourite features is the fact that error messages are
       | actually legible, unlike jq.
        
         | brundolf wrote:
         | It's very possible it could be faster; jq seems to actually be
         | fairly unoptimized. This implementation in OCaml was featured
         | on HN a while back and it trashes the original jq in
         | performance: https://github.com/davesnx/query-json
         | 
         | After seeing that one I did my own (less-complete) version in
         | Rust and managed to squeeze out even more performance in the
         | operations it supports: https://github.com/brundonsmith/jqr
        
       | cube2222 wrote:
       | This looks quite cool! I'm not sure though why I would use this
       | over the original jq. However, I can definitely see the value in
       | embedding this into my own applications, to provide jq scripting
       | inside of them.
       | 
       | Shameless plug: As I'm not a fan of the jq syntax, I've created
       | jql[0] as an alternative to it. It's also written in Go and
       | presents a lispy continuation-based query language (it sounds
       | much scarier than it really is!). This way it has much less
       | special syntax than jq and is - at least to me - much easier to
       | compose for common day-to-day JSON manipulation (which is the use
       | case it has been created for; there are definitely many features
       | of jq that aren't covered by it).
       | 
       | It might seem dead, as it hasn't seen any commit in ages, but to
       | me it's just finished, I still use it regularly instead of jq on
       | my local dev machine. Check it out if you're not a fan of the jq
       | syntax.
       | 
       | [0]: https://github.com/cube2222/jql
        
         | laqq3 wrote:
         | One reason to prefer gojq is that gojq's author is one of the
         | most knowledgeable person for the original jq (as seen by
         | GitHub PRs and issues), and his gojq fixes many long standing
         | issues in jq.
         | 
         | Plus, for my use cases, gojq beats jq by a fair margin.
        
       ___________________________________________________________________
       (page generated 2022-08-21 23:00 UTC)