[HN Gopher] The shape of data ___________________________________________________________________ The shape of data Author : luu Score : 50 points Date : 2022-03-29 20:48 UTC (1 days ago) (HTM) web link (www.scattered-thoughts.net) (TXT) w3m dump (www.scattered-thoughts.net) | [deleted] | cupofpython wrote: | Honestly, I dont see the issue with JSON. It is capturing user | generated content. It's not that '43' is a logged as a string | instead of an int - it is that '43' is the raw data in quotes. To | me, that is the same spirit as using "read" instead of "eval" as | mentioned elsewhere. Yes the read-print-loop fails for JSON - but | JSON only has this failing when you are working with code- | generated values. At the end of the day - a user type the 4 and 3 | keys on their keyboard and that was captured. To say it is an int | or a str or whatever brings back the need to understand memory | representations. | | for example - when parsing json with python, you can apply the | same principles you would to python objects. That is, assume the | item is the format you know it should be (or test it first to be | safe) | | so even though the json is {'43' : ['bob','alice']} - you can do | an int() cast if you need to do something with that data that | requires it to have a type. Otherwise it is represented as it was | typed. | | I do agree with the article overall though! | joshlemer wrote: | So, JSON has non-string values in other positions (as elements | of arrays, or values in an object). Wouldn't your argument also | lead to the conclusion that we don't need numbers at all, since | we could get by with | | { "foo": "42", "bar": ["1", "2", "3"] } | | There's also the issue of values with multiple equivalent | string representations. I want 42.1 to equal 42.10 and 42.100. | I also want {"foo":1,"bar":2} to equal {"bar":2,"foo":1} but | with just strings you don't get that: | | { "{\"foo\":1,\"bar\":2}": 1, "{\"bar\":2,\"foo\":1}": 1, | "42.1": 2, "42.10": 2, "42.100": 2 } | | should have 2 keys but has 5 | thedudeabides5 wrote: | This wishlist sounds like rose.ai | | "Wishlist | | Data model: | | A small set of primitives eg writing an inspector gui eg | searching for references to some id But still able to represent | types and invariants Able to reify changes as data eg for undo | log eg for real-time collaboration | | All data has some name/path/location by which it can be referred | to eg no hidden state in closures eg no hidden closures in the | event loop queues Avoid depending on pointers for identity | | Data notation: | | A textual representation which is easy to read/write Used | consistently everywhere - one standard way of picturing data | Self-describing - doesn't require out-of-band type/schema | | Uses layering to add capabilities while mimicking familiar | notation Uses shorthands and exploits context to reduce redundant | information eg clojure namespace aliases eg unison names | | Code: | | The notation for code is a superset of the notation for data eg | can print data and copy-paste into code / repl | | Can choose the mapping between tags in data notation and types in | code Code can be represented as data with low mental distance | | The codebase is also data - can trivially analyze whole thing | including dependencies without having to execute side effects | | Maybe, if possible, reify the execution of code as data | | Crucially, the data model and the data notation need to be co- | designed, because it's so easy to make choices in the data model | that prevent creating a good data notation later." ___________________________________________________________________ (page generated 2022-03-30 23:01 UTC)