[HN Gopher] Show HN: Base32H, a human-friendly duotrigesimal num...
       ___________________________________________________________________
        
       Show HN: Base32H, a human-friendly duotrigesimal number system
        
       Author : yellowapple
       Score  : 13 points
       Date   : 2020-09-06 07:06 UTC (1 days ago)
        
 (HTM) web link (base32h.github.io)
 (TXT) w3m dump (base32h.github.io)
        
       | sharpercoder wrote:
       | The V v U u alias should be split into V v and U u. The l should
       | be used as alias for 1.
       | 
       | I totally see the historic and soundex analogue between V v U u,
       | but it seems to me that the visual similarity of 1 L l i has
       | precedence.
        
       | thenines wrote:
       | I like this, though I agree with others that the minimally-
       | confused U & V, would be better traded for the oft-confused 1, I
       | & l.
       | 
       | One slight additional issue not so far mentioned is what of the
       | case of needing to encode one of many now "NSFW numbers", such as
       | the trigger-warning (!) decimal 739787225?
        
       | keithlfrost wrote:
       | I have only one complaint about this Base32 encoding choice, and
       | it stems from the fact that I prefer to encode Base32 using lower
       | case letters, instead of the choice made here to make upper case
       | canonical. When using lower case, the main source of possible
       | confusion is that it can be difficult to tell l and 1 apart, as
       | in l1l1l1l... and this scheme uses both l (canonically "L") and
       | 1.
        
         | edoceo wrote:
         | Hmm, other base32 system avoid that by not including I and L
         | (and O) - and some other refs I've read (ULID comes to mind)
         | say produce UPPER output but accept either case input.
         | 
         | And, like this spec, the values are aliases so 0/o are the
         | same, 1/I/l are the same, etc
         | 
         | https://github.com/ulid/spec
        
           | keithlfrost wrote:
           | Yes. I'm surprised the author would be more concerned about
           | confusion between U/V (or u/v) than between 1/l ... the
           | former has always seemed relatively far-fetched to me,
           | whereas depending on the font, the latter can be a real
           | problem. Again, I attribute the issue to the choice of upper
           | case as canonical, because L is not easily confused with any
           | other letter or number.
        
       | qrian wrote:
       | Why is l and I not aliased? They are easily confused in san-serif
       | fonts.
        
         | tzs wrote:
         | 26 letters + 10 digits - 4 letters lost to aliasing (o, i, s,
         | u) = 32 symbols. Making L and alias would leave them only
         | enough for base 31 unless they dropped one of the other
         | aliases.
         | 
         | Personally, I'd be OK with that. I think U is much less likely
         | to be confused for V, at least in anything not handwritten,
         | than lower-case L is likely to be confused for 1.
        
       | jarym wrote:
       | I threw 0x2059B7DEDB800C03 (2331096449934167043) in there as a
       | Postgres int8 I had lying around and I get the message: "The
       | number you're encoding is bigger than what Javascript can
       | accurately represent, so the below value is probably incorrect."
       | 
       | However, it is (now) possible to represent this in JavaScript as
       | a BigInt[0]
       | 
       | [0] https://developer.mozilla.org/en-
       | US/docs/Web/JavaScript/Refe...
        
       | geoah wrote:
       | Interesting approach, thank you for sharing this. I especially
       | like the 5Ss aliasing.
       | 
       | A test vector file for implementers would be nice (something like
       | what cbor provides) so all possible edge cases can be checked
       | for.
        
         | yellowapple wrote:
         | Good idea, agreed. There's a partial attempt at that in the JS
         | implementation's test suite, albeit written into the test code
         | itself; that'd probably be a decent enough starting point for a
         | non-comprehensive approach.
         | 
         | At some point a full-blown test harness will be useful (i.e. to
         | compare implementations and make sure they have equivalent
         | behavior, for e.g. randomized or sequential tests). Haven't
         | gotten that far yet :)
        
       | fsiefken wrote:
       | I miss the comparison to the duodecimal number system. To me that
       | seems much better then both the decimal and the duotrigesimal
       | number system.
       | http://duodecimal.net/archives/duodecimal/duodecimal.html
        
       | nick_kline wrote:
       | Use the us ascii 0-9 and a-z (caps equivalent). O and zero (0)
       | are the same, as are numeral 1 and L _and_ I. S matches a and 5.
       | 
       | I like it.
        
       | dheera wrote:
       | Bitcoin addresses use base58 I believe, which is like base64 but
       | avoids 0, O, I, l, +, /. It arguably serves the human-friendly
       | requirement well while being more compact than Base32H. It is,
       | however, not friendly to byte-aligning use cases.
        
       ___________________________________________________________________
       (page generated 2020-09-07 23:01 UTC)