[HN Gopher] Show HN: utt - Universal Text Transformer
       ___________________________________________________________________
        
       Show HN: utt - Universal Text Transformer
        
       Author : notamy
       Score  : 48 points
       Date   : 2022-03-07 19:34 UTC (3 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jwilk wrote:
       | What's the purpose of the help message screenshot? Couldn't you
       | use a code block, like in all the other examples?
        
         | notamy wrote:
         | I could and probably should, it was just really lazy at the
         | time it was written >~<
        
           | chungy wrote:
           | Isn't it _more_ effort to do the screenshot than a simple
           | copy+paste?
        
       | kbd wrote:
       | FWIW nu shell can be used for a lot of the same use-cases:
       | $ echo "[1, 2, 3]" | nu -c 'cat | from json | to yaml'
       | ---         - 1         - 2         - 3                  $ echo
       | '{"key": [1, 2, 3]}' | nu -c 'cat | from json | to yaml'
       | ---         key:           - 1           - 2           - 3
        
         | [deleted]
        
         | notamy wrote:
         | Cool! Glad I'm not the only one who saw a need for this stuff
         | :D
        
       | [deleted]
        
       | TheMagicHorsey wrote:
       | Reminds me of the Haskell project Pandoc ... which is also aimed
       | at this use case.
        
       | sanity31415 wrote:
       | Good idea, but this is a painful limitation:
       | 
       | > For example, utt does not process data in a streaming manner,
       | but rather loads the entire dataset into memory before processing
        
       | lgessler wrote:
       | Cool project! Something to consider: "Transformer"[1] is already
       | used to refer to a popular element of state of the art deep
       | neural networks, especially ones that are used on natural
       | language, i.e. text. That might make this name a little confusing
       | for people who have a foot in that world, especially since
       | "Universal" and "Text" are also words thrown around in similar
       | contexts.
       | 
       | [1]:
       | https://en.wikipedia.org/wiki/Transformer_(machine_learning_...
        
         | laumars wrote:
         | Transform is also used in data mangling to mean exactly what
         | this tool does.
         | 
         | Transformation is also used in enterprise IT to mean
         | modernising.
         | 
         | Transformer is also a mathematics term, a kids toy and many
         | other things.
         | 
         | In short, ML doesn't have a monopoly on it.
        
         | [deleted]
        
         | ketralnis wrote:
         | It's also a concept in haskell but I don't think anybody would
         | claim a monopoly on the idea of transforming things
        
         | BaseballPhysics wrote:
         | The term "transformer" is used throughout mathematics and
         | computing. Machine learning hardly has the monopoly on it. Or
         | should, for example, Haskell rename their monad transformers
         | library?
        
         | amelius wrote:
         | Deep learning also messed up the definition of tensor, and so
         | _they_ are the ones who should be careful with picking names.
         | 
         | (That you can represent a tensor by a multidimensional array
         | does not mean that a tensor _is_ a multidimensional array).
         | 
         | https://en.wikipedia.org/wiki/Tensor
        
         | notamy wrote:
         | Good to know, thanks! I was originally considering a name more
         | like "universal text finagler" but then I realised that there's
         | already a well-known "UTF" (:
        
           | throwawybllion wrote:
           | uft: Universal Finagler of Text, Universal Format Translator?
           | unt: unt is not a transformer?
        
           | chungy wrote:
           | UTF itself is already short for "Unicode Transformation
           | Format" and certainly predates any usage of the middle word
           | by ML projects.
        
           | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-03-07 23:00 UTC)