hngopher.com

       [HN Gopher] Let's try to understand AI monosemanticity
       ___________________________________________________________________
        
       Let's try to understand AI monosemanticity
        
       Author : bananaflag
       Score  : 48 points
       Date   : 2023-11-27 21:04 UTC (1 hours ago)
        
 (HTM) web link (www.astralcodexten.com)
 (TXT) w3m dump (www.astralcodexten.com)
        
       | turtleyacht wrote:
       | By the same token, thinking in memes all the time may be a form
       | of impoverished cognition.
       | 
       | Or, is it enhanced cognition, on the part of the interpreter
       | having to unpack much from little?
        
         | aatd86 wrote:
         | Some kind of single context abstract interpretation maybe.
        
         | throwanem wrote:
         | Darmok and Jalad at Mar-a-Lago.
        
       | erikerikson wrote:
       | Before finishing my read, I need to register an objection to the
       | opening which reads to me so as to imply it is the only means:
       | 
       | > Researchers simulate a weird type of pseudo-neural-tissue,
       | "reward" it a little every time it becomes a little more like the
       | AI they want, and eventually it becomes the AI they want.
       | 
       | This isn't the only way. Back propagation is a hack around the
       | oversimplification of neural models. By adding a sense of
       | location into the network, you get linearly inseparable functions
       | learned just fine.
       | 
       | Hopfield networks with Hebbian learning are sufficient and are
       | implemented by the existing proofs of concept we have.
        
       | s1gnp0st wrote:
       | > Shouldn't the AI be keeping the concept of God, Almighty
       | Creator and Lord of the Universe, separate from God-
       | 
       | This seems wrong. God-zilla is using the concept of God as a
       | superlative modifier. I would expect a neuron involved in the
       | concept of godhood to activate whenever any metaphorical "god-
       | of-X" concept is being used.
        
         | Sniffnoy wrote:
         | I mean, it's not actually. It's just a somewhat unusual
         | transcription (well, originally somewhat unusual, now obviously
         | it's the official English name) of what might be more usually
         | transcribed as "Gojira".
        
           | s1gnp0st wrote:
           | Ah, I thought the Japanese word was just "jira". My mistake.
        
             | postmodest wrote:
             | That's an entirely different monster.
        
               | eichin wrote:
               | Indeed, but not an entirely unrelated one though - per
               | https://en.wikipedia.org/wiki/Jira_(software)#Naming the
               | inspiration path was Bugzilla -> Godzilla -> Gojira ->
               | Jira (which is why Confluence keeps correcting me when I
               | try to spell it JIRA)
        
             | VinLucero wrote:
             | I see what you did there.
        
       | lukev wrote:
       | There's actually a somewhat reasonable analogy to human cognitive
       | processes here, I think, in the sense that humans tend to form
       | concepts defined by their connectivity to other concepts (c.f.
       | Ferdinand de Saussure & structuralism).
       | 
       | Human brains are also a "black box" in the sense that you can't
       | scan/dissect one to build a concept graph.
       | 
       | Neural nets do seem to have some sort of emergent structural
       | concept graph, in the case of LLMs it's largely informed by human
       | language (because that's what they're trained on.) To an extent,
       | we can observe this empirically through their output even if the
       | first principles are opaque.
        
       | _as_text wrote:
       | I just skimmed through it for now, but it has seemed kinda
       | natural to me for a few months now that there would be a deep
       | connection between neural networks and differential or algebraic
       | geometry.
       | 
       | Each ReLU layer is just a (quasi-)linear transformation, and a
       | pass through two layers is basically also a linear
       | transformation. If you say you want some piece of information to
       | stay (numerically) intact as it passes through the network, you
       | say you want that piece of information to be processed in the
       | same way in each layer. The groups of linear transformations that
       | "all process information in the same way, and their compositions
       | do, as well" are basically the Lie groups. Anyone else ever had
       | this thought?
       | 
       | I imagine if nothing catastrophic happens we'll have a really
       | beautiful theory of all this someday, which I won't create, but
       | maybe I'll be able to understand it after a lot of hard work.
        
       | shermantanktop wrote:
       | As described in the post, this seems quite analogous to the
       | operation of a bloom filter, except each "bit" is more than a
       | single bit's worth of information, and the match detection has to
       | do some thresholding/ranking to select a winner.
       | 
       | That said, the post is itself clearly summarizing much more
       | technical work, so my analogy is resting on shaky ground.
        
       | gmuslera wrote:
       | At least the first part reminded me of Hyperion and how AIs
       | evolved there (I think the actual explanation is in The Fall of
       | Hyperion), smaller but more interconnected "code".
       | 
       | Not sure about actual implementation, but at least for us
       | concepts or words are not pure nor isolated, they have multiple
       | meanings that collapse into specific ones as you put several
       | together
        
       | daveguy wrote:
       | All this anthropomorphizing of activation networks strikes me as
       | very odd. None of these neurons "want" to do anything. They
       | respond to specific input. Maybe humans are the same, but in the
       | case of artificial neural networks we at least know it's a simple
       | mathematical function. Also, an artificial neuron is nothing like
       | a biological neuron. At the most basic -- artificial neurons
       | don't "fire" except in direct response to inputs. Biological
       | neurons fire _because of their internal state_ , state which is
       | modified by biological signaling chemicals. It's like comparing
       | apples to gorillas.
        
       ___________________________________________________________________
       (page generated 2023-11-27 23:00 UTC)