[HN Gopher] Multimodal Neurons in Artificial Neural Networks
       ___________________________________________________________________
        
       Multimodal Neurons in Artificial Neural Networks
        
       Author : todsacerdoti
       Score  : 61 points
       Date   : 2021-03-04 20:13 UTC (2 hours ago)
        
 (HTM) web link (openai.com)
 (TXT) w3m dump (openai.com)
        
       | gallerdude wrote:
       | I've always thought it's wild how we can apply one concept to so
       | many different types of things. For example, if I say something
       | is "soft," you probably think of the opposite of firmness. But at
       | the same time, I can describe a person as "soft," and the same
       | descriptor can say something meaningful about their character.
       | 
       | Seeing the Spider-Man neuron work on multiple types (pictures,
       | drawings, text), makes it seem like we can teach AI to learn
       | these same type connections.
       | 
       | And if we scale up the network size enough, what if we could see
       | these types through the equivalent of a being with 1000IQ? What
       | connection types are the most effective for a being like that?
       | Can we even understand them? Maybe they would be deep, and
       | archetypical in the way that Odysseus and Harry Potter are the
       | same, despite the fact that one is an ancient Greek king, and the
       | other is a modern British wizard. Even more interestingly, maybe
       | the connections would be completely inexplicable to us, with no
       | apparent rhyme or reason perceptible to humans.
        
         | colah3 wrote:
         | I'm really excited about the dream that we'll be able to learn
         | from neural networks. Shan Cater and Michael Nielsen wrote a
         | really inspiring article on this
         | (https://distill.pub/2017/aia/). I also wrote something about
         | this a while back
         | (http://colah.github.io/posts/2015-01-Visualizing-
         | Representat...).
         | 
         | One of the amazing things about this project exploring CLIP was
         | seeing some hints of this. For example, one day I was studying
         | one of the Africa neurons and it generated the text "IMBEWU" --
         | it turns out this is a popular TV show in South Africa
         | (https://en.wikipedia.org/wiki/Imbewu:_The_Seed). That's a
         | trivial example, but it begins to hint at something
         | interesting.
         | 
         | I'd really love to see what a domain expert analyzing CLIP
         | would make of things. For example, I'd love to hear what
         | ethnographers think of the region neurons, or what historians
         | think of the time period neurons. Especially for future, larger
         | models.
        
       | biasdose wrote:
       | I'm impressed with OpenAI confronting this head on.
       | 
       | "Our model, despite being trained on a curated subset of the
       | internet, still inherits its many unchecked biases and
       | associations."
       | 
       | If these models find themselves into production environment - if
       | they are good enough and profitable enough - they will eventually
       | become legacy systems quietly perpetuating the biases of past
       | times.
        
       | kowlo wrote:
       | The typographic attacks are great fun. Labelling an apple as a
       | toaster is all it takes!
        
         | HPsquared wrote:
         | It needs a 'shenanigans' neuron.
        
       | iujjkfjdkkdkf wrote:
       | Can someone give more technical detail on what they are showing
       | with the "neurons"?
       | 
       | They say "Each neuron is represented by a feature visualization
       | with a human-chosen concept labels to help quickly provide a
       | sense of each neuron", and these neurons are selected from the
       | final layer. I don't think I understand this.
        
         | the8472 wrote:
         | Start with random input, then incrementally optimize the input
         | to maximize the activation of one of the nodes in the graph,
         | the neuron. The visualization is one of those inputs that hit a
         | maximum.
        
       ___________________________________________________________________
       (page generated 2021-03-04 23:00 UTC)