[HN Gopher] Let's try to understand AI monosemanticity ___________________________________________________________________ Let's try to understand AI monosemanticity Author : bananaflag Score : 48 points Date : 2023-11-27 21:04 UTC (1 hours ago) (HTM) web link (www.astralcodexten.com) (TXT) w3m dump (www.astralcodexten.com) | turtleyacht wrote: | By the same token, thinking in memes all the time may be a form | of impoverished cognition. | | Or, is it enhanced cognition, on the part of the interpreter | having to unpack much from little? | aatd86 wrote: | Some kind of single context abstract interpretation maybe. | throwanem wrote: | Darmok and Jalad at Mar-a-Lago. | erikerikson wrote: | Before finishing my read, I need to register an objection to the | opening which reads to me so as to imply it is the only means: | | > Researchers simulate a weird type of pseudo-neural-tissue, | "reward" it a little every time it becomes a little more like the | AI they want, and eventually it becomes the AI they want. | | This isn't the only way. Back propagation is a hack around the | oversimplification of neural models. By adding a sense of | location into the network, you get linearly inseparable functions | learned just fine. | | Hopfield networks with Hebbian learning are sufficient and are | implemented by the existing proofs of concept we have. | s1gnp0st wrote: | > Shouldn't the AI be keeping the concept of God, Almighty | Creator and Lord of the Universe, separate from God- | | This seems wrong. God-zilla is using the concept of God as a | superlative modifier. I would expect a neuron involved in the | concept of godhood to activate whenever any metaphorical "god- | of-X" concept is being used. | Sniffnoy wrote: | I mean, it's not actually. It's just a somewhat unusual | transcription (well, originally somewhat unusual, now obviously | it's the official English name) of what might be more usually | transcribed as "Gojira". | s1gnp0st wrote: | Ah, I thought the Japanese word was just "jira". My mistake. | postmodest wrote: | That's an entirely different monster. | eichin wrote: | Indeed, but not an entirely unrelated one though - per | https://en.wikipedia.org/wiki/Jira_(software)#Naming the | inspiration path was Bugzilla -> Godzilla -> Gojira -> | Jira (which is why Confluence keeps correcting me when I | try to spell it JIRA) | VinLucero wrote: | I see what you did there. | lukev wrote: | There's actually a somewhat reasonable analogy to human cognitive | processes here, I think, in the sense that humans tend to form | concepts defined by their connectivity to other concepts (c.f. | Ferdinand de Saussure & structuralism). | | Human brains are also a "black box" in the sense that you can't | scan/dissect one to build a concept graph. | | Neural nets do seem to have some sort of emergent structural | concept graph, in the case of LLMs it's largely informed by human | language (because that's what they're trained on.) To an extent, | we can observe this empirically through their output even if the | first principles are opaque. | _as_text wrote: | I just skimmed through it for now, but it has seemed kinda | natural to me for a few months now that there would be a deep | connection between neural networks and differential or algebraic | geometry. | | Each ReLU layer is just a (quasi-)linear transformation, and a | pass through two layers is basically also a linear | transformation. If you say you want some piece of information to | stay (numerically) intact as it passes through the network, you | say you want that piece of information to be processed in the | same way in each layer. The groups of linear transformations that | "all process information in the same way, and their compositions | do, as well" are basically the Lie groups. Anyone else ever had | this thought? | | I imagine if nothing catastrophic happens we'll have a really | beautiful theory of all this someday, which I won't create, but | maybe I'll be able to understand it after a lot of hard work. | shermantanktop wrote: | As described in the post, this seems quite analogous to the | operation of a bloom filter, except each "bit" is more than a | single bit's worth of information, and the match detection has to | do some thresholding/ranking to select a winner. | | That said, the post is itself clearly summarizing much more | technical work, so my analogy is resting on shaky ground. | gmuslera wrote: | At least the first part reminded me of Hyperion and how AIs | evolved there (I think the actual explanation is in The Fall of | Hyperion), smaller but more interconnected "code". | | Not sure about actual implementation, but at least for us | concepts or words are not pure nor isolated, they have multiple | meanings that collapse into specific ones as you put several | together | daveguy wrote: | All this anthropomorphizing of activation networks strikes me as | very odd. None of these neurons "want" to do anything. They | respond to specific input. Maybe humans are the same, but in the | case of artificial neural networks we at least know it's a simple | mathematical function. Also, an artificial neuron is nothing like | a biological neuron. At the most basic -- artificial neurons | don't "fire" except in direct response to inputs. Biological | neurons fire _because of their internal state_ , state which is | modified by biological signaling chemicals. It's like comparing | apples to gorillas. ___________________________________________________________________ (page generated 2023-11-27 23:00 UTC)