[HN Gopher] "Attention", "Transformers", in Neural Network "Larg... ___________________________________________________________________ "Attention", "Transformers", in Neural Network "Large Language Models" Author : macleginn Score : 21 points Date : 2023-12-24 21:10 UTC (1 hours ago) (HTM) web link (bactra.org) (TXT) w3m dump (bactra.org) | low_tech_love wrote: | Really interesting, I like the kind of "stream of consciousness" | approach to the content, it's refreshing. What's also interesting | is the fact that the author felt the need to apologize and | preface it with some forced deference due to some kind of | internet bashing he certainly received. I hope this doesn't | discourage him to keep publishing his notes (although I think it | will). Why are we getting so human-phobic? | defrost wrote: | It's an understanderable deference when stumbling through a | huge new field and its freshly minted jargon when tidying up | and tying the new jargon to long standing terms in older | fields. | | "As near as I can tell when the new guard says X they're pretty | much talking about what we called Y" | | Does 'attention' in the AI bleeding edge really correspond to | kernal smoothing | mapping attenuation | damping ? | | This is (one of) the elephants in a darkened room that Cosma is | groping around and showing his thoughts as he goes. | | > I hope this doesn't discourage him to keep publishing his | notes | | Doubtful, aside from the inevitable attenuation with age, he's | been airing his thoughts for at least two decades, eg: his | wonderful little: | | _A Rare Blend of Monster Raving Egomania and Utter Batshit | Insanity_ (2002) | | http://bactra.org/reviews/wolfram/ | panarchy wrote: | It is nice and it's interesting how if you go read stuff like | Einstein's general relativity paper you (or at least I did) | find that it's actually quite similar and not so dense. | brcmthrowaway wrote: | Biggest takeaway: extraction of prompts seems to be complete | bullshit. | haltist wrote: | This person doesn't understand that large neural networks are | somewhat conscious and a stepping stone to AGI. Why else would | OpenAI be worth so much money if it wasn't a stepping stone to | AGI? No one can answer this without making it obvious they do not | understand that large numbers can be conscious and sentient. | Checkmate atheists. | ChainOfFools wrote: | I probably would agree with the unsnarkified version of what | you're saying to some extent, but I think it's worth mentioning | that the argument you seem to be dismissing can take a much | stronger form, questioning latent premises about free will by | proposing that _neither_ computers nor humans are sentient, | that they are both entirely deterministic and utimately amount | to interference patterns of ancient thermodynamic gradients | created in the formation of the universe. | seydor wrote: | And what do the different heads represent? Why are query, key, | and values simply linear transforms of the input. ___________________________________________________________________ (page generated 2023-12-24 23:00 UTC)