[HN Gopher] The potential of transformers in reinforcement learning ___________________________________________________________________ The potential of transformers in reinforcement learning Author : beefman Score : 54 points Date : 2021-12-19 18:59 UTC (1 days ago) (HTM) web link (lorenzopieri.com) (TXT) w3m dump (lorenzopieri.com) | Buttons840 wrote: | What's a good introduction to transformers? | tchalla wrote: | I have seen a lot of introduction that explains the mechanics. | However, I haven't seen one that explains the intuition of the | hypothesis on why it works. | lalaithion wrote: | Not sure if this is a good introduction, but a good second | paper to read is https://arxiv.org/abs/2106.06981 | | You can think of finite state machines as being two functions: | f(input, state) = output, and g(input, state) = next_state. | (Traditional FSMs have 3 'output' states, basically 'terminated | - success', 'terminated - failure', and 'still working', but in | theory it makes sense to fully generalize it). | | If you think about plain neural networks as approximating | arbitrary functions f(input) = output, then recurrent neural | networks are "continuous state machines", where you have the | same two functions f(input, state) = output, and g(input, | state) = next_state, except instead of being finite symbols, | they're continuous points in N dimensional space. This, at | least to me, clarifies why recurrent neural networks work on | simple and short time-based problems, but can't efficiently | generalize to complex problems--they're just FSMs! | | The paper I linked above provides a similar high-level | computational analogy to how transformers work. | atty wrote: | If you mean the technical details of attention models, the | original paper "Attention is all you need" is not too difficult | to read. If you're more interested in applications, hugging | face has a "course" on their website that walks through the | high level topics of applying transformers to natural language | processing (can't remember if they cover transformers for other | topics). | criticaltinker wrote: | The original paper that introduced the Transformer architecture | is quite accessible and outlines a lot of the history and | rational for the design [1]. | | [1] https://arxiv.org/pdf/1706.03762.pdf | mrfusion wrote: | I tried that but it seems to gloss over what an encoder, etc | actually are. | | I think I'd do better with pseudo code or a toy example. | saynay wrote: | The encoder is the neural-net that converts the input to | the embedding vector. The decoder is the neural-net that | converts that vector into output. What that embedding | vector "means" is whatever the entire algorithm has learned | it means. | | For more simplified look at embeddings, I would look at | Word2Vec (although, it doesn't involve transformers). It | encodes single words, instead of entire phrases, and does | so by looking at their relative position to other words | while being trained. | | Embeddings are just vectors, and so you can do math or | compare them to other embeddings. The famous example is | E(king) - E(man) + E(woman) = E(queen) | mrfusion wrote: | So you're saying the encoding could be a neural net OR | something like word2vec? | criticaltinker wrote: | Check out The Annotated Transformer, it's one of my | favorite references! It contains straightforward python | code side by side with excerpts from the original paper. | | http://nlp.seas.harvard.edu/2018/04/03/attention.html | beefman wrote: | Transformers from Scratch | | link: https://e2eml.school/transformers.html | | discussion here: https://news.ycombinator.com/item?id=29315107 | dpflan wrote: | Another resource: "The Illustrated Transformer" | | - https://jalammar.github.io/illustrated-transformer/ | | - HN post for the article: | https://news.ycombinator.com/item?id=18351674 | visarga wrote: | For accessibility I recommend Yannic Kilcher video review of | "Attention Is All You Need" | | https://www.youtube.com/watch?v=iDulhoQ2pro | | Yannic has been making about 62 other transformer paper reviews | since. You can find the usual suspects. | | https://www.youtube.com/watch?v=u1_qMdb0kYU&list=PL1v8zpldgH... | moffkalast wrote: | Transformers (2007) | timy2shoes wrote: | I prefer to go to the original source, specifically The | Transformers (1984-87). | visarga wrote: | So transformers have done it again, another sub-field of ML with | all its past approaches surpassed by a simple language model, at | least when there is enough data. | | So far they can handle: text, image, video, code, proteins and | now planning and behavior. It's like a universal algorithm for | learning and reminds me of the uniformity of the brain. Hope | we're going to see much more efficient hardware implementations | in the future. | blovescoffee wrote: | I wouldn't say they've "done it" quite yet. There's definitely | an application for imitation learning but that might be it. A | translation of the work in sequence-to-sequence to sequence-to- | action is something I've also considered researching. A few | challenges exist which the author touches on in just one | sentence. First, we need data about previous sequences of | actions and this is necessarily a challenge in many fields in | robotics/learning. A related problem is that of exploration. | How exactly should we inform the exploration of new sequences? | Also, if our policy is based on the prediction of a | Transformer, does it have the traditional desirable properties | of a policy in an RL environment? Off the top of my head it | seems like a Transformer fed into an MLP would probably be fit | but I'm not sure. Transformers do seem promising, but it's a | bit early to say they've "done it" :) ___________________________________________________________________ (page generated 2021-12-20 23:01 UTC)