[HN Gopher] How does GPT obtain its ability? Tracing emergent ab... ___________________________________________________________________ How does GPT obtain its ability? Tracing emergent abilities of language models Author : headalgorithm Score : 63 points Date : 2022-12-14 21:07 UTC (1 hours ago) (HTM) web link (yaofu.notion.site) (TXT) w3m dump (yaofu.notion.site) | seydor wrote: | Not sure about all its filters being RL. Sometimes it seems to | flag its output as inappropriate because of a single word (or | none at all). Also it has asymmetric behavior, e.g. it will make | a joke about men but refuse to make one about women | jimmaswell wrote: | Imagine Wolfram's system with GPT integrated. I hope they're | working on that. | seydor wrote: | they seem to be very different and not at all compatible | approaches. | resource0x wrote: | Current GPT version is not very good at math, and it's not at | all clear whether this can be fixed within a language model. | | Q: I have 2 cats and 1 horse. How many legs do they have in | total? | | A: If you have two cats and one horse, then the total number of | legs they have is 8. Cats typically have four legs, while | horses have four legs as well. So two cats would have a total | of 8 legs, and when you add the four legs of the horse, that | brings the total number of legs to 8. | sebzim4500 wrote: | It would have got it right if you got it to show its working. | | That's the trick to doing computations with LLMs, the | computation per token is constant so you want to split the | computation over many tokens. | | > I have 2 cats and 1 horse. How many legs do they have in | total? show your working. | | > If you have 2 cats and 1 horse, they would have a total of | 10 legs. Cats typically have 4 legs, so 2 cats would have 2 x | 4 = <<24=8>>8 legs. Horses have 4 legs, so 1 horse would have | 1 x 4 = <<14=4>>4 legs. Together, the 2 cats and 1 horse | would have 8 + 4 = <<8+4=12>>12 legs. | amelius wrote: | > That's the trick to doing computations with LLMs, the | computation per token is constant so you want to split the | computation over many tokens. | | Could you explain this a bit more? What are LLMs? Why is | the number of tokens different in your approach? | [deleted] | smallerfish wrote: | It's very reactive to the prompt: | | > i have 23 horses each with a normal number of legs, along | with 14 dogs, each with a normal number of legs. how many | legs in total do all of the animals have? | | > Based on the information you provided, it sounds like you | have 23 horses and each horse has 4 legs, and you have 14 | dogs and each dog has 4 legs. If that is correct, then the | total number of legs among all of the animals would be 23 x 4 | + 14 x 4 = 92 + 56 = <<234+144=148>>148 legs. | moffkalast wrote: | You know funny thing, for comparison I just entered that | exact prompt into the 3 gig sized flan-t5 and I get the | following: | | > The horses have 23 * 4 = 126 legs. The dogs have 14 * 4 = | 84 legs. The total number of legs is 126 + 84 = 204. The | answer: 204. | | Very very similar, but completely wrong results. | 1123581321 wrote: | I've found it gets the math right most of the time in | stories. For example: | | > Write a story about a veterinarian counting the number of | total legs on two cats and a horse. He reports the total | number of legs to his assistant so he can write them down. | | As the veterinarian entered the examination room, he was | greeted by two playful cats and a majestic horse. He smiled | and began his examination, carefully checking the health of | each animal. | | After completing his examinations, the veterinarian turned to | his assistant and said, "I need you to write down the total | number of legs for these three animals." | | The assistant nodded and picked up a pen and paper. "Okay, | let's see," the veterinarian said as he counted the legs on | the cats. "Each cat has four legs, so that's eight legs for | the two cats." | | Next, he moved on to the horse. "And this beautiful horse has | four legs as well, for a total of twelve legs." | | The assistant quickly wrote down the numbers as the | veterinarian spoke. "Twelve legs total," the veterinarian | confirmed. "Make sure to include that in their medical | records." | | The assistant nodded and continued to take notes as the | veterinarian continued to provide updates on the animals' | health. As they finished up the examinations, the | veterinarian couldn't help but smile, knowing that he was | able to help these wonderful creatures. | amelius wrote: | Perhaps you can ask GPT about what is wrong in the answer and | how it would fix that? This would give a general approach: | | Q: [Question] | | A: [Answer] | | Q: What is wrong in "[Answer]" and how would you fix that? | | A: [Improved answer] | srajabi wrote: | Amazing insight, particularly section 6. | | "- The two important but different abilities of GPT-3.5 are | *knowledge* and *reasoning*. Generally, it would be ideal if we | could *offload the knowledge part to the outside retrieval system | and let the language model only focus on reasoning.* This is | because: - The model's internal knowledge is always cut off at a | certain time. The model always needs up-to-date knowledge to | answer up-to-date questions. - Recall we have discussed that is | 175B parameter is heavily used for storing knowledge. If we could | offload knowledge to be outside the model, then the model | parameter might be significantly reduced such that eventually, it | can run on a cellphone (call this crazy here, but ChatGPT is | already science fiction enough, who knows what the future will | be)." | | & "Yet there was a WebGPT paper published in Dec 2021. It is | likely that this is already tested internally within OpenAI." | | It definitely feels like this may be the next step in making this | kind of system robust. It ends up being an interface for search. | kelseyfrog wrote: | See REALM[1] for some older(2 years) work on this idea. | | 1. https://arxiv.org/abs/2002.08909 | adamsmith143 wrote: | The problem with ChatGPT's "knowledge" is that it isn't | trustworthy. It will happily output very confident sounding | nonsense, or blatantly incorrect statements. We need a way to | verify how accurate it's outputs are | belter wrote: | ChatGPT made this nice COBOL program to create an S3 Bucket,a | technical impossibility... | | IDENTIFICATION DIVISION. PROGRAM-ID. CREATE-S3-BUCKET. | | ENVIRONMENT DIVISION. CONFIGURATION SECTION. | | INPUT-OUTPUT SECTION. | | DATA DIVISION. FILE SECTION. | | WORKING-STORAGE SECTION. 01 AWS-ACCESS-KEY PIC X(20). 01 AWS- | SECRET-KEY PIC X(40). 01 BUCKET-NAME PIC X(255). | | PROCEDURE DIVISION. CREATE-BUCKET. MOVE AWS-ACCESS-KEY TO | AWS-ACCESS-KEY-VAR MOVE AWS-SECRET-KEY TO AWS-SECRET-KEY-VAR | MOVE BUCKET-NAME TO BUCKET-NAME-VAR INVOKE AWS-S3 "CREATE- | BUCKET" USING AWS-ACCESS-KEY-VAR AWS-SECRET-KEY-VAR BUCKET- | NAME-VAR | nathias wrote: | so, much like other knowledge sources? | jtxt wrote: | One way I tried to do this is by having it write an answer, | and a footnote reference at each fact. [1] then list search | terms that be used to verify each claim, then I would respond | with the url and quotes from found pages for each one, then | have it rewrite the answer based on that information and cite | the sources. I think something this direction can be | automated. I saw someone do this with math and other tasks, | that would talk to a connected program before answering. | moffkalast wrote: | ChatGPT to be employed in marketing positions immediately. | rightbyte wrote: | How much disk space does 175B parameters use? A float or half | precision float per parameter or does it need pointers to | connections too? | | Given how responses are generated in seconds and for free I am | fairly sure it could run on a desktop computer. | moyix wrote: | One float per param, so naively 175*4 = ~700GB on disk. Most | recent models are trained in FP16 or BF16 so 350GB. And | there's some work on quantizing them to INT8 so knock that | down to a mere 175GB. You can definitely run it on a desktop | computer using RAM and NVME offload to make up for the fact | that you probably don't have 175GB of GPU memory available, | but it won't be fast: https://huggingface.co/blog/bloom- | inference-pytorch-scripts | | OpenAI generates responses so fast by doing the generation in | parallel across something like 8x80GB A100s (I don't know the | exact details of their hardware setup, but NVIDIA's open | FasterTransformer library achieves low latency for large | models this way). ___________________________________________________________________ (page generated 2022-12-14 23:00 UTC)