[HN Gopher] How does GPT obtain its ability? Tracing emergent ab...
       ___________________________________________________________________
        
       How does GPT obtain its ability? Tracing emergent abilities of
       language models
        
       Author : headalgorithm
       Score  : 63 points
       Date   : 2022-12-14 21:07 UTC (1 hours ago)
        
 (HTM) web link (yaofu.notion.site)
 (TXT) w3m dump (yaofu.notion.site)
        
       | seydor wrote:
       | Not sure about all its filters being RL. Sometimes it seems to
       | flag its output as inappropriate because of a single word (or
       | none at all). Also it has asymmetric behavior, e.g. it will make
       | a joke about men but refuse to make one about women
        
       | jimmaswell wrote:
       | Imagine Wolfram's system with GPT integrated. I hope they're
       | working on that.
        
         | seydor wrote:
         | they seem to be very different and not at all compatible
         | approaches.
        
         | resource0x wrote:
         | Current GPT version is not very good at math, and it's not at
         | all clear whether this can be fixed within a language model.
         | 
         | Q: I have 2 cats and 1 horse. How many legs do they have in
         | total?
         | 
         | A: If you have two cats and one horse, then the total number of
         | legs they have is 8. Cats typically have four legs, while
         | horses have four legs as well. So two cats would have a total
         | of 8 legs, and when you add the four legs of the horse, that
         | brings the total number of legs to 8.
        
           | sebzim4500 wrote:
           | It would have got it right if you got it to show its working.
           | 
           | That's the trick to doing computations with LLMs, the
           | computation per token is constant so you want to split the
           | computation over many tokens.
           | 
           | > I have 2 cats and 1 horse. How many legs do they have in
           | total? show your working.
           | 
           | > If you have 2 cats and 1 horse, they would have a total of
           | 10 legs. Cats typically have 4 legs, so 2 cats would have 2 x
           | 4 = <<24=8>>8 legs. Horses have 4 legs, so 1 horse would have
           | 1 x 4 = <<14=4>>4 legs. Together, the 2 cats and 1 horse
           | would have 8 + 4 = <<8+4=12>>12 legs.
        
             | amelius wrote:
             | > That's the trick to doing computations with LLMs, the
             | computation per token is constant so you want to split the
             | computation over many tokens.
             | 
             | Could you explain this a bit more? What are LLMs? Why is
             | the number of tokens different in your approach?
        
             | [deleted]
        
           | smallerfish wrote:
           | It's very reactive to the prompt:
           | 
           | > i have 23 horses each with a normal number of legs, along
           | with 14 dogs, each with a normal number of legs. how many
           | legs in total do all of the animals have?
           | 
           | > Based on the information you provided, it sounds like you
           | have 23 horses and each horse has 4 legs, and you have 14
           | dogs and each dog has 4 legs. If that is correct, then the
           | total number of legs among all of the animals would be 23 x 4
           | + 14 x 4 = 92 + 56 = <<234+144=148>>148 legs.
        
             | moffkalast wrote:
             | You know funny thing, for comparison I just entered that
             | exact prompt into the 3 gig sized flan-t5 and I get the
             | following:
             | 
             | > The horses have 23 * 4 = 126 legs. The dogs have 14 * 4 =
             | 84 legs. The total number of legs is 126 + 84 = 204. The
             | answer: 204.
             | 
             | Very very similar, but completely wrong results.
        
           | 1123581321 wrote:
           | I've found it gets the math right most of the time in
           | stories. For example:
           | 
           | > Write a story about a veterinarian counting the number of
           | total legs on two cats and a horse. He reports the total
           | number of legs to his assistant so he can write them down.
           | 
           | As the veterinarian entered the examination room, he was
           | greeted by two playful cats and a majestic horse. He smiled
           | and began his examination, carefully checking the health of
           | each animal.
           | 
           | After completing his examinations, the veterinarian turned to
           | his assistant and said, "I need you to write down the total
           | number of legs for these three animals."
           | 
           | The assistant nodded and picked up a pen and paper. "Okay,
           | let's see," the veterinarian said as he counted the legs on
           | the cats. "Each cat has four legs, so that's eight legs for
           | the two cats."
           | 
           | Next, he moved on to the horse. "And this beautiful horse has
           | four legs as well, for a total of twelve legs."
           | 
           | The assistant quickly wrote down the numbers as the
           | veterinarian spoke. "Twelve legs total," the veterinarian
           | confirmed. "Make sure to include that in their medical
           | records."
           | 
           | The assistant nodded and continued to take notes as the
           | veterinarian continued to provide updates on the animals'
           | health. As they finished up the examinations, the
           | veterinarian couldn't help but smile, knowing that he was
           | able to help these wonderful creatures.
        
           | amelius wrote:
           | Perhaps you can ask GPT about what is wrong in the answer and
           | how it would fix that? This would give a general approach:
           | 
           | Q: [Question]
           | 
           | A: [Answer]
           | 
           | Q: What is wrong in "[Answer]" and how would you fix that?
           | 
           | A: [Improved answer]
        
       | srajabi wrote:
       | Amazing insight, particularly section 6.
       | 
       | "- The two important but different abilities of GPT-3.5 are
       | *knowledge* and *reasoning*. Generally, it would be ideal if we
       | could *offload the knowledge part to the outside retrieval system
       | and let the language model only focus on reasoning.* This is
       | because: - The model's internal knowledge is always cut off at a
       | certain time. The model always needs up-to-date knowledge to
       | answer up-to-date questions. - Recall we have discussed that is
       | 175B parameter is heavily used for storing knowledge. If we could
       | offload knowledge to be outside the model, then the model
       | parameter might be significantly reduced such that eventually, it
       | can run on a cellphone (call this crazy here, but ChatGPT is
       | already science fiction enough, who knows what the future will
       | be)."
       | 
       | & "Yet there was a WebGPT paper published in Dec 2021. It is
       | likely that this is already tested internally within OpenAI."
       | 
       | It definitely feels like this may be the next step in making this
       | kind of system robust. It ends up being an interface for search.
        
         | kelseyfrog wrote:
         | See REALM[1] for some older(2 years) work on this idea.
         | 
         | 1. https://arxiv.org/abs/2002.08909
        
         | adamsmith143 wrote:
         | The problem with ChatGPT's "knowledge" is that it isn't
         | trustworthy. It will happily output very confident sounding
         | nonsense, or blatantly incorrect statements. We need a way to
         | verify how accurate it's outputs are
        
           | belter wrote:
           | ChatGPT made this nice COBOL program to create an S3 Bucket,a
           | technical impossibility...
           | 
           | IDENTIFICATION DIVISION. PROGRAM-ID. CREATE-S3-BUCKET.
           | 
           | ENVIRONMENT DIVISION. CONFIGURATION SECTION.
           | 
           | INPUT-OUTPUT SECTION.
           | 
           | DATA DIVISION. FILE SECTION.
           | 
           | WORKING-STORAGE SECTION. 01 AWS-ACCESS-KEY PIC X(20). 01 AWS-
           | SECRET-KEY PIC X(40). 01 BUCKET-NAME PIC X(255).
           | 
           | PROCEDURE DIVISION. CREATE-BUCKET. MOVE AWS-ACCESS-KEY TO
           | AWS-ACCESS-KEY-VAR MOVE AWS-SECRET-KEY TO AWS-SECRET-KEY-VAR
           | MOVE BUCKET-NAME TO BUCKET-NAME-VAR INVOKE AWS-S3 "CREATE-
           | BUCKET" USING AWS-ACCESS-KEY-VAR AWS-SECRET-KEY-VAR BUCKET-
           | NAME-VAR
        
           | nathias wrote:
           | so, much like other knowledge sources?
        
           | jtxt wrote:
           | One way I tried to do this is by having it write an answer,
           | and a footnote reference at each fact. [1] then list search
           | terms that be used to verify each claim, then I would respond
           | with the url and quotes from found pages for each one, then
           | have it rewrite the answer based on that information and cite
           | the sources. I think something this direction can be
           | automated. I saw someone do this with math and other tasks,
           | that would talk to a connected program before answering.
        
           | moffkalast wrote:
           | ChatGPT to be employed in marketing positions immediately.
        
         | rightbyte wrote:
         | How much disk space does 175B parameters use? A float or half
         | precision float per parameter or does it need pointers to
         | connections too?
         | 
         | Given how responses are generated in seconds and for free I am
         | fairly sure it could run on a desktop computer.
        
           | moyix wrote:
           | One float per param, so naively 175*4 = ~700GB on disk. Most
           | recent models are trained in FP16 or BF16 so 350GB. And
           | there's some work on quantizing them to INT8 so knock that
           | down to a mere 175GB. You can definitely run it on a desktop
           | computer using RAM and NVME offload to make up for the fact
           | that you probably don't have 175GB of GPU memory available,
           | but it won't be fast: https://huggingface.co/blog/bloom-
           | inference-pytorch-scripts
           | 
           | OpenAI generates responses so fast by doing the generation in
           | parallel across something like 8x80GB A100s (I don't know the
           | exact details of their hardware setup, but NVIDIA's open
           | FasterTransformer library achieves low latency for large
           | models this way).
        
       ___________________________________________________________________
       (page generated 2022-12-14 23:00 UTC)