[HN Gopher] Training and aligning LLMs with RLHF and RLHF altern...
       ___________________________________________________________________
        
       Training and aligning LLMs with RLHF and RLHF alternatives
        
       Author : rasbt
       Score  : 65 points
       Date   : 2023-09-10 14:04 UTC (8 hours ago)
        
 (HTM) web link (magazine.sebastianraschka.com)
 (TXT) w3m dump (magazine.sebastianraschka.com)
        
       | scoresmoke wrote:
       | Discussions about LLM alignment often miss topics of data quality
       | and quantity. It turns out that current models like Llama 2 use
       | 10K+ prompts and responses for supervised fine-tuning (SFT) and
       | 100K+ human preference pairs. While the preferences are pretty
       | easy to annotate, producing a good SFT dataset is uneasy.
       | 
       | https://evalovernite.substack.com/p/rlhf-math-aint-enough
       | 
       | https://doi.org/10.5281/zenodo.8186168
        
       | jamesblonde wrote:
       | I read here that Yann LeCun claimed that even with RLHF, LLMs
       | will still hallucinate - that it's an unavoidable consequence of
       | their autoregressive nature
       | 
       | https://www.hopsworks.ai/dictionary/rlhf-reinforcement-learn...
        
         | ShamelessC wrote:
         | That goes without saying.
         | 
         | edit: I don't like your linked article at all. Subtly
         | misleading and/or misinformed. Like a yahoo news but for ML.
         | 
         | to clarify: No one (certainly not OpenAI) suggested that RLHF
         | was useful for reducing hallucinations. It's not for that. The
         | insinuation that it was designed for that purpose (at least
         | partially) and yet "failed" is a faulty one. It was not
         | designed for that purpose. Hallucinations are a known issue
         | with large language models, and while I appreciate LeCunn re-
         | iterating that; lesser researchers than LeCunn are aware of
         | that fact.
        
         | og_kalu wrote:
         | Likely yes. But "solving" hallucinations is not really
         | important as long as mitigating it to some sufficiently low
         | level is possible.
        
           | phillipcarter wrote:
           | Moreover, it's all about use case. If you need a high degree
           | of reliability and reproducibility, don't use LLMs! Not yet,
           | at least. That's fine though, because there's a ton of value
           | they offer in solving problems where that isn't needed.
        
             | 3abiton wrote:
             | I wonder if there will be a new metric implemented in
             | evaluating LLMs: Hallucination score.
        
             | bugglebeetle wrote:
             | > If you need a high degree of reliability and
             | reproducibility, don't use LLMs!
             | 
             | This is true of pretty much all of machine learning. LLMs
             | are just getting singled out because their outputs are not
             | getting the same level of validation that typicall occurs
             | with older approaches. BERT models will also spit out
             | whacky stuff, depending on how they're trained/fine-
             | tuned/used/etc
        
           | bugglebeetle wrote:
           | For many NLP tasks (which is what I mostly use LLMs for),
           | hallucinations can be prevented with simple, procedural
           | checks against the input or a controlled vocabulary. For
           | example, for NER tasks, you can just check whether the
           | extracted entities are valid relative to either of the two.
        
       | Geee wrote:
       | What datasets OpenAI uses for RLHF? Is the assumption correct
       | that it's "time & labor intensive"? Couldn't you take ranked
       | responses from HN / Reddit / Stack Exchange / Quora etc. where
       | answers are already ranked, and train the reward model on that?
        
       ___________________________________________________________________
       (page generated 2023-09-10 23:00 UTC)