[HN Gopher] PMET: Precise Model Editing in a Transformer
       ___________________________________________________________________
        
       PMET: Precise Model Editing in a Transformer
        
       Author : PaulHoule
       Score  : 74 points
       Date   : 2023-08-27 18:35 UTC (4 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | ttul wrote:
       | The PRC would doubtless have an interest in precisely removing
       | all knowledge of certain historical facts from LLMs within China.
        
         | quantum_state wrote:
         | they could just use it without publishing the paper ... wonder
         | what the reason could be ...
        
         | PaulHoule wrote:
         | That's just one application.
         | 
         | One of the worst problems of LLMs at this point in time is
         | keeping them updated.
         | 
         | For instance ChatGPT should be able to talk about the Superbowl
         | in 1984 when the Chicago Bears trounced the New England
         | Patriots (I remember it well because I grew up in New England!)
         | but I couldn't expect it to have anything to say about the
         | (other kind of football) game I saw yesterday where West Ham
         | beat Brighton because nothing about the later game is in the
         | training set.
         | 
         | This problem just gets worse as time passes and the world
         | continues to change. Bing's chatbot works around this for my
         | soccer example by running a conventional query and then having
         | the LLM summarize it which gave a pretty good summary of the
         | game but when I asked it pointed questions about this
         | particular game such "Who had the most possession?" which was
         | relevant because it was really lopsided in the direction of the
         | losing team, it fell down, it seemed to be working off
         | structured statistics that didn't have this data as opposed to
         | media reports of the game which surely would have noticed that.
         | 
         | With current technology they will need to rebuild the whole
         | thing one day which will (1) be crazy expensive and (2) will
         | break all the document vectors that people have saved from the
         | model which will be a big problem for anybody using systems
         | like LangChain or doing embedding-based similarity search.
         | 
         | There's a lot of need for some ability to update an LLM
         | incrementally and not wreck it's performance and this kind of
         | research points to one path to that.
        
       | KhoomeiK wrote:
       | Fyi, Meng et al 2022 [1] is pretty much required reading in order
       | to understand this paper
       | 
       | [1] https://arxiv.org/abs/2202.05262
        
         | lucidrains wrote:
         | Yannic did a great interview with the authors some time ago
         | https://youtu.be/_NMQyOu2HTo
        
       | gmerc wrote:
       | This may drop the cost and significantly increase the feasibility
       | for government / court mandated changes / censoring / edits to
       | models.
        
       ___________________________________________________________________
       (page generated 2023-08-27 23:00 UTC)