hngopher.com

       [HN Gopher] State-of-the-art open-source chatbot, Vicuna-13B, ju...
       ___________________________________________________________________
        
       State-of-the-art open-source chatbot, Vicuna-13B, just released
       model weights
        
       Author : weichiang
       Score  : 95 points
       Date   : 2023-04-03 20:02 UTC (2 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | weichiang wrote:
       | See the original Vicuna post:
       | https://news.ycombinator.com/item?id=35378683
        
       | a5huynh wrote:
       | Note that what they released are the _delta_ weights from the og
       | LLaMa model. To play around with it, you 'll need to grab the
       | original LLaMA 13B model and apply the changes.                 >
       | We release Vicuna weights as delta weights to comply with the
       | LLaMA model       > license. You can add our delta to the
       | original LLaMA weights to obtain       > the Vicuna weights.
       | 
       | Edit: took me a while to find it, here's a direct link to the
       | delta weights: https://huggingface.co/lmsys/vicuna-13b-delta-v0
        
         | swyx wrote:
         | so an extra licensing issue to get around the original non
         | commercial license... this is just a research curiosity is it
         | not?
        
         | 0cf8612b2e1e wrote:
         | Not a lawyer, but that still feels like dubious territory. I
         | would still be on the hook for acquiring the original download,
         | which Facebook has been launching dmca takedown requests for
         | the llama-dl project.
        
           | sebzim4500 wrote:
           | The llama-dl project actually helped you download the
           | weights, whereas this just assumes you already have them.
           | That feels like a pretty massive difference to me.
        
           | zhisbug wrote:
           | https://github.com/facebookresearch/llama/pull/184
        
             | 0cf8612b2e1e wrote:
             | Nobody at Facebook approved it? Given the attention it has
             | received, hard to imagine it has slipped through the
             | cracks, but a deliberate decision to not address.
        
           | gaogao wrote:
           | It's fairly similar to a ROM patch in the video game space,
           | which has mostly stood the test of time.
        
           | sillysaurusx wrote:
           | (I work on llama-dl.)
           | 
           | We're fighting back against the DMCA requests on the basis
           | that NN weights aren't copyrightable. This thread has
           | details: https://news.ycombinator.com/item?id=35393782
           | 
           | I don't think you have to worry about Facebook going after
           | you. The worst that will happen is that they issue a DMCA, in
           | which case your project gets knocked offline. I don't think
           | they'll be going the RIAA route of suing individual hackers.
           | 
           | The DMCAs were also launched by a third party law firm, not
           | Meta themselves, so there's a bit of "left hand doesn't know
           | what the right hand is doing" in all of this.
           | 
           | I'll keep everyone updated. For now, hack freely.
        
             | meghan_rain wrote:
             | keep up god's work!
        
           | capableweb wrote:
           | Very unlikely you'd face any legal action for usage of
           | anything. If you share it, then it becomes less unlikely.
           | 
           | Edit: Also, judging by a comment from the team in the GitHub
           | repository (https://github.com/lm-
           | sys/FastChat/issues/86#issuecomment-14...), they seem to at
           | least hint about been in contact with the llama team.
        
         | superkuh wrote:
         | That's what they say but I just spent 10 minutes searching the
         | git repo, reading the relavent .py files and looking at their
         | homepage and the vicuna-7b-delta and vicuna-13b-delta-v0 files
         | are no where to be found. Am I blind or did they announce a
         | release without actually releasing?
        
           | zhwu wrote:
           | If you follow this command in their instruction, the delta
           | will be automatically downloaded and applied to the base
           | model. https://github.com/lm-sys/FastChat#vicuna-13b:
           | `python3 -m fastchat.model.apply_delta --base
           | /path/to/llama-13b --target /output/path/to/vicuna-13b
           | --delta lmsys/vicuna-13b-delta-v0`
        
             | [deleted]
        
             | viraptor wrote:
             | This can be then quantized to the llama.cpp/gpt4all format,
             | right? Specifically, this only tweaks the existing weights
             | slightly, without changing the structure?
        
             | hotpathdev wrote:
             | I may have missed the detail, but it also expects the
             | pytorch conversion rather than original LLaMa model.
        
               | zhwu wrote:
               | Yes, you need to convert the original LLaMA model to the
               | huggingface format, according to https://github.com/lm-
               | sys/FastChat#vicuna-weights and https://huggingface.co/do
               | cs/transformers/main/model_doc/llam...
        
           | MMMercy2 wrote:
           | You can use this command to apply the delta weights.
           | (https://github.com/lm-sys/FastChat#vicuna-13b) The delta
           | weights are hosted on huggingface and will be automatically
           | downloaded.
        
             | superkuh wrote:
             | Thanks! https://huggingface.co/lmsys/vicuna-13b-delta-v0
             | 
             | Edit, later: I found some instructive pages on how to use
             | the vicuna weights with llama.cpp (https://lmsysvicuna.mira
             | heze.org/wiki/How_to_use_Vicuna#Use_...) and pre-made ggml
             | format compatible 4-bit quantized vicuna weights,
             | https://huggingface.co/eachadea/ggml-
             | vicuna-13b-4bit/tree/ma... (8GB ready to go, no 60+GB RAM
             | steps needed)
        
             | eurekin wrote:
             | I did try, but got:
             | 
             | ``` ValueError: Tokenizer class LLaMATokenizer does not
             | exist or is not currently imported. ```
        
       | simse wrote:
       | It's actually very impressive. I gave it the task of converting a
       | query and an OpenAPI spec into an API call, and it worked! I've
       | not been succesful in getting GPT-3.5 to do this without rambling
       | on about the reasoning for its decision.
        
         | zhwu wrote:
         | Wow, that is very interesting. Would you mind sharing the
         | prompt you used to query the model?
        
         | capableweb wrote:
         | Usually if I want code from the GPT family I always add "Just
         | show me the code, no extra words or explanation" in the end of
         | the prompt, and it works 99% of the time.
         | 
         | Edit: just finished the conversion of Vicuna myself now and
         | been doing some light testing, seems to work in ~80% of the
         | cases for it, not as high success-rate as with GPT for sure.
         | Probably there is a better way of structuring the prompt for
         | Vicuna.
        
       | ode wrote:
       | Is there some single page that keeps a running status of the
       | various LLVM's and the software to make them runnable on consumer
       | hardware?
        
         | takantri wrote:
         | Hi! Funnily enough I couldn't find much on it either, so that's
         | exactly what I've been working on for the past few months: just
         | in case this kind of question got asked.
         | 
         | I've recently opened a GitHub repository which includes
         | information for both AI model series[0] and frontends you can
         | use to run them[1]. I've wrote a Reddit post beforehand that's
         | messier, but a lot more technical[2].
         | 
         | I try to keep them as up-to-date as possible, but I might've
         | missed something or my info may not be completely accurate.
         | It's mostly to help get people's feet wet.
         | 
         | [0] - https://github.com/Crataco/ai-
         | guide/blob/main/guide/models.m...
         | 
         | [1] - https://github.com/Crataco/ai-
         | guide/blob/main/guide/frontend...
         | 
         | [2] -
         | https://old.reddit.com/user/Crataco/comments/zuowi9/opensour...
        
       | nerdchum wrote:
       | what are model weights?
        
         | ozmodiar wrote:
         | They basically encapsulate what a model has "learned." ML
         | models without their weights are useless because the output is
         | essentially random noise. You then train the model on data, and
         | it changes the weights into numbers that cause the whole thing
         | to work. Training data and processing power are usually very
         | expensive so the resulting weights are valuable.
        
         | MMMercy2 wrote:
         | They are the parameters of this large language model. There are
         | 13B fp16 numbers.
        
         | superkuh wrote:
         | Essentially a computer neural network is just a lot of addition
         | (and matrix multiplication) of floating point numbers. The
         | parameters are the "strength" or "weights" of the connections
         | between neurons on different layers and the "bias" of each
         | neuron. If neuron Alice is connected to neuron Bob and Alice
         | has a value of 0.7, and the weight of Alice's connection to bob
         | is 0.5, then the value sent from Alice to Bob is 0.35. This
         | value (and the values from all the other incoming connections)
         | are summed at added to the neuron's negative bias.
         | 
         | I highly recommend checking out 3blue1brown series on how
         | neural nets, gradient descent, and the dot product (implemented
         | as a matrix multiplication) all tie together:
         | https://www.youtube.com/watch?v=aircAruvnKk
        
         | detrites wrote:
         | A large array of uniquely-set floating point values. (AKA
         | "parameters".)
         | 
         | In a language model, a word is put in one end (as a numerical
         | index to a wordlist), and then it and the weights multiplied
         | together, and then a new word comes out (again as an index).
         | 
         | Numbers in, numbers out, and a small bit of logic that maps
         | words to numbers and back at either end. ("Encodings".)
         | 
         | "Training" is the typically expensive process of feeding huge
         | amounts of data into the model, to get it to choose the magic
         | values for its weights that allow it to do useful stuff that
         | looks and feels like that training data.
         | 
         | Something else that can be done with weights is they can be
         | "fine-tuned", or "tweaked" slightly to give different overall
         | results out of the model, therefore tailored to some new use-
         | case. Often the model gets a new name after.
         | 
         | In this case, what's been released is not actually the weights.
         | It's a set of these tweaks ("deltas"), which are intended to be
         | added to Meta's LLaMA model weights to end up with the final
         | intended LLaMA-based model, called "Vicuna".
        
           | MuffinFlavored wrote:
           | > A large array of uniquely-set floating point values.
           | 
           | How large? How many elements?
        
             | skrblr wrote:
             | It's in the name of the model - "Vicuna-13B" implies there
             | are 13 billion parameters.
        
         | tomp wrote:
         | the secret sauce of AI
        
           | zhisbug wrote:
           | lol weights are all you need
        
       | phoenixreader wrote:
       | Could someone explain how to test this? Applying the delta
       | conversion requires 60GB of CPU RAM. Do you just have 60GB RAM on
       | your machine?
        
         | taf2 wrote:
         | I got the 64gb MacBook Pro but already realizing the 96gb
         | laptop would have made sense now - I got it in Jan right before
         | all the ai crazy really lite up - distinctly remember thinking
         | who would ever need more then 64gb of ram...
        
       | vlugorilla wrote:
       | From the git repo:
       | 
       | > This conversion command needs around 60 GB of CPU RAM.
       | 
       | Ok. I don't have that. Has/will someone release the full weights
       | with the deltas applied?
        
       | [deleted]
        
       | mlboss wrote:
       | Not bad.
       | 
       | https://pastebin.com/urDUsEew
        
       | mesmertech wrote:
       | Amazing model, close and probably better than Bard. Journey to
       | getting the weights was a fun one : )
        
       ___________________________________________________________________
       (page generated 2023-04-03 23:01 UTC)