[HN Gopher] State-of-the-art open-source chatbot, Vicuna-13B, ju... ___________________________________________________________________ State-of-the-art open-source chatbot, Vicuna-13B, just released model weights Author : weichiang Score : 95 points Date : 2023-04-03 20:02 UTC (2 hours ago) (HTM) web link (twitter.com) (TXT) w3m dump (twitter.com) | weichiang wrote: | See the original Vicuna post: | https://news.ycombinator.com/item?id=35378683 | a5huynh wrote: | Note that what they released are the _delta_ weights from the og | LLaMa model. To play around with it, you 'll need to grab the | original LLaMA 13B model and apply the changes. > | We release Vicuna weights as delta weights to comply with the | LLaMA model > license. You can add our delta to the | original LLaMA weights to obtain > the Vicuna weights. | | Edit: took me a while to find it, here's a direct link to the | delta weights: https://huggingface.co/lmsys/vicuna-13b-delta-v0 | swyx wrote: | so an extra licensing issue to get around the original non | commercial license... this is just a research curiosity is it | not? | 0cf8612b2e1e wrote: | Not a lawyer, but that still feels like dubious territory. I | would still be on the hook for acquiring the original download, | which Facebook has been launching dmca takedown requests for | the llama-dl project. | sebzim4500 wrote: | The llama-dl project actually helped you download the | weights, whereas this just assumes you already have them. | That feels like a pretty massive difference to me. | zhisbug wrote: | https://github.com/facebookresearch/llama/pull/184 | 0cf8612b2e1e wrote: | Nobody at Facebook approved it? Given the attention it has | received, hard to imagine it has slipped through the | cracks, but a deliberate decision to not address. | gaogao wrote: | It's fairly similar to a ROM patch in the video game space, | which has mostly stood the test of time. | sillysaurusx wrote: | (I work on llama-dl.) | | We're fighting back against the DMCA requests on the basis | that NN weights aren't copyrightable. This thread has | details: https://news.ycombinator.com/item?id=35393782 | | I don't think you have to worry about Facebook going after | you. The worst that will happen is that they issue a DMCA, in | which case your project gets knocked offline. I don't think | they'll be going the RIAA route of suing individual hackers. | | The DMCAs were also launched by a third party law firm, not | Meta themselves, so there's a bit of "left hand doesn't know | what the right hand is doing" in all of this. | | I'll keep everyone updated. For now, hack freely. | meghan_rain wrote: | keep up god's work! | capableweb wrote: | Very unlikely you'd face any legal action for usage of | anything. If you share it, then it becomes less unlikely. | | Edit: Also, judging by a comment from the team in the GitHub | repository (https://github.com/lm- | sys/FastChat/issues/86#issuecomment-14...), they seem to at | least hint about been in contact with the llama team. | superkuh wrote: | That's what they say but I just spent 10 minutes searching the | git repo, reading the relavent .py files and looking at their | homepage and the vicuna-7b-delta and vicuna-13b-delta-v0 files | are no where to be found. Am I blind or did they announce a | release without actually releasing? | zhwu wrote: | If you follow this command in their instruction, the delta | will be automatically downloaded and applied to the base | model. https://github.com/lm-sys/FastChat#vicuna-13b: | `python3 -m fastchat.model.apply_delta --base | /path/to/llama-13b --target /output/path/to/vicuna-13b | --delta lmsys/vicuna-13b-delta-v0` | [deleted] | viraptor wrote: | This can be then quantized to the llama.cpp/gpt4all format, | right? Specifically, this only tweaks the existing weights | slightly, without changing the structure? | hotpathdev wrote: | I may have missed the detail, but it also expects the | pytorch conversion rather than original LLaMa model. | zhwu wrote: | Yes, you need to convert the original LLaMA model to the | huggingface format, according to https://github.com/lm- | sys/FastChat#vicuna-weights and https://huggingface.co/do | cs/transformers/main/model_doc/llam... | MMMercy2 wrote: | You can use this command to apply the delta weights. | (https://github.com/lm-sys/FastChat#vicuna-13b) The delta | weights are hosted on huggingface and will be automatically | downloaded. | superkuh wrote: | Thanks! https://huggingface.co/lmsys/vicuna-13b-delta-v0 | | Edit, later: I found some instructive pages on how to use | the vicuna weights with llama.cpp (https://lmsysvicuna.mira | heze.org/wiki/How_to_use_Vicuna#Use_...) and pre-made ggml | format compatible 4-bit quantized vicuna weights, | https://huggingface.co/eachadea/ggml- | vicuna-13b-4bit/tree/ma... (8GB ready to go, no 60+GB RAM | steps needed) | eurekin wrote: | I did try, but got: | | ``` ValueError: Tokenizer class LLaMATokenizer does not | exist or is not currently imported. ``` | simse wrote: | It's actually very impressive. I gave it the task of converting a | query and an OpenAPI spec into an API call, and it worked! I've | not been succesful in getting GPT-3.5 to do this without rambling | on about the reasoning for its decision. | zhwu wrote: | Wow, that is very interesting. Would you mind sharing the | prompt you used to query the model? | capableweb wrote: | Usually if I want code from the GPT family I always add "Just | show me the code, no extra words or explanation" in the end of | the prompt, and it works 99% of the time. | | Edit: just finished the conversion of Vicuna myself now and | been doing some light testing, seems to work in ~80% of the | cases for it, not as high success-rate as with GPT for sure. | Probably there is a better way of structuring the prompt for | Vicuna. | ode wrote: | Is there some single page that keeps a running status of the | various LLVM's and the software to make them runnable on consumer | hardware? | takantri wrote: | Hi! Funnily enough I couldn't find much on it either, so that's | exactly what I've been working on for the past few months: just | in case this kind of question got asked. | | I've recently opened a GitHub repository which includes | information for both AI model series[0] and frontends you can | use to run them[1]. I've wrote a Reddit post beforehand that's | messier, but a lot more technical[2]. | | I try to keep them as up-to-date as possible, but I might've | missed something or my info may not be completely accurate. | It's mostly to help get people's feet wet. | | [0] - https://github.com/Crataco/ai- | guide/blob/main/guide/models.m... | | [1] - https://github.com/Crataco/ai- | guide/blob/main/guide/frontend... | | [2] - | https://old.reddit.com/user/Crataco/comments/zuowi9/opensour... | nerdchum wrote: | what are model weights? | ozmodiar wrote: | They basically encapsulate what a model has "learned." ML | models without their weights are useless because the output is | essentially random noise. You then train the model on data, and | it changes the weights into numbers that cause the whole thing | to work. Training data and processing power are usually very | expensive so the resulting weights are valuable. | MMMercy2 wrote: | They are the parameters of this large language model. There are | 13B fp16 numbers. | superkuh wrote: | Essentially a computer neural network is just a lot of addition | (and matrix multiplication) of floating point numbers. The | parameters are the "strength" or "weights" of the connections | between neurons on different layers and the "bias" of each | neuron. If neuron Alice is connected to neuron Bob and Alice | has a value of 0.7, and the weight of Alice's connection to bob | is 0.5, then the value sent from Alice to Bob is 0.35. This | value (and the values from all the other incoming connections) | are summed at added to the neuron's negative bias. | | I highly recommend checking out 3blue1brown series on how | neural nets, gradient descent, and the dot product (implemented | as a matrix multiplication) all tie together: | https://www.youtube.com/watch?v=aircAruvnKk | detrites wrote: | A large array of uniquely-set floating point values. (AKA | "parameters".) | | In a language model, a word is put in one end (as a numerical | index to a wordlist), and then it and the weights multiplied | together, and then a new word comes out (again as an index). | | Numbers in, numbers out, and a small bit of logic that maps | words to numbers and back at either end. ("Encodings".) | | "Training" is the typically expensive process of feeding huge | amounts of data into the model, to get it to choose the magic | values for its weights that allow it to do useful stuff that | looks and feels like that training data. | | Something else that can be done with weights is they can be | "fine-tuned", or "tweaked" slightly to give different overall | results out of the model, therefore tailored to some new use- | case. Often the model gets a new name after. | | In this case, what's been released is not actually the weights. | It's a set of these tweaks ("deltas"), which are intended to be | added to Meta's LLaMA model weights to end up with the final | intended LLaMA-based model, called "Vicuna". | MuffinFlavored wrote: | > A large array of uniquely-set floating point values. | | How large? How many elements? | skrblr wrote: | It's in the name of the model - "Vicuna-13B" implies there | are 13 billion parameters. | tomp wrote: | the secret sauce of AI | zhisbug wrote: | lol weights are all you need | phoenixreader wrote: | Could someone explain how to test this? Applying the delta | conversion requires 60GB of CPU RAM. Do you just have 60GB RAM on | your machine? | taf2 wrote: | I got the 64gb MacBook Pro but already realizing the 96gb | laptop would have made sense now - I got it in Jan right before | all the ai crazy really lite up - distinctly remember thinking | who would ever need more then 64gb of ram... | vlugorilla wrote: | From the git repo: | | > This conversion command needs around 60 GB of CPU RAM. | | Ok. I don't have that. Has/will someone release the full weights | with the deltas applied? | [deleted] | mlboss wrote: | Not bad. | | https://pastebin.com/urDUsEew | mesmertech wrote: | Amazing model, close and probably better than Bard. Journey to | getting the weights was a fun one : ) ___________________________________________________________________ (page generated 2023-04-03 23:01 UTC)