[HN Gopher] A simple guide to fine-tuning Llama 2
       ___________________________________________________________________
        
       A simple guide to fine-tuning Llama 2
        
       Author : samlhuillier
       Score  : 113 points
       Date   : 2023-07-24 19:18 UTC (3 hours ago)
        
 (HTM) web link (brev.dev)
 (TXT) w3m dump (brev.dev)
        
       | treprinum wrote:
       | Is there any tutorial on how to use HuggingFace LLaMA 2-derived
       | models? They don't have checkpoint files of the original LLaMA
       | and can't be used by the Meta's provided inference code, instead
       | they use .bin files. I am only interested in Python code so no
       | llama.cpp.
        
         | ramesh31 wrote:
         | >I am only interested in Python code so no llama.cpp.
         | 
         | llama cpp has python bindings: https://pypi.org/project/llama-
         | cpp-python/
         | 
         | Here's using it with langchain:
         | https://python.langchain.com/docs/integrations/llms/llamacpp
        
         | lolinder wrote:
         | I'd reconsider your rejection of llama.cpp if I were you. You
         | can always call out to it from Python, but llama.cpp is by far
         | the most active project in this space, and they've gotten the
         | UX to the point where it's extremely simple to use.
         | 
         | This user on HuggingFace has all the models ready to go in GGML
         | format and quantized at various sizes, which saves a lot of
         | bandwidth:
         | 
         | https://huggingface.co/TheBloke
        
           | treprinum wrote:
           | I understand, I use llama.cpp for my own personal stuff but
           | can't override the policy on the project I want to plug it
           | in, which is python-only.
        
       | syntaxing wrote:
       | Can someone share a good tutorial how to prepare the data? And
       | for fine tuning, does a 3090 have enough VRAM? I want to do what
       | the author mentioned by fine tuning the model on my personal data
       | but I'm not sure how to prepare the data. I tried using vector
       | search + LLM but I find the results very subpar when using a
       | local LLM.
        
         | notpublic wrote:
         | As mentioned in the OP's blog post, checkout
         | https://github.com/facebookresearch/llama-recipes.git.
         | specifically files in ft_datasets directory.
         | 
         | I am able to finetune meta-llama/Llama-2-13b-chat-hf on a 3090
         | using instructions from quickstart.ipynb.
        
         | samlhuillier wrote:
         | Working on this now!
        
           | syntaxing wrote:
           | I'm looking forward to this! Are you using an adapter (I
           | don't see it mentioned in your article)? I was under the
           | impression you cannot fit 7B at 4 bit since it'll take 25GB
           | of VRAM or so.
        
             | samlhuillier wrote:
             | Yes using the qlora adapter that hugging face provides with
             | peft
        
         | jawerty wrote:
         | I just streamed this last night
         | https://m.youtube.com/watch?v=TYgtG2Th6fI&t=3998s
         | 
         | I've been live streaming myself fine tuning llama on my GitHub
         | data (to code like me)
        
           | jeremycarter wrote:
           | Fantastic job! Very easy to follow
        
             | jawerty wrote:
             | Thank you! I have some other streams where I do little
             | projects like these check them out
        
       | eachro wrote:
       | I've veen a bit out of the loop on this area but would like to
       | get back into it given how much has changed in the LLM landscape
       | in the last 1-2 yrs. What models are small enough to play with on
       | Collab? Or am I going to have to spin up my own gpu box on aws to
       | be able to mess around with these models?
        
         | naderkhalil wrote:
         | Hey, you could use a template on brev.dev to spin up a gpu box
         | with the model and Jupyter notebook. Alternatively, the falcon
         | 7b model should be small enough for colab
        
         | [deleted]
        
       | nmitchko wrote:
       | This is a pretty useless post. You could also follow the same
       | 1000x tutorials about llama and use the already uploaded hugging
       | face formats that are on hugging face...
       | 
       | Here are some actually useful links
       | 
       | https://blog.ovhcloud.com/fine-tuning-llama-2-models-using-a...
       | 
       | https://huggingface.co/meta-llama/Llama-2-70b-hf
       | 
       | https://huggingface.co/meta-llama/Llama-2-7b-hf
        
         | onlypositive wrote:
         | Is it really "useless" if I didn't even know about llama? And
         | look, now I have 3 more links to dive into.
         | 
         | This is the opposite of useless.
        
           | mciancia wrote:
           | Well it's quite possible that it is useless for you since you
           | didn't hear about llama by now ;)
        
         | jeremycarter wrote:
         | Thanks!
        
       | m00dy wrote:
       | Which dataset would be good to fine-tune for developing sales
       | assistant like chatbot ?
        
         | ShamelessC wrote:
         | You could try using a transcript of The Wolf of Wall Street,
         | maybe throw in Glengarry Glen Ross for good measure?
         | 
         | /s
        
       ___________________________________________________________________
       (page generated 2023-07-24 23:00 UTC)