[HN Gopher] Ask HN: Open-source ChatGPT alternatives?
       ___________________________________________________________________
        
       Ask HN: Open-source ChatGPT alternatives?
        
       What's the state of the art in open source GPT models right now, in
       practical terms? If your typical use case is taking a pretrained
       model and fine tuning it to a specific task, which LLM would yield
       the best results while running on consumer hardware? Note that I'm
       specifically asking for software that I can run on my own hardware,
       I'm not interested in paying OpenAI $0.02 per API request.  I'll
       start the recommendations with Karpathy's nanoGPT:
       https://github.com/karpathy/nanoGPT  What else do we have?
        
       Author : baobabKoodaa
       Score  : 49 points
       Date   : 2023-02-14 20:58 UTC (2 hours ago)
        
       | speedgoose wrote:
       | You also have GPT J 6B and BLOOM but to be honest they are not
       | like ChatGPT.
       | 
       | https://huggingface.co/EleutherAI/gpt-j-6B
       | 
       | https://huggingface.co/bigscience/bloom
        
         | baobabKoodaa wrote:
         | Can you elaborate on how they are not like ChatGPT? I was
         | looking into GPT-JT (built on top of GPT J you mentioned). If I
         | spend the time to actually finetune and run inference with one
         | of these models, am I likely going to be disappointed at the
         | results?
        
           | speedgoose wrote:
           | It depends what you use it for. If it's to classify text and
           | you can fine tune it, it's probably good enough.
           | 
           | For following instructions, ChatGPT is a lot better but GPT J
           | did relatively well if given enough examples on simple tasks.
           | 
           | For a chatbot, it's not really useable.
        
           | smoldesu wrote:
           | Maybe? GPT-J is closer to the AI-Dungeon model of
           | intelligence. It's able to fill in the blank after what you
           | type, but it's hysterically bad at answering precise
           | questions (to the point that I had to nerf it for fun to see
           | how stupid the output could get).
           | 
           | It will handle basic natural language and context clues just
           | fine. It's just not very fast, and the generations probably
           | won't be as thorough as ChatGPT.
        
             | baobabKoodaa wrote:
             | Am I going to be able to fine tune GPT-J or GPT-JT on
             | consumer hardware?
        
               | [deleted]
        
               | simonw wrote:
               | When you say "fine tune" here what are you looking to do?
               | 
               | The impression I've got is that fine tuning large
               | language models is mostly useful for very simple tasks,
               | such as training a spam or categorization filter.
               | 
               | If you're looking to take a model and then e.g. train it
               | on a few thousand additional pages of documentation in
               | order to get it to answer project-specific questions,
               | I've got the impression that fine tuning isn't actually a
               | useful way to achieve that (I'd love to be proven wrong
               | about this).
               | 
               | Instead, people are getting good results with "retrieval
               | augmented generation" - where you first run a search (or
               | an embeddings-based semantic search) against your docs to
               | find relevant snippets, then feed them to the large
               | language model as part of a glued together prompt.
               | 
               | I wrote about my explorations of this technique here -
               | plenty of other people have written about this too:
               | https://simonwillison.net/2023/Jan/13/semantic-search-
               | answer...
        
               | baobabKoodaa wrote:
               | > When you say "fine tune" here what are you looking to
               | do?
               | 
               | As an example of fine tuning, I might take a pretrained
               | model and then continue training it with a custom dataset
               | that is tailored to a specific text generation task (not
               | classification). Here is an example of a custom dataset
               | that I might fine tune on:
               | 
               | https://github.com/baobabKoodaa/future/blob/8d2ae91e6a6f0
               | 0c7...
               | 
               | I would like the LLM to generate fictional text in the
               | same style as the fine tuned dataset.
        
               | simonw wrote:
               | I've not yet managed to convince myself if fine tuning
               | LLMs works for that kind of example.
               | 
               | Have you tried fine tuning GPT3 via the OpenAI APIs for
               | this? It should only cost a few dollars for that smaller
               | set of examples, and it would at least help demonstrate
               | it it's possible to get the results you want with the
               | current best-in-class language model before you try to
               | run that against a smaller model that you can fit on your
               | own hardware.
        
               | baobabKoodaa wrote:
               | > Have you tried fine tuning GPT3 via the OpenAI APIs for
               | this
               | 
               | I haven't. That's not a bad idea.
               | 
               | > it would at least help demonstrate it it's possible to
               | get the results you want with the current best-in-class
               | language model before you try to run that against a
               | smaller model that you can fit on your own hardware
               | 
               | The dataset you saw was (mostly) generated with ChatGPT
               | and davinci-002, by using prompt engineering instead of
               | fine tuning. So it's definitely possible to produce good
               | results like this (though no judgment here on the
               | question of prompt engineering vs fine tuning).
        
       | [deleted]
        
       | mindcrime wrote:
       | Previous related discussions:
       | 
       | https://news.ycombinator.com/item?id=34115698
       | 
       | https://news.ycombinator.com/item?id=33955125
       | 
       | https://news.ycombinator.com/item?id=34163413
       | 
       | https://news.ycombinator.com/item?id=34628256
       | 
       | https://news.ycombinator.com/item?id=34147281
       | 
       | https://news.ycombinator.com/item?id=34445873
        
         | baobabKoodaa wrote:
         | I went through all of these and the only one in there that I
         | found that _might_ be fine-tuneable on consumer hardware seems
         | to be KoboldAI. Not sure yet.
        
       | gigel82 wrote:
       | GPT Neo 1.3B (https://huggingface.co/EleutherAI/gpt-neo-1.3B) is
       | the largest I can run on my 12Gb VRAM GPU, and I'm sorry to say
       | it's output is a joke (nowhere near GPT-3, more like GPT-2 level
       | of BS).
       | 
       | However, you can fine tune it; and I'm sure with lots of fine
       | tuning and some jiggling of the parameters you can get a half
       | decent custom-purpose solution.
        
       | dieselgate wrote:
       | I'm not very familiar with this space but would have thought
       | "OpenAI" would be at least somewhat open-source. Is this just
       | naming and not relevant to the product at all?
        
         | baobabKoodaa wrote:
         | > Is this just naming and not relevant to the product at all?
         | 
         | They took funding as an open source non profit. Once they got
         | the money they turned into a closed sourced for-profit
         | censorship machine.
        
           | titaniczero wrote:
           | Yeah, but to be fair GPT 2 is open source and Whisper (a
           | high-quality speech recognition and multilingual translation
           | model) is also open source. A few years ago I needed a good
           | model for transcription for a project and I couldn't find
           | anything decent. They really have contributed to the open
           | source community.
           | 
           | If they keep releasing older models and keep their cutting-
           | edge technology for profit I'm fine with it.
        
             | baobabKoodaa wrote:
             | Fair enough.
        
         | gerash wrote:
         | OpenAI stopped being a non-profit and hasn't published anything
         | on Chatgpt yet
        
       | PaulHoule wrote:
       | Here is a Python package that can download transformer embeddings
       | automatically
       | 
       | https://www.trychroma.com/
       | 
       | In general a lot of people download models from huggingface, I
       | think that package automates that task.
        
         | baobabKoodaa wrote:
         | I don't know if there is an implication here that I don't get,
         | but I don't see the connection between this answer and the
         | question I asked.
        
           | PaulHoule wrote:
           | You want a large language model. This gives you a large
           | language model.
        
             | baobabKoodaa wrote:
             | I asked for recommendations on which LLM would run on
             | consumer hardware for the purposes of fine tuning and
             | inference, with good results. You linked a package that can
             | be used to download models? I don't see how these things
             | are related.
        
               | PaulHoule wrote:
               | Why don't go ask ChatGPT then?
               | 
               | But seriously I am asking a very similar question with a
               | focus on LLMs for classification (e.g. "Is this an
               | article about a sports game?"), information extraction,
               | clustering and such. I am not so interested in generation
               | (Which I am assuming you are.), however the GPT style
               | embeddings and those are are useful for the kind of work
               | I do and are interchangeable with BERT-like and other
               | embeddings.
               | 
               | "Good" or "Best" is something you have to define for
               | yourself and the one thing every successful A.I.
               | developer has done is develop a facility for testing if
               | the solution was performing acceptably. With that library
               | you can download a model and start working with the
               | model, again, the successful people all tested at least
               | one model. In the time since your post, a run-of-the-mill
               | Python developer could have made some progress. Learn
               | Python or get a non technical co-founder,
               | 
               | For my kind of tasks I want something that handles bigger
               | documents than ChatGPT and when I go shopping for models
               | I cannot find a high quality very large transformer that
               | has been assembled with a tokenizer and trained hard on
               | language tasks. When I look at the literature it seems
               | the very long transformers like reformer wouldn't perform
               | so well if somebody did try to build an LLM so I wait. I
               | am certain that somebody will upload a better model to
               | huggingface someday -- that's the thought process it
               | takes to get an answer for questions like yours.
               | 
               | If you look though at the process used to make ChatGPT-3
               | able to converse there is the GPT-style embedding and
               | then a process of dialog generation trained on totally
               | different principles which is the "Reinforcement Learning
               | from Human Feedback"
               | 
               | https://www.assemblyai.com/blog/how-chatgpt-actually-
               | works/
               | 
               | and I think you are not going to get _that_ kind of
               | capability open source in that the training data doesn 't
               | exist for it. There are many things you have to do once
               | you have that training data, but I think there are many
               | people able to follow that path now that it has been
               | blazed.
        
       | lopuhin wrote:
       | In terms of models which are reasonably fast to run and easy to
       | install, I think Flan-T5 is one of the best:
       | https://huggingface.co/google/flan-t5-xxl - although out of the
       | box it's more focused on giving short answers and it's very far
       | from ChatGPT.
        
         | baobabKoodaa wrote:
         | It's not clear from the link if it's possible to fine tune
         | Flan-T5 on consumer hardware?
        
           | lopuhin wrote:
           | They released different sizes from "small" to "xxl", and at
           | least "base" should be small enough to fine-tune virtually
           | anywhere.
        
       | smoldesu wrote:
       | This runs fine in RAM constrained (<2gb) situations:
       | https://huggingface.co/EleutherAI/gpt-neo-125M
       | 
       | It's bigger brother, 1.3b, uses ~5.5gb of memory but yields
       | slightly more GPT-like answers. Both take ~5-20 seconds to
       | generate a response though, so take that into account when
       | building with it.
        
       | eric_hui wrote:
       | [dead]
        
       | init0 wrote:
       | [flagged]
        
         | flangola7 wrote:
         | Facebook also has OPT, which is one of the largest public pre-
         | trained models
        
         | baobabKoodaa wrote:
         | Did you generate this with Bing? I don't believe this answer is
         | human-written. It describes 2 separate projects with the words
         | "It is available on GitHub and can be used in various research
         | projects".
        
           | franze wrote:
           | according to gptzero AI detection it: ".. text may include
           | parts written by AI"
        
           | speedgoose wrote:
           | It's also not very up to date which is suspicious for such a
           | precise answer.
        
       | valgaze wrote:
       | Clue on conversation "history"-- "While ChatGPT is able to
       | remember what the user has said earlier in the conversation,
       | there is a limit to how much information it can retain. The model
       | is able to reference up to approximately 3000 words (or 4000
       | tokens) from the current conversation - any information beyond
       | that is not stored.
       | 
       | Please note that ChatGPT is not able to access past conversations
       | to inform its responses."
       | 
       | https://help.openai.com/en/articles/6787051-does-chatgpt-rem...
       | 
       | Some interesting techniques I've seen involve essentially a ring-
       | buffer and after each turn a call is made to _summarize_ the
       | conversation up to that point and use that as context for
       | subsequent prompt
        
         | roxgib wrote:
         | Presumably one of the benefits of running your own model is
         | that you can feed extra data into it via training rather than
         | purely through inference? I.e. if you're a software company you
         | could fine-tune it on your codebase, improving its answers
         | without increasing inference time?
        
       | SparkyMcUnicorn wrote:
       | Open Assistant (started by some of the people that started Stable
       | Diffusion I think?) is very early, but looks very promising.
       | 
       | https://open-assistant.io/
       | 
       | https://github.com/LAION-AI/Open-Assistant
        
         | baobabKoodaa wrote:
         | I briefly looked at this and it doesn't seem like they provide
         | a model that I can fine tune on consumer hardware?
        
           | Mizza wrote:
           | Hold your damn horses, this technology is brand new, requires
           | a tremendous amount of data gathering and computation,
           | there's a massive volunteer effort already under way, and
           | you're begging for a free home version so you can save $0.02
           | cents. You want a hole, pick up a shovel.
        
             | baobabKoodaa wrote:
             | > Hold your damn horses, this technology is brand new,
             | requires a tremendous amount of data gathering and
             | computation, there's a massive volunteer effort already
             | under way, and you're begging for a free home version so
             | you can save $0.02 cents. You want a hole, pick up a
             | shovel.
             | 
             | $0.02 cents per API request. What I would like to do is
             | provide people a free service on the internet that using an
             | LLM under the hood. If you are so rich that you can burn
             | $10k on an internet hobby project, good for you, but just
             | know that everybody else is not as rich as you.
             | 
             | Also I wasn't "begging for a free home version". In my
             | original question I already provided one option. Sorry, I
             | mean "one free home version". So it's not like I was
             | starved out of options. There are options. I was asking for
             | recommendations.
        
       ___________________________________________________________________
       (page generated 2023-02-14 23:00 UTC)