[HN Gopher] Ask HN: Open-source ChatGPT alternatives? ___________________________________________________________________ Ask HN: Open-source ChatGPT alternatives? What's the state of the art in open source GPT models right now, in practical terms? If your typical use case is taking a pretrained model and fine tuning it to a specific task, which LLM would yield the best results while running on consumer hardware? Note that I'm specifically asking for software that I can run on my own hardware, I'm not interested in paying OpenAI $0.02 per API request. I'll start the recommendations with Karpathy's nanoGPT: https://github.com/karpathy/nanoGPT What else do we have? Author : baobabKoodaa Score : 49 points Date : 2023-02-14 20:58 UTC (2 hours ago) | speedgoose wrote: | You also have GPT J 6B and BLOOM but to be honest they are not | like ChatGPT. | | https://huggingface.co/EleutherAI/gpt-j-6B | | https://huggingface.co/bigscience/bloom | baobabKoodaa wrote: | Can you elaborate on how they are not like ChatGPT? I was | looking into GPT-JT (built on top of GPT J you mentioned). If I | spend the time to actually finetune and run inference with one | of these models, am I likely going to be disappointed at the | results? | speedgoose wrote: | It depends what you use it for. If it's to classify text and | you can fine tune it, it's probably good enough. | | For following instructions, ChatGPT is a lot better but GPT J | did relatively well if given enough examples on simple tasks. | | For a chatbot, it's not really useable. | smoldesu wrote: | Maybe? GPT-J is closer to the AI-Dungeon model of | intelligence. It's able to fill in the blank after what you | type, but it's hysterically bad at answering precise | questions (to the point that I had to nerf it for fun to see | how stupid the output could get). | | It will handle basic natural language and context clues just | fine. It's just not very fast, and the generations probably | won't be as thorough as ChatGPT. | baobabKoodaa wrote: | Am I going to be able to fine tune GPT-J or GPT-JT on | consumer hardware? | [deleted] | simonw wrote: | When you say "fine tune" here what are you looking to do? | | The impression I've got is that fine tuning large | language models is mostly useful for very simple tasks, | such as training a spam or categorization filter. | | If you're looking to take a model and then e.g. train it | on a few thousand additional pages of documentation in | order to get it to answer project-specific questions, | I've got the impression that fine tuning isn't actually a | useful way to achieve that (I'd love to be proven wrong | about this). | | Instead, people are getting good results with "retrieval | augmented generation" - where you first run a search (or | an embeddings-based semantic search) against your docs to | find relevant snippets, then feed them to the large | language model as part of a glued together prompt. | | I wrote about my explorations of this technique here - | plenty of other people have written about this too: | https://simonwillison.net/2023/Jan/13/semantic-search- | answer... | baobabKoodaa wrote: | > When you say "fine tune" here what are you looking to | do? | | As an example of fine tuning, I might take a pretrained | model and then continue training it with a custom dataset | that is tailored to a specific text generation task (not | classification). Here is an example of a custom dataset | that I might fine tune on: | | https://github.com/baobabKoodaa/future/blob/8d2ae91e6a6f0 | 0c7... | | I would like the LLM to generate fictional text in the | same style as the fine tuned dataset. | simonw wrote: | I've not yet managed to convince myself if fine tuning | LLMs works for that kind of example. | | Have you tried fine tuning GPT3 via the OpenAI APIs for | this? It should only cost a few dollars for that smaller | set of examples, and it would at least help demonstrate | it it's possible to get the results you want with the | current best-in-class language model before you try to | run that against a smaller model that you can fit on your | own hardware. | baobabKoodaa wrote: | > Have you tried fine tuning GPT3 via the OpenAI APIs for | this | | I haven't. That's not a bad idea. | | > it would at least help demonstrate it it's possible to | get the results you want with the current best-in-class | language model before you try to run that against a | smaller model that you can fit on your own hardware | | The dataset you saw was (mostly) generated with ChatGPT | and davinci-002, by using prompt engineering instead of | fine tuning. So it's definitely possible to produce good | results like this (though no judgment here on the | question of prompt engineering vs fine tuning). | [deleted] | mindcrime wrote: | Previous related discussions: | | https://news.ycombinator.com/item?id=34115698 | | https://news.ycombinator.com/item?id=33955125 | | https://news.ycombinator.com/item?id=34163413 | | https://news.ycombinator.com/item?id=34628256 | | https://news.ycombinator.com/item?id=34147281 | | https://news.ycombinator.com/item?id=34445873 | baobabKoodaa wrote: | I went through all of these and the only one in there that I | found that _might_ be fine-tuneable on consumer hardware seems | to be KoboldAI. Not sure yet. | gigel82 wrote: | GPT Neo 1.3B (https://huggingface.co/EleutherAI/gpt-neo-1.3B) is | the largest I can run on my 12Gb VRAM GPU, and I'm sorry to say | it's output is a joke (nowhere near GPT-3, more like GPT-2 level | of BS). | | However, you can fine tune it; and I'm sure with lots of fine | tuning and some jiggling of the parameters you can get a half | decent custom-purpose solution. | dieselgate wrote: | I'm not very familiar with this space but would have thought | "OpenAI" would be at least somewhat open-source. Is this just | naming and not relevant to the product at all? | baobabKoodaa wrote: | > Is this just naming and not relevant to the product at all? | | They took funding as an open source non profit. Once they got | the money they turned into a closed sourced for-profit | censorship machine. | titaniczero wrote: | Yeah, but to be fair GPT 2 is open source and Whisper (a | high-quality speech recognition and multilingual translation | model) is also open source. A few years ago I needed a good | model for transcription for a project and I couldn't find | anything decent. They really have contributed to the open | source community. | | If they keep releasing older models and keep their cutting- | edge technology for profit I'm fine with it. | baobabKoodaa wrote: | Fair enough. | gerash wrote: | OpenAI stopped being a non-profit and hasn't published anything | on Chatgpt yet | PaulHoule wrote: | Here is a Python package that can download transformer embeddings | automatically | | https://www.trychroma.com/ | | In general a lot of people download models from huggingface, I | think that package automates that task. | baobabKoodaa wrote: | I don't know if there is an implication here that I don't get, | but I don't see the connection between this answer and the | question I asked. | PaulHoule wrote: | You want a large language model. This gives you a large | language model. | baobabKoodaa wrote: | I asked for recommendations on which LLM would run on | consumer hardware for the purposes of fine tuning and | inference, with good results. You linked a package that can | be used to download models? I don't see how these things | are related. | PaulHoule wrote: | Why don't go ask ChatGPT then? | | But seriously I am asking a very similar question with a | focus on LLMs for classification (e.g. "Is this an | article about a sports game?"), information extraction, | clustering and such. I am not so interested in generation | (Which I am assuming you are.), however the GPT style | embeddings and those are are useful for the kind of work | I do and are interchangeable with BERT-like and other | embeddings. | | "Good" or "Best" is something you have to define for | yourself and the one thing every successful A.I. | developer has done is develop a facility for testing if | the solution was performing acceptably. With that library | you can download a model and start working with the | model, again, the successful people all tested at least | one model. In the time since your post, a run-of-the-mill | Python developer could have made some progress. Learn | Python or get a non technical co-founder, | | For my kind of tasks I want something that handles bigger | documents than ChatGPT and when I go shopping for models | I cannot find a high quality very large transformer that | has been assembled with a tokenizer and trained hard on | language tasks. When I look at the literature it seems | the very long transformers like reformer wouldn't perform | so well if somebody did try to build an LLM so I wait. I | am certain that somebody will upload a better model to | huggingface someday -- that's the thought process it | takes to get an answer for questions like yours. | | If you look though at the process used to make ChatGPT-3 | able to converse there is the GPT-style embedding and | then a process of dialog generation trained on totally | different principles which is the "Reinforcement Learning | from Human Feedback" | | https://www.assemblyai.com/blog/how-chatgpt-actually- | works/ | | and I think you are not going to get _that_ kind of | capability open source in that the training data doesn 't | exist for it. There are many things you have to do once | you have that training data, but I think there are many | people able to follow that path now that it has been | blazed. | lopuhin wrote: | In terms of models which are reasonably fast to run and easy to | install, I think Flan-T5 is one of the best: | https://huggingface.co/google/flan-t5-xxl - although out of the | box it's more focused on giving short answers and it's very far | from ChatGPT. | baobabKoodaa wrote: | It's not clear from the link if it's possible to fine tune | Flan-T5 on consumer hardware? | lopuhin wrote: | They released different sizes from "small" to "xxl", and at | least "base" should be small enough to fine-tune virtually | anywhere. | smoldesu wrote: | This runs fine in RAM constrained (<2gb) situations: | https://huggingface.co/EleutherAI/gpt-neo-125M | | It's bigger brother, 1.3b, uses ~5.5gb of memory but yields | slightly more GPT-like answers. Both take ~5-20 seconds to | generate a response though, so take that into account when | building with it. | eric_hui wrote: | [dead] | init0 wrote: | [flagged] | flangola7 wrote: | Facebook also has OPT, which is one of the largest public pre- | trained models | baobabKoodaa wrote: | Did you generate this with Bing? I don't believe this answer is | human-written. It describes 2 separate projects with the words | "It is available on GitHub and can be used in various research | projects". | franze wrote: | according to gptzero AI detection it: ".. text may include | parts written by AI" | speedgoose wrote: | It's also not very up to date which is suspicious for such a | precise answer. | valgaze wrote: | Clue on conversation "history"-- "While ChatGPT is able to | remember what the user has said earlier in the conversation, | there is a limit to how much information it can retain. The model | is able to reference up to approximately 3000 words (or 4000 | tokens) from the current conversation - any information beyond | that is not stored. | | Please note that ChatGPT is not able to access past conversations | to inform its responses." | | https://help.openai.com/en/articles/6787051-does-chatgpt-rem... | | Some interesting techniques I've seen involve essentially a ring- | buffer and after each turn a call is made to _summarize_ the | conversation up to that point and use that as context for | subsequent prompt | roxgib wrote: | Presumably one of the benefits of running your own model is | that you can feed extra data into it via training rather than | purely through inference? I.e. if you're a software company you | could fine-tune it on your codebase, improving its answers | without increasing inference time? | SparkyMcUnicorn wrote: | Open Assistant (started by some of the people that started Stable | Diffusion I think?) is very early, but looks very promising. | | https://open-assistant.io/ | | https://github.com/LAION-AI/Open-Assistant | baobabKoodaa wrote: | I briefly looked at this and it doesn't seem like they provide | a model that I can fine tune on consumer hardware? | Mizza wrote: | Hold your damn horses, this technology is brand new, requires | a tremendous amount of data gathering and computation, | there's a massive volunteer effort already under way, and | you're begging for a free home version so you can save $0.02 | cents. You want a hole, pick up a shovel. | baobabKoodaa wrote: | > Hold your damn horses, this technology is brand new, | requires a tremendous amount of data gathering and | computation, there's a massive volunteer effort already | under way, and you're begging for a free home version so | you can save $0.02 cents. You want a hole, pick up a | shovel. | | $0.02 cents per API request. What I would like to do is | provide people a free service on the internet that using an | LLM under the hood. If you are so rich that you can burn | $10k on an internet hobby project, good for you, but just | know that everybody else is not as rich as you. | | Also I wasn't "begging for a free home version". In my | original question I already provided one option. Sorry, I | mean "one free home version". So it's not like I was | starved out of options. There are options. I was asking for | recommendations. ___________________________________________________________________ (page generated 2023-02-14 23:00 UTC)