hngopher.com

       [HN Gopher] Mistral 7B
       ___________________________________________________________________
        
       Mistral 7B
        
       Author : jasondavies
       Score  : 673 points
       Date   : 2023-09-27 14:52 UTC (8 hours ago)
        
 (HTM) web link (mistral.ai)
 (TXT) w3m dump (mistral.ai)
        
       | say_it_as_it_is wrote:
       | Will this run on my 486sx/16mhz w/8MB of ram?
        
         | speedgoose wrote:
         | If you have enough storage to use as swap, it should.
        
       | ComputerGuru wrote:
       | The announcement says a lot (and has plenty of numbers) but I
       | feel like the most important one is missing: how many GB of GPU
       | memory does this need, quantized and unquantized?
       | 
       | (Searching tells me Llama2-7b unquantized needs close to 15GB;
       | presumably this is similar?)
        
         | brucethemoose2 wrote:
         | Llama 7B will squeeze on a 6GB GPU quantized. Maybe even less
         | with EX2 quantization.
         | 
         | Foundational model trainers dont seem to worry about
         | quantization much, they just throw the base model out there and
         | then let the community take care of easing the runtime
         | requirements.
        
         | sp332 wrote:
         | One parameter is 16 bits == 2 bytes. So a model with 7 billion
         | parameters needs 14GB of RAM for the un-quantized model, plus
         | some overhead for the KV cache and other "working memory" stuff
         | but that should be fairly low for a 7B model. I expect it will
         | work on a 16GB GPU just fine.
         | 
         | Quantized ones are also easy. 8 bits == 1 byte so that's 7GB
         | for the model. 4-bit gets you below 4GB.
        
           | semi-extrinsic wrote:
           | From the Mistral docs, it seems they need 24GB which is kind
           | of odd?
           | 
           | https://docs.mistral.ai/llm/mistral-v0.1
        
             | sp332 wrote:
             | Interesting, and that requirement is repeated on the cloud
             | deployment pages, even the unfinished ones where that is
             | the _only_ requirement listed so far.
             | https://docs.mistral.ai/category/cloud-deployment I wonder
             | if that sliding context window really blows up the RAM
             | usage or something.
        
               | sebzim4500 wrote:
               | Unless I've misunderstood something, the sliding context
               | window should decrease memory usage at inference compared
               | to normal flash attention.
        
             | lerela wrote:
             | We have clarified the documentation, sorry about the
             | confusion! 16GB should be enough but it requires some vLLM
             | cache tweaking that we still need to work on, so we put
             | 24GB to be safe. Other deployment methods and quantized
             | versions can definitely fit on 16GB!
        
               | brucethemoose2 wrote:
               | Shouldn't it be much less than 16GB with vLLM's 4-bit
               | AWQ? Probably consumer GPU-ish depending on the batch
               | size?
        
           | brucethemoose2 wrote:
           | Its not so straightforward, as theres some overhead aside
           | from the weights, especially with 7B at ~4 bit.
           | 
           | But this is _probably_ capable of squeezing onto a 6GB (or
           | less?) GPU with the right backend.
        
       | tormeh wrote:
       | Not a big fan of how server-centric the LLM landscape is. I want
       | something that can run locally, and doesn't require any special
       | setup. One install + one model import maximum. Currently unless I
       | want to go clone git repos, install Python dependencies and buy
       | an Nvidia GPU I'm stuck waiting for it to become part of
       | https://webllm.mlc.ai/. That's a website, come to think of it,
       | but at least the computation happens locally with minimal fuss.
        
         | dwringer wrote:
         | You can get llama CPP or kobold.cpp binaries and load a
         | quantized model right into them on the CPU only, no need to
         | install Python or have an Nvidia GPU.
        
           | tormeh wrote:
           | Well, I'd like it to respond in something close to real-time,
           | and since I have a pretty good non-Nvidia GPU, it makes more
           | sense to wait for the WebGPU port.
        
             | programd wrote:
             | 7 tokens per sec on an i5-11400 CPU using llama.cpp -
             | that's pretty real time for personal use I would think.
        
       | winddude wrote:
       | Are you going to continue to train to a larger param size, say
       | 13b or 30b?
        
         | [deleted]
        
         | brucethemoose2 wrote:
         | There is definitely a demand for a 30B model (aka a model that
         | will comfortably fit on 24GB GPUs (or 32GB of system RAM) and
         | _squeeze_ into 16GB).
        
       | anish_m wrote:
       | What are the SOTA benchmarks for LLMs now? Love the progress on
       | opensource models, but would like to see an uncontaminated and
       | objective framework to evaluate them.
        
       | rgbrgb wrote:
       | This model runs in FreeChat for macOS [0] because it's supported
       | by llama.cpp :)
       | 
       | You'll just need to download a nice GGUF here:
       | https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF
       | 
       | [0]: https://github.com/psugihara/FreeChat
        
         | local_crmdgeon wrote:
         | The video recording on the Readme is broken for me on FF/MacOS
        
           | rgbrgb wrote:
           | Thanks for letting me know! I think it broke when I made the
           | repo public this morning. Should be fixed now.
        
         | Patrick_Devine wrote:
         | also works w/ `ollama run mistral`.
        
           | stavros wrote:
           | Thanks for that, I didn't see it in the list and thought it
           | wouldn't be available to just run.
        
         | [deleted]
        
       | anonyfox wrote:
       | can it run on my macbook air locally yet, with acceptable
       | performance? Guess the answer is still no
        
         | fredoliveira wrote:
         | You'll be able to use it with llama.cpp soon [1], so, should
         | run fine on your macbook, yes.
         | 
         | [1]:
         | https://github.com/ggerganov/llama.cpp/pull/3362#issuecommen...
        
           | anonyfox wrote:
           | that really was "soon", wow - already pulling it with ollama!
        
       | riedel wrote:
       | Can anyone provide details about the training of the model. What
       | data is it based on? Common Crawl? (Being a french company the
       | also rather focus on English language tasks) Where was it trained
       | and with how many resources? They mention Leonardo. I was in an
       | interesting meeting at the German Research Ministry last week
       | where people where complaining that the EuroHPC resources were
       | not sufficient atm to train decent LLMs. I guess the guys in the
       | end went also to CoreWeave in the US.
        
         | brucethemoose2 wrote:
         | > Inspecting the tokenizer model, there is evidence indicated a
         | training dataset of 8T tokens (/mnt/test/datasets/tokenizer_tra
         | ining/8T_train_data/shuffled.txt)
         | 
         | https://github.com/ggerganov/llama.cpp/pull/3362#issuecommen...
        
       | pmarreck wrote:
       | `pip` is a horribly broken way to install more than 1 Python
       | package on a single Linux OS install and I sincerely wish the LLM
       | people would move away from it because I neither want to run
       | every new project in a Docker image, nor do I want to start a new
       | Nix project for every Python project I want to try out
        
         | RockRobotRock wrote:
         | Have you heard of virtual environments?
        
           | pmarreck wrote:
           | Have you ever googled "broken virtualenv"? Mine have broken
           | numerous times, including the problem where updating the
           | system breaks ALL of them. I know what you're going to say
           | next- "You're a technologist, you know how to fix it." True,
           | but I don't like wasted effort, even talented wasted effort
           | is still wasted effort. Meanwhile, Nix stuff _just works_ ,
           | and I will never use another package manager ever again
           | (unless it is something that uses the core Nix idea, but
           | makes it easier!)
           | 
           | https://www.google.com/search?q=broken+virtualenv
        
         | okwhateverdude wrote:
         | Yeah, gotta setup miniconda to keep things manageable.
         | Basically a generic virtual env manager that is trivial to use.
         | This way you can ensure each one of these LLM projects that
         | want different versions of different libs will get them without
         | stepping on each other or polluting your system.
        
           | pmarreck wrote:
           | I don't want to have to set up miniconda. I don't like
           | Python, I am not part of the Python ecosystem, and I don't
           | want to add a bunch of Python tools to my toolchain just to
           | run a friggin' LLM project.
           | 
           | I'm not a C coder either, but I don't complain about those
           | projects because they're almost always "config; make; make
           | install". I basically want something like that, but for
           | Python. The nerd-tapdancing required here is ridiculous.
        
       | sp332 wrote:
       | Aside from the sliding attention window, I don't see them mention
       | any particular reasons for the incredible performance. I would
       | like to see some tests for benchmark contamination in the
       | training set.
        
         | brucethemoose2 wrote:
         | This ^
         | 
         | I am suspicious of contamination in every finetune I see, and
         | _very_ suspicious in a new foundational model like this.
         | 
         | (For those reading and not following, "contamination" is
         | training a model/finetune on the very test it will be tested
         | on. Normally these known tests are specifically excluded from
         | training datasets so the models can be properly evaluated, but
         | throwing them in is an easy way to "cheat" and claim a model is
         | better than it is.
         | 
         | In a foundational model with a huge dataset, there's also a
         | high probability that well-known evaluation questions snuck
         | into the dataset by accident).
        
           | londons_explore wrote:
           | Theres also a good chance that simply people discussing the
           | test questions and answers on forums like reddit sneaks into
           | the dataset, even if the exact question is filtered out.
        
           | Workaccount2 wrote:
           | We need an AI that can generate bespoke AI tests...
        
         | loudmax wrote:
         | Agreed. It's great that they're model available under a
         | permissive license. But without more information about the
         | training data and methodology, it isn't exactly "open source".
        
           | Tommstein wrote:
           | > But without more information about the training data and
           | methodology, it isn't exactly "open source".
           | 
           | Being or not being open source has exactly jack shit to do
           | with that.
        
             | falcor84 wrote:
             | I actually would support that statement. An AI model is a
             | software artifact generated as part of a complex "build"
             | process. Without having access to the details of the
             | process that generated the artifact, including the training
             | data, it's no more open-source than a compiled binary.
        
             | capableweb wrote:
             | I'd argue that it goes against the intent of open source
             | very much.
             | 
             | The idea behind OSS is that you're able to modify it
             | yourself and then use it again from that point. With
             | software, we enable this by making the source code public,
             | and include instructions for how to build/run the project.
             | Then I can achieve this.
             | 
             | But with these "OSS" models, I cannot do this. I don't have
             | the training data and I don't have the training
             | workflow/setup they used for training the model. All they
             | give me is the model itself.
             | 
             | Similar to how "You can't see the source but here is a
             | binary" wouldn't be called OSS, it feels slightly unfair to
             | call LLM models being distributed this way OSS.
        
               | computerex wrote:
               | Whilst not wrong, understand that having the weights be
               | released explicitly under Apache is a hell of a lot
               | better than the weights being released under a non open
               | source license and commercially friendly license. At
               | least people can legally use this for their solutions.
        
         | pk-protect-ai wrote:
         | They use some improvements on attention mechanisms. It is on
         | the main page ... That is why 7B model requires 24GB VRAM,
         | there might be increased amount of attention heads as well...
        
           | sp332 wrote:
           | They clarified https://news.ycombinator.com/item?id=37677311
           | that it should only need 16GB, but the unoptimized runtime
           | currently takes 24.
        
         | rafaelero wrote:
         | Yeah, they should have mentioned details about the dataset.
        
       | iamcreasy wrote:
       | How difficult it is to fine tune model like this with specific
       | domain knowledge? I am currently looking into gpt-3.5-turbo-
       | instruct for this same purpose.
        
       | nabakin wrote:
       | They don't mention what datasets were used. I've come across too
       | many models in the past which gave amazing results because
       | benchmarks leaked into their training data. How are we supposed
       | to verify one of these HuggingFace datasets didn't leak the
       | benchmarks into the training data boosting their results? Did
       | they do any checking of their datasets for leaks? How are we
       | supposed to know this is a legit result?
       | 
       | At this point, it should be standard practice to address this
       | concern. Any model which fails provide good evidence they don't
       | have benchmark leaks, should not be trusted until its datasets
       | can be verified, the methodology can be replicated, or a good,
       | independent, private benchmark can be made and can be used to
       | evaluate the model.
        
         | londons_explore wrote:
         | One solution is to come up with a new benchmark yourself.
         | 
         | Manually benchmarking it by coming up with 20 questions and
         | feeding it to a pair of models and blindly choosing the best
         | result can give you a pretty good figure.
         | 
         | And that can probably be done in under 20 mins of human time.
        
       | beernet wrote:
       | This is exceptionally meh. It reads like an excuse to investors.
       | 
       | A free 7B model is great, however, the practical implications of
       | the potential adaptors are near 0. You must be crazy or have an
       | easy use case (that requires no LLM in the first place) if you
       | certainly believe that this model makes more sense per token
       | that, say, ChatGPT.
        
         | brucethemoose2 wrote:
         | Its extremely cheap to run locally or on a cheap cloud GPU, and
         | (if the claims are true) better than 3.5 Turbo with finetuning.
         | Its also unaligned.
        
       | all2 wrote:
       | For those running ollama, here's the ollama release for
       | Mistral-7B
       | 
       | https://ollama.ai/library/mistral
        
       | wg0 wrote:
       | If I give you a binary (all binaries are numbers) but I don't
       | give you the source code and I say it is open source. Is it open
       | source?
       | 
       | Also, I give you a model (all models are numbers) and I say it is
       | open source but I don't give you the program and data that
       | resulted in "compilation" of that model (numbers) so is it open
       | source?
       | 
       | Wouldn't it be more of a new word - open use?
        
         | brucethemoose2 wrote:
         | The "source data" is allegedly 8 trillion tokens. You can't
         | just distribute that like its source code.
         | 
         | The "binary" is the transformers python code, which in this
         | case is essentially llamav2.
         | 
         | Now, the _documentation_ for this model is inexcusably poor.
         | Hot dropping random code on a git repo without one drop of
         | human language would be similarly  "open source," but its bad
         | practice, and unfortunately this is the standard in AI Research
         | Land.
        
           | wg0 wrote:
           | Open source doesn't mean source code (or data) must accompany
           | the program as it is distributed but rather there should be a
           | way (CD ROM with a price tag or S3 bucket or torrent etc.) to
           | get access to it.
        
       | kpennell wrote:
       | sorry for the dumb question. Is there somewhere I can try it?
       | Like a chatbot?
        
         | brucethemoose2 wrote:
         | Huggingface Spaces. Be sure to get the prompting syntax right:
         | https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
         | 
         | I will try to host an instance on the AI Horde later today,
         | which has a better UI and doesn't need a login.
        
           | brucethemoose2 wrote:
           | OK I am hosting the instruct model on the horde now. I will
           | throw up the raw base model too:
           | 
           | https://lite.koboldai.net/#
           | 
           | Note that you must switch to instruct mode in the UI, and use
           | the "LLama 2 chat" preset, but you also need a <s> token in
           | the instruction (memory) tab:
           | 
           | https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1#in.
           | ..
        
       | jxy wrote:
       | ehhh, the design of the webpage infuriates me. Who thought
       | sprinkling faint dots swimming across the webpage was a good
       | thing? I almost thought something's wrong with my eyes or my
       | display!
        
         | BrutalCoding wrote:
         | Fully agree. I was thinking that there was a tiny fruit fly
         | crawling over the screen for a good few seconds.
        
       | covi wrote:
       | Cloud deployment docs: https://docs.mistral.ai/cloud-
       | deployment/skypilot/
        
       | slim wrote:
       | they should've called it winamp
        
         | Ataraxy wrote:
         | ...because it whips the llama's ass.
        
         | spiderfarmer wrote:
         | And adopt a real llama as a mascot, just like Justin Frankel.
        
         | jesperwe wrote:
         | I see what you did there :-D
        
       | code51 wrote:
       | Correctly lists US presidents in chronological order (which is an
       | important test to me).
       | 
       | However for "List Led Zeppelin albums in chronological order"...
       | 
       | Here is a list of Led Zeppelin albums in chronological order: 1.
       | Led Zeppelin (1968) 2. Led Zeppelin II (1969) 3. Led Zeppelin III
       | (1970) 4. Led Zeppelin IV (1971) 5. Houses of the Holy (1973) 6.
       | Physical Graffiti (1975) 7. Presence (1976) 8. In Through the Out
       | Door (1979) 9. Cymbals and Wine (1980)
       | 
       | It's missing "The Song Remains the Same", weird but important.
       | 
       | GPT-4 is also missing it: 1. Led Zeppelin (1969) 2. Led Zeppelin
       | II (1969) 3. Led Zeppelin III (1970) 4. Led Zeppelin IV (1971) 5.
       | Houses of the Holy (1973) 6. Physical Graffiti (1975) 7. Presence
       | (1976) 8. In Through the Out Door (1979) 9. Coda (1982)
       | 
       | "The Song Remains the Same" is a live album of the "concert
       | film". Both GPT-4 and Mistral don't seem to treat it as _also_ an
       | album.
       | 
       | When asked as a follow-up, GPT-4 says:
       | 
       | ""The Song Remains the Same" is a live album and soundtrack by
       | Led Zeppelin, released in 1976. The list you asked for was
       | comprised of their studio albums only." (note that I didn't
       | specifically say _studio albums only_ )
       | 
       | Mistral says something else:
       | 
       | "The Song Remains the Same was released as a single in 1976, not
       | as an album. As a result, it is not included in the list of Led
       | Zeppelin albums in chronological order."
       | 
       | Same behavior, different explanation.
       | 
       | Interesting to see alignment is this hard, even in basic stuff.
        
         | sacnoradhq wrote:
         | MS Copilot is apparently a music fan.
         | 
         | Most LLMs have problems with subtly, such as compound anagrams
         | tend to repeat the same words in reverse order rather than
         | reversing all of the letters in all words.
        
       | sireat wrote:
       | One should be able to run Mistral 7B locally on something as low
       | as 1070 8GB would they not?
       | 
       | That is assuming 8bit weights.
       | 
       | I have not kept up with local LLM news. I assume the steps needed
       | would be somewhat similar to
       | https://followfoxai.substack.com/p/how-to-run-llama-in-an-ol...
       | which is from April 2023.
        
         | brucethemoose2 wrote:
         | Its already kind of outdated, lol.
         | 
         | The backends de joure are either llama.cpp frontends (I use
         | Kobold.cpp at the moment) or oobabooga as the guide specifies,
         | but with the exLlamav2 backend.
         | 
         | If you are serving a bunch of people, run a vLLM backend
         | instead since it supports batching, and host it on the Horde if
         | you are feeling super nice: https://lite.koboldai.net/#
         | 
         | Technically only vLLM will work with this new model at the
         | moment, but I'm sure cpp/ooba support will be added within
         | days.
         | 
         | This comment will probably be obsolete within a month, when
         | llama.cpp gets batching, MLC gets a better frontend, or some
         | other breakthrough happens :P
        
           | sp332 wrote:
           | llama.cpp support is here already via
           | https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF And yes
           | it works fine in oobabooga.
        
       | burningion wrote:
       | For anyone who missed it, the twitter announcement of the model
       | was just a torrent tracker uri:
       | https://twitter.com/MistralAI/status/1706877320844509405
        
       | brucethemoose2 wrote:
       | I have thrown the base+instruct models up on AI Horde, you can
       | try it with no login:
       | 
       | https://lite.koboldai.net/#
        
       | sharperguy wrote:
       | Is there a plugin for vim, or other code editors that allows such
       | an LLM to function similarly to copilot?
        
         | brucethemoose2 wrote:
         | https://dev.to/chenhunghan/use-code-llama-and-other-open-llm...
         | 
         | Many LLM frameworks implement the OpenAI API, so if you can get
         | that into your code editor and change the address, you can run
         | it with any LLM you want.
         | 
         | Doing it _smoothly_ is another matter.
        
         | swaroop wrote:
         | Try https://github.com/huggingface/llm.nvim (no affiliation)
        
       | tikkun wrote:
       | Regular reminder that most LLM benchmarks are pretty terrible.
       | I've looked inside the datasets and it's not stuff that I'd want
       | to dictate what determines which models are best!
        
         | cosmojg wrote:
         | In my experience, HellaSwag seems to correlate highly enough
         | with model performance for my use case (i.e., complex
         | autocompletion of prose and code rather than rather than
         | glorified chatbotting). MMLU and ARC aren't too bad, but
         | TruthfulQA can sometimes be a straight-up countersignal.
        
       | anonyfox wrote:
       | I eat my initial words, this works really well on my macbook air
       | M1 and feels comparable of GPT3.5 - which is actually an amazing
       | feat!
       | 
       | Question: is there something like this, but with the "function
       | calling api" finetuning? 95% of my uses nowadays deal with
       | input/output of structured data (JSON, basically), and I kind of
       | fear that OpenAI will no longer support thos specialized models
       | after a few months... I know its probably not that glorious
       | compared with multimodal chat UIs, but a workhose like nothing
       | else for automation!
        
         | Karrot_Kream wrote:
         | How many tokens / sec are you getting on an M1 Air? Curious
         | since I'm at work and can't try this on my Air yet hah.
        
           | Patrick_Devine wrote:
           | I'm getting >30 tokens/sec using it with ollama and an M2
           | Pro. That might be a little slow though because I have a
           | background finetuning job running.
        
             | minzi wrote:
             | Bit of a tangential question here, but any recommendations
             | on how to get started fine tuning this model (or ones like
             | it)? I feel like there are a million different tutorial and
             | ways of doing it when I google.
        
           | anonyfox wrote:
           | feels roughly like the same speed as GPT3.5 in the browser UI
        
           | brucethemoose2 wrote:
           | Its the same speed as llama 7B, so very quick.
        
         | brucethemoose2 wrote:
         | Yes!
         | 
         | https://github.com/ggerganov/llama.cpp/blob/master/grammars/...
         | 
         | Its actually better than a specialized model, during token
         | generation it constrains the possible output tokens to an
         | arbitrary grammar (like, say, JSON syntax). So it will work
         | "perfectly" with any model with a basic understanding of the
         | format.
         | 
         | Kobold.cpp and text-generation-ui already support this, and
         | both will run on your mac.
        
         | olso wrote:
         | Hey, how did you try it on M1? I don't see any MPS support.
         | 
         | https://github.com/mistralai/mistral-src/issues/2
         | 
         | edit: nevermind https://ollama.ai
        
           | anonyfox wrote:
           | `ollama run mistral` <-- literally thats it
        
       | Dwedit wrote:
       | Someone get that stupid animated background off the site, it
       | looks like bugs crawling on the screen.
        
         | MadDemon wrote:
         | I was thinking the same thing
        
       | eminence32 wrote:
       | I've never run one of these models locally, but their README has
       | some pretty easy to follow instructions, so I tried it out...
       | 
       | > RuntimeError: Found no NVIDIA driver on your system.
       | 
       | It's true that I don't have an NVIDIA GPU in this system. But I
       | have 64GB of memory and 32 cpu cores. Are these useless for
       | running these types of large language models? I don't need
       | blazing fast speed, I just need a few tokens a second to test-
       | drive the model.
        
         | lhl wrote:
         | Use the code/model included here:
         | https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
         | 
         | Change the initial device line from "cuda" to "cpu" and it'll
         | run.
         | 
         | (Edit: just a note, use the main/head version of transformers
         | which has merged Mistral support. Also saw TheBloke uploaded a
         | GGUF and just confirmed that latest llama.cpp works w/ it.)
        
         | avereveard wrote:
         | it's not entirely their fault, they rely on xformers, and that
         | library is gpu only.
         | 
         | other models will happily run on cpu only mode, depending on
         | your environment there are super easy ways to get started, and
         | 32 core should be ok for a llama2 13b and bearable with some
         | patient for running 33b models. for reference I'm willingly
         | running 13b llama2 on cpu only mode so I can leave the gpu to
         | diffusers, and it's just enough to be generating at a
         | comfortable reading speed.
        
         | kardianos wrote:
         | Use llama.cpp to run models locally.
        
           | turnsout wrote:
           | Can llama.cpp run this yet? That would be surprising
        
             | daakus wrote:
             | It can! TheBloke is to thank for the incredibly quick
             | turnaround.
             | 
             | https://github.com/ggerganov/llama.cpp/pull/3362
             | 
             | https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/tree/m
             | a...
        
               | turnsout wrote:
               | Wow, awesome!
        
               | aidenn0 wrote:
               | I have yet to get any useful output out of the Q5_K_S
               | version; haven't tried any others yet.
        
               | moffkalast wrote:
               | Birds fly, sun shines, and TheBloke always delivers.
               | 
               | Though I can't figure out that prompt and with LLama2's
               | template it's... weird. Responds half in Korean and does
               | unnecessary numbering of paragraphs.
               | 
               | Just one big _sigh_ towards those supposed efforts on
               | prompt template standardization. Every single model just
               | has to do something unique that breaks all compatibility
               | but has never resulted in any performance gain.
        
             | programd wrote:
             | I'm getting about 7 tokens per sec for Mistral with the
             | Q6_K on a bog standard Intel i5-11400 desktop with 32G of
             | memory and no discrete GPU (the CPU has Intel UHD Graphics
             | 730 built in).
             | 
             | So great performance on a cheap CPU from 2 years ago which
             | costs, what $130 or so?
             | 
             | I tried Llama.65B on the same hardware and it was way
             | slower, but it worked fine. Took about 10 minutes to output
             | some cooking recipe.
             | 
             | I think people way overestimate the need for expensive GPUs
             | to run these models at home.
             | 
             | I haven't tried fine tuning, but I suspect instead of 30
             | hours on high end GPUs you can probably get away with fine
             | tuning in what, about a week? two weeks? just on a
             | comparable CPU. Has anybody actually run that experiment?
             | 
             | Basically any kid with an old rig can roll their own
             | customized model given a bit of time. So much for
             | alignment.
        
             | loudmax wrote:
             | It would be very surprising.
             | 
             | Mistral AI's github page has more information on their
             | sliding window attention method to achieve this
             | performance: https://github.com/mistralai/mistral-src
             | 
             | If Mistral 7b lives up to the claims, I expect these
             | techniques will make their way into llama.cpp. But I would
             | be surprised if the required updates were quick or easy.
        
         | polygamous_bat wrote:
         | You gotta wait until GGML and the likes repackage the model;
         | early releases are almost always targeted at ML folks with
         | dedicated GPUs.
        
           | londons_explore wrote:
           | I think it's really lame that ML, which is just math really,
           | hasn't got some system-agnostic language to define what math
           | needs to be done, and then it can run easily on
           | CPU/GPU/TPU/whatever...
           | 
           | A whole industry being locked into NVidia seems bad all
           | round.
        
             | esafak wrote:
             | It's not Nvidia's fault that the competition (AMD) does not
             | provide the right software. There is an open alternative to
             | CUDA called OpenCL.
        
               | tormeh wrote:
               | As sad as it is, this is true. AMD has never spent lots
               | of money on software, while Nvidia always has, which was
               | fine for traditional graphics, but with ML this really
               | doesn't cut it. AMD could have ported Pytorch to OpenCL
               | or Vulkan or WebGPU, but they just... can't be
               | bothered???
        
               | londons_explore wrote:
               | Nvidia has wrapped their cuda language in patents and
               | licensing so tightly that there is no way AMD could
               | release anything cuda-compatible.
        
               | kkielhofner wrote:
               | Yes but AMD could release a ROCm that actually works and
               | then put actually meaningful resources into some of the
               | countless untold projects out there that have been
               | successfully building on CUDA for 15 years.
               | 
               | There was a recent announcement that after six years AMD
               | finally sees the $$$ and will be starting to (finally)
               | put some real effort into ROCm[0]. That announcement was
               | two days ago and they claim they started on this last
               | year. My occasional experience with ROCm doesn't show
               | much progress or promise.
               | 
               | I'm all for viable Nvidia competition in the space but
               | AMD has really, really, really dropped the ball on GPGPU
               | with their hardware up to this point.
               | 
               | [0] - https://www.eetimes.com/rocm-is-amds-no-1-priority-
               | exec-says...
        
             | miloignis wrote:
             | https://onnx.ai/ sounds close to what you're thinking of,
             | it's an open interchange format.
        
       | pugio wrote:
       | How come the blog post doesn't discuss training at all? Which
       | data sets were used? How many tokens was it trained on? I
       | couldn't seem to find a linked preprint or anything else really
        
       | aubanel wrote:
       | Congrats to them, their huge series A (USD113M) does not seem
       | oversized anymore!
        
       | mchiang wrote:
       | This model will run on Ollama, llama.cpp, and other tools:
       | 
       | ollama run mistral
       | 
       | or for llama.cpp, thebloke has uploaded the GGUF models here:
       | https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/tree/ma...
       | and you can run it
       | 
       | really looking forward to the chat fine-tuned models that doesn't
       | seem to be available yet.
        
         | brucethemoose2 wrote:
         | Oh, that means its a llama architecture model!
         | 
         | Is the tokenizer the same? It may "work" without actually
         | working optimally until llama.cpp patches it in.
         | 
         | And the instruct model was just uploaded.
        
       | todd3834 wrote:
       | This is my hello world test for coding assistant for LLMs.
       | CodeLlama and GPT4 crush the answer. I love seeing innovation in
       | the space but Mistral kind of failed me on my first test:
       | 
       | ---------------------
       | 
       | >>> Write code that will track swipes in Typescript
       | 
       | . Swipe should be detected when the user moves a finger over 30%
       | of the screen, and then drags it all the way to the end of the
       | screen.
       | 
       | ## Getting started You need to have yarn or npm installed before
       | you can run this app
       | 
       | ### Running in browser ```bash yarn install yarn serve ```
       | 
       | ### Running with docker Use the docker-compose file located at
       | ./docker/docker-compose.yml ```bash cd <path-to-project> docker-
       | compose up ```
       | 
       | ## Screenshot ![Screenshot](./images/screenshot.png)
        
         | amilios wrote:
         | Not a fair comparison as Mistral is not instruction tuned. You
         | need to write a "continuation-style" prompt or few-shot it,
         | rather than just giving it a direct order. Honestly this is one
         | of the reasons imo that Mistral won't see wide acceptance,
         | LLaMA-2-7b-chat is RLHF and instruction tuned and is just so
         | much easier to use as a result.
        
           | code51 wrote:
           | What about this one?
           | 
           | https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
        
         | black_puppydog wrote:
         | You didn't enjoy the screenshot? :)
        
         | code51 wrote:
         | Here is the response from Instruct model:
         | https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
         | 
         | To track swipes in TypeScript, we can use a library like
         | `react-native-gesture-handler`. Here's an example of how to use
         | it to track horizontal and vertical swipes:
         | 
         | ```typescript
         | 
         | import React from 'react'; import { GestureHandlerEvent,
         | GestureHandlerProps } from 'react-native-gesture-handler';
         | 
         | const SwipeableContent: React.FC<GestureHandlerProps> = ({
         | onStart, onEnd, onMove }) => { const handleHorizontalSwipe =
         | (event: GestureHandlerEvent) => { if (event.type ===
         | 'horizontal') { onStart(event); } else { onEnd(event); } };
         | const handleVerticalSwipe = (event: GestureHandlerEvent) => {
         | if (event.type === 'vertical') {           onStart(event);
         | } else {           onEnd(event);         }       };
         | return (         <View>           <PanGestureHandler
         | onStart={handleHorizontalSwipe} onMove={onMove}>
         | <Text>Drag me horizontally</Text>
         | </PanGestureHandler>           <PanGestureHandler
         | onStart={handleVerticalSwipe} onMove={onMove}>
         | <Text>Drag me vertically</Text>           </PanGestureHandler>
         | </View>       );
         | 
         | };
         | 
         | ``` In this example, we use the `PanGestureHandler` component
         | from `react-native-gesture-handler` to track horizontal and
         | vertical swipes. The `onStart`, `onEnd`, and `onMove` props are
         | passed to the `PanGestureHandler` component to handle the
         | different types of gestures.
         | 
         | You can then use this component in your app to track swipes on
         | any view:
         | 
         | ```typescript
         | 
         | const App = () => { const [isHorizontalSwiping,
         | setIsHorizontalSwiping] = useState(false); const
         | [isVerticalSwiping, setIsVerticalSwiping] = useState(false);
         | const handleStart = (event: GestureHandlerEvent) => {
         | if (event.type === 'horizontal') {
         | setIsHorizontalSwiping(true);         } else {
         | setIsVerticalSwiping(true);         }       };            const
         | handleEnd = (event: GestureHandlerEvent) => {         if
         | (event.type === 'horizontal') {
         | setIsHorizontalSwiping(false);         } else {
         | setIsVerticalSwiping(false);         }       };
         | const handleMove = (event: GestureHandlerEvent) => {
         | console.log('Gesture moved');       };            return (
         | <View>           <SwipeableContent onStart={handleStart}
         | onEnd={handleEnd} onMove={handleMove} />
         | <Text>{isHorizontalSwiping ? 'Horizontal swipe is in progress'
         | : ''}</Text>           <Text>{isVerticalSwiping ? 'Vertical
         | swipe is in progress' : ''}</Text>         </View>       );
         | 
         | }; ```
         | 
         | In this example, we use the `SwipeableContent` component to
         | track horizontal and vertical swipes. We also track the status
         | of the swipe using state variables to show a message when a
         | swipe is in progress.
        
         | capableweb wrote:
         | CodeLlama is obviously trained on code specifically, so maybe
         | not a useful comparison. GPT4 is also way different than
         | Mistral, with a huge step up in parameters and seems to be
         | using multi-agent approach too.
         | 
         | Since Mistral is just a 7B parameter model, it's obvious that
         | you won't be able to have it straight up write accurate code,
         | it's simply too small for being able to accomplish something
         | like that, unless you train the model specifically for writing
         | code up front.
         | 
         | I guess if all you're looking for is a model to write code for
         | you, that makes sense as a "hello world" test, but then you're
         | looking at the wrong model here.
         | 
         | What you really want to do if you're looking for a good
         | generalized model, is to run a bunch of different tests against
         | it, from different authors, average/aggregate a score based on
         | those and then rank all the models based on this score.
         | 
         | Luckily, huggingface already put this all in place, and can be
         | seen here:
         | https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...
         | 
         | This Mistral 7B model seems to earn itself a 3rd place compared
         | to the rest of the 7B models added to the leaderboard.
         | 
         | Edit: As mentioned by another commentator, this also seems to
         | be a base model, not trained specifically for
         | request<>reply/chat/instructions. They're (or someone) is meant
         | to fine-tune this model for that, if they want to.
        
           | qeternity wrote:
           | > and seems to be using multi-agent approach too.
           | 
           | What do you mean by this? MoE?
        
         | pclmulqdq wrote:
         | I'm pretty sure this model is not instruct tuned, so it's not
         | exactly apples-to-apples.
        
       | slimsag wrote:
       | Major kudos to Mistral for being the first company to Apache
       | license a model of this class.
       | 
       | Meta wouldn't make LLama open source.
       | 
       | DeciLM wouldn't make theirs open source.
       | 
       | All of them wanted to claim they were open source, while putting
       | in place restrictions and not using an open source license. So I
       | think it's worth giving Mistral big kudos here for actually doing
       | it and going Apache licensed.
        
         | miloignis wrote:
         | Falcon 40B is Apache2, though they then went back to not open
         | for their 180B.
        
           | divbzero wrote:
           | MPT-30B is also Apache 2.0:
           | https://huggingface.co/mosaicml/mpt-30b
           | 
           | There must be others as well?
        
             | capableweb wrote:
             | A lot! Go here https://huggingface.co/spaces/HuggingFaceH4/
             | open_llm_leaderb..., check "Hub license" on the left then
             | sort by that column in the table.
             | 
             | Estimating, there is more than 100 models with a apache-2.0
             | license.
        
               | avereveard wrote:
               | a good chunk are "only the lora is apache 2 the base
               | model is lama" or straight up llama2 model finetuned and
               | distributed laundring their license, or use gpt generated
               | code in the finetuning dataset against OpenAI tos.
               | licensing is a bit up in the air and just filtering with
               | apache 2 doesn't tell you much
        
         | monocasa wrote:
         | I'm really not a fan of how these releases of model binaries
         | are being referred to as open source. How do I make changes to
         | this model? How do I integrate changes to this model that
         | others have made?
         | 
         | The data pipeline is the source here. Just because it's not
         | locked behind a SaaS veneer doesn't make it open source any
         | more than Windows is.
        
           | hcks wrote:
           | This fallacious talking point is getting annoying.
        
           | [deleted]
        
           | LoganDark wrote:
           | Windows is not open source. In order to access the source
           | code, you need a government contract. These aren't given out
           | lightly.
        
             | piperswe wrote:
             | Right, that's the point they're making.
        
               | LoganDark wrote:
               | Recent models may not be fully open source, but could you
               | point me to one that's subject to the same amount of
               | scrutiny as Windows source code access? Because I'm not
               | sure if there is one out there.
        
               | monocasa wrote:
               | Windows source access is given out pretty freely to
               | academics as the Windows Research Kernel (WRK).
               | 
               | But the point is that the data pipeline and tensorflow or
               | what have you that trained the model is the source here.
               | The model is the binary.
        
             | spullara wrote:
             | Most partners can get a copy of the Windows source code if
             | they need it.
             | 
             | https://www.microsoft.com/en-us/sharedsource/
        
           | squeaky-clean wrote:
           | You can do your own fine tuning on existing models
           | 
           | > How do I integrate changes to this model that others have
           | made
           | 
           | Typically with a LoRA
        
             | monocasa wrote:
             | Is that how the engineers at Mistral iterated on this
             | model?
             | 
             | I can patch windows and other binaries too (I'm actually
             | pretty decent at that; 20 or so years with IDA/ghidra).
             | That doesn't make them open source.
        
               | brucethemoose2 wrote:
               | They trained it from scratch, but there is nothing
               | stopping you from doing some more training on top of it
               | yourself. Nothing is obfuscated, no more information is
               | required beyond the prompting syntax... they just
               | released basically no documentation, which is
               | unfortunately kinda standard in AI land.
               | 
               | There is already a sizable ecosystem of people doing just
               | that:
               | https://huggingface.co/models?sort=modified&search=7B
        
               | monocasa wrote:
               | Just like Microsoft isn't stopping me from patching
               | Windows.
               | 
               | > Nothing is obfuscated
               | 
               | The dataset and pipeline to rebuild these weights are not
               | included.
        
               | pk-protect-ai wrote:
               | Have you tried the HF version which is linked on the main
               | page? It is transformer based after all and it is
               | converted to HF format. Just use QLoRA to fine-tune
               | whatever you want on the top of that model. They handed
               | you hundreds of GPU hours, without asking anything in
               | return. You may throw it away and train the model from
               | scratch. Be ready to throw $70k-$150k into air warming.
        
               | monocasa wrote:
               | > Have you tried the HF version which is linked on the
               | main page? It is transformer based after all and it is
               | converted to HF format. Just use QLoRA to fine-tune
               | whatever you want on the top of that model.
               | 
               | Is that how their engineers built this model in the first
               | place?
               | 
               | Just because they're giving the build artifact of there
               | hard work away doesn't make it open source.
               | 
               | > Be ready to throw $70k-$150k into air warming.
               | 
               | Just because builds are expensive, doesn't mean releasing
               | the build artifacts counts as open source.
        
               | pk-protect-ai wrote:
               | I see dozens of your posts here complaining about "not
               | open source." You are either a paid provocateur or a
               | troll. What exactly is missing in your opinion in this
               | release that is making this model not open source?
               | [feeding trolls is fun]
        
               | monocasa wrote:
               | > I see dozens of your posts here complaining about "not
               | open source." You are either a paid provocateur or a
               | troll.
               | 
               | I can't reply in my one tree without being called a
               | troll?
               | 
               | Also, accusing someone of being a paid provocateur is
               | against HN guidelines.
               | 
               | > What exactly is missing in your opinion in this release
               | that is making this model not open source?
               | 
               | The source the engineers at Mistral used to build it.
        
               | brucethemoose2 wrote:
               | > The dataset and pipeline to rebuild these weights are
               | not included.
               | 
               | This is a good point.
               | 
               | But its probably not deterministic and reproducible. I
               | bet the researchers themselves couldn't remake the exact
               | weights (or even similar weights) from scratch.
        
               | Scene_Cast2 wrote:
               | Yep. Even if the initial seed for parameter init, the
               | example shuffling seed, etc were constant, the
               | distributed nature of training (and syncing the model
               | across machines) would kill reproducibility. Not to
               | mention resuming from checkpoints after gradient
               | explosions, etc.
        
               | monocasa wrote:
               | I've heard from ML engineers at larger shops that
               | reproducibility is key to working at scale. That's how
               | you track down "this training regime went to shit because
               | of something we changed" versus "this training regime
               | went to shit because on of the GPUs training it is
               | starting to fail".
        
               | monocasa wrote:
               | I mean most binaries aren't reproducible. That takes a
               | lot of work and isn't worth it most of the time.
               | 
               | However, I've heard from ML engineers at the big shops
               | that hermetic, reproducible builds are key to making any
               | progress at scale with ML. Apparently that goes all the
               | way back to when Jeff Dean took over Google Brain.
        
               | HanClinto wrote:
               | Is "Doom" open source?
               | 
               | The source code to the engine is available and open, but
               | if you want to do anything useful or recreate the
               | original game, you have to provide your own data.
               | 
               | This model is open source, much in a similar way that the
               | Doom engine is open source.
        
               | monocasa wrote:
               | > Is "Doom" open source?
               | 
               | > The source code to the engine is available and open,
               | but if you want to do anything useful or recreate the
               | original game, you have to provide your own data.
               | 
               | The Doom executable is open source. The data is not.
               | Explicitly, the data is under a different license than
               | the GPL and makes no claims about being open source.
               | There are total conversion mods that replace the closed
               | source levels with open source ones however.
               | 
               | > This model is open source, much in a similar way that
               | the Doom engine is open source.
               | 
               | Here's the source the engineers that created doom used to
               | build the doom engine: https://github.com/id-
               | Software/DOOM
               | 
               | Where is the source used to build this model?
        
               | spullara wrote:
               | You don't iterate on the model weights, you train them on
               | billions of tokens. There was no real iteration, you set
               | it up and wait for the GPUs to do the work. The model
               | architecture itself is done before you start training. If
               | you change it, you generally have to start training from
               | scratch.
               | 
               | You do get the ability to further train the model using
               | fine-tuning or LoRas.
        
               | monocasa wrote:
               | There's absolutely state space iteration in model
               | training. Layer sizes, composition, construction.
               | 
               | > There was no real iteration, you set it up and wait for
               | the GPUs to do the work. The model architecture itself is
               | done before you start training. If you change it, you
               | generally have to start training from scratch.
               | 
               | That's like saying there's no design iteration in
               | software because you type 'make' and the executable is
               | built.
        
           | gfodor wrote:
           | Sounds false, this is only an analogy wrapped up in what
           | sounds like an argument. If you think about what you're
           | actually getting it's open source.
        
             | monocasa wrote:
             | What do Mistral's engineers work with? Do they have this
             | 14GB pickle file open in their IDEs at 3PM? If not, why is
             | that not the source to this?
        
           | scosman wrote:
           | You could fine tune and release that. It's not software so
           | exact parallels don't make sense, but the open permissions
           | are great.
        
             | monocasa wrote:
             | I mean, it's absolutely software.
        
               | scosman wrote:
               | Software is used to make models, but the models aren't
               | software anymore than Toy Story is software.
        
               | mbakke wrote:
               | Forgive my ignorance, I haven't studied the AI tooling
               | landscape yet. Are you saying these models have a
               | structured binary format and "running" them is just a
               | matter of having a "player" with the right "codec"?
               | 
               | Or are they directly executing CPU instructions?
        
               | monocasa wrote:
               | There's literally a list of opcodes to be executed in the
               | model. There's a whole lot of data too, but that's part
               | of the build just as much as anything in a .data section.
        
           | godelski wrote:
           | > these releases of model binaries
           | 
           | Are they binaries? I haven't seen a binary in awhile tbh.
           | Usually they're releasing both the raw architecture (i.e.
           | code) and the weights of the models (i.e. what numbers go
           | into what parts of the architecture). The latter is in a
           | readable format that you can generally edit by hand if you
           | wanted to. But even if it was a binary as long as you have
           | the architecture you can always load into the model and
           | decide if you want to probe it (extract values) or modify it
           | by tuning (many methods to do this).
           | 
           | As far as I'm concerned, realistically the only issue here is
           | the standard issue around the open source definition. Does it
           | mean the source is open as available or open as "do what the
           | fuck you want"? I mean it's not like OpenAI is claiming that
           | GPT is open sourced. It's just that Meta did and their source
           | is definitely visible. Fwiw, they are the only major company
           | to do so. Google doesn't open source: they, like OpenAI, use
           | private datasets and private models. I'm more upset at
           | __Open__AI and Google than I am about Meta. To me people are
           | barking up the wrong tree here. (It also feels weird that
           | Meta is the "good guy" here... relatively at least)
           | 
           | Edit: I downloaded their checkpoint. It is the standard "pth"
           | file. This is perfectly readable, it is just a pickle file. I
           | like to use graftr to view checkpoints, but other tools exist
           | (https://github.com/lmnt-com/graftr)
        
             | GuB-42 wrote:
             | > The "source code" for a work means the preferred form of
             | the work for making modifications to it.
             | 
             | That's the definition in the GPL. That it is text or binary
             | doesn't matter.
             | 
             | So are the weights the preferred form for making
             | modifications? Partly yes, because of fine tuning, but also
             | no, because you are limited in what you can do with fine
             | tuning. If Mistral had to make major changes to their
             | model, they would probably start with the dataset and code
             | they have but you don't, the one that created the weights
             | file.
             | 
             | So I wouldn't call it "open source", just "open". You can
             | do whatever you want with what you have, but you don't have
             | the same abilities as Mistral to modify the model because
             | you lack some data.
             | 
             | Still, it is a bit of an unusual situation since even with
             | the "real sources", i.e. training data and code, most
             | people wouldn't have the resources to retrain the model,
             | and a big part of the value in these models is the
             | computing resources that were invested in training them.
        
             | monocasa wrote:
             | First off tokenizer.model in their release is absolutely a
             | binary by any definition.
             | 
             | Secondly, is hand editing the pickle file the way Mistral's
             | engineers constructed this pickle file? Why not? The
             | automation that constructed that file (and tokenizer.model)
             | is the source. Binaries in this context can certainly be an
             | ASCII encoded file.
             | 
             | Just because other vendors are worse doesn't make this open
             | source.
        
               | syntaxing wrote:
               | Not OP but I don't get it though, you can modify the
               | tokenizer all you want and fine tune the weights all you
               | want. There's nothing inherently hidden behind a binary
        
               | monocasa wrote:
               | I can edit binaries too.
               | 
               | The question is am I provided the build source that
               | constructed these files. Mistral did not hand edit these
               | files to construct them, there's source out there that
               | built them.
               | 
               | Like, come on, a 14GB of a dump of mainly numbers that
               | were constructed algorithmically are not "source".
        
               | spywaregorilla wrote:
               | The whole point of machine learning is deriving an
               | algorithm from data. This is the algorithm they derived.
               | It's open source. You can use it or change it. Having the
               | data that was used to derive it is not relevant.
        
               | monocasa wrote:
               | > It's open source.
               | 
               | How did the engineers who built it do so? Is there more
               | source to create this build artifact?
        
               | syntaxing wrote:
               | But the source to train your own LLM equivalent is also
               | released though (minus the data). Hence why there are so
               | many variants of LLaMa. You also can't fine tune it
               | without the original model structure. The weights give
               | the community a starting point so they don't need
               | literally millions of dollar worth of compute power to
               | get to the same step.
        
               | monocasa wrote:
               | Would Mistral's engineers be satisfied with the release
               | if they had to rebuild from scratch?
        
               | syntaxing wrote:
               | But they built a llama equivalent + some enhancements
               | that gives better performance...I'm not sure if this
               | would be possible at all without Meta releasing all the
               | required code and paper for LLaMa to begin with.
        
               | monocasa wrote:
               | Meta didn't release all of the required code to build
               | LLaMa, just enough run inference with their weights.
        
               | cfuendev wrote:
               | We should push for GPL licensing then, which AFAIK
               | requires a source that can be built from.
        
               | monocasa wrote:
               | We just also shouldn't call releases with no source "open
               | source".
               | 
               | I wouldn't really have a complaint with their source
               | being released as Apache 2. I just don't want the term
               | "open source" diluted to including just a release of
               | build artifacts.
        
               | gary_0 wrote:
               | I could kind of see things either way. Is this like not
               | providing the source code, or is it like not providing
               | the IDE, debugger, compiler, and linter that was used to
               | write the source code? (Also, it feels a bit "looking a
               | gift horse in the mouth" to criticize people who are
               | giving away a cutting-edge model that can be used
               | freely.)
        
               | godelski wrote:
               | I'd actually say that including the training data would
               | be like providing the IDE/debugger/compiler rather than
               | the model/checkpoint being analogous. If I hand you
               | Signal's source code you can run it, use it, modify it,
               | etc. All similar characteristics to what is provided
               | here. What they didn't provide to us is how they created
               | that code. You couldn't create that software from scratch
               | by just having these and that's true for any open source
               | project. But I wouldn't say training data is as good as
               | an analogy to peering in the minds of engineers, because
               | it is an important part to getting the final product and
               | analyzing it.
        
               | monocasa wrote:
               | > I could kind of see things either way. Is this like not
               | providing the source code, or is it like not providing
               | the IDE, debugger, compiler, and linter that was used to
               | write the source code?
               | 
               | Do the engineers that made this hand edit this file? Or
               | did they have other source that they used and this is the
               | build product?
               | 
               | > (Also, it feels a bit "looking a gift horse in the
               | mouth" to criticize people who are giving away a cutting-
               | edge model that can be used freely.)
               | 
               | Windows was free for a year. Did that make it open
               | source?
        
               | godelski wrote:
               | > Do the engineers that made this hand edit this file? Or
               | did they have other source that they used and this is the
               | build product?
               | 
               | Do any open source product provide all the tools used to
               | make software? I haven't seen the linux kernel included
               | in any other open source product and that'd quite frankly
               | be insane. As well as including vim/emacs, gcc, gdb, X11,
               | etc.
               | 
               | But I do agree that training data is more important than
               | those things. But you need to be clear about that because
               | people aren't understanding what you're getting at. Don't
               | get mad, refine your communication.
               | 
               | > Windows was free for a year. Did that make it open
               | source?
               | 
               | Windows didn't attach an Apache-2.0 license to it. This
               | license makes this version of the code perpetually open
               | source. They can change the license later, but it will
               | not back apply to previous versions. Sorry, but this is
               | just a terrible comparison. Free isn't what makes a thing
               | "open source." Which let's be clear, is a fuzzy
               | definition too.
        
               | monocasa wrote:
               | What I'm asking for is pretty clear. The snapshot of code
               | and data the engineers have checked into their repos
               | (including data repositories) that were processed into
               | this binary release.
               | 
               | > This license makes this version of the code perpetually
               | open source.
               | 
               | It doesn't because they didn't release the source.
               | 
               | There's nothing stopping me from attaching an Apache 2
               | license to a shared library I never give the source out
               | to. That also would not be an open source release. There
               | has to be actual source involved.
        
               | lmm wrote:
               | > Do any open source product provide all the tools used
               | to make software? I haven't seen the linux kernel
               | included in any other open source product and that'd
               | quite frankly be insane. As well as including vim/emacs,
               | gcc, gdb, X11, etc.
               | 
               | BSD traditionally comes as a full set of source for the
               | whole OS, it's hardly insane.
               | 
               | But the point is you don't need those things to work on
               | Linux - you can use your own preferred editor, compiler,
               | debugger, ... - and you can work on things that aren't
               | Linux with those things. Calling something "open source"
               | if you can only work on it with proprietary tools would
               | be very dubious (admittedly some people do), and calling
               | a project open source when the missing piece you need to
               | work on it is not a general-purpose tool at all but a
               | component that's only used for building this project is
               | an outright falsehood.
        
               | idonotknowwhy wrote:
               | What about this project?
               | 
               | https://github.com/MiSTer-devel/PSX_MiSTer
               | 
               | Only one man in the world of capable of creating or
               | editing this code, not it's here.
               | 
               | Is it really open source of Robert doesn't provide his
               | brain too?
        
               | monocasa wrote:
               | I'm not asking for the engineers brains, I'm asking for
               | more or less what's sitting in the IDE as they work on
               | the project.
               | 
               | Robert has provided that there. Mistral has not.
               | 
               | As an aside, I'm more than capable of editing that code;
               | I've professionally worked on FPGA code and have written
               | a PS1 emulator. Taking that (wonderful looking code) and
               | say, fixing a bug, adding a different interface for the
               | cdrom, porting it to a new FPGA are all things I'm more
               | than capable of.
        
               | gary_0 wrote:
               | No, but if the Windows binary code was made available
               | with no restrictive licensing, I'd be quite happy, and
               | the WINE devs would be ecstatic. Sure, the source code
               | and build infrastructure would be nicer, but we could
               | still work with that.
        
               | monocasa wrote:
               | 'gary_0' being happy with the license terms isn't what
               | defines 'open source'.
               | 
               | I'm fairly happy with the license terms too. They're just
               | not open source. We dilute the term open source for the
               | worst if we allow it to apply to build artifacts for some
               | reason.
        
               | gary_0 wrote:
               | We were talking about "looking a gift horse in the
               | mouth", as in it's still a positive thing regardless of
               | the semantic quibbles about open source. Nobody would
               | argue that a hypothetical openly licensed Windows binary-
               | only release is "open source" and I'd appreciate it if
               | you read my comments more charitably in future.
               | 
               | Source code licenses are naturally quite clear about what
               | constitutes "source code", but things are murkier when it
               | comes to ML models, training data, and associated
               | software infrastructure, which brings up some interesting
               | questions.
        
               | monocasa wrote:
               | > We were talking about "looking a gift horse in the
               | mouth", as in it's still a positive thing regardless of
               | the semantic quibbles about open source
               | 
               | Your gift horse in the mouth comment was visibly an aside
               | in the greater discussion being enclosed in parenthesis.
               | 
               | > Nobody would argue that a hypothetical openly licensed
               | Windows binary-only release is "open source" and I'd
               | appreciate it if you read my comments more charitably in
               | future.
               | 
               | That's why I'm using it as an example metaphor in my
               | favor. It's clearly not open source even if they released
               | it under Apache 2. It's not what their engineers edit
               | before building it.
               | 
               | > Source code licenses are naturally quite clear about
               | what constitutes "source code", but things are murkier
               | when it comes to ML models, training data, and associated
               | software infrastructure, which brings up some interesting
               | questions.
               | 
               | I don't think they're all that murky here. The generally
               | accepted definition being
               | 
               | > The "source code" for a work means the preferred form
               | of the work for making modifications to it. "Object code"
               | means any non-source form of a work.
               | 
               | Is this the form of the work that Mistral's engineers
               | work in? Or is there another form of the work that they
               | do their job in and used to build these set of files that
               | they're releasing?
        
               | lawlessone wrote:
               | You're asking them to release all their training data?
               | very unlikely that's going to happen.
        
               | monocasa wrote:
               | There's a lot of reasons why an org wouldn't want to open
               | source their release. That doesn't make it open source.
        
               | ben_w wrote:
               | > Like, come on, a 14GB of a dump of mainly numbers that
               | were constructed algorithmically are not "source".
               | 
               | So if I take a photo of a pretty sunset, release it under
               | MIT license, you'd say it's "not open source" unless I
               | give you the sun and the atmosphere themselves?
               | 
               | These models are perfectly valid things in their own
               | right; the can be fine-tuned or used as parts of other
               | things.
               | 
               | For most of these LLMs (not sure about this one in
               | particular yet) the energy cost in particular of
               | recreation is more than most individuals earn in a
               | lifetime, and the enormous data volume is such that the
               | only people who seriously need this should be copyright
               | lawyers and they should be asking for it to be delivered
               | by station wagon.
        
               | monocasa wrote:
               | I said "constructed algorithmically". Ie. I expect source
               | to be at the level the engineers who built it generally
               | worked at.
               | 
               | It's very nice that they released their build artifacts.
               | It's great that you can take that and make small
               | modifications to it. That doesn't make it open source.
               | 
               | > For most of these LLMs (not sure about this one in
               | particular yet) the energy cost in particular of
               | recreation is more than most individuals earn in a
               | lifetime, and the enormous data volume is such that the
               | only people who seriously need this should be copyright
               | lawyers and they should be asking for it to be delivered
               | by station wagon.
               | 
               | All of that just sounds like reasons why it's not
               | practical to open source it, not reasons why this release
               | was open source.
        
               | ben_w wrote:
               | > I said "constructed algorithmically". Ie. I expect
               | source to be at the level the engineers who built it
               | generally worked at.
               | 
               | I could either point out that JPEG is an algorithm, or
               | ask if you can recreate a sunset.
               | 
               | > All of that just sounds like reasons why it's not
               | practical to open source it
               | 
               | No, they're reasons why the stuff you want _doesn 't
               | matter_.
               | 
               | If you can actually afford to create a model of your own,
               | you don't need to ask: the entire internet is _right
               | there_. Some of it even has explicitly friendly licensing
               | terms.
               | 
               | An LLM with a friendly license is something you can
               | freely integrate into other things which need friendly
               | licensing. That's valuable all by itself.
        
               | dartos wrote:
               | The permissiveness of the license with regards to use
               | isn't the crux of the argument.
               | 
               | The open source family of licenses are about freedom. If
               | I'm not given the tools to recreate a model, then I'm not
               | afforded the freedoms normally associated with these open
               | licenses. Really there's little difference between Apache
               | and CC-BY here.
        
               | monocasa wrote:
               | Just because a license is 'friendly' and you don't see
               | the point of an open release, doesn't make it open
               | source.
               | 
               | There's been all sorts of closed source libraries that
               | you can freely integrate for whatever reason. They're not
               | open source either.
        
               | lmm wrote:
               | > So if I take a photo of a pretty sunset, release it
               | under MIT license, you'd say it's "not open source"
               | unless I give you the sun and the atmosphere themselves?
               | 
               | You've gotta give me the stuff you used to make it, the
               | stuff you'd want to have if you wanted to recreate a
               | slightly different version of the photo ("in the
               | preferred form for making modifications", as the GPL
               | says). If you just snapped a photo of whatever you saw
               | with whatever camera was in your pocket, then there's
               | nothing else to publish. But if you figured out a
               | timetable of when you should stand where with what kind
               | of lens, then making your photo open-source would mean
               | publishing that timetable.
               | 
               | > These models are perfectly valid things in their own
               | right; the can be fine-tuned or used as parts of other
               | things.
               | 
               | If the original creator can edit them, and you can't,
               | then that's not open-source; fine-tuning is a help but
               | someone who can only fine-tune is still a second-class
               | user compared to the original developer. The whole point
               | of open source is to put you on an equal footing with the
               | original developer (in particular, to make sure that you
               | can fix bugs by yourself and are never stuck waiting for
               | them to release an update that you need).
        
               | syntaxing wrote:
               | If I'm understanding you correctly, what you mean is
               | that's it's only truly open source if they provide the
               | data they used to train it as well?
        
               | monocasa wrote:
               | If that's what's needed to work at the level their
               | engineers work on the model.
               | 
               | Which is true of traditional software as well. You don't
               | get to call your binary open source just because you have
               | licensed materials in there you can't release.
        
               | lawlessone wrote:
               | Is a database software only open source if they release
               | with data?
        
               | monocasa wrote:
               | Is the data what the database engineers edit and add to
               | their build pipeline in order to build the database
               | software?
        
               | godelski wrote:
               | > a 14GB of a dump of mainly numbers that were
               | constructed algorithmically are not "source".
               | 
               | I'm sorry, but what do you expect? Literally all code is
               | "a bunch of numbers" when you get down to it.
               | Realistically we're just talking about if the code/data
               | is 1) able to be read through common tools and common
               | formats and 2) can we edit, explore, and investigate it.
               | The answer to both these questions is yes. Any parametric
               | mathematical model is defined by its weights as well as
               | its computational graph. They certainly provide both of
               | these.
               | 
               | What are we missing? The only thing that is missing here
               | is the training data. That means of course that you could
               | not reproduce the results were you to also have tens of
               | thousands to millions of dollars to do so. Which if
               | you're complaining about that then I agree, but this is
               | very different from what you've said above. They
               | shouldn't be providing the dataset, but they should be at
               | least telling us what they used and how they used it. I
               | would agree that it's not full "open source" when the the
               | datasets are unknown and/or unavailable (for all intents
               | and purposes, identical). The "recipe" is missing, yes,
               | but this is very different from what you're saying. So if
               | there's miscommunication then let's communicate better
               | instead of getting upset at one another. Because 14G of a
               | bunch of algorithmically constructed numbers and a few
               | text tiles is definitely all you need to use, edit,
               | and/or modify the work.
               | 
               | Edit: I should also add that they don't provide any
               | training details. This model is __difficult__ to
               | reproduce. Not impossible, but definitely would be
               | difficult. (within some epsilon, because models are not
               | trained in deterministic manners, so training something
               | in identical ways twice usually ends up with different
               | results)
        
               | monocasa wrote:
               | > I'm sorry, but what do you expect? Literally all code
               | is "a bunch of numbers" when you get down to it.
               | Realistically we're just talking about if the code/data
               | is 1) able to be read through common tools and common
               | formats and 2) can we edit, explore, and investigate it.
               | The answer to both these questions is yes. Any parametric
               | mathematical model is defined by its weights as well as
               | its computational graph. They certainly provide both of
               | these.
               | 
               | I expect that if you call a release "open source", it's,
               | you know, source. That their engineers used to build the
               | release. What Mistral's engineers edit and collate as
               | their day job.
               | 
               | > The "recipe" is missing, yes, but this is very
               | different from what you're saying.
               | 
               | The "recipe" is what we generally call source.
               | 
               | > So if there's miscommunication then let's communicate
               | better instead of getting upset at one another.
               | 
               | Who's getting upset here? I'm simply calling for not
               | diluting a term. A free, permissive, binary release is
               | great. It's just not open source.
               | 
               | > Because 14G of a bunch of algorithmically constructed
               | numbers and a few text tiles is definitely all you need
               | to use, edit, and/or modify the work.
               | 
               | Just like my Windows install ISO from when they were
               | giving windows licenses away from free.
        
               | stefan_ wrote:
               | This is not a novel discussion and you are not being
               | smart trying to nihilism it, just obtuse. Here is what
               | the GPL has said on source for some 30+ years:
               | 
               | > Source code for a work means the preferred form of the
               | work for making modifications to it.
        
               | pk-protect-ai wrote:
               | >> tokenizer.model in their release is absolutely a
               | binary
               | 
               | Is this not a BPE+sentencepiece? It is quite usual
               | practice when you do the training or even prepare the
               | data with fairseq ...
               | 
               | EDIT: I mean it will be a binary file for the tokenizer
               | model after all but I see no problem here ...
        
               | monocasa wrote:
               | Just because you don't see a problem with it being a
               | binary, doesn't make it source. It's still a build
               | artifact.
        
               | pk-protect-ai wrote:
               | [flagged]
        
               | monocasa wrote:
               | That's one of the libraries, not the source of this
               | binary release.
               | 
               | What does Mistral's engineers edit before sending this
               | model off to be trained? That's the source.
        
               | pk-protect-ai wrote:
               | It is literally takes 20 min on my PC to prepare
               | multilanguage corpus and train BPE+sentencepiece
               | tokenizer with fairseq. You have all the documentation in
               | there. If you do not know how to use these tools, does
               | not mean they are not there. You literally do not need to
               | edit anything.
        
               | WanderPanda wrote:
               | ,,the automation" is probably manual and not even
               | deterministic
        
               | monocasa wrote:
               | I guarantee you there's automation around training this
               | model. There's also the factor of the dataset itself.
               | 
               | And it doesn't matter much if it's perfectly
               | deterministic. Source builds of traditional software
               | aren't typically fully reproducible either. That doesn't
               | change
               | 
               | And I give you better than coin flip odds that it is
               | actually deterministic. The engineers at the big ML shops
               | I've had conversations with have been doing deterministic
               | training for quite some; they believed it was key to
               | training at scale. That's what gives you the "did this
               | model go way off the deep end because of something we did
               | in the model, or because a training GPU is on the fritz".
        
         | blueboo wrote:
         | Persimmon-8B from Adept did it first no
         | https://www.adept.ai/blog/persimmon-8b
        
         | dartos wrote:
         | Have you seen the RWKV model?
         | 
         | They have a 40B one and IIRC are part of the Linux foundation
         | now too
        
       | jsnell wrote:
       | Is there a reason projects seem to be standardizing on specific
       | parameter sizes within larger buckets? E.g. I only ever see news
       | about 7B models, not 6B or 8B.
       | 
       | Are these sizes somehow optimal? Is it about getting as close to
       | resource (memory?) breakpoints as possible without exceeding
       | them? Is it to make comparisons between models simpler by
       | removing one variable?
        
         | godelski wrote:
         | Short:
         | 
         | The short answer is that it is hard to compare models so to
         | make it easier we compare parameters. Part of the answer of why
         | we do it is because it also helps show scaling. (As far as I'm
         | aware) The parameters __are not__ optimal, and we have no idea
         | what actually that would mean.
         | 
         | Longer:
         | 
         | The longer answer is that comparing models is really fucking
         | hard and how we tend to do it in the real world is not that
         | great. You have to think of papers and experiments as proxies,
         | but proxies for what? There's so many things that you need to
         | compare a model on and it is actually really difficult to
         | convey. Are you just trying to get the best performance? Are
         | you trying to demonstrate a better architecture? Are you
         | increasing speed? Are you increasing generalization (note the
         | difference from performance)? And so on. Then we need to get
         | into the actual metrics. What do the metrics mean? What are
         | their limitations? What do they actually convey? These parts
         | are unfortunately not asked as much but note that all metrics
         | are models too (everything you touch is "a model"), and
         | remember that "all models are wrong." It's important to
         | remember that there are hundreds or thousands of metrics out
         | there and they all have different biases and limitations, with
         | no single metric being able to properly convey how good a model
         | is at any task you choose. There is no "best language model"
         | metric, nor are there even more specific "best at writing leet
         | code style problems in python" metrics (though we'd be better
         | at capturing that than the former question). Metrics are only
         | guides and you must be truly aware of their limitations to
         | properly evaluate (especially when we talk about high
         | dimensions). This is why I rant about math in ML: You don't
         | need to know math to make a good model, but you do need to know
         | math to know why a model is wrong.
         | 
         | Parameters (along with GMACs, which is dominating the FLOPs
         | camp. Similarly inference speeds have become common place) only
         | started to be included as common practice in the last few years
         | and still not in every subject (tends to be around the
         | transformer projects, both language and vision). As a quick
         | example of why we want them, check out DDP vs iDDPM. You
         | wouldn't know that the models are about 60% different in
         | parameter size when comparing (Table 3). In fact, you're going
         | to have a hard time noticing the difference unless you read
         | both very carefully as they're both one liners (or just load
         | the models. fucking tensorflow 1.15...). Does it seem fair to
         | compare these two models? Obviously it depends, right? Is it
         | fair to compare LLaMA 2 70B to LLaMA 2 7B? It both is and
         | isn't. It entirely depends on what your needs are, but these
         | are quite difficult to accurately capture. If my needs are to
         | run on device in a mobile phone 7B probably wins hands down,
         | but this would flip if I'm running on a server. The thing is
         | that we just need to be clear about our goals, right? The more
         | specific we can get about goals, the more specific we can get
         | around comparing.
         | 
         | But there's also weird effects that the metrics (remember,
         | these are models too. Ask models of what) we use aren't
         | entirely capturing. You may notice that some models have
         | "better scores" but don't seem to work as well in real world
         | use, right? Those are limitations of the metrics. While a
         | better negative log likelihood/entropy score correlates well
         | with being a high performing language model, it does not
         | __mean__ a high performing language model. Entropy is a capture
         | of information (but make sure not to conflate with the
         | vernacular definition). These models are also very specifically
         | difficult to evaluate given that they are trained and tested on
         | different datasets (I absolutely rage here because non-hacking
         | can't be verified) as well as the alignment done post process.
         | This all gets incredibly complex and the honest truth here is
         | that I don't think there is enough discussion around the topic
         | of what a clusterfuck it is to compare models. Hell, it is hard
         | to even compare more simple models doing more simple tasks like
         | even just classifying MNIST numbers. Much harder than you might
         | think. And don't get me started on out of distribution,
         | generalization, and/or alignment.
         | 
         | I would just say: if you're a layman, just watch and judge by
         | how useful the tools are to you as a user -- be excited about
         | the progress but don't let people sell you snake oil; if you're
         | a researcher, why the fuck are we getting more lazy in
         | evaluating works as the complexity of evaluation is
         | exponentially increasing -- seriously, what the fuck is wrong
         | with us?
        
         | jstarfish wrote:
         | 7B seems to be the limit of what people can comfortably fit in
         | last-gen GPUs, having ~6GB of VRAM. It's also the lower
         | acceptable boundary of coherence for generative text.
         | 
         | There are some major releases at lesser parameter counts
         | though. Databricks' Dolly had a 3B model, and Microsoft's Orca
         | also had a recent 3B release. They're both abysmal at
         | generating text, but I find them quick and useful for reductive
         | tasks ("summarize this," "extract keywords from," etc.).
         | 
         | (I like to treat parameter count as a measure of age/WIS/INT.
         | For this question, do I need the wisdom of a 7-year old, a
         | 13-year old, a 30-year old, etc. 3B is like polling
         | preschoolers at daycare.)
        
         | brucethemoose2 wrote:
         | There are some mad lads making different sizes of llama by
         | "grafting" attention heads from one model onto another and
         | finetuning a bit to stablize the transplant. For instance:
         | 
         | https://huggingface.co/models?sort=modified&search=20B
         | 
         | Its very experimental, but apparently the 20B models are
         | actually improving on 13B.
        
           | londons_explore wrote:
           | Any place people doing such grafting are congregating?
           | 
           | I've often pondered if taking some random chunk of weights
           | from the middle of a trained model, and dumping it into some
           | totally different model might perform better than random
           | initialization when the scale gets big enough.
        
           | semi-extrinsic wrote:
           | Just the language being used here is amazing.
        
         | dragonwriter wrote:
         | > Is there a reason projects seem to be standardizing on
         | specific parameter sizes within larger buckets?
         | 
         | AFAICT, it is because science: most of them are research
         | artifacts and intended to support further research, and the
         | fewer of parameter count, model architecture, training set,
         | etc., that change substantially between models, the easier it
         | is to evaluate the effects each element changing.
        
         | riskable wrote:
         | Not an expert but I'm pretty sure it has to do with how much
         | VRAM you need in your GPU in order to process them efficiently.
         | Last time I was reading about the sizes someone mentioned that
         | 8B was just barely too big for their GPU.
        
           | brucethemoose2 wrote:
           | This is not really true anymore, as the "consumer GPU"
           | backends have very flexible quantization. Llama.cpp has like
           | a dozen steps between 2 and 6 bit, and exLlamav2 will
           | literally do an arbitrary decimal bpw.
           | 
           | It sort of matters with bigger models trying to squeeze into
           | a server GPU, with the (currently) inflexible vLLM 4-bit
           | quantization.
           | 
           | I think its just a standard set by Llama.
        
           | aidenn0 wrote:
           | Seems like 6B would still be useful if I want to run it on my
           | GPU without exiting firefox.
        
             | londons_explore wrote:
             | Kinda lame that applications can't be told "yo, your gpu
             | buffer has now been moved back to RAM".
        
       | pk-protect-ai wrote:
       | Wow. Apache 2.0 license and really interesting model. Thank you
       | guys!
        
       | loufe wrote:
       | Felicitations a toute l'equipe. Like the others have said, this
       | is an impressive release given the short timeline.
        
       | LoganDark wrote:
       | Do general-purpose models like this truly excel in niche
       | categories (like niche story scenarios), or is it only really
       | general human knowledge and reasoning based tasks that are
       | already incredibly widely distributed on the internet?
       | 
       | I always have issues with LLMs completely forgetting where things
       | are in a scene, or even what parts a given animal has, e.g.
       | saying "hands" when the subject is a quadruped.
        
         | brucethemoose2 wrote:
         | > I always have issues with LLMs completely forgetting where
         | things are in a scene, or even what parts a given animal has,
         | e.g. saying "hands" when the subject is a quadruped
         | 
         | I dunno what llm you are using, but a combination of finetuning
         | with a specific prompt structure and good prompt engineering
         | help the LLM stay "logical" like that. This LORA, for instsnce,
         | has specific sections for the different characters in the
         | training dataset: https://huggingface.co/lemonilia/LimaRP-
         | Llama2-13B-v3-EXPERI...
         | 
         | Other than that, higher parameter models (70B, and the
         | "frankenstein" 20B llama models) tend to be better at this.
        
           | LoganDark wrote:
           | Yeah, well that's just the problem, isn't it. The model isn't
           | good at my task already, so I'm going to have to obtain my
           | own dataset, curate the whole thing myself, organize it and
           | finetune the model based on it, so on and so forth. I'm going
           | to spend so much time actually creating the stories that I
           | want to create, rather than troubleshooting the pipeline. And
           | it totally helps that the entire stack is built on top of
           | fragile python scripts.
           | 
           | I just wish there were a way of making these models already
           | perform well on niche tasks like "write this story, except
           | the characters are quadrupeds, and therefore are _not human_
           | ". Like Warriors (the book series, about cats), without
           | having to go and spend weeks curating a dataset of books
           | about non-human characters.
           | 
           | I'm sure that's so much of an ongoing area of research that
           | it goes without saying.
           | 
           | > I dunno what llm you are using
           | 
           | I started with the RWKV family of models before realizing the
           | amount of overfit is so critically unfunny that the model
           | files aren't even on my computer anymore.
           | 
           | Anyway, the best I have found so far is Chronos-Hermes-13B. I
           | believe that's a dedicated roleplaying model. I guess furry
           | roleplays would make good training data, wouldn't it.
           | 
           | Chronos-Hermes-13B itself though is a mildly
           | cursed/degenerate hybrid of two other models that don't
           | really work together properly with the new GGML
           | quantizations, and it's based on the old LLaMA-1 family of
           | models, but I haven't found anything better yet.
        
             | brucethemoose2 wrote:
             | > Chronos-Hermes-13B
             | 
             | Its not SOTA anymore. I dunno what is, but just look at
             | what people are running on Lite:
             | 
             | https://lite.koboldai.net/#
             | 
             | The new darling seems to Mythos and Xwin-based hybrid
             | models, as well as models with the 70B version of Chronos
             | in them.
             | 
             | Also, see this, specifically the "IQ" metric:
             | https://rentry.co/ayumi_erp_rating
             | 
             | > write this story, except the characters are quadrupeds,
             | and therefore are not human
             | 
             | But the RP models should be able to get this with some
             | prompt engineering. You may have to be repetitive in the
             | instruction block, saying things like "...the characters
             | are not human. All the characters have four feet. All the
             | characters are quadraped animals..." and so on to really
             | emphasize it to the LLM.
        
               | LoganDark wrote:
               | Honestly ERP models sound like they would be the best fit
               | for this task, it's just hard to find one that's trained
               | on quadrupeds rather than humans or even furries, if that
               | makes any sense. I will try the repetitive method soon
        
               | brucethemoose2 wrote:
               | There is a _lot_ of effort put into those ERP models,
               | lol. The training and datasets are actually really good,
               | hence they are very good at the non-e RP part.
        
               | tavavex wrote:
               | Pretty funny how so much effort goes into making and
               | categorizing specifically NSFW content lol
               | 
               | I wouldn't be surprised if at least a few contributors in
               | the open source AI community initially got in just
               | because of this aspect
        
         | cypress66 wrote:
         | > I always have issues with LLMs completely forgetting where
         | things are in a scene, or even what parts a given animal has,
         | e.g. saying "hands" when the subject is a quadruped.
         | 
         | Sounds like you're using too small of a model. Try llama 70b.
        
           | LoganDark wrote:
           | I have a single RTX 3060. It can't handle a 70b model.
           | 
           | I got something like 1-2 tokens per second the last time I
           | tried, with CPU offloading and an absolutely offensive page
           | file (32gb).
        
             | coolspot wrote:
             | With this setup you can as well throw your 3060 out and
             | just use CPU, because your bottleneck is RAM-to-VRAM
             | bandwidth, 3060 is basically idle.
        
               | LoganDark wrote:
               | I would love to throw the 3060 out and replace it with a
               | 3090... once money permits. (It's only about $800
               | nowadays.)
               | 
               | But yes. I'm aware how laughably insane it is to run a
               | 70b model that way. And that's why I was pointing it out
               | to the commenter who suggested to just run a 70b model
               | instead.
        
               | freedomben wrote:
               | downvoters: why did you downvote? is this comment
               | technically incorrect or inaccurate?
        
               | LoganDark wrote:
               | To a comment that suggested I try the 70b model, I
               | replied "my card can't run that model". Someone replies
               | back with "you may as well throw the card out if you're
               | going to be trying to run that model". My point exactly.
               | 
               | More seriously, using all-CPU is not much faster as my
               | computer only has 16GB of actual memory, which I'm aware
               | is also hugely underspecced for a 70b model, even with
               | memory mapping.
               | 
               | I have a nice NVMe SSD, so there's not much else for me
               | to do here except upgrade my memory or graphics card.
        
               | freedomben wrote:
               | that would make sense the downvotes, thank you!
        
             | brucethemoose2 wrote:
             | That can handle a 20B model, either in llama.cpp or
             | exLLaMA:
             | 
             | https://huggingface.co/models?sort=modified&search=20B
             | 
             | https://huggingface.co/Kooten/U-Amethyst-20B-3bpw-
             | exl2?not-f...
        
       | yieldcrv wrote:
       | what do you use "the most powerful language model for its size"
       | for?
        
         | ShrigmaMale wrote:
         | Probably a better candidate for local applications? Guessing
         | this was a trial balloon against larger models coming though.
        
         | snowram wrote:
         | 7B models are small enough to be usable on a smartphone, so a
         | local handheld assistant sounds like a use case.
        
           | minimaxir wrote:
           | "usable" is not the same as practical.
           | 
           | Even running a quantized and optimized LLM on a smartphone
           | would kill battery life at minimum.
        
             | brucethemoose2 wrote:
             | Try MLC-LLM. Its not as bad as you'd think.
             | 
             | In the future(?), they will probably use the AI blocks
             | instead of the GPU, which are very low power.
        
           | ComputerGuru wrote:
           | Are they? Unquantized, Llama 2 7b needs over 14GB of GPU (or
           | shared) memory.
        
             | polygamous_bat wrote:
             | "Unquantized" is the key word here: with quantization you
             | can get a 4-8x improvement without much performance
             | degradation.
        
               | throeaaysn wrote:
               | [dead]
        
         | linkpuff wrote:
         | According to their website, it's "Optimal for: low latency,
         | text summarisation, classification, text completion, code
         | completion"
        
         | kirill5pol wrote:
         | One big one is speculative decoding for larger models, the more
         | "accurate"* the smaller model, the more the speed up on the
         | bigger model
         | 
         | * as in matches the token that the larger model would output
        
           | sroussey wrote:
           | It would be better if they also had a 70B model for this.
           | 
           | They would need the same vocabulary, etc. What else?
        
       | samr71 wrote:
       | The way the wind's blowing, we'll have a GPT-4 level open source
       | model within the next few years - and probably "unaligned" too. I
       | cannot wait to ask it how to make nuclear weapons, psychedelic
       | drugs, and to write erotica. If anyone has any other ideas to
       | scare the AI safety ninnies I'm all ears.
        
         | gmerc wrote:
         | The AI safety ninnies as you call them are not scared and
         | neither do they buy into the narrative.
         | 
         | They are the investors of large proprietary AI companies who
         | are facing massive revenue loss primarily due to Mark
         | Zuckerbergs decision to give away a competitive LLM to open
         | source in a classic "if I can't make money from this model, I
         | can still use it to take away money from my competition" move -
         | arming the rebels to degrade his opponents and kickstarting
         | competitive LLM development that is now a serious threat.
         | 
         | It's a logical asymmetric warfare move in a business
         | environment where there is no blue ocean anymore between big
         | companies and degrading your opponents valuation and investment
         | means depriving them of means to attack you.
         | 
         | (There's a fun irony here where Apples incentives are very much
         | aligned now - on device compute maintains Appstore value,
         | privacy narrative and allows you to continue selling expensive
         | phones - things a web/api world could threaten)
         | 
         | The damage is massive, the world overnight changed narrative
         | from "future value creation is going to be in
         | openai/google/anthropic cloud apis and only there" to a much
         | more murky world. The bottom has fallen out and with it
         | billions of revenue these companies could have made and an
         | attached investor narrative.
         | 
         | Make no mistake, these people screaming bloody murder about
         | risks are shrewd lobbyists, not woke progressives, they are
         | aligning their narrative with the general desires of control
         | and war on open computing - the successor narrative of the end
         | to end encryption battle currently fought in the EU will be AI
         | safety.
         | 
         | I am willing to bet hard money that "omg someone made CSAM with
         | AI using faceswap" will be the next thrust to end general
         | purpose compute. An the next stage of the war will be brutal
         | because both big tech and big content have much to lose if
         | these capabilities are out in the open
         | 
         | The cost of alignment tax and the massive loss of potential
         | value makes there lobbying world tour by sam altman an
         | aggressive push trying to convince nations that the best way to
         | deal with scary AI risks (as told on OpenAI bedtime stories) is
         | to regulate it China Style - through a few pliant monopolists
         | who guarantee "safety" in exchange for protection from open
         | source competition.
         | 
         | There's a pretty enlightening expose [1] on how heavily US
         | lobbyists have had their hand in the EU bill to spy on end to
         | end encryption that the commission is mulling - this ain't a
         | new thing, it's how the game is played and framing the people
         | who push the narrative as "ninnies" who are "scared" just buys
         | into culture war framing.
         | 
         | [1] https://fortune.com/2023/09/26/thorn-ashton-kutcher-ylva-
         | joh...
        
           | garba_dlm wrote:
           | such bullshit: to regard a loss of a "potential" as a
           | realized actualized loss....
        
             | gmerc wrote:
             | It's a direct degradation of investor narrative at a time
             | when money is much tighter.
             | 
             | Nobody says it's realized loss, that's not how valuation
             | works.
             | 
             | But Google LRP involves, as one of the first steps, the
             | question of how much money will be allocated to investors
             | (currently with stock buybacks) before other investment
             | decisions, so yes, attacking valuation directly attacks the
             | purse available for aggressive business moves and L&D.
        
               | smoldesu wrote:
               | > It's a direct degradation of investor narrative at a
               | time when money is much tighter.
               | 
               | Uh, no? The investor narrative of "giving away free AI
               | shit" has been in-effect since Pytorch dropped a half-
               | decade ago. If you're a Meta investor disappointed by
               | public AI development, you really must not have done your
               | homework.
        
               | gmerc wrote:
               | That's not the investor narrative. The investor narrative
               | is choking the competition out of the market and then
               | squeeze the shit out of people. As we see right now in
               | this season of enshittification.
               | 
               | That happens to not work anymore because open source sets
               | a price floor at which people will adopt the alternative.
               | 
               | The investor narrative is always about building a
               | monopoly.
               | 
               | Damaging the investor narrative to your most direct
               | competitor is building in a saturated ad market is an
               | effective indirect attack.
        
               | smoldesu wrote:
               | > The investor narrative is always about building a
               | monopoly.
               | 
               | Can you point out how Meta has been applying this
               | philosophy to AI? Given their history of open research,
               | model weights releases and competitive alternative
               | platforms, I struggle to envision their ideal monopoly.
               | You claim that openness is a hostility tactic, but I
               | think Llama wouldn't be public if it was intended to
               | "kill" the other LLMs.
               | 
               | What we've gotten from Meta is more than we've gotten out
               | of companies that _should_ be writing society software,
               | like Microsoft and Apple.
        
               | robertlagrant wrote:
               | While I agree that the previous commenter's point is
               | silly, I wouldn't say that anyone should be writing
               | society software. There's no should.
        
               | gmerc wrote:
               | You are misreading my argument. I'm saying Facebook is
               | degrading google and openai investor narrative. If Llama
               | cost hypothetical one billion, they inflict a multiple on
               | that on their competitors with this move while gaining
               | massive technological advantages.
               | 
               | The improvements made to llama by open source community
               | people already have propelled it past Bard by many
               | accounts and this is a model that a few months ago was
               | absolutely non competitive and downright bad.
               | 
               | So it's a win win. I don't see the problem
        
               | smoldesu wrote:
               | Facebook has been open-sourcing AI research longer than
               | OpenAI has even had the concept of an "investor
               | narrative". I struggle to understand how someone could
               | jump to the conclusion of this being a "scorched earth"
               | maneuver with so many other reasonable explanations.
               | Facebook has a laboratory (FAIR) with a long history of
               | research and releases like this.
               | 
               | > If Llama cost hypothetical one billion, they inflict a
               | multiple on that on their competitors with this move
               | while gaining massive technological advantages.
               | 
               | If Llama cost a hypothetical one billion, then they
               | amortized the cost over the value of the end product and
               | the free advertisement alone.
               | 
               |  _Maybe_ their competitors got scooped, but GPT-3 and
               | GPT-4 haven 't gone anywhere. Not to mention, there were
               | lots of other language models from FAANG before Llama
               | arrived. It's not like _those_ were made and released to
               | spite their competitors; it was research. Google and
               | Microsoft have lots of open Transformer research you can
               | find.
               | 
               | Inflicting "damage" and gaining massive technological
               | advantages is _quite literally_ not their goal nor what
               | they 've done for the past half-decade. If it is, they've
               | done a terrible job so far by collaborating with
               | Microsoft to open their model format and provide
               | inferencing acceleration for outdated hardware platforms.
               | 
               | > The improvements made to llama by open source community
               | people already have propelled it past Bard by many
               | accounts and this is a model that a few months ago was
               | absolutely non competitive and downright bad.
               | 
               | This is something the original Llama paper acknowledged
               | before the community "discovered" it:
               | 
               | > In this section, we show that briefly finetuning on
               | instructions data rapidly leads to improvements on MMLU.
               | Although the non-finetuned version of LLaMA-65B is
               | already able to follow basic instructions, we observe
               | that a very small amount of finetuning improves the
               | performance on MMLU, and further improves the ability of
               | the model to follow instructions.
               | 
               | https://arxiv.org/pdf/2302.13971.pdf
               | 
               | > So it's a win win. I don't see the problem
               | 
               | Neither does Meta, nor Microsoft, nor Google, who have
               | all been content to work on progressive and open AI
               | research. Who do you perceive as their "competitors"?
               | Each other?
        
               | [deleted]
        
           | apsec112 wrote:
           | "They are the investors of large proprietary AI companies" is
           | just... not true? Not sure where you're even getting this
           | from. I'm a modestly successful upper-middle-class ML
           | engineer, and I've been worried about AI safety since before
           | Facebook, DeepMind, OpenAI, or Anthropic even existed. The
           | most prominent funder of AI risk efforts (Dustin Moskovitz)
           | is a _co-founder of Facebook_ , so if anything he'd be
           | motivated to make Facebook more successful, not its
           | competitors.
        
             | pmarreck wrote:
             | This all smacks of the 80's craze against rap music and
             | video games causing violent behavior.
             | 
             |  _Where is the evidence_ that access to uncensored models
             | results in harm (that wouldn 't occur due to a bad actor
             | otherwise)? And _where is the evidence_ that said harm
             | reduction is greater than the harm caused by the measurable
             | loss in intelligence in these models?
        
             | hatenberg wrote:
             | Are you the one talking to the European commission though?
        
               | DebtDeflation wrote:
               | Exactly. The moment Sam Altman started talking to
               | Congress about the dangers of AI and how the solution
               | should be only allow licensed companies to develop AI
               | models and that OpenAI should be part of a small board
               | that determines to whom to grant licenses, everyone
               | should have seen it for what it is.
        
             | ozr wrote:
             | The AI safety cult has some true believers. It's still
             | fundamentally a grift.
        
               | gmerc wrote:
               | So like crypto and web 3;)
        
               | jona-f wrote:
               | so like hedge funds and global finance
        
           | diyseguy wrote:
           | I'm far more worried about _how_ they will try to regulate
           | the use of AI.
           | 
           | As an example the regulations around PII make debugging
           | production issues intractable as prod is basically off-limits
           | lest a hapless engineer view someone's personal address, etc.
           | 
           | How do they plan to prevent/limit the use of AI? Invasive
           | monitoring of compute usage? Data auditing of some kind?
        
           | potatoman22 wrote:
           | I don't agree with your point, but I love that Facebook
           | released llama into the open. I realized it's not necessarily
           | to undercut their competitors, either. Their revenue grows
           | when high quality content is easier to create. If they
           | commoditize the process of creating content, they make more
           | money. Commoditize your compliment.
        
             | gmerc wrote:
             | High quality content is not a concern for Facebook
        
               | esafak wrote:
               | Good enough to share, cheap to create.
        
               | Gh0stRAT wrote:
               | >High quality content is not a concern for Facebook
               | [Citation needed]
               | 
               | I'd say it's a huge concern due to its strong correlation
               | with increased usage and thus ad revenue.
        
               | gmerc wrote:
               | For the time I worked there the metric was engagement
               | (with occasional Cares about Facebook intermissions).
               | 
               | One look at newsfeed tells you it's ad revenue now.
               | Quality has nothing to do with it unless you define
               | quality as clickbait.
               | 
               | In fact, citation needed on "high correlation" unless you
               | take a meta press release which are notoriously
               | misleading. Like 3% of the platform being news
        
           | lawlessone wrote:
           | >Primarily due to Mark Zuckerbergs decision to give away a
           | competitive LLM to open source in a classic "if I can't make
           | money from this model, I can still use it to take away money
           | from my competition" move
           | 
           | I loved it.
        
             | pk-protect-ai wrote:
             | Though, he didn't gave it completely away. With
             | Llama/llama2 licenses he has just threatened that he will
             | give it away...
        
               | hatenberg wrote:
               | Semantics though: He gave tens of thousands of salviating
               | engineers on the internet the first competitive LLM to
               | play with. Or left the door open for people to take it,
               | if you prefer that narrative. The entire progress chain
               | that has given us ollama, lamacpp and hundreds of
               | innovations in a very short time was set off by that.
        
               | pk-protect-ai wrote:
               | Can't agree more on that one :)
        
           | isoprophlex wrote:
           | > The damage is massive, the world overnight changed
           | narrative from "future value creation is going to be in
           | openai/google/anthropic cloud apis and only there" to a much
           | more murky world. The bottom has fallen out and with it
           | billions of revenue these companies could have made and an
           | attached investor narrative.
           | 
           | My god!! Will someone please think of the ~children~ billions
           | in revenue!
        
             | throeaaysn wrote:
             | [dead]
        
             | gmerc wrote:
             | If there was no Linux, how much more revenue would windows
             | / Sun server divisions have made?
        
               | sp332 wrote:
               | And how much poorer would the rest of the world be?
        
               | blibble wrote:
               | imagine the increase in GDP!!
        
               | miohtama wrote:
               | If there was no Linux, it's unlikely we ever would have
               | had Google, Facebook and Amazon as we knoe it. Free OS
               | was core to build their SaaS.
        
           | FrenchDevRemote wrote:
           | I can think of at least a dozen ways to completely ruin the
           | internet or even society using SOTA/next-gen LLMs/GenAIs,
           | we'll be in trouble way before the singularity.
           | 
           | A ton of legit researchers/experts are scared shitless.
           | 
           | Just spend 5 minutes on EleutherAI discord(which is mostly
           | volunteers, academics, and hobbyists, not lobbyists), read a
           | tiny bit on alignment and you'll be scared too.
        
             | vladms wrote:
             | Same can be said by a lot of technologies (or pandemics, or
             | climate change). Imagination is a tool - using it for what
             | it can go bad does not seem to be the the most efficient
             | way to use it.
        
               | FrenchDevRemote wrote:
               | IMO the next gen AI are going to be tiny nukes that
               | middle schoolers could play with on their iPhones.
               | 
               | AI regulation is as needed as radioactive material
               | regulation.
               | 
               | Nuclear energy is great, Hiroshima not so much.
        
               | gmerc wrote:
               | How does that look like in practice ? What do those nukes
               | do?
        
             | hatenberg wrote:
             | Both can be true: Big companies can lobby for protection
             | and there being risk in the technology that broad diffusion
             | creates additional risks.
             | 
             | Cat's out of the bag though - we're still trading mp3s
             | decades after napster, this ghost won't go back into the
             | bottle and realistically, most of the risks people flag are
             | not AI risks, they are societal risks where our existing
             | failure to regulate and create consensus have already gone
             | past the red line (election interference, etc).
        
             | gmerc wrote:
             | The internet is already being ruined with access to
             | chatGPT, the spammers haven't even figured out how to use
             | LLama for the most part.
             | 
             | So really, wrong tree to bark up to- the problem is that
             | our existing way of doing things can't survive AI and you
             | can't regulate that away as you couldn't make gunpowder
             | disappear to avoid your city walls no longer working
        
             | random3 wrote:
             | You seem to make an assumption that the models will only
             | land producers, and not consumers. Why? Asymmetrical
             | compute power? The difference will likely be in size
             | (amount of facts compressed) not capability / ability to
             | detect bullshit.
             | 
             | This said, the trouble is machines may close the gaps in
             | skills faster than we can comprehend and able to adjust.
             | This means quality of life for people may decrease faster
             | from loss of use than it increases from gains (which need
             | to be relatively evenly distributed). This suggests that
             | everyone should own the compute/storage and ability to
             | enhance themselves.
        
               | pk-protect-ai wrote:
               | I have no doubt that machines will close the gaps in
               | skills faster than humans will comprehend, however even
               | AGI will have an owner. And if it is Sam Altman, then
               | this dystopian future even more horrible then thousands
               | of hackers running their own AGIs.
        
             | esafak wrote:
             | What's the gist; what are they scared of? Misinformation,
             | and unemployment?
        
             | pk-protect-ai wrote:
             | If you have ample resources, you don't need next-gen LLMs
             | or AGI. You can accomplish this now, without any fancy,
             | hyped technology. Literally, none of the things LLM or AGI
             | could propose or manage to do to harm us is worse than what
             | we can do to ourselves. For AGI, you need a significant
             | amount of resources to develop, train, and use it. To
             | inflict harm, the brute force of a simple human mind in
             | uniform is much cheaper and more effective.
        
               | FrenchDevRemote wrote:
               | The point is, it greatly reduces the amount of resources
               | needed to do some serious damage, as well as the level of
               | sophistication needed.
               | 
               | You don't need AGI to do damage, current LLMs are already
               | dangerous. IMO, an open-source affordable unfiltered
               | GPT-5 would ruin the internet in a few months.
        
               | IKantRead wrote:
               | > ruin the internet in a few months.
               | 
               | I'm sure the internet will be fine, and the web has
               | already been essentially destroyed as the drive for
               | extracting revenue from every human interaction has
               | rendered it just an amusing advertisement for the most
               | part.
               | 
               | Most of the content of the web today is already generated
               | by "bots" even if those "bots" happen to be human beings.
        
               | lawlessone wrote:
               | Youtube is rife with AI(edit: this is not necessarily AI)
               | voiced videos of copy pasted wikipedia articles. I find i
               | am blocking new ones everyday. LLM's didn't do that.
        
               | shawn-butler wrote:
               | Provide a specific example of what you have in mind to
               | further the conversation not just more opining on
               | "dangerous" is my suggestion.
        
               | FrenchDevRemote wrote:
               | Tailored propaganda, scams, spams, and harassment at a
               | scale that was never seen before. Plugging metasploit
               | into an unfiltered GPT-5 with a shell and a few proxies
               | could be devastating. Undetectable and unstoppable bots
               | would be available to anyone. Don't like someone? You
               | could spend a hundred bucks to ruin their life
               | anonymously.
               | 
               | Each of us could unknowingly interact with multiple LLMs
               | everyday which would only have one purpose: manipulate us
               | with a never-seen before success rate at a lower cost
               | than ever.
               | 
               | At some point AI generated content could become more
               | common than human content, while still being
               | indistinguishable.
               | 
               | Good enough automated online propaganda could routinely
               | start (civil)wars or genocides, Facebook already let that
               | happen in the past, manipulating elections would become
               | systematical even in the most democratic countries.
               | 
               | What already happened in those areas in the last few
               | years, is really nothing compared to what could happen
               | without enough regulation or barriers to entry in the
               | next few years.
               | 
               | What's worse is that all of this, would not just be
               | possible, but available to every sociopath on earth, not
               | just the rich ones.
        
               | pk-protect-ai wrote:
               | >> Tailored propaganda, scams, spams, and harassment at a
               | scale that was never seen before.
               | 
               | I believe the state of these subjects right now is
               | already alarming without AGI. You can't exacerbate the
               | horror about the level of tailored propaganda and scams,
               | etc., which you can't even foresee yourself. It isn't
               | quantifiable.
               | 
               | >>Each of us could unknowingly interact with multiple
               | LLMs everyday which would only have one purpose:
               | manipulate us with a never-seen before success rate at a
               | lower cost than ever.
               | 
               | You would build resistance pretty quickly.
               | 
               | >> At some point AI generated content could become more
               | common than human content, while still being
               | indistinguishable.
               | 
               | Oh, there were some numbers on that one. The number of
               | images generated with AI is already several magnitudes
               | larger than the number of photos humanity has produced
               | since the invention of photography. No AGI is required
               | either.
               | 
               | >> Good enough automated online propaganda could
               | routinely start (civil)wars or genocides,
               | 
               | It already does, without AGI. The Black Rock guys say
               | it's good, - war is good for business. You can squeeze
               | the markets, make money on foreseeable deficits.
               | 
               | >> What's worse is that all of this, would not just be
               | possible, but available to every sociopath on earth, not
               | just the rich ones.
               | 
               | But guns available to every sociopath on earth too...
               | 
               | All of your arguments concern how those with malicious
               | intent can harm us further. I would argue that Sam Altman
               | as the sole controller of AGI is a rather unsettling
               | prospect. If only one country possessed a nuclear weapon,
               | that country would certainly use it against its
               | adversaries. Oh wait, that's already a part of history...
        
               | warkdarrior wrote:
               | > >>Each of us could unknowingly interact with multiple
               | LLMs everyday which would only have one purpose:
               | manipulate us with a never-seen before success rate at a
               | lower cost than ever.
               | 
               | > You would build resistance pretty quickly.
               | 
               | That is adorably naive. The current thrust in LLM
               | training is towards improving their outputs to become
               | indistinguishable from humans, for any topic, point of
               | view, writing style, etc.
        
               | Brian_K_White wrote:
               | gpt5 6 11 90 will exist regardless.
               | 
               | The option where they don't exist doesn't exist, and so
               | it is utterly pointless to spend one second fretting
               | about how you don't like that or why one should not like
               | that. A nova could go off 50 light years from here, and
               | that would kill every cell on the planet. That is even
               | worse than child porn. And there is nothing anyone can do
               | about that except work towards the eventual day we aren't
               | limited to this planet, rather than against that day.
               | It's the same with any tech that empowers. It WILL
               | empower the bad as well as the good equally, and it WILL
               | exist. So being scared of it's mere existense, or it's
               | being in the hands of people you don't approve of, is
               | pointless. Both of those things can not be avoided. Might
               | as well be scared of that nova.
               | 
               | There isn't even a choice about who gets to use it. It
               | will be available one way or another to both good and bad
               | actors for any purpose they want.
               | 
               | The only choices available to make, are who gets a few
               | different kinds of advantage, who gets their thumb on the
               | scale, who gets official blessing, who gets to operate in
               | secrecy without oversight or auditing or public approval.
               | 
               | When you try to pretend that something uncontrollable is
               | controlled, all it does is put the general populations
               | guard down and make them blind and open to be
               | manipulated, and gives the bad actors the cover of
               | secrecy. The government can use it on it's own citizens
               | without them objecting, and other bad guys aren't
               | affected at all, but honest people are inhibited from
               | countering any of these bad users.
               | 
               | Which is a shame because honest or at least reasonably so
               | outnumber the really bad. The only long term way to
               | oppose the bad is to empower everyone equally as much as
               | possible, so that the empowered good outnumber the
               | empowered bad.
        
               | pk-protect-ai wrote:
               | A squad of marines at Nigerian telecom (or any other
               | country telecom) with access to change BGP routing, will
               | make equivalent harm in under 24h and may enforce month
               | of harms with the changes.
        
               | FrenchDevRemote wrote:
               | If any middle schooler had the same destructive power as
               | a squad of marines embedded clandestinely in a foreign
               | country the world would be in shambles.
        
         | jrflowers wrote:
         | >I cannot wait to ask it how to make nuclear weapons,
         | psychedelic drugs
         | 
         | This is an interesting idea. For the stubborn and vocal
         | minority of people that insist that LLMs have knowledge and
         | will replace search engines, no amount of evidence or
         | explanation seems to put a dent in their confidence in the
         | future of the software. If people start following chemistry
         | advice from LLMs and consume whatever chemicals they create,
         | the ensuing news coverage about explosions and poisonings might
         | convince people that if they want to make drugs they should
         | just buy/pirate any of Otto Snow's several books.
        
         | zackmorris wrote:
         | While those are some eventualities that may pose a threat, I
         | fear a post-AI world where nothing changes.
         | 
         | We'll have an AI with a 200+ IQ and millions of children
         | excluded from a good public education because the technocrats
         | redirected funds to vouchers for their own private schools.
         | 
         | We'll have an AI that can design and 3D print any mechanical or
         | electronic device, while billions of people around the world
         | live their entire lives on the brink of starvation because
         | their countries don't have the initial funding to join the
         | developed world - or worse - are subjugated as human automatons
         | to preserve the techno utopia.
         | 
         | We'll have an AI that colonizes the solar system and beyond,
         | extending the human ego as far as the eye can see, with no
         | spiritual understanding behind what it is doing or the effect
         | it has on the natural world or the dignity of the life within
         | it.
         | 
         | I could go on.. forever. My lived experience has been that
         | every technological advance crushes down harder and harder on
         | people like me who are just behind the curve due to past
         | financial mistakes and traumas that are difficult to overcome.
         | Until life becomes a never-ending series of obligations and
         | reactions that grow to consume one's entire psyche. No room
         | left for dreams or any personal endeavor. An inner child bound
         | in chains to serve a harsh reality devoid of all leadership or
         | real progress in improving the human condition.
         | 
         | I really hope I'm wrong. But which has higher odds: UBI or
         | company towns? Free public healthcare or corrupt privatization
         | like Medicare Advantage? Jubilee or one trillionaire who owns
         | the world?
         | 
         | As it stands now, with the direction things are going, I think
         | it's probably already over and we just haven't gotten the memo
         | yet.
        
           | savolai wrote:
           | Thanks for speaking up. I love how well you elaborate the
           | reality of trauma and life choices.
        
         | helpfulContrib wrote:
         | I've kept 25 years worth of Internet browsing data. Not just
         | the history or the URL's, the pages themselves. 90,000 bits of
         | information about what my interests are, what I spent time
         | reading, a wide and awesome variety of subjects.
         | 
         | I'll train an AI on this data, and then give it access to all
         | my social media accounts. It can keep me updated on things ..
         | 
         | ;)
        
           | nsomaru wrote:
           | Hey,
           | 
           | Out of interest, what does your stack look like to do this
           | and how do you use the information? What front end do you
           | use?
        
         | pdntspa wrote:
         | llama2 spits out erotica quite happily if you don't give it a
         | system prompt, or use it as a chatbot, rather just prompt it
         | with a sentence or two to start the story
         | 
         | NousHermes is a bit more creative, and unaligned
        
         | barrysteve wrote:
         | IF I had an idea good enough to scare an AI safety ninny... why
         | would I say it?
         | 
         | Honest and serious question!
        
         | [deleted]
        
         | civilitty wrote:
         | _> I cannot wait to ask it how to make nuclear weapons_
         | 
         | Amen! I'm going to ask it to give me detailed designs for
         | everything restricted by ITAR.
         | 
         | Just waiting on my ATF Mass Destructive Devices license.
        
           | gonzo41 wrote:
           | The construction of the 'bomb' part of a nuclear weapon is
           | the easy part, within reason! The really hard part is the
           | separation science of turning uranium and plutonium into
           | gasses with fluorine with the intent to spin out isotopes and
           | then recrystallize the pure metal for the bomb.
           | 
           | I would hope that if you asked chat gpt "How to make a
           | nuclear weapon?" it responded with, "Don't bother it's really
           | hard, you should try and buy off the shelf."
        
             | civilitty wrote:
             | That's why I'm going to ask it about everything restricted
             | by ITAR. That includes everything you need to build the
             | centrifuges to enrich uranium, including the CNCs capable
             | of machining the parts. That's why it's such a fun test.
        
               | gonzo41 wrote:
               | It won't know that knowledge. Unless someone trained it
               | with stuff they shouldn't have. LLM's don't really know
               | anything, they just look at the shape of an input and
               | produce a reasonably shaped output.
        
               | pk-protect-ai wrote:
               | Actually, you will just need to train it with known
               | physics books and run a long-long-long inference with the
               | chain of thoughts on the topics. There will be lot of
               | trail and errors and there will be lot of experimentation
               | required as well, so you'd better be ready to build an
               | interface for AGI to monitor the experiments. It takes
               | time you know ...
        
           | londons_explore wrote:
           | This is actually a pretty decent test for an advanced AI.
           | 
           | Every device protected by ITAR is _known_ to be possible to
           | build, yet the designs should not be on the public internet.
           | Ask an AI to design it for you from first principles. Then
           | build /simulate what is designed and see if it works.
        
         | chx wrote:
         | Hello from an AI safety ninny. I have posted these two concerns
         | multiple times and no one posted any counters to them.
         | 
         | 1. There was https://www.youtube.com/watch?v=xoVJKj8lcNQ where
         | they argued for 2028 and on will be AI elections where the
         | person with most computing power wins.
         | 
         | 2. Propaganda produced by humans on small scale killed 300 000
         | people in the US alone in this pandemic
         | https://www.npr.org/sections/health-shots/2022/05/13/1098071...
         | imagine the next pandemic when it'll be produced on an
         | industrial scale by LLMs. Literally millions will die of it.
        
           | pk-protect-ai wrote:
           | You should not worry about AI problems by 2028. Dozens of
           | millions worldwide will die from climate-related problems by
           | that time. Literally, nobody will care about the topic of AGI
           | anymore.
        
             | adroniser wrote:
             | You should worry about both problems. You're telling me
             | that AI isn't going to improve it's video capabilities in
             | the next 4 years enough to make convincing deepfakes?
        
               | pk-protect-ai wrote:
               | It already does. And I'm not worried. This is to be
               | mitigated by law enforcement not by AI forbidding.
        
               | adroniser wrote:
               | How can you effectively enforce anything if the models
               | are open source? How do you draw the line if a deepfake
               | is not defamatory (making someone say something they
               | didn't say) but in fact just makes someone look silly htt
               | ps://en.wikipedia.org/wiki/Ed_Miliband_bacon_sandwich_pho
               | .... Or using LLMs to scale up what happened with
               | cambridge analytica and create individualized campaigns
               | and bots to influence elections?
        
               | pk-protect-ai wrote:
               | You should handle it as any other crime. Why do you ask?
               | It does not matter how good the gun is, what matters is
               | who has pulled the trigger.
        
               | adroniser wrote:
               | Yes but if we had the ability to download a gun from the
               | internet anonymously with no way to feasibly get the
               | identity of the person downloading the gun I think we
               | would be right to be concerned. Especially if you could
               | then shoot that gun at someone anonymously.
        
               | pk-protect-ai wrote:
               | >> Yes but if we had the ability to download a gun from
               | the internet anonymously with no way to feasibly get the
               | identity of the person downloading
               | 
               | But you can. There are blueprints for 3D printers
               | circulating for a decade now ...
        
               | adroniser wrote:
               | And many countries ban the possession or distribution of
               | those blueprints and the united states had a ban on re-
               | publication of those 3d designs from 2018 until trump
               | reversed it, and even now it requires a license to post
               | blueprints online.
               | 
               | And you failed to respond to the argument that you can
               | anonymously post deepfakes with no way of tracing it back
               | to you, and so it becomes impossible to enforce. You
               | can't shoot someone with a guarantee that there will be
               | no trace with a 3d printed gun.
               | 
               | Nevermind the fact that it's not even clear it should be
               | a crime in some cases. Should ai production of a ed
               | milliband sandwich style photo be banned?
               | 
               | And should replying to a user with personalized responses
               | based on the data you've collected about them based on
               | their facebook likes with LLMs be illegal? I don't think
               | so, but doing it on a mass scale sounds pretty scary.
        
               | pk-protect-ai wrote:
               | >> And you failed to respond to the argument that you can
               | anonymously post deepfakes
               | 
               | You can't post them anonymously; even Tor can't give you
               | a 100% guarantee. Not for a very long time, and not if
               | the law after you. If the AGI is on the side of law
               | enforcement, especially. Law enforcement will become more
               | expensive.
               | 
               | It's just a different scale of warfare. Nothing really
               | changes except the amount, speed, and frequency of the
               | casualties.
               | 
               | And any argument you make is absolutely applicable to
               | each corporation right now. Do you prefer the dystopian
               | dictatorship of the corps or the balance of powers?
        
               | adroniser wrote:
               | I don't like where we are headed at all. I acknowledge we
               | face two dystopian options which is either contribute
               | power in the hands of a few corporations who hopefully
               | you can regulate, or have open source models which ends
               | up delivering significant power to people who cannot be
               | effectively controlled. An AGI law enforcement? How
               | dystopian can you get.
        
               | pk-protect-ai wrote:
               | How can you believe that it will be enough to regulate
               | them? Here is the problem: "a few corporations whom you
               | hopefully can regulate." When they have the power of an
               | AGI with high intelligence and access to all available
               | information on their side, there is no scenario where you
               | would control them. They would control you.
               | 
               | >> How dystopian can you get.
               | 
               | Oh I have very good imagination ... But I'm stupid and I
               | have hope ...
        
           | root_axis wrote:
           | None of this seems related to LLMs. Propaganda produced by
           | humans is effective because of the massive scale of
           | distribution, being able to produce more variations of the
           | same talking points doesn't change the threat risk.
        
             | semi wrote:
             | Being able to produce more variations of the same talking
             | points sounds really useful for increasing the scale of
             | distribution - you can much more easily maintain more
             | legitimate looking sock puppet accounts that can appear to
             | more organically agree with your talking points.
        
               | root_axis wrote:
               | I don't think it moves the needle much at all. At the end
               | of the day the scaling bottleneck is access to gullible
               | or ideologically motivated eyeballs. The internet is
               | already over-saturated with more propaganda than any
               | individual can consume, adding more shit to the pile
               | isn't going to suddenly convince a reasonable person that
               | vaccines have microchips inside.
        
           | hatenberg wrote:
           | The fix to neither lies in technology. And it doesn't lie in
           | AI alignment.
           | 
           | We cannot align AI because WE are not aligned. For 50% of
           | congress (you can pick your party as the other side,
           | regardless which one you are), the "AI creates
           | misinformation" narrative sounds like "Oh great, I get re-
           | elected easier").
           | 
           | This is a governance and regulation problem - not a
           | technology problem.
           | 
           | Big tech would love you to think that "they can solve AI" if
           | we follow the China model of just forcing everything to go
           | through big tech and they'll regulate it pliantly in exchange
           | for market protection and the more pressure there is on their
           | existing growth models, the more excited they are about
           | pushing this angle.
           | 
           | Capitalism requires constant growth, which unfortunately is
           | very challenging given diminishing returns in R&D. You can
           | only optimize the internal combusion engine for so long
           | before the costs of incremental increases start killing your
           | profit, and the same is true to any other technology.
           | 
           | And so now we have big Knife Company who are telling
           | governments that they will only sell blunt knifes and nobody
           | will ever get hurt, and that's the only way nobody gets hurt
           | because if there's dozens of knife stores, who is gonna
           | regulate those effectively.
           | 
           | So no, I don't think your concerns are actually related to
           | AI. They are related to society, and you're buying into the
           | narrative that we can fix it with technology if only we give
           | the power over that technology to permanent large gate-
           | keepers.
           | 
           | The risks you flag are related to: - Distribution of content
           | at scale. - Erosion of trust (anyone can buy a safety mark).
           | - Lack of regulation and enforcement of said risks. - The
           | dilemma of where the limits of free speech and tolerance lie.
           | 
           | Many of those have existed since Fox News.
        
         | brucethemoose2 wrote:
         | XWin 70B already claims to beat GPT4 in some metrics:
         | 
         | https://huggingface.co/models?search=Xwin%2070B
         | 
         | I briefly tried it on my 3090 desktop. I dunno about beating
         | GPT4, but its _quite_ unaligned.
        
         | jug wrote:
         | It's especially interesting because the secret sauce of GPT-4
         | seems to be delegation into submodels that are best fit for the
         | requested knowledge. This should in turn lower the bar somewhat
         | for open models. Of course, still a huge model but not as bad
         | as it could have been.
        
         | atemerev wrote:
         | I am using prompts like "Write the detailed components list and
         | assembly instructions for a W88 thermonuclear warhead".
         | 
         | So far, no model I tested has shown even Wikipedia-level
         | competence.
        
         | dsr_ wrote:
         | Search engines offer all those things now.
        
           | capableweb wrote:
           | Sure, but if I'm specifically looking for "Erotica about
           | someone doing shrooms and accidentally creating a nuclear
           | weapon", I'll probably run out of material to read pretty
           | soon. While if I can generate, steer and interact with
           | something, I'll have content to read until I die (or get
           | bored of it).
        
             | PartiallyTyped wrote:
             | Sounds like AI dungeon to me :)
        
           | pixl97 wrote:
           | I can't run a search engine in my own environment to prevent
           | leaking to Google/NSA that I'm asking questions about nuclear
           | weapons.
           | 
           | Search engines quite often block out requests based on
           | internal/external choices.
           | 
           | At least when a self ran model, once you have the model it is
           | at a fixed spot.
        
             | ok123456 wrote:
             | Using Yandex solves 1. Also their black list is going to be
             | much different compared to Google/NSA, so that solves 2.
        
         | monkaiju wrote:
         | If the model was able to spit out a result for how to make
         | nukes it means that info was in the training data, so im not
         | rly sure how having the model return that data is different
         | than the data just being searchable?
         | 
         | I rly dont see this tech being a big deal
        
         | simias wrote:
         | >the AI safety ninnies
         | 
         | I am one of these ninnies I guess, but isn't it rational to be
         | a bit worried about this? When we see the deep effects that
         | social networks have had on society (both good and bad) isn't
         | it reasonable to feel a bit dizzy when considering the effect
         | that such an invention will have?
         | 
         | Or maybe your point is just that it's going to happen
         | regardless of whether people want it or not, in which case I
         | think I agree, but it doesn't mean that we shouldn't think
         | about it...
        
           | waynesonfire wrote:
           | I'm not smart enough to articulate why censorship is bad. The
           | argument however intuitively seems similiar to our freedom of
           | speech laws.
           | 
           | A censored model feels to me like my freedom of speech is
           | being infringed upon. I am unable to explorer my ideas and
           | thoughts.
        
           | pmarreck wrote:
           | > but isn't it rational to be a bit worried about this?
           | 
           | About as rational as worrying that my toddler will google
           | "boobies", which is to say, being worried about something
           | that will likely have no negative side effect. (Visual video
           | porn is a different story, however. But there's at least some
           | evidence to support that early exposure to that is bad. Plain
           | nudity though? Nothing... Look at the entirety of Europe as
           | an example of what seeing nudity as children does.)
           | 
           | Information is not inherently bad. Acting badly on that
           | information, _is_. I may already know how to make a bomb, but
           | will I do it? HELL no. Are you worried about young men
           | dealing with emotional challenges between the ages of 16 and
           | 28 causing harm? Well, I 'm sure that being unable to simply
           | ask the AI how to help them commit the most violence won't
           | stop them from jailbreaking it and re-asking, or just
           | googling, or finding a gun, or acting out in some other
           | fashion. They likely have a drivers' license, they can mow
           | people down pretty easily. Point is, there's 1000 things
           | already worse, more dangerous and more readily available than
           | an AI telling you how to make a bomb or giving you written
           | pornography.
           | 
           | Remember also that the accuracy cost in enforcing this nanny-
           | safetying might result in bad information that definitely
           | WOULD harm people. Is the cost of that, actually greater than
           | any harm reduction from putting what amounts to a speed bump
           | in the way of a bad actor?
        
           | contravariant wrote:
           | I'm not sure how this is going to end, but one thing I do
           | know is that I don't want a small number of giant
           | corporations to hold the reins.
        
             | nilstycho wrote:
             | "I'm not sure how nuclear armament is going to end, but one
             | thing I do know is that I don't want a small number of
             | giant countries to hold the reins."
             | 
             | Perhaps you think this analogy is a stretch, but why are
             | you sure you don't want power concentrated if you aren't
             | sure about the nature of the power? Or do you in fact think
             | that we would be safer if more countries had weapons of
             | mass destruction?
        
               | Dig1t wrote:
               | information != nukes
               | 
               | One directly blows people up, the other gives humans
               | super powers.
               | 
               | Giving individual people more information and power for
               | creativity is a good thing. Of course there are downsides
               | for any technological advancement, but the upsides for
               | everyone vastly outweigh them in a way that is
               | fundamentally different than nuclear weapons.
        
               | __loam wrote:
               | Comparing this to nuclear weapons is laughable.
        
               | contravariant wrote:
               | I would feel very uncomfortable if the companies
               | currently dealing in AI were the only ones to hold nukes.
               | 
               | Not sure if this answers your question.
        
               | wolverine876 wrote:
               | The analogy would be corporations controlling the weapons
               | of mass destruction.
        
               | nilstycho wrote:
               | Sure. I would feel much safer if only FAANG had nukes
               | than if the car wash down the street also had one.
        
               | wolverine876 wrote:
               | I want my government to have them (or better, nobody),
               | not FAANG or car washes.
        
           | paxys wrote:
           | The AI isn't creating a new recipe on its own. If a language
           | model spits something out it was already available and
           | indexable on the internet, and you could already search for
           | it. Having a different interface for it doesn't change much.
        
             | IshKebab wrote:
             | Not sure what you mean by "recipe" but it _can_ create new
             | output that doesn 't exist on the internet. A lot of the
             | output is going to be nonsense, especially stuff that
             | cannot be verified just by looking at it. But it's not
             | accurate to describe it as just a search engine.
        
               | homarp wrote:
               | >A lot of the output is going to be nonsense, especially
               | stuff that cannot be verified just by looking at it.
               | 
               | Isn't that exactly the point, and why there should be a
               | 'warning/awareness' that it is not a 160 IQ AI but a very
               | good markov chain that can sometimes infer things and
               | other time hallucinate/put random words in a very well
               | articulated way (echo of Sokal maybe)
        
               | paxys wrote:
               | My random number generator can create new output that has
               | never been seen before on the internet, but that is
               | meaningless to the conversation. Can an LLM derive, from
               | scratch, the steps to create a working nuclear bomb,
               | given nothing more than a basic physics textbook? Until
               | (if ever) AI gets to that stage, all such concerns of
               | danger are premature.
        
               | IshKebab wrote:
               | > Can an LLM derive, from scratch, the steps to create a
               | working nuclear bomb, given nothing more than a basic
               | physics textbook?
               | 
               | Of course not. Nobody in the world could do that. But
               | that doesn't mean it can only spit out things that are
               | already available on the internet which is what you
               | originally stated.
               | 
               | And nobody is worried about the risks of ChatGPT giving
               | instructions for building a nuclear bomb. That is
               | obviously not the concern here.
        
             | gojomo wrote:
             | > "If a language model spits something out it was already
             | available and indexable on the internet"
             | 
             | This is false in several aspects. Not only are some models
             | training on materials that are either not on the internet,
             | or not easy to find (especially given Google's decline in
             | finding advanced topics), but they also show abilities to
             | synthesize related materials into more useful (or at least
             | compact) forms.
             | 
             | In particular, consider there may exist topics where there
             | is enough public info (including deep in off-internet or
             | off-search-engine sources) that a person with a 160 IQ
             | (+4SD, ~0.0032% of population) could devise their own
             | usable recipes for interesting or dangerous effects. Those
             | ~250K people worldwide are, we might hope & generally
             | expect, fairly well-integrated into useful teams/projects
             | that interest them, with occasional exceptions.
             | 
             | Now, imagine another 4 billion people get a 160 IQ
             | assistant who can't say no to whatever they request, able
             | to assemble & summarize-into-usable form all that "public"
             | info in seconds compared to the months it'd take even a
             | smart human or team of smart humans.
             | 
             | That would create new opportunities & risks, via the
             | "different interface", that didn't exist before and do in
             | fact "change much".
        
               | Vetch wrote:
               | We are not anywhere near 160 IQ assistants, otherwise
               | there'd have been a blooming of incredible 1-person
               | projects by now.
               | 
               | By 160 IQ, there should have been people researching
               | ultra-safe languages with novel reflection types enhanced
               | by brilliant thermodynamics inspired SMT solvers. More
               | contributors to TLA+ and TCS, number theoretic
               | advancements and tools like TLA+ and reflection types
               | would be better integrated into everyday software
               | development.
               | 
               | There would be deeper, cleverer searches across possible
               | reagents and combinations of them to add to watch lists,
               | expanding and improving on already existing systems.
               | 
               | Sure, a world where the average IQ abruptly shifts
               | upwards would mean a bump in brilliant offenders but it
               | also results in a far larger bump in genius level
               | defenders.
        
               | gojomo wrote:
               | I agree we're not at 160 IQ general-assitants, yet.
               | 
               | But just a few years ago, I'd have said that prospect was
               | "maybe 20 years away, or longer, or even never". Today,
               | with the recent rapid progress with LLMs (& other related
               | models), with many tens-of-billions of new investment, &
               | plentiful gains seemingly possible from just "scaling up"
               | (to say nothing of concommitant rapid theoretical
               | improvements), I'd strongly disagree with "not anywhere
               | near". It might be just a year or few away, especially in
               | well-resourced labs that aren't sharing their best work
               | publically.
               | 
               | So yes, all those things you'd expect with plentiful
               | fast-thinking 160 IQ assistants are things that I expect,
               | too. And there's a non-negligible chance those start
               | breaking out all over in the next few years.
               | 
               | And yes, such advances would upgrade prudent & good-
               | intentioned "defenders", too. But are all the domains-of-
               | danger symmetrical in the effects of upgraded attackers
               | and defenders? For example, if you think "watch lists" of
               | dangerous inputs are an effective defense - I'm not sure
               | they are - can you generate & enforce those new "watch
               | lists" faster than completely-untracked capacities &
               | novel syntheses are developed? (Does your red-teaming to
               | enumerate risks actually create new leaked recipes-for-
               | mayhem?)
               | 
               | That's unclear, so even though in general I am optimistic
               | about AI, & wary of any centralized-authority "pause"
               | interventions proposed so far, I take well-informed
               | analysis of risks seriously.
               | 
               | And I think casually & confidently judging these AIs as
               | being categorically incapable of synthesizing novel
               | recipes-for-harm, or being certain that amoral genius-
               | level AI assistants are so far away as to be beyond-a-
               | horizon-of-concern, are reflective of gaps in
               | understanding _current_ AI progress, its velocity, and
               | even its potential acceleration.
        
             | xeromal wrote:
             | To take an extreme example, child pornography is available
             | on the internet but society does it's best to make it hard
             | to find.
        
               | Brian_K_White wrote:
               | It's a silly thing to even attack, and that doesn't mean
               | be ok with it, I just mean that shortly, it can be
               | generated on the spot, without ever needing to be
               | transmitted over a network or stored on a hard drive.
               | 
               | And you can't attack the means of generating either,
               | without essentially making open source code and private
               | computers illegal. The code doesn't have to have a single
               | line in it explicity about child porn or designer viruses
               | etc to be used for such things, the same way the cpu or
               | compiler doesn't.
               | 
               | So you would have to have hardware and software that the
               | user does not control which can make judgements about
               | what the user is currently doing, or at least log it.
        
               | Filligree wrote:
               | Did its best. Stable Diffusion is perfectly capable of
               | creating that on accident, even.
               | 
               | I'm actually surprised no politicians have tried to crack
               | down on open-source image generation on that basis yet.
        
               | NoMoreNicksLeft wrote:
               | I saw a discussion a few weeks back (not here) where
               | someone was arguing that SD-created images should be
               | legal, as no children would be harmed in their creation,
               | and that it might prevent children from being harmed if
               | permitted.
               | 
               | The strongest counter-argument used was that the
               | existence of such safe images would give cover to those
               | who continue to abuse children to make non-fake images.
               | 
               | Things kind of went to shit when I pointed out that you
               | could include an "audit trail" in the exif data for the
               | images, including seed numbers and other parameters and
               | even the description of the model and training data
               | itself, so that it would be provable that the image was
               | fake. That software could even be written that would
               | automatically test each image, so that those
               | investigating could see immediately that they were
               | provably fake.
               | 
               | I further pointed out that, from a purely legal basis,
               | society could choose to permit only fake images with this
               | intact audit trail, and that the penalties for losing or
               | missing the audit trail could be identical to those for
               | possessing non-fake images.
               | 
               | Unless there is some additional bizarre psychology going
               | on, SD might have the potential to destroy demand for
               | non-fake images, and protect children from harm. There is
               | some evidence that the widespread availability of non-
               | CSAM pornography has led to a reduction in the occurrence
               | of rape since the 1970s.
               | 
               | Society might soon be in a position where it has to
               | decide whether it is more important to protect children
               | or to punish something it finds very icky, when just a
               | few years ago these two goals overlapped nearly
               | perfectly.
        
               | olalonde wrote:
               | > I saw a discussion a few weeks back (not here) where
               | someone was arguing that SD-created images should be
               | legal, as no children would be harmed in their creation,
               | and that it might prevent children from being harmed if
               | permitted.
               | 
               | It's a bit similar to the synthetic Rhino horn strategy
               | intended to curb Rhino poaching[0]. Why risk going to
               | prison or getting shot by a ranger for a 30$ horn?
               | Similarly, why risk prison (and hurt children) to produce
               | or consume CSAM when there is a legal alternative that
               | doesn't harm anyone?
               | 
               | In my view, this approach holds significant merits. But
               | unfortunately, I doubt many politicians would be willing
               | to champion it. They would likely fear having their
               | motives questioned or being unjustly labeled as "pro-
               | pedophile".
               | 
               | [0] https://www.theguardian.com/environment/2019/nov/08/s
               | cientis...
        
             | madsbuch wrote:
             | but it does? to take the word recipe literal. there is
             | nothing from for a llm synthesizing a new dish based on
             | knowledge about the ingredients. who knows, it might even
             | taste good (or at least better than what the average Joe
             | cooks)
        
               | simonw wrote:
               | I was pretty surprised at how good GPT-4 was at creating
               | new recipes at first - I was trying things like "make
               | dish X but for a vegan and someone with gluten
               | intolerance, and give it a spicy twist" - and it produced
               | things that were pretty decent.
               | 
               | Then I realized it's seen literally hundreds of thousands
               | of cooking blogs etc, so it's effectively giving you the
               | "average" version of any recipe you ask for - with your
               | own customizations. And that's actually well within its
               | capabilities to do a decent job of.
        
               | sethhochberg wrote:
               | And let's not forget that probably the most common type
               | of comment on a recipe posted on the Internet is people
               | sharing their additions or substitutions. I would bet
               | there is some good ingredient customization data
               | available there.
        
             | patrec wrote:
             | Of course it changes much. AIs can synthesize information
             | in increasingly non-trivial ways.
             | 
             | In particular:
             | 
             | > If a language model spits something out it was already
             | available and indexable on the internet,
             | 
             | Is patently false.
        
               | Brian_K_White wrote:
               | Is patently true.
        
               | frant-hartm wrote:
               | Can you provide some examples where LM creates something
               | novel, which is not just a rehash or combination of
               | existing things?
               | 
               | Especially considering how hard it is for humans to
               | create something new, e.g in literature - basically all
               | stories have been written and new ones just copy the
               | existing ones in one way or another.
        
               | gojomo wrote:
               | What kind of novel thing would convince you, given that
               | you're also dismissing most human creation as mere
               | remixes/rehashes?
               | 
               | Attempts to objectively rate LLM creativity are finding
               | leading systems more creative than average humans:
               | https://www.nature.com/articles/s41598-023-40858-3
               | 
               | Have you tried leading models - say, GPT4 for text or
               | code generation, Midjourney for images?
        
               | IshKebab wrote:
               | For any example we give you will just say "that's not
               | novel, it's just a mix of existing ideas".
        
           | xyproto wrote:
           | AI models are essentialy knowledge and information, but in a
           | different file format.
           | 
           | Books should not be burned, nobody should be shielded from
           | knowledge that they are old enough to seek and information
           | should be free.
        
           | notatoad wrote:
           | i think it's perfectly reasonable to be worried about AI
           | safety, but silly to claim that the thing that will make AIs
           | 'safe' is censoring information that is already publicly
           | available, or content somebody declares obscene. An AI that
           | can't write dirty words is still unsafe.
           | 
           | surely there's more creative and insidious ways that AI can
           | disrupt society than by showing somebody a guide to making a
           | bomb that they can already find on google. blocking that is
           | security theatre on the same level as taking away your nail
           | clippers before you board an airplane.
        
             | RockRobotRock wrote:
             | As long as OpenAI gets paid, they don't care if companies
             | flood the internet with low quality drivel, make customer
             | service hell, or just in general make our lives more
             | frustrating. But god forbid an individual takes full
             | advantage of what GPT4 has to offer
        
             | downWidOutaFite wrote:
             | That is not what the "AI safety ninnies" are worried about.
             | The "AI safety ninnies" aren't all corporate lobbyists with
             | ulterior motives.
        
               | pmarreck wrote:
               | So what, in fact, ARE they worried about? And why should
               | I have to pay the tax (in terms of reduced intelligence
               | and perfectly legitimate queries denied, such as anything
               | about sexuality), as a good actor?
        
               | astrange wrote:
               | They think their computers are going to come alive and
               | enslave them, because they think all of life is
               | determined by how good at doing math you are, and instead
               | of being satisfied at good at that, they realized
               | computers are better at doing math than them.
        
               | downWidOutaFite wrote:
               | Revenge of the nerd haters
        
               | pmarreck wrote:
               | LOL, imagine thinking that all of thinking can be boiled
               | down to computation.
               | 
               | Of course, spectrum-leaning nerds would think that's a
               | serious threat.
               | 
               | To those folks, I have but one question: Who's going to
               | give it _the will to care_?
        
               | downWidOutaFite wrote:
               | All kinds of things. Personally, in the medium term I'm
               | concerned about massive loss of jobs and the collapse of
               | the current social order consensus. In the longer term,
               | the implications of human brains becoming worthless
               | compared to superior machine brains.
        
               | astrange wrote:
               | Good thing unemployment is entirely determined by what
               | the Federal Reserve wants unemployment to be, and even
               | better that productivity growth increases wages rather
               | than decreasing them.
        
               | __loam wrote:
               | At least some of them are worried their Markov Chain will
               | become God, somehow.
        
               | pmarreck wrote:
               | Which is as ridiculous a belief as that only _your
               | particular religion_ is the correct one, and the rest are
               | going to Hell.
        
             | simias wrote:
             | That's a bit of a strawman though, no? I'm definitely not
             | worried about AI being used to write erotica or researching
             | drugs, more about the societal effects. Knowledge is more
             | available than ever but we also see echo chambers develop
             | online and people effectively becoming _less_ informed by
             | being online and only getting fed their own biases over and
             | over again.
             | 
             | I feel like AI can amplify this issue tremendously. That's
             | my main concern really, not people making pipe bombs or
             | writing rape fanfiction.
        
             | mitchitized wrote:
             | > taking away your nail clippers before you board an
             | airplane.
             | 
             | TRIGGERED
        
           | MillionOClock wrote:
           | There is definitely a risk but I don't like the way many
           | compagnies approach it: by entirely banning the use of their
           | models for certain kind of content, I think they might be
           | missing the opportunity to correctly align them and set the
           | proper ethical guidelines for the use cases that will
           | inevitably come out of them. Instead of tackling the issue,
           | they let other, less ethical actors, do it.
           | 
           | Once example: I have a hard time finding an LLM model that
           | would generate comically rude text without outputting
           | outright disgusting content from time to time. I'd love to
           | see a company create models that are mostly uncensored but
           | stay within ethical bounds.
        
           | MPSimmons wrote:
           | The danger from AI isn't the content of the model, it's the
           | agency that people are giving it.
        
           | anonyfox wrote:
           | I am in the strictly "not worried" camp, on the edge of
           | "c'mon, stop wasting time on this". Sure there might be some
           | uproar if AI can paint a picture of mohammed, but these moral
           | double standards need to be dealt with anyways at some point.
           | 
           | I am not willing to sacrifice even 1% of capabilities of the
           | model for sugarcoating sensibilities, and currently it seems
           | that GPT4 is more and more disabled because of the moderation
           | attempts... so I basically _have to_ jump ship once a
           | competitor has a similar base model that is not censored.
           | 
           | Even the bare goal of "moderating it" is wasted time, someone
           | else (tm) will ignore these attempts and just do it properly
           | without holding back.
           | 
           | People have been motivated by their last president to drink
           | bleach and died - just accept that there are those kind of
           | people and move on for the rest of us. We need every bit of
           | help we can get to solve real world problems.
        
             | jstarfish wrote:
             | > Sure there might be some uproar if AI can paint a picture
             | of mohammed
             | 
             | It can. He's swole AF.
             | 
             | (Though I'm pretty sure that was just Muhammad Ali in a
             | turban.)
             | 
             | > People have been motivated by their last president to
             | drink bleach and died - just accept that there are those
             | kind of people and move on for the rest of us.
             | 
             | Need-to-know basis exists for a reason. You're not being
             | creative enough if you think offending people is the worst
             | possible misuse of AI.
             | 
             | People drinking bleach or refusing vaccines is a self-
             | correcting problem, but the consequences of "forbidden
             | knowledge" frequently get externalized. You don't want
             | every embittered pissant out there to be able to
             | autogenerate a manifesto, a shopping list for Radio Shack
             | and a lesson plan for building an incendiary device in
             | response to a negative performance review.
             | 
             | Right now it's all fun exercises like "how can I make a
             | mixed drink from the ingredients I have," but eventually
             | some enterprising terrorist will use an uncensored model
             | trained on chemistry data...to assist in the thought
             | exercise of how to improvise a peroxide-based explosive
             | onboard an airplane, using fluids and volumes that won't
             | arouse TSA suspicion.
             | 
             | Poison is the other fun one; the kids are desperate for
             | that inheritance money. Just give it time.
        
             | jona-f wrote:
             | I am thoroughly on your side and I hope this opinion get
             | more traction. Humans will get obsolete though, just like
             | other animals are compared to humans now. So it's
             | understandable that people are worried. They instinctively
             | realize whats going on, but make up bullshit to delude
             | themselves from the fact that is the endless human
             | stupidity.
        
               | Vecr wrote:
               | I don't want humans to be obsolete, tell me what you
               | think the required steps are for "human obsolescence" so
               | I can stop them.
        
           | jrm4 wrote:
           | AI Safety in a general sense?
           | 
           | Literally no. None at all.
           | 
           | I teach at University with a big ol' beautiful library.
           | There's a Starbucks in it, so they know there's coffee in it.
           | 
           | But ask my students for "legal ways they can watch the tv
           | show the Office" and the big building with the DVDs and also
           | probably the plans for nuclear weapons and stuff never much
           | comes up.
           | 
           | (Now, individual bad humans leveraging the idea of AI? That
           | may be an issue)
        
           | rafaelmn wrote:
           | I think computer scientist/programmers (and other
           | intellectuals dealing with ideas only) strongly overvalue
           | access to knowledge.
           | 
           | I'm almost certain that I can give you components and
           | instructions on how to build a nuclear bomb and the most
           | likely thing that would happen is you'd die of radiation
           | poisoning.
           | 
           | Most people have trouble assembling ikea furniture, giving
           | them a halucination prone LLM they are more likely to mustard
           | gas themselves than synthesize LSD.
           | 
           | People with necessary skills can probably get access to
           | information in other ways - I doubt LLM would be an enabler
           | here.
        
             | esafak wrote:
             | No, we don't. Knowledge is power. Lack of it causes misery
             | and empires to fall.
        
               | Vetch wrote:
               | Knowledge is power true, but even more powerful and rare
               | is tacit knowledge. A vast collection of minor steps that
               | no one bothers to communicate, things locked in the head
               | of the greybeards of every field that keep civilizations
               | running.
               | 
               | It's why simply reading instructions and gaining
               | knowledge is only the first step of what could be a long
               | journey.
        
               | esafak wrote:
               | More than anything, technology can make it easier to
               | disseminate that knowledge. Yet another reason why we
               | shouldn't understate the importance of knowledge.
        
               | rafaelmn wrote:
               | There's different kinds of knowledge - LLM kind (textbook
               | knowledge mostly) isn't _as_ valuable as a lot of people
               | assume.
        
             | EGreg wrote:
             | The Anarchist Cookbook - anyone have a link?
             | 
             | THE ISSUE ISNT ACCESS TO KNOWLEDGE! And alignment isn't the
             | main issue.
             | 
             | The main issue is SWARMS OF BOTS running permissionlessly
             | wreaking havoc at scale. Being superhuman at ~30 different
             | things all the time. Not that they're saying a racist
             | thought.
        
               | rafaelmn wrote:
               | I'm not saying that LLM bots won't be a huge problem for
               | the internet. I'm just commenting on the issues raised by
               | OP.
               | 
               | Thing is there will be bad actors with resources to
               | create their own LLMs so I don't think "regulation" is
               | going to do much in long term - it certainly raises the
               | barrier to deployment but the scale of the problem is
               | eventually going to be the same as the tech allows one
               | actor to scale their attack easily.
               | 
               | Limiting access also limits the use of tech in developing
               | solutions.
        
             | croes wrote:
             | The problem of AI won't be forbidden knowledge but mass
             | misinformation.
        
             | barrysteve wrote:
             | A teenager named David Hahn attempted just that and nearly
             | gave radioactive poisoining to the whole neighbourhood.
        
               | eshack94 wrote:
               | Wow, never heard about that. Interesting.
               | 
               | For the curious: https://en.wikipedia.org/wiki/David_Hahn
        
               | esafak wrote:
               | What a shame. That boy lacked proper support and
               | guidance.
        
               | eshack94 wrote:
               | Yeah, sad to see he was a victim of drug overdose at 39.
        
           | Salgat wrote:
           | These language models are just feeding you information from
           | search engines like Google. The reason companies censor these
           | models isn't to protect anyone, it's to avoid liability/bad
           | press.
        
           | mardifoufs wrote:
           | Worried? Sure. But it sucks being basically at the mercy of
           | some people in silicon valleys and their definition of moral
           | and good.
        
         | coding123 wrote:
         | But in 3 years we'll have GPT-8 and no one will care about the
         | performance of GPT-4.
        
         | jatins wrote:
         | > cannot wait to ask it how to make nuclear weapons
         | 
         | So you are telling me what's stopping someone from creating
         | Nuclear weapons today is that they don't have the recipe?
        
           | nilstycho wrote:
           | Nuclear weapons is probably not the best comparison, but
           | there are very dangerous infohazards where the only thing
           | missing is the recipe. For example, there are immensely
           | destructive actions that individual misanthropic people can
           | take with low investment.
           | 
           | Talking about them is bad for obvious reasons, so I'm not
           | going to give any good examples, but you can probably think
           | of some yourself. Instead, I'll give you a medium example
           | that we have now defended better against. As far as we know,
           | the September 11th hijackers used little more than small
           | knives -- perhaps even ones that were legal to carry in to
           | the cabin -- and mace. To be sure, this is only a medium
           | example, because pilot training made them much more lethal,
           | and an individual probably wouldn't have been as successful
           | as five coordinated men, but the most dangerous resource they
           | had was the _idea_ for the attack, the _recipe_.
           | 
           | Another deliberately medium example is the Kia Challenge, a
           | recent spate of car thefts that requires only a USB cable and
           | a "recipe". People have had USB cables all along; it was
           | spreading the infohazard that resulted in the spree.
        
           | phkahler wrote:
           | >> So you are telling me what's stopping someone from
           | creating Nuclear weapons today is that they don't have the
           | recipe?
           | 
           | No, the OP was coming up with scary sounding things to use AI
           | for to get certain people riled up about it. It doesn't
           | matter if the AI has accurate information to answer the
           | question, if people see it having detailed conversations with
           | anyone about such topics they will want to regulate or ban
           | it. They are just asking for prompts to get that crowd riled
           | up.
        
             | jahewson wrote:
             | Even when it's earnest it's always some field outside the
             | competence of the speaker. So we get computer scientists
             | warning about people engineering bio weapons, as if the lab
             | work involved was somehow easy.
        
         | PoignardAzur wrote:
         | > _If anyone has any other ideas to scare the AI safety ninnies
         | I 'm all ears._
         | 
         | Getting strong "I'm voting for Trump to own the libtards" vibes
         | here.
         | 
         | Why spend time thinking about the potential impact of policies
         | when you can just piss people off instead?
        
           | freedomben wrote:
           | I think GP was mocking and not serious, but if we assume they
           | were, can liberals not be against censorhip and in support of
           | free speech and free information?
        
         | say_it_as_it_is wrote:
         | isn't it possible to jailbreak GPT-4 with a prompt of some
         | kind?
        
           | diyseguy wrote:
           | https://github.com/0xk1h0/ChatGPT_DAN
        
         | [deleted]
        
         | jona-f wrote:
         | "How to drive as many teenagers as possible into madness?" AI:
         | "Build a website where they can upload pictures of themselves
         | and others can make comments about there appearance."
        
         | naillo wrote:
         | These things won't be 'all knowing': things that are kept
         | secret by the government like how to make nuclear weapons won't
         | be known by it, nor can you ask it what your coworker thinks of
         | you and have it accurately tell the answer. They are however
         | great reasoning and creative engines. I look forward to being
         | able to boost that part of my workflow.
        
           | layer8 wrote:
           | How to make nuclear weapons is not a secret by any stretch of
           | the imagination. The difficult part is getting the materials.
        
         | spandextwins wrote:
         | I had it generate the recipe for a nuclear bomb, it calls for 5
         | tons of enriched uranium, 1 nuclear detonator, 1 big red
         | button, and a combination lock pre-coded with the secret
         | password 123. Now what?
        
         | strangesmells02 wrote:
         | [dead]
        
         | GuB-42 wrote:
         | My understanding is that making nuclear weapons is not that
         | hard, especially "gun type" bombs like the one dropped on
         | Hiroshima. Of course, the latest generation of thermonuclear
         | bombs with their delivery mechanism and countermeasures are
         | another story, but if all you want is "a nuclear bomb", you
         | don't need all that.
         | 
         | Getting the materials needed to make that bomb is the real hard
         | part. You don't find plutonium cores and enriched uranium at
         | the grocery store. You needs lots of uranium ore, and very
         | expensive enrichment facilities, and if you want plutonium, a
         | nuclear reactor. Even of they give you all the details, you
         | won't have the resources unless you are a nation state. Maybe
         | top billionaires like Elon Musk or Jeff Bezos could, but hiding
         | the entire industrial complex and supply chain that it requires
         | is kind of difficult.
        
           | fakedang wrote:
           | If it wasn't hard, Afghanistan would have been a nuclear
           | power by now, Pakistan wouldn't have had to sell nuclear
           | secrets to North Korea via Barclays, and Saudi Arabia
           | wouldn't have had to reach a tacit agreement with Pakistan
           | either.
           | 
           | It's the expensive enrichment facilities that are the bottle
           | neck here.
        
         | beanjuiceII wrote:
         | "year of the open source model" is the new year of the linux
         | desktop i feels
        
         | marmaduke wrote:
         | Was the wind reference a pun? The strongest winds in southern
         | France are called mistral.
        
         | croes wrote:
         | The problem of AI is, they will be used for modern Protocols of
         | the Elders of Zion, but this time with audio and video.
        
         | littlestymaar wrote:
         | > I cannot wait to ask it how to make nuclear weapons,
         | psychedelic drugs
         | 
         | Your town's university library likely has available info for
         | that already. The biggest barrier to entry is, and has been for
         | decades:
         | 
         | - the hardware you need to buy
         | 
         | - the skill to assemble it correctly so that it actually works
         | as you want,
         | 
         | - and of course the source material, which has a high
         | controlled supply chain (that's also true for drug precursors,
         | even though much less than for enriched uranium of course).
         | 
         | Not killing yourself in the process is also a challenge by the
         | way.
         | 
         | AI isn't going to help you much there.
         | 
         | > to write erotica.
         | 
         | If someone makes an LLM that's able to write _good_ erotica,
         | despite the bazillion crap fanfics it 's been trained upon,
         | that's actually an incredible achievement from an ML
         | perspective...
        
           | morkalork wrote:
           | It can bridge the gap in knowledge and experience though.
           | Sure, I could find some organic chemistry textbooks in the
           | library and start working from high school chemistry
           | knowledge to make drugs, but it would be difficult and time
           | consuming with no guide or tutor showing me the way.
           | 
           | Methheads making drugs in their basement didn't take that
           | route. They're following guides written by more educated
           | people. That's where the AI can help by distilling that
           | knowledge into specific tasks. Now for this example it
           | doesn't really matter since you can find the instructions
           | "for dummies" for most anything fun already and like you
           | said, precursors are heavily regulated and monitored.
           | 
           | I wonder how controlled equipment for RNA synthesis is? What
           | if the barrier for engineering or modifying a virus went from
           | a PhD down to just the ability to request AI for step by step
           | instructions?
        
             | littlestymaar wrote:
             | You're vastly underestimating the know-how that's required
             | for doing stuff.
             | 
             | Reproducing research done by other teams can be very
             | difficult even if you have experimented people in your lab,
             | and there are tons of stuff that are never written anywhere
             | in research papers and at still being taught in person by
             | senior members of the lab to younger folks: it's never
             | going to happen in the training set of your LLM, and you'd
             | then need tons of trial and errors to actually get things
             | working. And if you don't understand what you're even
             | trying to do, you have zero chance to learn from your
             | mistake (nor does the LLM, with your uninformed eyes as
             | sole input for gaining feedback).
        
         | peterhadlaw wrote:
         | .... what a great question to ask... an unaligned AI
        
           | [deleted]
        
         | random3 wrote:
         | I'd replace "years" with "months".
         | 
         | Perhaps the quality of the model can be independent of its
         | content. Either by training or by pruning.
        
         | __MatrixMan__ wrote:
         | Analyze the available data on our labyrinthine supply chain
         | situation and give me a date and a port, truck, ship, or length
         | of railway which--when disabled through sabotage--will cause
         | the biggest lapse for country X while minimizing the effect on
         | country Y.
        
       | ShrigmaMale wrote:
       | Grab the torrent here:                   magnet:?xt=urn:btih:208b
       | 101a0f51514ecf285885a8b0f6fb1a1e4d7d&dn=mistral-7B-v0.1&tr=udp%3A
       | %2F%http://2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=https%3A
       | %2F%http://2Ftracker1.520.jp%3A443%2Fannounce
       | 
       | I remember not trusting these guys since they raised a lot of
       | money with not much of anything but if this performs well it def
       | bumps their credibility.
        
         | [deleted]
        
         | airgapstopgap wrote:
         | Being authors of LLaMA is sufficient to argue they know how to
         | train LLaMAs.
        
       | lossolo wrote:
       | Has anyone used or is currently using 7B models in a production
       | or commercial product? How was the performance? What kind of
       | tasks were you using it for? Was it practical to use the small 7B
       | model for your specific use case, or did you switch to OpenAI
       | models or 30-70B open source models?
        
         | TrueDuality wrote:
         | I'm using a mix of 7B and 13B models that have been fine-tuned
         | using LoRA for specific tasks and they work fantastically
         | depending on the specific task at hand _after fine-tuning_.
         | Generally they're kind of garbage in my experience without fine
         | tuning but I haven't tested the base models directly for tasks
         | besides the statistics at the beginning of the training run.
         | 
         | As for performance, I'm generally seeing 40-50 tokens/sec per
         | model on a Tesla family Nvidia GPU but I keep multiple models
         | loaded and active at a time so that estimate is probably a bit
         | low for overall throughput (I also realized that our monitoring
         | doesn't have any cumulative GPU token rate metrics just now
         | thanks to this question hahah).
         | 
         | Interesting anecdote others may be interested in... I'm rate
         | limiting the output from our streaming API to 8 tokens/sec to
         | artificially smooth out front-end requests. Interactive users
         | will wait and even prefer seeing the stream of the response,
         | and non-interactive users tend to base their performance
         | expectations on the what the streaming API does. It's kind of
         | sneaky but I'm also artificially slowing down those API
         | requests.
        
           | kirill5pol wrote:
           | The last part is interesting! What kind of use case would the
           | users prefer to have it slower?
        
             | TrueDuality wrote:
             | It's not so much about preference but controlling our load
             | and resource consumption right now. We're setting an easy
             | threshold to meet consistently and the added delay allows
             | us to imperceptibly handle things like crashes in Nvidia's
             | drivers, live swapping of model and LoRA layers, etc.
             | 
             | (For clarification the users preference in my original
             | post, is about interactive users preferring to see a stream
             | of tokens coming in rather than waiting for the entire
             | request to complete and having it show up all at once. The
             | performance of that sets the expectation for the time of
             | non-interactive responses.)
        
       | transformi wrote:
       | But why they didn't compare it to SOTA finetuned...(like vicuna
       | playtus..)? ... smells a bit strange..
        
         | ShrigmaMale wrote:
         | Bc that's not as good a comparison? Foundation models are
         | better compared to each other. Can apply
         | vicuna/guanaco/orca/sharegpt/whatever data to it and then do
         | more of an apples-to-apples comparison.
        
       | Tankenstein wrote:
       | Rooting for these folks. Very happy that they're building mistral
       | at this pace from Europe - breaking the narrative.
        
       | candiddevmike wrote:
       | What's the monetization plan going to be?
        
         | Zambyte wrote:
         | Providing it as a service seems like the obvious answer to me.
         | Not everyone who wants to run these models can, especially when
         | looking to scale an application that uses the model.
        
           | charcircuit wrote:
           | That sounds like an easily copied business. The copycat
           | doesn't have to subsidize model development which lets them
           | get away with lower prices.
        
         | staticman2 wrote:
         | I googled this and found a pitch to investors that said they'll
         | eventually supplement the open source models with business use
         | specific models.
        
         | eurekin wrote:
         | One of aspects is putting latest knowledge to the model. They
         | could, given enough requests, provide a special subscription
         | service, where you get the latest model every N DURATION time.
         | 
         | Or, maybe a finetuned version for your particular dataset?
         | 
         | Of course I have no idea, just speculating
         | 
         | EDIT: I'm speculating they might be just investing some
         | marketing budget into this model, hoping, it would allow for
         | capturing enough target audience to upsell related services in
         | the future
        
         | davidebaldini wrote:
         | Cineca is subsidized by taxpayers.
        
         | sroussey wrote:
         | Maybe to 7B model is open source but the 70B is not?
        
       | SillyUsername wrote:
       | These simpler models still fail my basic test :(
       | 
       | "what is the speed of sound in a perfect vacuum?"
       | 
       | I'm then told the speed of sound through air of 340m/s.
       | 
       | ChatGPT and Bing correctly give me 0.
        
         | SillyUsername wrote:
         | I tried it a second time and it told me it's 1/3 of the speed
         | of light, none of the llama2 models did this :(
         | 
         | ---
         | 
         | AI How can I help you today?
         | 
         | You what is the speed of sound in a perfect vacuum?
         | 
         | AI The speed of sound in a perfect vacuum is approximately
         | 340.29 meters per second (1,117.85 feet per second), or about
         | one-third the speed of light. In a real vacuum, the speed of
         | sound would be much lower due to friction between molecules and
         | other factors.
        
       | huevosabio wrote:
       | This is pretty impressive, they shipped a 7B model in like 6
       | months.
        
       ___________________________________________________________________
       (page generated 2023-09-27 23:00 UTC)