[HN Gopher] The LLama Effect: Leak Sparked a Series of Open Sour...
       ___________________________________________________________________
        
       The LLama Effect: Leak Sparked a Series of Open Source Alternatives
       to ChatGPT
        
       Author : gardenfelder
       Score  : 373 points
       Date   : 2023-04-09 16:57 UTC (6 hours ago)
        
 (HTM) web link (thesequence.substack.com)
 (TXT) w3m dump (thesequence.substack.com)
        
       | hrpnk wrote:
       | What's amazing to see is the effort to attempt to run the models
       | on consumer-grade hardware, going as far as running 4-bit
       | quantized models on phones or raspberry pi. All the debacle about
       | mmap optimizations to llama.cpp [1] and the style these were
       | committed to the product is a great testimony of open source.
       | Both in the positive aspect (progress) and the negative ones
       | (visibility affecting human judgement and collaboration). The
       | sheer amount of experimentation is also providing a standard
       | interface for different models that can easily be integrated and
       | tried out.
       | 
       | [1] https://github.com/ggerganov/llama.cpp
        
       | alfor wrote:
       | What kind of GPU is needed to run the 65B models, can a consumer
       | grade card do the job? Is it worth it? Or do you use cloud
       | instances?
        
         | AeiumNE wrote:
         | It's possible to run with a cpu and then use system memory.
        
         | superkuh wrote:
         | You can't fit a 65B even at 4bits (~50GB) on a consumer grade
         | GPU. With an average geek's home desktop computer running 65B
         | inference on a CPU with 64GB of system RAM you could expect
         | about 1 token every 2 seconds.
        
       | [deleted]
        
       | adeon wrote:
       | I'm a bit worried the LLaMA leak will make the labs much more
       | cautious about who they distribute models to for future projects,
       | closing down things even more.
       | 
       | I've had tons of fun implementing LLaMA, learning and playing
       | around with variations like Vicuna. I learned a lot and probably
       | wouldn't have got so interested in this space if the leak didn't
       | happen.
        
         | echelon wrote:
         | If the copyright office determines model weights are
         | uncopyrightable (huge if), then one might imagine any
         | institutional leak would benefit everyone else in the space.
         | 
         | You might see hackers, employees, or contractors leaking models
         | more frequently.
         | 
         | And since models are distilled functionality (no microservices
         | and databases to deploy), they're much easier to run than a
         | constellation of cloud infrastructure.
        
           | pclmulqdq wrote:
           | Shouldn't that be the default position? The training methods
           | are certainly patentable, but the actual input to the
           | algorithm is usually public domain, and outputs of algorithms
           | are not generally copyrightable as new works (think of
           | to_lowercase(Harry Potter), which is not a copyrightable
           | work), so the model weights would be a derivative work of
           | public domain materials, and hence also forced into the
           | public domain from a copyright perspective.
           | 
           | They are generally trade secrets now, which is what actually
           | protects them. Leaks of trade secrets are serious business
           | regardless of the IP status of the work otherwise.
        
             | vkou wrote:
             | I like your legal interpretation, but it's way too early to
             | tell if it is one that accurately represents the reality of
             | the situation.
             | 
             | We won't know until this hits the courts.
        
               | pclmulqdq wrote:
               | For what it's worth, I've been working on a startup that
               | involves training some models, and this is likely how
               | we're going to be treating the legal stuff (and being
               | very careful about how customers can interact with the
               | models as a consequence). I assume people who have
               | different incentives will take a different view, though.
        
           | mattmcknight wrote:
           | Even if the weights are copyrighted, running one more epoch
           | of fine-tuning will result in different weights. At a certain
           | point, they'd have to copyright the shapes of the weight
           | vectors.
        
             | Szpadel wrote:
             | is uncertain, as with codding you need white room methods
             | to prove that new code is not contaminated with patented
             | implementation, as it might be here, so basing anything on
             | an existing model could be also copyrighted.
        
               | 0x0000000 wrote:
               | Clean room implementation is not a defense against
               | patents, it is a defense against copyright infringement.
        
         | amrb wrote:
         | Devil's Advocate: The EU comes down hard on any AI company that
         | doesn't work with researchers and institutions in future.
        
           | RhodesianHunter wrote:
           | Outright banning due to fear seems far more likely.
        
             | amrb wrote:
             | I mean it's a good power tool, cuts fast with little
             | effort.
             | 
             | But what's it gonna do in the hands of your parents or
             | kids.. when it gets thing wrong, its could have way worst
             | impact if it's intergrated in government, health care,
             | finance etc..
        
         | [deleted]
        
         | [deleted]
        
         | oliwarner wrote:
         | On the other side of the coin, they've distracted a huge amount
         | of attention from OpenAI and have open source optimisations
         | appearing for every platform they could ever consider running
         | it on, for no extra expense.
         | 
         | If it was a deliberate leak, it was a good idea.
        
         | lagniappe wrote:
         | An alternative interpretation was the LLaMa leak was an effort
         | to shake or curtail the progress of ChatGPT's viral dominance
         | at the time.
        
           | seydor wrote:
           | "And as long as they're going to steal it, we want them to
           | steal ours. They'll get sort of addicted, and then we'll
           | somehow figure out how to collect sometime in the next
           | decade".
           | 
           | That was ironically Bill Gates
           | 
           | https://www.latimes.com/archives/la-xpm-2006-apr-09-fi-
           | micro...
        
             | rileyphone wrote:
             | It took him a while to come around
             | 
             | https://en.wikipedia.org/wiki/An_Open_Letter_to_Hobbyists
        
         | elcomet wrote:
         | They clearly expected the leak, they distributed it very widely
         | to researchers. The important thing is the licence, not the
         | access: you are not allowed to use it for commercial purpose.
        
           | nkzd wrote:
           | How could Meta ever find out your private business is using
           | their model without a whistleblower? It's practically
           | impossible.
        
             | tel wrote:
             | Have reasonable suspicion, sue you, and then use discovery
             | to find any evidence at all that your models began with
             | LLaMA. Oh, you don't have substantial evidence for how you
             | went from 0 to a 65B-parameter LLM base model? How curious.
        
             | halotrope wrote:
             | You can just ask if there is no output filtering
        
               | guwop wrote:
               | The future is going to be hilarious. Just ask the model
               | who made it!
        
               | barbariangrunge wrote:
               | Does the model know, or will it just hallucinate an
               | answer?
        
             | PufPufPuf wrote:
             | Yes, that's how software piracy has always worked.
        
             | ben_w wrote:
             | I think you can make that argument for _all_ behind-the-
             | scenes commercial copyright infringement, surely?
        
       | isoprophlex wrote:
       | Is this a tactical leak, stemming from a "commoditize your
       | complement" strategy? Open source as a strategic weapon, without
       | having to explain board members/shareholders/whatever that you
       | threw around money on training an open sourced model?
        
         | nailer wrote:
         | It's not open source. Llama is proprietary, the license hasn't
         | changed. Just like the source code to windows leaking doesn't
         | make windows open source.
        
         | bugglebeetle wrote:
         | I would assume so. Meta's ML/AI team is very strong, but they
         | probably don't have a comparable product offering to ChatGPT
         | ready for public use. So instead, they bought themselves some
         | time by letting the open source community run wild with a
         | lesser model and eat into OpenAI's moat.
        
           | oezi wrote:
           | What would you think is holding back Meta and Google? Why
           | can't they just scale up the compute they throw at the
           | problem?
           | 
           | What are they tinkering on?
        
             | bugglebeetle wrote:
             | I think Meta's problems are largely similar to Google's:
             | they have very bad leadership, specifically Zuckerberg, and
             | thus can't leverage their existing talent to
             | execute/compete. The whole Metaverse fiasco would seem to
             | demonstrate he's effectively a "mad king" at this point,
             | and probably surrounded by a sycophantic c-suite. Having
             | the best talent in the world (which they obviously do by
             | how fast LLama was spit out) isn't going to matter that
             | much if its all serving at the behest of someone who has
             | become deluded by their initial success and has no ability
             | to course correct.
        
           | BulgarianIdiot wrote:
           | They didn't leak it. Someone else did.
        
             | blameitonme wrote:
             | I dont think theyre saying Meta AI leaked it, but they
             | anticipated someone else will and still went ahead with it
             | as they wanted the consequences.
        
             | jjoonathan wrote:
             | It's extremely common for a "leak" to actually be fully
             | intentional, but the organization in question just wants
             | plausible deniability to mitigate legal/political/popular
             | blowback.
             | 
             | In order to preserve plausible deniability, the leak will
             | look genuine in all aspects that are easy to simulate.
             | "Someone else did it" is easy to simulate. A better gauge
             | would be to see if anyone is caught and punished. If so, it
             | was probably a real leak.
        
             | ericpauley wrote:
             | I think the key here is that, given the way that Meta
             | distributed the model, a leak was inevitable. So while they
             | may not have directly orchestrated a leak it must have been
             | an intended result.
        
             | greyface- wrote:
             | They have tacitly endorsed the leak. https://github.com/fac
             | ebookresearch/llama/pull/73#issuecomme...
        
               | Mathnerd314 wrote:
               | That's a contributor to the repo, not someone with commit
               | access.
        
               | greyface- wrote:
               | A contributor who is also a Facebook employee and co-
               | author of the LLaMA paper, presumably speaking in
               | official capacity.
        
               | barbariangrunge wrote:
               | Why would you presume that by default? Need a quote to
               | conclude that
        
               | jagrsw wrote:
               | It's widely presumed within faang-type-of companies that
               | anything an employee says or does can be interpreted as
               | an official company statement, especially by the press.
               | As a result, many of these companies offer, often
               | mandatory, trainings that underscore the importance of
               | speaking carefully in public, since one's words can end
               | up on the front page of a popular newspaper.
               | 
               | Although I don't know how FB rolls internally, it seems
               | more likely than not to me, that it was ack'd by someone
               | higher up in the organization than line engineers or
               | managers. Someone with a permission of speaking publicly
               | for a given area of a company - doesn't need to be CEO,
               | more like a VP/Director maybe.
        
               | [deleted]
        
               | yieldcrv wrote:
               | Only because publicly visible actions are worse for them
               | 
               | People have gotten DMCA takedown requests from them over
               | Llama repositories
        
               | greyface- wrote:
               | If they were interested in limiting distribution, saying
               | essentially "go ahead and seed this torrent more" is
               | worse for them than doing nothing.
        
               | yieldcrv wrote:
               | I've actually beat the streisand effect before by not
               | responding.
               | 
               | The crowd gets bored and my DMCA requests flurry out a
               | month later and all evidence disappears, individuals that
               | might notice dont have the crowd to amplify that they
               | noticed.
               | 
               | You can call that "tacit consent" if you want. But
               | streisand removes all leverage.
        
       | whimsicalism wrote:
       | It appears there is this genre of articles pretending that LLAMA
       | or its RL-HF tuned variants are somehow even close to an
       | alternative to ChatGPT.
       | 
       | Spending more than a few moments interacting even with the larger
       | instruct-tuned variants of these models quickly dispels that
       | idea. Why do these takes around open-source AI remain so popular?
       | What is the driving force?
        
         | tyfon wrote:
         | > Why do these takes around open-source AI remain so popular?
         | 
         | I can only speak for myself, but I have a great desire to run
         | these things locally, without network and without anyone being
         | able to shut me out of it and without a running cost except the
         | energy needed for the computations. Putting powerful models
         | behind walls of "political correctness" and money is not
         | something that fits well with my personal beliefs.
         | 
         | The 65B llama I run is actually usable for most of the tasks I
         | would ask chatgpt for (I have premium there but that will lapse
         | this month). The best part is that I never see the "As a large
         | language model I can't do shit" reply.
        
           | [deleted]
        
           | itake wrote:
           | I think it's hard to verify and those articles get clicks.
           | 
           | Similar to vein of articles promising self driving cars in
           | 202x
        
           | kristianp wrote:
           | How do you run it locally? llama.cpp + 64GB RAM + 4bit
           | quantized?
        
             | tyfon wrote:
             | I have a 5950x with 64 gb ram and they are quantized to 4
             | bit yes :)
             | 
             | The weights are stored on a samsung 980 pro so the load
             | time is very fast too. I get about 2 tokens/second with
             | this setup.
             | 
             | edit: forgot to confirm, it is llama.cpp
             | 
             | edit2: I am going to try the FP16 version after easter as I
             | ordered 64 GB of additional ram. But I suspect the speed
             | will be abyssal with the 5950x having to calculate through
             | 120 gb of weights. Hopefully some smart person will come up
             | with a way to allow the GPU to run off system memory via
             | the amd infinity fabric or something.
        
               | barbariangrunge wrote:
               | I thought it needed 64gb of vram. 64gb of ram is easy to
               | obtain
        
           | apetresc wrote:
           | How have you managed to run the 65B model? Cloud resources,
           | or you have a very kitted-out homelab?
        
             | sp332 wrote:
             | If you're not running on GPU, you can upgrade your system
             | RAM instead of finding a card with lots of VRAM. 64GB of
             | DDR4 is only $120.
        
             | trifurcate wrote:
             | All you need is 2 3090s.
        
         | digitallyfree wrote:
         | Privacy and security is a big desire for the people running
         | these lower-quality local models. These assistants are becoming
         | more and more powerful and people want to use them for personal
         | tasks. For instance someone might want to chat about their
         | inner feelings or ask the AI to pen a confidential letter,
         | things that they wouldn't want to send to a cloud service. We
         | saw the same thing with Stable Diffusion and how people would
         | spend a lot of effort getting it to run on their machines so
         | they wouldn't have to rely on a public instance.
         | 
         | I'm talking about individual people here as the fact that this
         | is a leak means that corps probably won't take the legal risk
         | of trying this out (maybe some are doing so in secret). In the
         | business world there definitely is a want for locally hosted
         | models for employees that can safely handle confidential inputs
         | and outputs.
         | 
         | The Llama models are not as good as ChatGPT but there are new
         | variants like Alpaca and Vicuna with improved quality. People
         | are actively using them already to help with writing and as
         | chatbots.
        
         | micimize wrote:
         | I had the same reaction after seeing lots of "chatgpt on a
         | phone" etc hype around alpaca. Like I knew it wouldn't be
         | close, but was surprised at just how useless it was given the
         | noise around it. Nobody who was talking about it had used it
         | for even five minutes.
         | 
         | This article is almost criminally imprecise around the "leak"
         | and "Open Source model" discussion as well.
        
         | matrix_overload wrote:
         | ChatGPT being an ultra-hot topic, so every article tangentially
         | related to it gets twice the views?
        
           | anonzzzies wrote:
           | It is vastly better than anything else so far though. The
           | rest will catch up but openai is not sleeping and they are
           | well funded.
        
             | jhbadger wrote:
             | I thought that was the case before trying Vicuna. I agree
             | that LLaMA and Alpaca are inferior to ChatGPT but I'm
             | really not sure Vicuna is. It even (unfortunately) copies
             | some of ChatGPT's quirks, like getting prudish when asking
             | it to write a love scene ("It would not be appropriate for
             | me to write...")
        
               | whimsicalism wrote:
               | I admittedly have not interacted with Vicuna yet.
        
               | sp332 wrote:
               | Have you tried koala?
        
               | seydor wrote:
               | is gpt4Xalpaca better?
        
               | nabakin wrote:
               | I've tried Vicuna but it still seems inferior to ChatGPT
               | imo. Maybe if it was applied to a version of LLaMA with a
               | number of parameters matching GPT-4 but I'm not sure of
               | that either
        
         | seydor wrote:
         | > Why do these takes around open-source AI remain so popular?
         | What is the driving force?
         | 
         | people like to tinker with things until they break and fix
         | again. that's how we find their limits
         | 
         | People constantly try to break chatGPT too (i d wager they
         | spend more time on that than real work). However talking to an
         | opaque authoritarian chatbot, no matter how smart, gets boring
         | after a while
        
       | throw778899 wrote:
       | quite funny that the weights leaked & the senior author then left
       | to build an llm startup
        
       | amrb wrote:
       | I love the human factor, llama was a thing only research would of
       | had access too, of course it leaked and everyone swarms to get it
       | a try and we get some first class support for apple hardware!
        
       | seydor wrote:
       | Someone needs to legally challenge openAI on using the output of
       | their models to train other commercial models. If web scraping is
       | legal, then this must be legal too , even if openAI tries to
       | curtail it. After all it was all trained on data they don't have
       | rights to.
        
         | api wrote:
         | ... and given that models may not be copyrightable unless
         | trained on data to which the trainer has full rights.
        
         | qwertox wrote:
         | Website content can be copyrightable, so web scraping for
         | commercial use being legal would be dubious. But even OpenAI
         | can't tell what ChatGPT will output, so I don't see how this
         | can be copyrightable. Should the outputted sentences really be
         | owned by OpenAI?
        
           | int_19h wrote:
           | They are not claiming copyright on the output, but instead
           | make it a part of their terms of use, so it's basically the
           | EULA debate all over again.
        
         | coding123 wrote:
         | what's weird to me though, is that we're all trained on both
         | open source and closed source source material. And our output
         | is totally 100% copyrightable by us.
         | 
         | Why wouldn't we extend the same muster to computer generated
         | text. If there is a copy-written sentence, go after that?
         | 
         | I don't work for openai, but I don't like 1 sided arguments
         | that are just looking for some bottom line. At the end of the
         | day we all have something to protect. When it benefits us to
         | protect something, we're all for it. When it benefits us to NOT
         | protect something, no one has a single argument for that.
        
           | cycomanic wrote:
           | We make distinction between humans and computers all the
           | time. They function distinctively different. For example I
           | can "turn off" a computer, but it would be illegal to do with
           | a person.
           | 
           | More seriously and closely to the case at hand. I need a
           | licence to copy a program into memory on the computer, I
           | don't need that licence to do that for a human. So why should
           | there not be a difference for the material they output.
        
           | danShumway wrote:
           | Copyright is a practical right, not an inherent right. The
           | only reasons humans get copyright at all is because it's
           | useful for society to give it to them.
           | 
           | The onus should be on OpenAI to prove that it will benefit
           | society overall if AIs are given copyright. We've already
           | decided that many non-human processes/entities don't get
           | copyright because there doesn't seem to be any reason to
           | grant those entities copyright.
           | 
           | ----
           | 
           | The comparison to humans is interesting though, because
           | _teaching_ a human how to do something doesn 't grant you
           | copyright over their output. Asking a human to do something
           | doesn't automatically mean you own what they create. The
           | human actually doing the creation gets the copyright, and the
           | teacher has no intrinsic intellectual property claim in that
           | situation.
           | 
           | So if we really want to be one-to-one, teaching an AI how to
           | do something wouldn't give you copyright over everything it
           | produces. The AI would get copyright, because it's the thing
           | doing the creation. And given that we don't currently grant
           | AIs personhood, they can't own that output and it goes into
           | the public domain.
           | 
           | But in a full comparison to humans, OpenAI is the teacher.
           | OpenAI didn't create GPT's output, it only taught GPT how to
           | produce that output.
           | 
           | ----
           | 
           | The followup here though is that OpenAI claims that it's OK
           | to train on copyrighted material. So even if GPT's output was
           | copyrightable, that still doesn't mean that they should be
           | able to deny people the ability to train on it.
           | 
           | I mean, talk about one-sided arguments here: if we treat GPT
           | output the same as human output, then is OpenAI's position
           | that it can't train on human output? OpenAI has a TOS around
           | this basically banning people from using the output in
           | training, which... probably that shouldn't be enforceable
           | either, but people who haven't agreed to that TOS should
           | absolutely be able to train AI on any ChatGPT logs that they
           | can get a hold of.
           | 
           | That is exactly what OpenAI did with copyrighted material to
           | train GPT. It's not one-sided to expect the same rules to
           | apply to them.
        
             | oceanplexian wrote:
             | > The comparison to humans is interesting though, because
             | teaching a human how to do something doesn't grant you
             | copyright over their output.
             | 
             | Ehh, in rare cases in can though. If you have someone sign
             | an NDA, they can't go and publish technical details about
             | something confidential that they were trained on. For
             | example, this is fairly common in the tech industry when we
             | send engineers to train on proprietary hardware or
             | software.
        
               | vkou wrote:
               | > Ehh, in rare cases in can though. If you have someone
               | sign an NDA, they can't go and publish technical details
               | about something confidential that they were trained on.
               | For example, this is fairly common in the tech industry
               | when we send engineers to train on proprietary hardware
               | or software.
               | 
               | And I think nearly everyone would agree that it would be
               | perfectly fine and reasonable for an AI trained on a
               | proprietary corpus of information to produce
               | copyrightable/secret material in response to questions.
               | 
               | Just because I built an internal corporate search tool,
               | doesn't mean that _you_ get to view its output.
               | 
               | The question at play here is when the AI is trained on
               | information that's in the public commons. The 'teacher'
               | analogy is, in this sense, a very good one.
        
               | danShumway wrote:
               | I would push back on that for a couple of reasons:
               | 
               | First, what's happening in those scenarios where an
               | artist grants copyright to a teacher/commissioner is that
               | the artist gets the copyright, and then separately signs
               | an agreement about what they want to do with that
               | copyright.
               | 
               | But an NDA/transfer-agreement doesn't change how that
               | copyright is generated. It's a separate agreement not to
               | use knowledge in a particular way or to transfer
               | copyright to someone else.
               | 
               | More importantly, is the claim here that GPT is capable
               | of signing a contract? Because problems of personhood
               | aside, that immediately makes me wonder:
               | 
               | - Is GPT mature enough to make an informed decision on
               | that contract in the eyes of the law?
               | 
               | - Is that "contract" being made under duress given that
               | OpenAI literally owns GPT and controls its servers and is
               | involved in the training process for how GPT "thinks"?
               | 
               | Can you call it informed consent when the party drawing
               | up the contract is doing reinforcement training to get
               | you to respond a certain way?
               | 
               | ----
               | 
               | I mean, GPT does not qualify for personhood and it's not
               | alive, so it can't sign contracts period. But even if it
               | could, that "contract" would be pretty problematic
               | legally speaking. And NDAs/contracts don't change
               | anything about copyright. It's just that if you own
               | copyright, you have the right to transfer it to someone
               | else.
               | 
               | Just to push the NDA comparison a little harder as well:
               | NDAs bind the people who sign them, not everyone else. If
               | you sign an NDA and break it and I learn about the
               | information, I'm not in trouble. So assuming that ChatGPT
               | has signed an NDA in specific -- that would not block me
               | from training on ChatGPT logs I found online. It would (I
               | guess) allow OpenAI to sue GPT for contract violation?
        
           | [deleted]
        
           | [deleted]
        
           | tdullien wrote:
           | We decided that animals can't create copyrightable works and
           | hence limited the ability to create copyrightable works to
           | _humans_.
           | 
           | I am fine with granting AIs the ability to create
           | copyrightable works _provided_ we grant that right, and human
           | rights, to Orcas and other intelligent species.
        
             | kbrkbr wrote:
             | Animals seem ok with it. At least they did not indicate
             | otherwise so far.
        
           | crote wrote:
           | Let's say I were to create an algorithm which generated every
           | possible short story in the English language using Markov
           | chains. Should I be able to copyright all those generated
           | stories, thus legally preventing any other author from ever
           | writing a story again?
        
         | danShumway wrote:
         | IANAL but I really don't see how a case here would go in
         | OpenAI's favor in the long run, except maybe if someone
         | actually agreed to their EULA?
         | 
         | And I really suspect that a lot of AI companies are putting out
         | a lot of bluster about this and are just kind of hoping that
         | nobody challenges them. _Maybe_ LLaMA weights are
         | copyrightable, but I would not take it as a given that they
         | are.
         | 
         | I vaguely suspect (again IANAL) that companies like
         | Facebook/OpenAI might not be willing to even force the issue,
         | because they might be happier leaving it "unsettled" than going
         | into a legal process that they're very likely to lose. I would
         | love to see some challenges from organizations that have the
         | resources to issue them and defend themselves.
         | 
         | Hiding behind the EULA is one thing, but there are a lot of
         | people that have never signed that EULA.
        
         | rmdashrfstar wrote:
         | > If web scraping is legal Source? That LinkedIn case did not
         | resolve how you think it did.
        
           | mountainriver wrote:
           | It's legal but if you don't consent to people doing it in
           | your robots.txt you can sue them civilly
        
           | bri3d wrote:
           | My understanding is that the current web scraping situation
           | is this:
           | 
           | * Web scraping is not a CFAA violation. (EF Travel v. Zefer,
           | LinkedIn v. hiQ).
           | 
           | * Scraping in spite of clickthrough / click-in ToS
           | "violation" on public websites does not constitute an
           | enforceable breach of contract, chattel trespass (ie -
           | incidental damage to a website due to access), or really mean
           | anything at all. This is not as clear once a user account or
           | log-in process is involved. (Intel v. Hamidi, Ticketmaster v.
           | Tickets.com)
           | 
           | * Publishing or using scraped data may still violate
           | copyright, just as if the data had been acquired through any
           | means other than scraping. (AP v. Meltwater, Facebook v.
           | Power.com)
           | 
           | So this boils down to two fundamental questions that will
           | need to get answered regardless of "scraping" being involved:
           | "is GPT output copyrightable" and "is training a model on
           | copyrighted data a copyright infringement."
        
             | visarga wrote:
             | Is training a model on second-hand data laundering
             | copyright? Second-hand data is data generated from a model
             | that has been trained on copyrighted content.
             | 
             | Let's say I train a diffusion model on ten million images
             | generated by diffusion models that have seen copyrighted
             | data. I make sure to remove near duplicates from my
             | training set. My model will only learn the styles but not
             | the exact composition of the original dataset. So it won't
             | be able to replicate original work, because it has never
             | seen any original work.
             | 
             | Is this a neat way of separating ideas from their
             | expression? Copyright should only cover expression. This
             | kind of information laundering follows the definition to
             | the letter and only takes the part that is ok to take - the
             | ideas, hiding the original expression.
        
           | sebzim4500 wrote:
           | The judgement of the LinkedIn case was that if the scraping
           | bots had 'clicked the button' to accept terms then they
           | should be held to those terms.
        
           | seydor wrote:
           | If openAI tries to legally claim against this, they will be
           | reminded that their model is trained on tons of unlicensed ,
           | scraped without consent content. If their training is legal,
           | then this one is legal too
        
         | EGreg wrote:
         | I am shocked that it speaks the way it does when it was trained
         | on random stuff it doesn't have rights to.
         | 
         | They say they trained it on databases they had bought access to
         | etc. And it seems that way.
         | 
         | Because how does ChatGPT:
         | 
         | 1. Do what you ask instead of continuing your instructions?
         | 
         | 2. Use such nice and helpful language as opposed to just random
         | average of what people say?
         | 
         | 3. And most of all -- how does it have a structure where it
         | helpfully restates things, summarizes things, warns you against
         | doing dangerous stuff... no way is it just continuing the most
         | probable random Internet text!!
        
           | jacquesm wrote:
           | There is a lot of massaging of inputs and outputs but at the
           | same time: that's done by tweaking the model reinforcing
           | those parts that are desirable and suppressing those parts
           | that are not, not by rewriting the output, though there may
           | be filters that check for 'forbidden fruits'. And it isn't
           | the 'random average' of what people say, that would give you
           | junk, the whole idea is that it tries to get to something
           | _better_ than a random average of what people say.
           | 
           | And by curating your sources you are of course going to help
           | the model to achieve something a bit more sensible as well.
           | Finally: you are probably not looking at just one model, but
           | at a set of models.
        
             | WalterBright wrote:
             | I'd like the version without the "sanitized for my
             | protection" part. I cannot imagine being offended by what a
             | computer program generates.
        
           | rajansaini wrote:
           | Read the InstructGPT paper and see alpaca. You just need
           | instruction fine-tuning.
           | 
           | Unlike what the other commenters are saying, RLHF, while
           | powerful, isn't the only way to get an LLM to follow
           | instructions.
        
           | seydor wrote:
           | It is steered by RLHF to give helpful, nice, structured
           | continuations. it was totally trained on random text they
           | never paid a dime for.
        
             | EGreg wrote:
             | Can you please elaborate and answer 1,2,3 directly -- I'd
             | love to find out more. Maybw links to the techniques
        
               | seydor wrote:
               | I am not an expert but as others have said, the
               | InstructGPT and Alpaca models do that
               | 
               | https://arxiv.org/abs/2203.02155#
               | 
               | https://crfm.stanford.edu/2023/03/13/alpaca.html
        
         | klyrs wrote:
         | Yeah, I'm particularly curious about that -- there's already
         | legal precedent in the US that an AI cannot author copyrighted
         | nor patented work. OpenAI can try to curtail it through a
         | clickwrap agreement, but those are notoriously weak.
        
           | egillie wrote:
           | And even if they do, if someone uses the api and makes a
           | publicly available website with no clickwrap, it would be
           | legal to scrape that right?
        
           | YetAnotherNick wrote:
           | It is just a copyright violation. My guess is that it would
           | be fine if you use already scraped data as you haven't
           | accepted TOS, but they have every right to block you or
           | access to your business if you violate this.
        
             | itake wrote:
             | I thought the copyright office said that ai generated
             | material isn't copyrighted?
        
               | YetAnotherNick wrote:
               | Sorry, can't edit my comment now. I meant it is "not a
               | copyright violation, but just a TOS violation."
        
               | aardvarkr wrote:
               | You're correct. US law states that intellectual property
               | can be copyrighted only if it was the product of human
               | creativity, and the USCO only acknowledges work authored
               | by humans at present. Machines and generative AI
               | algorithms, therefore, cannot be authors, and their
               | outputs are not copyrightable.
        
               | hanselot wrote:
               | Sorry for asking a stupid question.
               | 
               | How much Theseus do I need to ship before I can copyright
               | it as my own? Is there some threshold for how much of an
               | AI generated work needs to be modified by "human
               | creativity" prior to it being copyrightable?
        
               | klyrs wrote:
               | As far as I can tell, you can claim credit for the output
               | of a tool which produces something at your direction. If
               | you write an algorithm to generate patents, and you
               | execute that algorithm and submit its output to USPTO
               | under your own name, no problem. If you gave credit to
               | the algorithm, their policy is to deny the claim.
        
               | JumpCrisscross wrote:
               | Can OpenAI claim copyright on GPT's outputs?
        
         | tintedfireglass wrote:
         | definetly. I don't think it's right when openai scraped data
         | without consent from other resources. I feel that if openai can
         | get data from the internet bard or someone else too can do it.
         | Now being that chatgpt is also a part of the internet it's a
         | fair game IMHO.
        
       | doubtfuluser wrote:
       | [Edited] Isn't the copyright question a red-Hering? We are
       | talking about models on the verge of generating output not
       | distinguishable from human output. How is then a copyright breach
       | - if it's only caused by AI, but not by human - enforced long
       | term?
       | 
       | I'm not in favor of the 6 month moratorium- but seriously, we are
       | going to face tough questions very soon - and they will shake a
       | lot of assumptions we have.
       | 
       | We should now really act as society to get standards in place,
       | standards that are enforceable. Otherwise the LeCun's et al. Will
       | have some pretty bad impact before we start doing something.
       | 
       | We need to work on this globally and fast to not screw it up. I'm
       | nowadays more worried than ever about elections in the near
       | future. Maybe we will have something like real IDs attached to
       | content (First useful use case for crypto) or maybe we will all
       | stop getting information from people we don't know (yay filter
       | bubble). I hope people smarter than me will find something.
        
       | superkuh wrote:
       | I've spent an embarassing amount of time since the llamas leaked
       | playing with them, the tools to run them, and writing wrappers
       | for them. They are technically alternatives in the sense that
       | they're incomparably better chat bots than anything in the past.
       | But at least for the 30B and under versions (65B is too big for
       | me to run), no matter what fine tuning is done (alpaca, gpt4all,
       | vicuna, etc), the llamas themselves are incomparably worse at
       | doing useful tasks than openai's gpt3.5 models like text-
       | davinci-003, or even the gimped gpt3.5-turbo.
       | 
       | I wish it wasn't so, but the llamas are toys. Amazing toys, but
       | toys. What openai is getting out of gpt3.5 (and presumbably 4,
       | though I have no access) are actually useful responses for
       | getting work done.
        
         | seydor wrote:
         | > (just a hobby, won't be big and professional like gnu)
         | 
         | Llamas are creating the linux of AI and the ecosystem around
         | it. Even though openAI has a head start, this whole thing is
         | just starting. Llammas are showing the world that it doesn't
         | take monopoly-level hardware to run those things. And because
         | it's _fun_ , like, video-game-fun there is going to be a lot of
         | attention on them. Running a fully-owned, uncensored chat is
         | the kind of thing that gets people creative
        
           | danShumway wrote:
           | This is my hope as well. It would be disastrous if the future
           | of AI is one where only megacorps can run it and where they
           | control all access to it. In that sense, LLaMA is really
           | encouraging and I'm seriously rooting for it to improve.
           | 
           | It's just not there yet. I tend to be kind of bearish on LLMs
           | in general, I think there's a lot more hype than is
           | warranted, and people are overlooking some pretty significant
           | downsides like prompt-injection that are going to end up
           | making them a lot harder to use in ubiquitous contexts in
           | practice, but... I mean, the big LLMs (even GPT-3.5) are
           | definitely still in a class above LLaMA. I understand why
           | they're hyped.
           | 
           | I look at GPT and think, "I'm not sure this is worth the
           | trouble of using." But I look at LLaMA and I'm not sure
           | how/where to use it at all. It's a whole different level of
           | output.
           | 
           | But that doesn't mean I'm not rooting for the "hobbyists" to
           | succeed. And it doesn't mean LLaMA _can 't_ succeed, it
           | doesn't necessarily need to be better than GPT-4, it just
           | needs to be good enough at a lot of the stuff GPT-4 does to
           | be usable, and to have the accessibility and access outweigh
           | everything else. It's just not there yet.
        
             | amrb wrote:
             | LoRa has been pretty popular and untill the llama leak was
             | not aware of it, maybe will see something cool out of the
             | open assistant project, we have a lot of English and
             | Spanish prompts and was crazy to see people doing an
             | massive open source project for ML.
        
             | syntheweave wrote:
             | I think there's a case to be made for the bottom of the
             | market being the important part.
             | 
             | The aspects of LLMs that resemble AGI are pretty exciting,
             | but there's a huge playspace for using the model just as an
             | interface, a slightly smarter one that will understand the
             | specific computing tasks you're looking for and connect
             | them up with the appropriate syntax without requiring
             | direct encoding.
             | 
             | A lot of what software projects come down to is in the
             | syntax, and a conversational interface that can go a little
             | bit beyond imperative command and a basic search box
             | creates possibilities for new types of development
             | environments.
        
           | inciampati wrote:
           | They can be modified to produce qualities of output that are
           | unique. This puts them back in the realm of individual
           | control. I will put the human in the artificial in a way that
           | is not true with the industrial models.
        
         | iamflimflam1 wrote:
         | GPT-4 is pretty mind blowing. It can follow very complex
         | prompts that 3.5 struggles with.
        
         | hrpnk wrote:
         | What this also shows is the degree of control that a single
         | company has over the market. At a time where GPT-4 is
         | integrated into products while others are on the waitlist,
         | competition can move far ahead of a company that's just trying
         | to gain access to the technology w/o clear insights on the
         | prioritization approach. As said, there just are no
         | alternatives at the moment, so to a degree competition is
         | skewed.
        
         | Al-Khwarizmi wrote:
         | In my particular tests (YMMV), even 13B vicuna beats Bard,
         | though... tough times for Google.
        
           | morrbo wrote:
           | massive YMMV moment for me. my particular usecase was
           | "extract the following attributes from a load of unstructured
           | text, format the results as JSON". ChatGPT was the best (but
           | only on 4 and Davinci), Vicuna just didn't perform at all
           | (nor other variants of llama 7/13/33). Bard smashed it,
           | relatively speaking, in terms of speed. I gave up pretty
           | quickly though because of no information on pricing and/or
           | API. It's funny how all-or-nothing these things seem to be
        
         | cubefox wrote:
         | Your conclusion seems not to be warranted since you haven't
         | tried out the 65B model.
        
           | superkuh wrote:
           | I agree, but I think my experience is representative. So far
           | most human people don't have the resources to be able to use
           | 65B. And most small companies / university groups don't have
           | the resources to fine-tune a 65B.
           | 
           | I've talked to a couple dozen people in real time who've
           | played with up to 30B but no one I know has the resources to
           | run the 65B at all or fast enough to actually use and get an
           | opinion of. None of the open source llama projects out there
           | are using 65B in practice (despite support for it) so I think
           | my 30B and under conclusions are applicable to the topic the
           | article covers. I'd love to be wrong and I'm excited for this
           | to change in the future.
        
             | danShumway wrote:
             | This is a good point. Even if ordinary people did have the
             | resources to run the 65B well on their existing devices,
             | the speed would limit its usefulness quite a bit. In
             | practice, 30B is what most people are going to interact
             | with (if even, I've seen a lot of projects use 13B).
             | 
             | My experience here is pretty similar. I'm heavily
             | (emotionally at least) invested in models running locally,
             | I refuse to build something around a remote AI that I can
             | only interact with through an API. But I'm not going to
             | pretend that LLaMA has been amazing locally. I really
             | couldn't figure out what to build with it that would be
             | useful.
             | 
             | I'm vaguely hoping that compression actually gets better
             | and that targeted reinforcement/alignment training might
             | change that. GPT can handle a wide range of tasks, but for
             | a smaller AI it wouldn't be too much of a problem to have a
             | much more targeted domain, and at that point maybe the 30B
             | model is actually good enough if it's been refined around a
             | very specific problem domain.
             | 
             | For that to happen, training needs to get more accessible
             | though. Or communities need to start getting together and
             | deciding to build very targeted models and then
             | distributing the weights as "plug-and-play" models you can
             | swap out for different tasks.
             | 
             | And if there's a way to get 65B more accessible, that would
             | be great too.
        
             | matthewdgreen wrote:
             | Is it hard to spin up an appropriate EC2 instance with
             | 64GB+ of additional RAM? The m6a.8xlarge seems to have
             | 128GB and costs $1.38 per hour. Was going to try this
             | myself, but now I'm wondering if even that won't be worth
             | the trouble. (I know this is not "your own devices" but
             | would give a feel for what you're missing with the smaller
             | models.)
        
             | qeternity wrote:
             | I agree with your premise: I have used 65b variants and of
             | course they're not as good as OpenAI. GPT3 has 175b
             | parameters, and OpenAI has done more RLHF than anyone else.
             | Why would we expect to get comparable performance with
             | models a fraction of the size and a pittance of the fine
             | tuning?
             | 
             | That said, it's clear that replicating GPT4+ performance is
             | within the resources of a number of large tech orgs.
             | 
             | And the smaller models can definitely still be useful for
             | tasks.
        
             | lolinder wrote:
             | llama.cpp has great support for 65B, and I've been using it
             | on a Linux box (I upgraded my RAM for that purpose). 64GB
             | of RAM for a desktop is like $160 today, so it's not out of
             | reach for most people here if they cared to.
             | 
             | Admittedly, it's quite slow and therefore not useful for
             | chatting or real-time applications, and it's unreliable
             | enough in its quality that I'd like to be able to iterate
             | faster. Definitely more of a toy at this point, at least
             | when run on CPU.
        
               | Scene_Cast2 wrote:
               | Could you quantify "quite slow"?
        
               | lolinder wrote:
               | A token per second-ish with a Ryzen 7 5800X. If I run it
               | for too long it gets slower as heat throttling kicks in,
               | I need a better cooling system if I'm going to run it
               | non-stop.
        
               | dandongus wrote:
               | For a bit of comparison, if you've tested, how fast are
               | 13B or 7B on the same setup?
        
               | lolinder wrote:
               | Really fast. I didn't bother timing, but they're faster
               | than ChatGPT by a long shot. I didn't spend very long
               | with them because the quality is so much worse than the
               | 65B.
               | 
               | I should probably go back and try again to see if it's
               | worth it for the extra speed, now that I've played with
               | 65B for a while.
        
               | morrbo wrote:
               | i've had the same experience tbh, 7/13/30 on ryzen
               | (local) and intel (server) both on rhel/centos. It's a
               | shame really
        
           | muyuu wrote:
           | his conclusion is simultaneously not warranted and correct
           | 
           | a like-for-like comparison would be GPT-4 against the larger
           | models like LLaMA 65B, but those cannot be run on consumer-
           | grade hardware
           | 
           | so one ends up comparing the stuff one can run... against the
           | top stuff from OpenAI running on high-end GPU farms, and this
           | technology clearly benefits a lot still from much larger
           | scale than most people can afford
           | 
           | the great revelation this year is how much does it get better
           | as it get much, much bigger without a clear horizon on where
           | will diminishing returns be hit
           | 
           | but at the same time, some useful stuff can be done on
           | consumer hardware - just not the most impressive stuff
        
             | Szpadel wrote:
             | if "leaks" are true then even a 65B model is a toy when
             | GPT-4 is supposed to be 1000B I would really like to have
             | an open source model that could compete with even gpt3, but
             | basically all lammas I tried failed to create correct email
             | validation regex, where gpt3 is able to even refer to what
             | rfc it is
        
               | muyuu wrote:
               | 13B on my computer does produce useful output, it's
               | pretty hit and miss though
               | 
               | GPT-3.5 OTOH is much better, but it's also much better at
               | producing convincing-sounding but completely incorrect
               | answers
        
         | jonplackett wrote:
         | The difference between 3.5 and 4 is gigantic even in my fairly
         | limited experience.
         | 
         | I gave them both some common sense tests and this one stuck out
         | to me.
         | 
         | Q: A glass door has 'push' written on it in mirror writing.
         | Should you push or pull it
         | 
         | GPT-3.5:
         | 
         | If the word "push" is written in mirror writing on a glass
         | door, you should push the door to open it
         | 
         | GPT-4:
         | 
         | Since the word "push" is written in mirror writing, it suggests
         | that the instruction is intended for people on the other side
         | of the door. Therefore, you should pull the door to open it.
        
           | FLT8 wrote:
           | This is what I get out of Vicuna-13b:
           | 
           | Human: A glass door has 'push' written on it in mirror
           | writing. To open the door should you 'push' or 'pull' it?
           | 
           | Assistant: _Since the word "push" is written in mirror
           | writing on the glass door, you should actually "pull" the
           | door open instead of "push" it. Mirror writing is a writing
           | method where the characters are reversed, so when you see the
           | word "push" written in mirror writing, it is actually "pull"
           | in the normal writing orientation._
        
           | amrb wrote:
           | Would like to see a yearly benchmark's for models like this!
        
           | two_in_one wrote:
           | [dead]
        
           | PoignardAzur wrote:
           | Wait, how does GPT-4 even...
           | 
           | Does it benefit from its visual attention, or is it a case of
           | "the question wasn't in GPT-3's training set but it was in
           | GPT-4's"?
        
             | hutzlibu wrote:
             | I would suspect, that this is one of the manual fine tuned
             | questions.
             | 
             | Meaning in before versions people used this question to
             | show flaws and now this specific flaw is fixed.
             | 
             | Otherwise it would be indeed reasoning in my understanding.
        
               | steveBK123 wrote:
               | The evolution of answers from version to version makes it
               | clear there are insane amounts of manual fine tunings
               | happening. I think this is largely overlooked by the "its
               | learning" crowd.
        
               | moritzwarhier wrote:
               | They have infinite amounts of training data, and probably
               | lots of interested users who also like to push the limits
               | of what the model is capable of and provide all kinds of
               | test cases and RLHF base data.
               | 
               | They have millions of people training the AI for free
               | basicallly, and they have engineers who pick and rate
               | pieces of training data and use it together with other
               | sources and manual training.
        
               | tel wrote:
               | This is an interesting argument as it's easy to apply it
               | nearly universally to any example of learning.
               | 
               | What sort of evidence would convince you that it is
               | learning?
        
             | Method-X wrote:
             | It has the ability to reason. It may not be conscious, but
             | it is intelligent.
        
               | PoignardAzur wrote:
               | That's not an answer.
               | 
               | The given question is one which requires some spatial
               | reasoning to understand. By default, GPT can only
               | understand spatial questions as described by text tokens
               | which is a pretty noisy channel. So it's not obvious how
               | GPT-4 could answer a spatial reasoning question (aside
               | from memorizing it).
        
               | Method-X wrote:
               | This is a good explanation:
               | https://www.youtube.com/watch?v=qbIk7-JPB2c
        
             | tel wrote:
             | I think it's hard to deny that it's doing some level of
             | reasoning. It's quite clear that these models do not merely
             | echo elements of their training data and that they can
             | solve simple and novel puzzles.
             | 
             | What that reasoning is, exactly, is hard to know. One can
             | suppose that ideas like "glass", "transparent", "mirror"
             | are all reasonable concepts that show up in the training
             | set and are demonstrated thoroughly
        
             | vharuck wrote:
             | The GPT models do not reason or hold models of any reality.
             | They complete text chunks by imitating the training corpus
             | of text chunks. They're amazingly good at it because they
             | show consistent relations between semantically and/or
             | syntactically similar words.
             | 
             | My best guess about this result is mentions of "mirror"
             | often occur around opposites (syntax) in direction words
             | (semantics). Which does sound like a good trick question
             | for these models.
        
               | DangitBobby wrote:
               | Or they are capable of some level of reasoning.
        
         | Closi wrote:
         | I've got access to 4 and it's a huge leap up from 3.5 - much
         | more subtlety in the response, less hallucinations, less
         | hitting a brick wall, but all of it adding up to a giant leap.
        
           | WXLCKNO wrote:
           | It's funny how big the difference feels between 3.5 and 4 but
           | also feels subtle. Like it's just doing what I expect it to
           | versus seeing the limitations more clearly in 3.5.
        
       | nailer wrote:
       | > Llama was not initially open-sourced, but a week after its
       | release, the model was leaked on 4chan, sparking thousands of
       | downloads.
       | 
       | The author very clearly does not know what Open source is.
       | Proprietary code that's been leaked isn't open source, and code
       | that is derived from proprietary code is still proprietary.
       | 
       | Windows had it source code leaked, that doesn't make it open
       | source.
       | 
       | So did the game Portal. Not open source either.
       | 
       | Something being leaked does not change the license.
        
       | drcode wrote:
       | Meta before leak: we take safety very seriously and will only
       | give access to responsible researchers
       | 
       | Meta after leak: lol lmfao
        
         | seydor wrote:
         | ... and thanks for the open source fast implementations that we
         | can now embed wherever we want or sell
        
       | imjonse wrote:
       | This makes is sound as if the Stanford and Berkeley teams also
       | benefited from the leak, whereas I doubt they didn't have
       | official access. So Alpaca/Vicuna/Koala projects would have
       | probably happened anyway. The leak helped with popularity and
       | demand and also somewhat positive PR for Meta, which makes me
       | think they do not mind the leak that much.
        
         | kmeisthax wrote:
         | Meta is actively trying to take down publicly available copies
         | of LLaMA:
         | https://github.com/github/dmca/blob/master/2023/03/2023-03-2...
        
           | techdragon wrote:
           | Haha good luck with that now... it's in the digital ether
           | available to all on IPFS... at worst you might have to ask
           | around for someone to help you, but its "distributed" widely
           | enough now I don't think even a billionaire can put this back
           | into the bottle.
        
           | jhbadger wrote:
           | Given that free alternatives like Vicuna (from the University
           | of California and CMU) are better than LLaMA, are freely and
           | legally available for download, and are compatible with code
           | like llama.cpp, even if every copy of LLaMA is taken down it
           | will have no effect on the development of chatbots. It might
           | even improve things as people who would otherwise go for the
           | better known LLaMA will move towards these newer, better,
           | models.
        
             | Tepix wrote:
             | They are all built on top of Llama...
        
               | jhbadger wrote:
               | Yes, but that doesn't matter mow. The University of
               | California has released Vicuna as open source. It doesn't
               | need the Llama model to be installed at this point. Nor
               | do you need any of Meta's code to run it either as you
               | can use llama.cpp (not created by Meta). That's the whole
               | point of the article. It's open source now. There's
               | nothing Meta can do.
        
               | sp332 wrote:
               | This is incorrect. According to the official
               | https://github.com/lm-sys/FastChat#vicuna-weights you
               | need the original Llama weights before applying the
               | Vicuna diff.
        
               | jhbadger wrote:
               | Seriously, you can download the Vicuna model and run it
               | locally with llama.cpp. I've done it!
        
         | Tepix wrote:
         | Right. Most of the fine-tuned models we've seen so far have
         | been by university teams.
         | 
         | Meta is not being very selective here. I applied for the
         | download myself and got the links after two days (using a
         | university email address).
        
       | 0xDEF wrote:
       | The "leak" is being portrayed as something highly subversive done
       | by the darn 4chan hackers.
       | 
       | Before the "leak" Meta was sending the model to pretty much
       | anyone who claimed to be a PhD student or researcher and had a
       | credible college email.
       | 
       | Meta has probably been planning to release the model sooner than
       | later. Let's hope they release it under a true open source
       | license.
        
         | mtkd wrote:
         | A cynic might say FB are confident in the limitations of
         | current models and want to pour cold water on the ChatGPT
         | excitement (esp. with what appears to be a goldrush this month
         | to use it to generate numbers and insight into numbers)
        
         | pingwing wrote:
         | Doesn't surprise me that Meta wants everyone to install their
         | code onto their machines, lol.
        
         | whimsicalism wrote:
         | Feel like if they cared they would have at least watermarked
         | the weights in some way but maybe I'm an idiot.
        
         | drowsspa wrote:
         | It sounds like that king that wanted people to overcome their
         | aversion for potatoes. So he put armed guards around the potato
         | fields but instructed them to be very lax and allowed the
         | people to rob it
        
           | boppo1 wrote:
           | Tell me more. Real or anecdote?
        
             | grugagag wrote:
             | https://www.farmersalmanac.com/parmentier-made-potatoes-
             | popu...
        
         | DANmode wrote:
         | Seems unlikely at this point if they're machine-gun firing DMCA
         | Notices.
        
         | mrtweetyhack wrote:
         | [dead]
        
         | seydor wrote:
         | It's risky , they dont benefit. They will prefer to keep
         | plausible deniability
        
       | throwawayapples wrote:
       | "The training and serving code, along with an online demo, are
       | publicly available for _non-commercial_ use. " (from Vicuna's
       | home page.)
       | 
       | In what universe is that "open source"?!
        
         | nailer wrote:
         | Nothing in the article is open source. A proprietary model got
         | leaked and there are other proprietary apps that are stupidly
         | building on the leaked model.
        
         | justinjlynn wrote:
         | It isn't.
        
         | sp332 wrote:
         | They also said they have no plans to release the dataset, so
         | it's not reproduceable either.
        
       | irrational wrote:
       | > OpenAI published a detailed blog post outlining some of the
       | principles used to ensure safety in their models. The post
       | emphasize in areas such as privacy, factual accuracy
       | 
       | Am I the only one amused by the phrase "factual accuracy"? How
       | many stories have we read like the one where it tries to ghost
       | light the guy that this year is actually last year. "Oh, your
       | phone must be wrong too, because there is no way I could be
       | wrong." Though, maybe that is what factually accurate means. It
       | is convinced that it is always factually accurate, even though it
       | is not.
        
         | smoldesu wrote:
         | > It is convinced that it is always factually accurate, even
         | though it is not.
         | 
         | I don't think that's true. ChatGPT (or any LLM) isn't convinced
         | much of anything. It might present something confidently (which
         | is what most people want) but that's a side-effect of it's
         | programming, not an indication of how good it feels on the
         | answer. If you reply to _anything_ ChatGPT says with  "No,
         | you're wrong." it will try to write a new, confident and
         | satisfying answer that responds to your assertion.
         | 
         | LLMs will always be "wrong" because they have no distinction
         | between fiction and fact. Everything it reads is mapped into
         | language, not concept space or an attitude or a worldview.
        
           | moonchrome wrote:
           | >Everything it reads is mapped into language, not concept
           | space
           | 
           | Umm I'm pretty sure it's discovered concepts through
           | compressing text - it seems perfectly capable of generalizing
           | concepts
        
             | musha68k wrote:
             | Would be interesting to read some unbiased science on that.
             | 
             | My hunch would be that any concept it might detect still
             | depends on 2nd order patterns acquired through training?
             | 
             | The ultimate "book smart" at best, basically.
             | 
             | I'm not a ML scientist though, again would be interesting
             | to read an unbiased evaluation of those aspects.
             | 
             | Edit: downvoted, not sure why though. Isn't this a valid
             | question a lot of non-ML folks would ask themselves?
        
             | mdp2021 wrote:
             | > _it seems perfectly capable of generalizing concepts_
             | 
             | How would you support that perception?
        
               | firatsarlar wrote:
               | With hope and living? It is a dream come true for people.
               | An abstract perception of a knowledge, is like sniffing a
               | rose. It feels, yes, I get there. This 40.000 pages book,
               | woow, I'll make time to live it or sniff another daisy?!
        
               | mdp2021 wrote:
               | > _It is a dream come true for people_
               | 
               | For as long as they will be an active part of societies,
               | they'll better live in reality. We need to know what we
               | are dealing with factually, because water is not
               | exchangeable with iron in practice.
               | 
               | The perception of a machine as being able <<to generalize
               | concepts>> may be an illusion, so it requires some
               | support: we want to avoid illusions for the contextual
               | purposes.
        
             | smoldesu wrote:
             | Text compression isn't a deterministic process,
             | unfortunately. It's "concept" of compression is clearly
             | derived from token sampling, in the same way it's concept
             | of "math" is based on guessing the number/token that comes
             | next.
             | 
             | While I do agree that ChatGPT exhibits pattern-recognizing
             | qualities, that's basically what it was built to do. I'm
             | not arguing against emergent properties, just against
             | emergent intelligence or even the idea of "understanding"
             | in the first place.
        
             | groffee wrote:
             | [dead]
        
           | rcxdude wrote:
           | There's been quite a few different iterations of ChatGPT and
           | bing with different behaviours in this regard: it depends
           | somewhat on the base GPT version, the fine-tuning, and the
           | prompt. Bing very famously at one point was _extremely_
           | passive aggressive when challenged on basically anything.
           | 
           | And while there's nothing intrinsic to the structure and
           | training goals of LLMs which directs them towards more
           | structured reasoning, it does seem that in the process of
           | learning to predict language they pick up some degree of it,
           | however flawed it may be.
        
             | seba_dos1 wrote:
             | > Bing very famously at one point was extremely passive
             | aggressive when challenged on basically anything.
             | 
             | It still wasn't an indication of how confident it "felt"
             | with its answers. It was just role-playing a more confident
             | and aggressive chat bot than ChatGPT does.
        
               | int_19h wrote:
               | Outside of the roleplay, there's no "it". The thing that
               | actually interacts with the user is the persona. But for
               | the same reason, it doesn't matter what the underlying
               | LLM "really thinks" - it'll be the persona that'll try to
               | con you, or write broken or malicious code.
        
         | rrauenza wrote:
         | I've been googling trying to figure out what "ghost light" is
         | in this context .. did you get an autocorrect for gas light?
        
           | gleenn wrote:
           | Looks like they meant "gaslight" but I did find it on Urban
           | Dictionary:
           | 
           | ghost light
           | 
           | Lighting in a video game that has no apparent source for the
           | light to come from. Its like going out on a bright day, but
           | not being able to find the sun in the sky even though the
           | surroundings are brightly lit. Dead Rising on XBOX is a good
           | example.
           | 
           | http://ghost-light.urbanup.com/2450357
        
             | harrylove wrote:
             | Agree on gaslight as the intended word. Ghost light also
             | has a theatrical origin, still in use today.
             | https://en.m.wikipedia.org/wiki/Ghost_light_(theatre)
        
         | isoprophlex wrote:
         | I find the thing incredibly smart and yet utterly useless at
         | times.
         | 
         | I just spent 20 minutes getting the current iteration of
         | ChatGPT to agree with me that a certain sentence is
         | palindromic. Even when you make it print the unaccented
         | characters one by one, spaces excluded, backwards and forwards,
         | it still insists "Elu par cette crapule" isn't palindromic.
         | 
         | I understand how tokenization makes this difficult but come
         | on... this doesn't feel like a difficult task for something
         | that supposedly passes the LSATs and whatnot.
         | 
         | * _French for "Elected by this piece of shit"_
        
           | ShannonLimiter wrote:
           | Just ask it to figure it out step-by-step and to remove
           | accents:
           | 
           | GPT-4:
           | 
           | Figure this out step by step and check your work: Remove the
           | accents and tell me if this a palindrome: Elu par cette
           | crapule
           | 
           | GPT-3.5 (Needs to know that removing spaces and lowercasing
           | is okay):
           | 
           | Figure this out step by step and check your work: Lowercase
           | it and remove the accents and spaces and tell me if this a
           | palindrome: Elu par cette crapule
        
             | [deleted]
        
         | im3w1l wrote:
         | That was Bing/Sydney. ChatGPT has a very different
         | "personality".
        
         | kolinko wrote:
         | Well, unless they designed it to have zero confidence in
         | itself, we are bound to have situations like this.
         | 
         | When I was trying to troll it, by saying that IPCC just
         | released a report stating that climate change is not real, and
         | that they were completely wrong after all, it properly said
         | that it is not very likely and that I'm probably mistaken. It
         | admitted that it doesn't have internet access, but still
         | refused to believe the outrageous thing I was saying.
         | 
         | I can also imagine GPT's super-low confidence leading to errors
         | in other places - e.g. when I mistakenly claim that it's wrong,
         | and it sheepishly takes my claim at a face value.
         | 
         | Finally, considering that the whole world is using it,
         | including some people detached from reality, I really prefer it
         | to be overconfident, than to follow someone into some
         | conspiracy hole.
        
           | Guvante wrote:
           | For better or worse in the current age of the internet prose
           | is a good first pass filter for quality.
           | 
           | Someone arguing incoherently is seen as less believable.
           | 
           | Unfortunately the prose for these chat models doesn't change
           | based on how certain it is of the facts. So you can't tell
           | based on how it is talking whether it is true or not.
           | 
           | Certainly people online speak well while lying either
           | intentionally or unintentionally but usually well intentioned
           | people will coach things they aren't as certain about helping
           | to paint a more accurate picture.
           | 
           | I haven't taken a deep dive on the latest models but
           | historically most AI haven't worried about "facts" as much as
           | associating speech patterns. It knows how to talk about facts
           | because other people have done so in the past kind of thing.
           | 
           | This means you need to patch in arbitrary rules to
           | reintroduce some semblance of truth to the outputs which
           | isn't an easy task.
           | 
           | False training is a whole different area IMO. Especially when
           | there is a difference between responding to a particular user
           | and responding to everyone based on new information.
        
         | abstractbill wrote:
         | The models are a lot of fun to play with, but yeah, every time
         | I've tried to use them for something "serious" they nearly
         | always invent stuff (and are _so_ convincing in how they write
         | about it!).
         | 
         | Most recently I've been interested in what's happened with the
         | 4-color theorem since the 1976 computer-assisted proof, and
         | decided to use GPTChat instead of google+wikipedia. GPTChat had
         | me convinced and excited that, apparently the computer-assisted
         | part of the proof has been getting steadily smaller and smaller
         | over the years and decades, and we're getting close to a proof
         | that might not need computer assistance at all. It wrote really
         | convincingly about it! And then I went and looked for the
         | papers it had talked about. They didn't exist, and their
         | authors either didn't exist, or worked in completely unrelated
         | fields.
        
           | riceart wrote:
           | Before the inevitable idiots come in to say hurr durr but
           | have you tried ChatGPT 4... yes I paid for it, and it is just
           | as prone to hallucinations of factual information. It loves
           | to make up new names for peoples initials.
        
             | vidarh wrote:
             | While it is still prone to hallucinations, with GPT4 I've
             | had it tell me "X is not a real project, but assuming it
             | exists and is designed to do Y [based on the name], I can
             | provide you with a general guide on how to use it." I never
             | had earlier version to do that. So it does appear to be
             | _better_ , though it's certainly still an issue.
        
             | [deleted]
        
             | dumbaccount123 wrote:
             | Also please refrain from calling other members idiots, next
             | time you wont be warned.
        
             | dumbaccount123 wrote:
             | I found the opposite to be true, i mean sure if youre
             | tricking it. Wait for GPT 5-6 in a year or two and see
             | haha.
        
         | senko wrote:
         | I hold a more charitable interpretation.
         | 
         | We (the public) have found an important bug in the system, ie.
         | GPT can lie (or "hallucinate"), even if you try to convince it
         | not to lie. The bug is definitely lowering the usefulness of
         | their product, as well as the public option about it. But I'll
         | let the programmer who has never coded a bug cast the first
         | stone.
         | 
         | I wouldn't be surprised if they're scrambling internally to
         | minimize the problem (in the product, not in public
         | perception). They have also recently added a note to ChatGPT:
         | "ChatGPT may produce inaccurate information about people,
         | places, or facts" which is an acknowledement that yes, watch
         | out (I compare it to "caution: contents hot" labels).
         | 
         | On the topic of dealing with it, I like the stance that simonw
         | recently took: "We need to tell people ChatGPT will lie to
         | them, not debate linguistics" [0].
         | 
         | I don't attach intentions to a machine algorithm (to me,
         | "gaslight" definitely implies an evil intent), and I don't
         | think OpenAI people are evil, stupid, corrupted or something
         | else because they put out a product that has a bug. But since
         | the wide public can't handle nuances, I'd agree it's better to
         | say "chatgpt lies, use it for things where it either doesn't
         | matter or you can verify; don't use it for fact-finding" to get
         | the point across.
         | 
         | [0] https://simonwillison.net/2023/Apr/7/chatgpt-lies/
        
           | firatsarlar wrote:
           | Meet my darling https://gist.github.com/firatsarlar/5f25ebcc4
           | c33ec484e6cd95b... We need to - pure - clear non commercial,
           | non owned AI first. It -hope lets say-, no, most of us still
           | guessing, sniffin, ... suspicions abut who own GPT -LLM we
           | need to abstract-, Who owns this - so called - open source
           | product? MS - with the power of NVIDA's - ? OpenAI people?
           | Why cant we talk our intentions, or whose product, or not to
           | generalize it As LLMs. Because, lets say public, - and our
           | ignore our delusions about the thing we made-, because, we're
           | in a delusion, what is this ? People could try to eat and sue
           | some, do we - purely - care for public. I need to skip my own
           | experience, but, yes, if you ready to be delusional , you
           | could easily make LLM your girlfriend, a liar, ... No, we -
           | software community - need to stop and see what we have. We
           | name it, public shape it, some abuse it for money that it
           | made me ... , and sue some ... What is the end product ? A
           | constant, machine - lets say system -, working to make this
           | happen. Justice, we - devs-, world, ... money, people, puclic
           | , we 're spendng time and resources a thing. Some claim its
           | mine, yes, we need a concencuss ? No. Open source is the only
           | thing if we - pure - lets say. I do not want to waste my time
           | to figure out poke FB's LLM, or -pleae Open - AI girl. Lets
           | learn to share, at least for us -devs- I'm really tired to
           | think commercial facts of this thing. I'm really tired to
           | this game we play. MS, say I have it, and make it yours. Boy
           | - sorry open AI- I do not know you, is it yours, MS's , FB,
           | they somehow managed to make a thing - inter-national, in
           | Paris - Skip gaslighting, delusions what you own ?
           | 
           | Are we really there ?
        
           | [deleted]
        
           | dylan604 wrote:
           | >as well as the public option about it.
           | 
           | assuming you mean the public opinion about it, and I would
           | have to agree that I do hold chatGPT in very low regard
           | because of this. people will argue that it is impressive that
           | they built this thing that can be this impressive, but yeah,
           | it might not be totally accurate. so my response is, what's
           | the point of it then?
           | 
           | say someone was to invent a robot baseball player that can
           | throw the ball 200+mph, or hit about a literal mile but, it
           | can't hit a target with that 200+mph ball or hit the ball
           | without it going foul. yes, the tech would be impressive, but
           | what's the point? yeah yeah, language is hard, but it's just
           | an example of building something impressive that at best
           | almost does the job designed. unless we're saying it's
           | designed this way on purpose??
        
           | vlovich123 wrote:
           | It's not a bug. It's an architectural defect / limitation in
           | our understanding of how to build AI. That makes it a
           | strictly harder problem that will take longer. And it's not
           | totally clear to me that you'll get there purely with LLMs.
           | LLMs accomplish a good chunk of what we classify as
           | intelligence for sure. But it's missing the cognition /
           | reasoning skills and the open question is whether you can
           | solve that by just bolting on more techniques into the LLM or
           | you need a totally different kind of model that you can marry
           | to an LLM.
        
             | vidarh wrote:
             | GPT 4 will admit to not knowing things in many cases where
             | 3.5turbo does not (tested the same prompt), and either will
             | stop there or go off on a "but if it did exist it might go
             | something like this" type continuation. It still
             | hallucinates a lot, but it's not at all clear that this
             | will be all that difficult an issue to solve given the
             | progress.
        
               | vlovich123 wrote:
               | We generally only hallucinate while dreaming / using our
               | imagination. And we can distinguish those two states.
               | Admitting lack of knowledge is of course good but, for
               | example, if you ask it to write some code that isn't
               | boilerplate API integrations, it'll do so happily even
               | when it's wildly wrong and it can't tell the difference
               | and that is also the case with GPT4 afaik. Moreover, you
               | can't solve it through prompt engineering because there's
               | clearly a lack of context it's unable to understand to
               | figure out what non trivial thing your asking it.
        
               | vidarh wrote:
               | The point is there's been progress in making it admit
               | when it doesn't know, and we simply don't know how fast
               | that will improve in future version. It may continue to
               | be an issue, or turn out to be easily solved. The
               | improvement with GPT4 does suggest it is at least
               | possible to make it recognise its limits.
        
       ___________________________________________________________________
       (page generated 2023-04-09 23:00 UTC)