[HN Gopher] What does copyright say about generative models?
       ___________________________________________________________________
        
       What does copyright say about generative models?
        
       Author : BerislavLopac
       Score  : 31 points
       Date   : 2022-12-15 09:00 UTC (14 hours ago)
        
 (HTM) web link (www.oreilly.com)
 (TXT) w3m dump (www.oreilly.com)
        
       | theGnuMe wrote:
       | Isn't this just the riff sampling thing? So depending on the
       | output you could be infringing.
       | 
       | Like how Vanilla Ice "stole" Queen's bass line.
       | 
       | https://en.wikipedia.org/wiki/Sampling_(music)
        
         | dcow wrote:
         | Which is ridiculous. You can't copyright 8 notes of music. This
         | is the festering disease we need to rid ourselves of.
        
           | lcnPylGDnU4H9OF wrote:
           | Well, now I'm disappointed. I was going to point to this TED
           | talk (https://www.ted.com/talks/damien_riehl_copyrighting_all
           | _the_...) but it appears to have been removed as the video no
           | longer loads for me. However, this (https://www.ted.com/talks
           | /damien_riehl_why_all_melodies_shou...) appears to be an
           | abridged version of the same talk.
           | 
           | The disappointing bit: I say it's an abridged version because
           | I distinctly remember him talking about how he actually
           | claimed copyright for _all_ 8-note melodies under a
           | permissive license, which I can 't find in the one that
           | actually loads. He "brute forced" every 8-note melody with a
           | program, saved them to a disk and claimed copyright under (I
           | believe) the MIT license. (I can see legal issues with doing
           | that, of course, so it's not hard to imagine possibly why the
           | video was replaced with a different one.)
        
           | an_ko wrote:
           | Damien Riehl and Noah Rubin have all the melodies on a hard
           | drive. (More info in this Adam Neely video:
           | https://www.youtube.com/watch?v=sfXn_ecH5Rw and the TED talk
           | linked in a nearby comment.) So depending on a court's kinda
           | arbitrary definition of creativity on any given day, all of
           | them may or may not be copyrighted already.
           | 
           | An infringement being accidental also doesn't seem to stop
           | copyright holders from successfully suing people. I think
           | this means you could technically be successfully sued for
           | dropping some bricks on a piano, since no matter where they
           | land, the action constitutes a public performance of a
           | copyrighted work. Fun stuff.
        
             | dcow wrote:
             | Which is most certainly not the original intention of
             | copyright, if we are to even give the concept credence. I'm
             | of the opinion that the entire system of western copyright
             | _law_ is absolutely broken. You definitely _can 't_
             | infringe on copyright by dropping bricks on a piano, in the
             | world with a flourishing creative arts scene, anyway.
             | Something went terribly wrong.
             | 
             | PS: I also second the Adam Neely video. I was actually
             | wondering if someone would link it (:
        
           | kevingadd wrote:
           | What's the threshold you _can_ copyright? 9 notes? 16? Why is
           | that the threshold and not 8?
           | 
           | I may not agree that it should be 8, but I don't see any
           | rigorous or well-reasoned model here to explain why "you can
           | copyright 8 notes" is a festering disease when it seems like
           | people are not reasoning through why something should or
           | shouldn't be copyrightable.
        
           | geoelectric wrote:
           | Robin Thicke wasn't exactly a sympathy-inspiring celebrity,
           | but the Blurred Lines case was another egregious example.
           | 
           | As someone who grew up in the 80s and 90s, I also really wish
           | we'd gotten the explosion of almost completely sample-based
           | music we were starting to see with bands like the Beastie
           | Boys, the KLF, and Pop Will Eat Itself. The Biz Markie
           | sampling case basically shitcanned an entire nascent
           | subgenre.
        
       | dcow wrote:
       | Copyright as we know it is, simply put: entirely broken.
       | 
       | There is no world imaginable where the concept of _owning_ ideas
       | should be protected. Imagine if chefs had to license recipes...
       | 
       | The arts communities need to take cues from the scientific
       | community where _citations_ are the cultural norm. And as a
       | society we need to figure out how to protect artists by
       | protecting and celebrating the _expression_ of ideas. Not the
       | ideas themselves.
        
         | thewataccount wrote:
         | > And as a society we need to figure out how to protect artists
         | by protecting and celebrating the expression of ideas.
         | 
         | Github copilot would (until they manually changed it) spit out
         | the famous fast inverse square function word for word -
         | comments and everything which is more than just the idea (why
         | would you want the comments with them swearing?)
        
           | dcow wrote:
           | I'll admit, writing something down is where it gets very
           | murky.
           | 
           | On one side, you have music and musicians, where the written
           | form is somewhat of meaningless tool used to produce creative
           | expression, which is the performance of sound on an
           | instrument.
           | 
           | On the other hand you have authors (and recently, software
           | engineers), where the creative expression as most understand
           | it, is in the choice of arrangement of words (or statements).
           | 
           | You can't really resolve the two. And so I lean towards a
           | stance that it is fraught to try and build a flourishing
           | creative society on the idea that simply writing an idea down
           | (or saving a file of data) means you own it and can legally
           | bring a case against someone else who happens to write the
           | same idea down, be it music or a book.
           | 
           | And then there's https://libraryofbabel.info, which contains
           | all possible combinations of characters that every has been,
           | or ever will be, written. Should that be copyrightable?
           | 
           | Anyway, I am of the opinion that Copilot _does_ infringe on
           | copyright _as we know it_. But I 'm also of the opinion that
           | simply giving individuals universal claim to any written text
           | they produce is problematic. And to answer your question,
           | maybe I _do_ want the comments. Maybe they communicate
           | something that the code cannot, swearing be damned.
           | 
           | I'd prefer a culture of citations for written/stored/saved
           | works. And a culture of celebrating performance of the arts,
           | not storage of them.
        
         | EMIRELADERO wrote:
         | > There is no world imaginable where the concept of owning
         | ideas should be protected.
         | 
         | Copyright does not apply to ideas, it applies to specific
         | expressions/implementations of those ideas, which is what it
         | seeks to incentivize. The fact that many companies
         | (particularly Disney) sued people who only copied the ideas and
         | not the expressions shows a failure of the legal system, not
         | copyright.
        
           | dcow wrote:
           | Why, then, is one artist even able to bring a case against
           | another artist for composing a song with a similar riff as
           | one they published? It seems it's not so simple. Perhaps I
           | should have been more specific to say that _copyright law_ as
           | it exists in society today is rather broken so as not to
           | disparage the original intent of copyright as a concept. But,
           | I feel like we 're just talking semantics, really. The point
           | remains that the implementation is broken, copyright _as we
           | know it_.
        
             | kevingadd wrote:
             | Because riffs aren't ideas, they're expressions of ideas.
             | This seems pretty obvious? The concept of a riff is not
             | copyrighted, but a specific one apparently can be.
             | Obviously copyright overreach is a problem today, but
             | you're attacking the wrong problem. Musical works can be
             | plagiarized by other musicians and it _does happen_ ,
             | sometimes by accident (due to modern technology and
             | attribution problems), sometimes on purpose.
             | 
             | The developer of a notable mobile game had to pull some
             | music from their title a couple years ago because the
             | composer they hired blatantly plagiarized some other works,
             | for example.
        
             | [deleted]
        
         | [deleted]
        
       | Guthur wrote:
       | It should say that the notion of intellectual property is
       | nonsensical.
        
       | Rochus wrote:
       | > _What was originally intended to protect artists has turned
       | into a rent-seeking game in which artists who can afford lawyers
       | monetize the creativity of artists who can't._
       | 
       | It's rather a rent-seeking industry; the vast majority of artists
       | benefit only marginally from copyright; an original intent, to
       | give composers an income who would otherwise be out of the
       | monetary loop, is long forgotten; instead, early on it was all
       | about protecting the publishers' business by restricting copies;
       | ironically, composers (or musicians in general) today earn best,
       | on average, when they work as a clerk for a copyright collecting
       | society; I don't think patent and copyright law can be fixed;
       | they can only make it even more complicated and unwieldy, so that
       | it gets even further away from composers and authors, and instead
       | plays into the hands of trolls or monopolists.
       | 
       | > _fixing copyright law to accommodate works used to train AI
       | systems, and developing AI systems that respect the rights of the
       | people who made the works on which their models were trained_
       | 
       | And then also charge each student of art for studying original
       | works with the intent to create new works based on what they have
       | learned to make a living? This idea can be extended in any
       | direction and quickly leads to a system that is rather against an
       | open society where people benefit from each other.
        
         | ROTMetro wrote:
         | The system sure seems to have created a ton of music. I know
         | many people who have benefited from being able to actually earn
         | money from their work and would strongly disagree with you. I
         | respect your opinion, but that is all you have written here,
         | your opinion, not some great truth about copyright.
        
       | Imnimo wrote:
       | I'm not sure I buy this line of reasoning about "inputs" rather
       | than "outputs". If there is some prohibition about using an image
       | as an input, regardless of whether any vestige of that image
       | exists in an output, doesn't that equally prohibit using an image
       | to train neural network that just says whether an image is a cat
       | or a dog? Or how about a network that just tries to denoise a
       | photo from your camera?
        
       | akira2501 wrote:
       | > Copilot itself is a commercial product that is built a body of
       | training data, even though it is completely different from that
       | data. It's clearly "transformative."
       | 
       | Is it, though?
       | 
       | The article wrestles with the notion of the gap between "idea"
       | and "expression." To me, I wonder if this is the same gap. The
       | training data is equivalent to the "idea," and the output of
       | using that training data in a particular way is the "expression."
       | 
       | In this view, the result of your training isn't transformative,
       | and it might not even something you can claim copyright over.
       | What is it other than a particular arrangement of facts that have
       | been feed into it? Merely adding weights in a highly dimensional
       | space does not seem "transformative."
       | 
       | This article feels like it's wrestling with the wrong side of the
       | problem.
        
       | avereveard wrote:
       | You already get protection for characters and stories so it's not
       | like ai will destroy publishing and it's not like existing
       | content is not protected already.
       | 
       | Copyright doesn't protect skill, as it doesn't protect
       | algorithms, because these are tools to create and not finished
       | products.
        
       | seydor wrote:
       | why does society and technology have to constantly adapt to
       | ancillary legal requirements, while the laws themselves rarely
       | adapt (e.g. never expire)?
        
       | nonrandomstring wrote:
       | Copyright vs. AI
       | 
       | I'll grab some popcorn. This could be one of the all time epic
       | battles.
       | 
       | Earlier someone posted, and then deleted, an Ask HN: "Are artists
       | fighting AI Art repeating Metallica versus Napster?"
       | 
       | This is actually an entertaining question, because it brings in
       | the power of the entertainments business.
       | 
       | Where is Napster today? Didn't Metallica win that one? It might
       | not be a great comparison, but what if RIAA, MPAA, Sony and the
       | game industry decide that "generative AI" occupies the same
       | threat space as "piracy"?
       | 
       | It was of course Metallica and an army of ten thousand lawyers
       | and goonies from a vast, wealthy, moribund industry that actually
       | _did_ manage to block the road of progress and frighten the genie
       | back into the bottle.
       | 
       | In fact, the power of the film and music industry to shape
       | technology has been so immense, you have to wonder whether they
       | could do it again over "AI".
       | 
       | Right now I think the entertainments industry is shitting itself
       | over LLM technology, but is split over whether it can gain enough
       | control to allow it on it's own terms, or mobilise to fight it.
       | We haven't yet reached the stage of commodity proliferation. That
       | will be the watershed.
        
         | amelius wrote:
         | > I'll grab some popcorn. This could be one of the all time
         | epic battles.
         | 
         | No, it will be boring, the outcome is clear: those with the
         | deepest pockets will win this battle.
         | 
         | Just look at Disney who had copyright law changed back in the
         | previous century.
        
         | geoelectric wrote:
         | The artists didn't exactly win Metallica vs. Napster either.
         | 
         | The end result was a compromise that still funneled some degree
         | of money into the labels, with the artists getting an equally
         | raw deal as before proportionally speaking--but now with a
         | micro share of your $10/mo to Spotify or wherever instead of
         | their share of $10+ for a single album.
         | 
         | I'm not saying the prior business model was sustainable (at
         | least ethically) but at the end of the day, "if you can't beat
         | them, join them" is still one hell of a compromise to make.
        
           | nonrandomstring wrote:
           | Yes I think you're right. The technology was tamed and
           | brought to heel. It starts out looking "disruptive". What
           | will that look like when Big Media figure out how to take
           | legal control of generative AI and become effective arbiters
           | of all that can be cheaply, mechanically created?
        
       | kmeisthax wrote:
       | >But how much of a song or a painting can you reproduce?
       | 
       | The reason why fair use is vague is specifically to confuse
       | people who ask these kinds of questions. The Supreme Court needed
       | a tool that artists could use to legally smack down people who
       | republish fragments of other people's work, but didn't want to
       | abolish the 1st Amendment in the process. So basically judges
       | have the final say as to whether or not something is novel
       | creativity or in debt to the original. Any hard-and-fast rule
       | beyond "binding precedent applies" is effectively copyright
       | abolition by degrees.
       | 
       | >We lost most of Elizabethan theater because there was no
       | copyright. [..] Without some kind of protection, authors had no
       | interest in publishing at all, let alone publishing accurate
       | texts.
       | 
       | This is a dated example, if only because creative works leave a
       | lot more evidence now than they used to. People today will act to
       | preserve art _against the artists own wishes_ and at great
       | personal risk.
       | 
       | >and it's easy to suspect that the actual payments will be
       | similar to the royalties musicians get from streaming services:
       | microcents per use
       | 
       | Given the amount of data these systems need (read: more than
       | humanity can provide) I'd say microcents is arguably too high.
       | Remember that you can't actually derive a clear chain of value
       | between one particular training set entry and one particular
       | execution of the model. It's all chucked into a blender that runs
       | on almost-linear algebra and calculus. At best you can detect if
       | parts of the image resemble specific training set examples[0] and
       | pay people slightly more if the model regurgitates training set
       | data.
       | 
       | Let's also keep in mind that a good chunk of the licensing system
       | is based on being able to say no to specific users, or write very
       | tailor-made licensing agreements for specific works or
       | conditions. That's still going to be threatened, even if we can
       | pay sub-Spotify-tier royalties every time a model trains itself
       | on your work.
       | 
       | >It is easy to imagine an AI system that has been trained on the
       | (many) Open Source and Creative Commons licenses.
       | 
       | Working on it: https://github.com/kmeisthax/PD-Diffusion
       | 
       | The thing is, we _already have_ a good database of reusable,
       | public-domain, no-attribution-necessary images; it 's called
       | Wikimedia Commons. I really can't fathom why OpenAI didn't start
       | there, other than just an assumption that they were entitled to
       | larger datasets or a feeling that they could get established
       | before anyone sued.
       | 
       | Even then, OpenAI already tried this with computer code and
       | they're getting sued for it anyway, because they never bothered
       | with attribution in the case of training set regurgitation.
       | 
       | [0] This is possible because part of the prompt guidance process
       | involves a thing called CLIP which can do both image and text
       | classification in the same coordinate system.
        
         | Nadya wrote:
         | Just an FYI but your link 404's. I assume it is a private repo.
        
       ___________________________________________________________________
       (page generated 2022-12-15 23:01 UTC)