[HN Gopher] What does copyright say about generative models? ___________________________________________________________________ What does copyright say about generative models? Author : BerislavLopac Score : 31 points Date : 2022-12-15 09:00 UTC (14 hours ago) (HTM) web link (www.oreilly.com) (TXT) w3m dump (www.oreilly.com) | theGnuMe wrote: | Isn't this just the riff sampling thing? So depending on the | output you could be infringing. | | Like how Vanilla Ice "stole" Queen's bass line. | | https://en.wikipedia.org/wiki/Sampling_(music) | dcow wrote: | Which is ridiculous. You can't copyright 8 notes of music. This | is the festering disease we need to rid ourselves of. | lcnPylGDnU4H9OF wrote: | Well, now I'm disappointed. I was going to point to this TED | talk (https://www.ted.com/talks/damien_riehl_copyrighting_all | _the_...) but it appears to have been removed as the video no | longer loads for me. However, this (https://www.ted.com/talks | /damien_riehl_why_all_melodies_shou...) appears to be an | abridged version of the same talk. | | The disappointing bit: I say it's an abridged version because | I distinctly remember him talking about how he actually | claimed copyright for _all_ 8-note melodies under a | permissive license, which I can 't find in the one that | actually loads. He "brute forced" every 8-note melody with a | program, saved them to a disk and claimed copyright under (I | believe) the MIT license. (I can see legal issues with doing | that, of course, so it's not hard to imagine possibly why the | video was replaced with a different one.) | an_ko wrote: | Damien Riehl and Noah Rubin have all the melodies on a hard | drive. (More info in this Adam Neely video: | https://www.youtube.com/watch?v=sfXn_ecH5Rw and the TED talk | linked in a nearby comment.) So depending on a court's kinda | arbitrary definition of creativity on any given day, all of | them may or may not be copyrighted already. | | An infringement being accidental also doesn't seem to stop | copyright holders from successfully suing people. I think | this means you could technically be successfully sued for | dropping some bricks on a piano, since no matter where they | land, the action constitutes a public performance of a | copyrighted work. Fun stuff. | dcow wrote: | Which is most certainly not the original intention of | copyright, if we are to even give the concept credence. I'm | of the opinion that the entire system of western copyright | _law_ is absolutely broken. You definitely _can 't_ | infringe on copyright by dropping bricks on a piano, in the | world with a flourishing creative arts scene, anyway. | Something went terribly wrong. | | PS: I also second the Adam Neely video. I was actually | wondering if someone would link it (: | kevingadd wrote: | What's the threshold you _can_ copyright? 9 notes? 16? Why is | that the threshold and not 8? | | I may not agree that it should be 8, but I don't see any | rigorous or well-reasoned model here to explain why "you can | copyright 8 notes" is a festering disease when it seems like | people are not reasoning through why something should or | shouldn't be copyrightable. | geoelectric wrote: | Robin Thicke wasn't exactly a sympathy-inspiring celebrity, | but the Blurred Lines case was another egregious example. | | As someone who grew up in the 80s and 90s, I also really wish | we'd gotten the explosion of almost completely sample-based | music we were starting to see with bands like the Beastie | Boys, the KLF, and Pop Will Eat Itself. The Biz Markie | sampling case basically shitcanned an entire nascent | subgenre. | dcow wrote: | Copyright as we know it is, simply put: entirely broken. | | There is no world imaginable where the concept of _owning_ ideas | should be protected. Imagine if chefs had to license recipes... | | The arts communities need to take cues from the scientific | community where _citations_ are the cultural norm. And as a | society we need to figure out how to protect artists by | protecting and celebrating the _expression_ of ideas. Not the | ideas themselves. | thewataccount wrote: | > And as a society we need to figure out how to protect artists | by protecting and celebrating the expression of ideas. | | Github copilot would (until they manually changed it) spit out | the famous fast inverse square function word for word - | comments and everything which is more than just the idea (why | would you want the comments with them swearing?) | dcow wrote: | I'll admit, writing something down is where it gets very | murky. | | On one side, you have music and musicians, where the written | form is somewhat of meaningless tool used to produce creative | expression, which is the performance of sound on an | instrument. | | On the other hand you have authors (and recently, software | engineers), where the creative expression as most understand | it, is in the choice of arrangement of words (or statements). | | You can't really resolve the two. And so I lean towards a | stance that it is fraught to try and build a flourishing | creative society on the idea that simply writing an idea down | (or saving a file of data) means you own it and can legally | bring a case against someone else who happens to write the | same idea down, be it music or a book. | | And then there's https://libraryofbabel.info, which contains | all possible combinations of characters that every has been, | or ever will be, written. Should that be copyrightable? | | Anyway, I am of the opinion that Copilot _does_ infringe on | copyright _as we know it_. But I 'm also of the opinion that | simply giving individuals universal claim to any written text | they produce is problematic. And to answer your question, | maybe I _do_ want the comments. Maybe they communicate | something that the code cannot, swearing be damned. | | I'd prefer a culture of citations for written/stored/saved | works. And a culture of celebrating performance of the arts, | not storage of them. | EMIRELADERO wrote: | > There is no world imaginable where the concept of owning | ideas should be protected. | | Copyright does not apply to ideas, it applies to specific | expressions/implementations of those ideas, which is what it | seeks to incentivize. The fact that many companies | (particularly Disney) sued people who only copied the ideas and | not the expressions shows a failure of the legal system, not | copyright. | dcow wrote: | Why, then, is one artist even able to bring a case against | another artist for composing a song with a similar riff as | one they published? It seems it's not so simple. Perhaps I | should have been more specific to say that _copyright law_ as | it exists in society today is rather broken so as not to | disparage the original intent of copyright as a concept. But, | I feel like we 're just talking semantics, really. The point | remains that the implementation is broken, copyright _as we | know it_. | kevingadd wrote: | Because riffs aren't ideas, they're expressions of ideas. | This seems pretty obvious? The concept of a riff is not | copyrighted, but a specific one apparently can be. | Obviously copyright overreach is a problem today, but | you're attacking the wrong problem. Musical works can be | plagiarized by other musicians and it _does happen_ , | sometimes by accident (due to modern technology and | attribution problems), sometimes on purpose. | | The developer of a notable mobile game had to pull some | music from their title a couple years ago because the | composer they hired blatantly plagiarized some other works, | for example. | [deleted] | [deleted] | Guthur wrote: | It should say that the notion of intellectual property is | nonsensical. | Rochus wrote: | > _What was originally intended to protect artists has turned | into a rent-seeking game in which artists who can afford lawyers | monetize the creativity of artists who can't._ | | It's rather a rent-seeking industry; the vast majority of artists | benefit only marginally from copyright; an original intent, to | give composers an income who would otherwise be out of the | monetary loop, is long forgotten; instead, early on it was all | about protecting the publishers' business by restricting copies; | ironically, composers (or musicians in general) today earn best, | on average, when they work as a clerk for a copyright collecting | society; I don't think patent and copyright law can be fixed; | they can only make it even more complicated and unwieldy, so that | it gets even further away from composers and authors, and instead | plays into the hands of trolls or monopolists. | | > _fixing copyright law to accommodate works used to train AI | systems, and developing AI systems that respect the rights of the | people who made the works on which their models were trained_ | | And then also charge each student of art for studying original | works with the intent to create new works based on what they have | learned to make a living? This idea can be extended in any | direction and quickly leads to a system that is rather against an | open society where people benefit from each other. | ROTMetro wrote: | The system sure seems to have created a ton of music. I know | many people who have benefited from being able to actually earn | money from their work and would strongly disagree with you. I | respect your opinion, but that is all you have written here, | your opinion, not some great truth about copyright. | Imnimo wrote: | I'm not sure I buy this line of reasoning about "inputs" rather | than "outputs". If there is some prohibition about using an image | as an input, regardless of whether any vestige of that image | exists in an output, doesn't that equally prohibit using an image | to train neural network that just says whether an image is a cat | or a dog? Or how about a network that just tries to denoise a | photo from your camera? | akira2501 wrote: | > Copilot itself is a commercial product that is built a body of | training data, even though it is completely different from that | data. It's clearly "transformative." | | Is it, though? | | The article wrestles with the notion of the gap between "idea" | and "expression." To me, I wonder if this is the same gap. The | training data is equivalent to the "idea," and the output of | using that training data in a particular way is the "expression." | | In this view, the result of your training isn't transformative, | and it might not even something you can claim copyright over. | What is it other than a particular arrangement of facts that have | been feed into it? Merely adding weights in a highly dimensional | space does not seem "transformative." | | This article feels like it's wrestling with the wrong side of the | problem. | avereveard wrote: | You already get protection for characters and stories so it's not | like ai will destroy publishing and it's not like existing | content is not protected already. | | Copyright doesn't protect skill, as it doesn't protect | algorithms, because these are tools to create and not finished | products. | seydor wrote: | why does society and technology have to constantly adapt to | ancillary legal requirements, while the laws themselves rarely | adapt (e.g. never expire)? | nonrandomstring wrote: | Copyright vs. AI | | I'll grab some popcorn. This could be one of the all time epic | battles. | | Earlier someone posted, and then deleted, an Ask HN: "Are artists | fighting AI Art repeating Metallica versus Napster?" | | This is actually an entertaining question, because it brings in | the power of the entertainments business. | | Where is Napster today? Didn't Metallica win that one? It might | not be a great comparison, but what if RIAA, MPAA, Sony and the | game industry decide that "generative AI" occupies the same | threat space as "piracy"? | | It was of course Metallica and an army of ten thousand lawyers | and goonies from a vast, wealthy, moribund industry that actually | _did_ manage to block the road of progress and frighten the genie | back into the bottle. | | In fact, the power of the film and music industry to shape | technology has been so immense, you have to wonder whether they | could do it again over "AI". | | Right now I think the entertainments industry is shitting itself | over LLM technology, but is split over whether it can gain enough | control to allow it on it's own terms, or mobilise to fight it. | We haven't yet reached the stage of commodity proliferation. That | will be the watershed. | amelius wrote: | > I'll grab some popcorn. This could be one of the all time | epic battles. | | No, it will be boring, the outcome is clear: those with the | deepest pockets will win this battle. | | Just look at Disney who had copyright law changed back in the | previous century. | geoelectric wrote: | The artists didn't exactly win Metallica vs. Napster either. | | The end result was a compromise that still funneled some degree | of money into the labels, with the artists getting an equally | raw deal as before proportionally speaking--but now with a | micro share of your $10/mo to Spotify or wherever instead of | their share of $10+ for a single album. | | I'm not saying the prior business model was sustainable (at | least ethically) but at the end of the day, "if you can't beat | them, join them" is still one hell of a compromise to make. | nonrandomstring wrote: | Yes I think you're right. The technology was tamed and | brought to heel. It starts out looking "disruptive". What | will that look like when Big Media figure out how to take | legal control of generative AI and become effective arbiters | of all that can be cheaply, mechanically created? | kmeisthax wrote: | >But how much of a song or a painting can you reproduce? | | The reason why fair use is vague is specifically to confuse | people who ask these kinds of questions. The Supreme Court needed | a tool that artists could use to legally smack down people who | republish fragments of other people's work, but didn't want to | abolish the 1st Amendment in the process. So basically judges | have the final say as to whether or not something is novel | creativity or in debt to the original. Any hard-and-fast rule | beyond "binding precedent applies" is effectively copyright | abolition by degrees. | | >We lost most of Elizabethan theater because there was no | copyright. [..] Without some kind of protection, authors had no | interest in publishing at all, let alone publishing accurate | texts. | | This is a dated example, if only because creative works leave a | lot more evidence now than they used to. People today will act to | preserve art _against the artists own wishes_ and at great | personal risk. | | >and it's easy to suspect that the actual payments will be | similar to the royalties musicians get from streaming services: | microcents per use | | Given the amount of data these systems need (read: more than | humanity can provide) I'd say microcents is arguably too high. | Remember that you can't actually derive a clear chain of value | between one particular training set entry and one particular | execution of the model. It's all chucked into a blender that runs | on almost-linear algebra and calculus. At best you can detect if | parts of the image resemble specific training set examples[0] and | pay people slightly more if the model regurgitates training set | data. | | Let's also keep in mind that a good chunk of the licensing system | is based on being able to say no to specific users, or write very | tailor-made licensing agreements for specific works or | conditions. That's still going to be threatened, even if we can | pay sub-Spotify-tier royalties every time a model trains itself | on your work. | | >It is easy to imagine an AI system that has been trained on the | (many) Open Source and Creative Commons licenses. | | Working on it: https://github.com/kmeisthax/PD-Diffusion | | The thing is, we _already have_ a good database of reusable, | public-domain, no-attribution-necessary images; it 's called | Wikimedia Commons. I really can't fathom why OpenAI didn't start | there, other than just an assumption that they were entitled to | larger datasets or a feeling that they could get established | before anyone sued. | | Even then, OpenAI already tried this with computer code and | they're getting sued for it anyway, because they never bothered | with attribution in the case of training set regurgitation. | | [0] This is possible because part of the prompt guidance process | involves a thing called CLIP which can do both image and text | classification in the same coordinate system. | Nadya wrote: | Just an FYI but your link 404's. I assume it is a private repo. ___________________________________________________________________ (page generated 2022-12-15 23:01 UTC)