How will AI shape IP law?

How will AI shape IP law?

by Mónika Mercz

Generative Artificial Intelligence hit the world akin to a storm, creating several possibilities, such as better access to manufacturing educational and entertainment content,[1] but also producing negative outcomes, like artists slowly losing their monopoly over creative work due to AI being able to clone someone’s voice and generate new written content in the style of any specific author.[2] These new results emerging from technological innovation bring with themselves contemporary challenges, most notably in the field of intellectual property law. Because who owns the work created by AI? Can a human have the right to control the creation of a generative AI system that replicates their individual “creative process,”? What control does the owner even have of the data that the AI system is trained on? These are complex questions, which necessitate a broader framework of analysis.

Several generative AI systems have used copyrighted works as part of their training data, all without the explicit permission of the copyright holders. Sarah Anderson, Kelly McKernan, and Karla Ortiz, a group of artists filed a class-action lawsuit[3] against Stable Diffusion, a “general-purpose” software program created and released in August 2022, DeviantArt’s “DreamUp” product that relies on Stable Diffusion to produce images and is only available to customers who pay DeviantArt, and Midjourney, a commercial product that produces images in response to text prompts. Their claim was that these companies have infringed the copyright of countless artists by training their models on creative work uploaded by them. Ultimately in this case, the judge dismissed all but one claim. He stated that the diffusion process associated with AI training “involves not copying of images, but instead, the application of mathematical equations and algorithms to capture concepts from the Training Images.”

In July of 2023, a group of artists[4] proposed a class-action lawsuit against OpenAI, alleging that their written works were unlawfully incorporated into the datasets utilized to train ChatGPT. Action was brough against Meta Platforms[5] and Google[6] as well. Two class actions against Meta, Kadrey v Meta[7] and Chabon v Meta[8] were treated jointly. The plaintiffs were authors of books and did not consent to their use as training material for Meta’s AI product, LLaMA. The cause of action in both cases includes direct copyright infringement, vicarious copyright infringement and infringing Digital Millennium Copyright Act’s (‘DMCA’) removal of copyright management information. In November of 2023, Meta’s motion to dismiss was granted, except the one alleging that the unauthorized copying of the plaintiffs’ books for purposes of training LLaMA constitutes copyright infringement. However, it is vital to note that the Judge concentrated on the very core of the copyright issue in the generative AI tools, namely their alleged training via resources made public on the internet.[9] While the U.S. case law is slowly being built and all of these cases serve as an important litmus test for the viability of copyright infringement claims against AI platforms, in the European Union the AI Act has introduced “obligations for providers of general-purpose AI models”, with two distinct requirements related to copyright.

Currently, when applying the fair use doctrine in the U.S., judges consider four factors: the purpose and character of use; the nature of the copyrighted work; the amount and substantiality of the portion taken, and the effect of the use on the potential market. Generative AI systems tend to be trained on all kinds of works, both factual and creative. The training process uses the copyrighted works in their entirety, but it seems undeniable that training a machine learning model is a quintessential example of transformative use. At this time, the effects of generative AI systems on the potential market for the works the AI system is trained on is unclear, as the legitimacy of explicitly replicating, perhaps in a manner that is identified, the exact “style” of a human creator is unresolved.

Of course, there are arguments in favor of the unfettered use of copyrighted material for training purposes. This justification relies on the analogy between the use of data for training and a human reading the copyrighted material, listening to a song or viewing visual art. However, we must recognise that while it may be theoretically possible for a human to “read” a document similarly to what a machine learning system does when it is trained, aka the learning of new things may be viewed under the same definition, in reality it is practically impossible for any human to learn so much material. At a very high level of abstraction, when a machine is being trained, ”reading” is a system that is accurate at predicting the logical next word that comes in a sentence. Therefore, these two concepts cannot mean the same.

Another argument that is often used in favor of the use of prior works as training data is that the training process of AI systems is only replicating the ideas found in these works rather than copying the specific expressions of these ideas. However, if left unchanged, it is very likely that the current IP regime will favor a dramatic shift away from human-led creation and towards one where more and more works are generated by machines. In addition, an excessively restrictive technological or licensing regime could have the unintended consequences of favoring early movers and larger players, for example, licensing deals are bilateral and private rather than de jure.

We must move forward carefully, as granting no IP rights to a human over what they consider their highly individual artistic style or process of creation, may have a chilling effect on both human innovation and creativity.

 

[1] AI for Education: Teach for tomorrow, https://www.aiforeducation.io/

[2] O. M. Ijiga – I. P. Idoko – L. A. Enyejo - O. Akoh – S. I. Ugbane – A. I. Ibokette: Harmonizing the voices of AI: Exploring generative music models, voice cloning, and voice transfer for creative expression, WJAETS, 2023. https://wjaets.com/sites/default/files/WJAETS-2024-0072.pdf

[3] Andersen et al. v. Stability AI et al., Case No. 3:23-cv-00201 (N.D. Cal. filed Jan. 13, 2023)

[4] Silverman et al. v. OpenAI, Case No. 4:2023-cv-03416 (N.D. Cal. filed July 7, 2023)

[5] Kadrey et al. v. Meta Platforms, Case No. 3:2023-cv-03417 (N.D. Cal. filed July 7, 2023)

[6] J.L. et al v. Alphabet, Case No. 3:23-cv-3440-AMO (N.D. Cal. filed July 11, 2023)

[7] Kadrey v. Meta Platforms, Inc. (3:23-cv-03417)

[8] Chabon v. Meta Platforms Inc. (3:23-cv-04663)

[9] Gianluca Campus: Generative AI: admissibility and infringement in the two US class actions against Meta’s LLaMA, Kluwer Copyright Blog, 2024. https://copyrightblog.kluweriplaw.com/2024/01/17/generative-ai-admissibility-and-infringement-in-the-two-us-class-actions-against-metas-llama/