By Emil Bjerg, journalist and editor
A series of prominent authors and creators have sued OpenAI for using copyrighted material to train their AI. In the wake of the lawsuit, we trace the central legal conflicts that may play a defining role in the future of generative AI.
While NFTs came in as a handy solution to protect artworks in our digital times, the current AI boom presents artists and authors with new trouble protecting their craft.
Image-generating AIs like Midjourney and Dall-E and text-generators like ChatGPT and Bard have hit the mainstream in the last year. You can ask Midjourney or Dall-E to create images in the style of a particular artist, just like you can prompt ChatGPT to write like your favorite author. Want a Monet painting of a modernist building or Hemingway writing about your cat? Everything is possible with generative AI.
Generative AI operates through a method known as deep learning, a subset of machine learning, where algorithms called neural networks are trained to recognize patterns in data. These neural networks are fed vast amounts of data – visual and textual – to analyze and learn from it.
As Harvard Business Review writes, that gives generative AI an intellectual property problem. Generative AI can seem like magic when it’s really (just) tools that skillfully merge and mix materials created by humans. Texts and images that usually would be protected under laws of intellectual property.
The picture or text that a generative AI offers users will, to a varying degree, be inspired by artists and authors, even if it’s then mixed into something new entirely – not too differently from the deconstruction that takes place when blending fruit and vegetables for a smoothie.
This creates a complex quandary that’s led to a series of legal conflicts.
Controversies in the art world
Earlier this year, illustrators Sarah Andersen, Kelly McKernan, and Karla Ortiz sued the companies Midjourney, DeviantArt, and Stability A.I. Ltd (the creator of Stable Diffusion) in California. In the lawsuit, the artists called the generative AIs “21st-century collage tools that violate the rights of millions of artists.” So far, the District Judge handling the case has said he was inclined to deny the lawsuit, stating that the group of artists needs to “provide more facts” about the alleged copyright infringements.
Do AI-generated pictures deserve copyrights?
The discussion has gone beyond humans arguing that human art should be protected by copyright. Earlier this year, Stephen Taler, a computer scientist, sought to overturn a ruling that rejects AI-generated art its own copyrights. In doing that, he wanted to give copyrights to the work created with his “creativity machine” – an algorithm that blends pictures to create something new. Stating that his work “lacks the human authorship necessary to support a copyright claim,” Taler lost his attempt at copyrighting his works.
So far, neither visual artists nor artists using AI have had the luck to have their work secured by copyright protection in relation to AI. But also big companies are suing AI image generators in what might become more defining cases.
Stockphotos versus AI image generators
In a potential landmark case, Getty Images has taken legal action against Stability AI, accusing the AI developer of infringing on copyrighted photographs and misusing copyright management information. Stable Diffusion is trained on more than 100 terabytes that contains two billion images, including Getty stock photos protected by copyrights.
Getty alleges that Stability AI unlawfully copied photos from its platform to train its Stable Diffusion AI image generator, using over 12 million images violating Getty’s terms of use.
In the lawsuit, Getty highlights that the AI-generated content often showcases altered versions of Getty Images’ watermarks, suggesting a direct link between the copyrighted images and the AI’s output. Getty further claims that Stability AI intentionally tampered with its watermarks and metadata to hide copyright breaches.
Stability AI, on the other hand, claims that their use of the pictures falls under fair use because their technology alters and transforms the images.
This case is predicted to be vital as it’s among the first more prominent companies to challenge a generative AI developer over the unauthorized use of copyrighted content for training. Therefore, it could establish a precedent for how AI systems handle copyrighted materials.
Lawsuits spilling over to texts and Big Tech
Prominent authors, including Jonathan Franzen and George R.R. Martin, have recently initiated legal action against OpenAI, voicing concerns over the AI company’s impact on the creative sector. The lawsuit, supported by the Authors Guild and filed by over a dozen writers, alleges that OpenAI used their books without permission to train its ChatGPT, thereby infringing on their copyrights. The authors argue that the chatbot can create “derivative works” resembling their books, potentially undermining the market for their original writings. They emphasize that they received neither compensation nor notification from OpenAI.
Previously, other notable creators and authors have sued OpenAI on similar grounds. Earlier this year, comedian Sarah Silverman sued OpenAI and Meta for alleged copyright infringement. The latest lawsuit makes at least three ongoing prominent copyright cases against OpenAI.
The complaint highlights that OpenAI, despite not disclosing its training data, has acknowledged using copyrighted content. It further claims that ChatGPT can generate book summaries with details not found in public reviews, indicating the full books were likely used for training.
The likely defense from generative AI companies: The fair use doctrine
Experts in copyright say that in cases like these, as is the case with Stability AI, the AI companies are likely to use the “fair use doctrine” in their defense. The doctrine allows for using a work without permission for particular use cases like research, reporting, teaching, or criticism.
A few legal precedents are likely to influence the ongoing AI copyright debates.
One of them, a 2015 ruling declared Google’s digital scanning of books for its library as “fair use,” emphasizing that it didn’t pose as a “significant market substitute” to the original works. This poses a challenge for OpenAI, as proving a lack of competition with original content is crucial.
Whether AI training will be added to the fair use doctrine is for future court cases to tell.
AI and IP: What does the future look like?
Generative AI has led the world into uncharted territories regarding copyrights. We’re in the middle of it, and the picture is messy and full of conflicts. While independent artists fighting potential copyright infringements could have some impact on the future of generative AI, it’s the latter cases opened by Getty and a series of acknowledged writers that might deeply affect the way generative AI works.
With a prominent company and a series of prominent authors suing some of the biggest generative AI companies, the latter cases could jeopardize the whole business model of generative AI. AI works by scraping massive amounts of, often copyrighted, data – without that, the AIs have little to create with.
So what are some possible solutions to the quandary?
One possibility might be the choice for artists, authors and creators to opt out of being used to train generative AI. The artist and technologist Holly Herndon, along with her partner Matt Dryhurst, has created the page haveibeentrained.com. The site allows creators from writers to visual artists to see if their work has been used to train generative AI. If the AI company agrees, the material used to train AI will subsequently be removed from the training set. So far two big AI companies – Hugging Face and Stability AI – have agreed to follow the wishes of the creators. By August this year, 1.4 billion images had been requested for opt-out. The invention, among her other AI projects, landed Herndon a spot on Time’s 2023 Time100 AI list.
Another option might be a compensation system for artists, authors and creators. With Firefly, Adobe is pursuing a model of compensation for contributors to the data used to train their generative AI. Recently Adobe’s president of digital media, David Wadhwani said to CNBC: “We want to be very clear that we are doing this in a way that will ultimately be commercially good for them. And we’re committing to making sure that we compensate them for revenues generated from Firefly.” The Generative AI revolution is making its mark in the legal world as well.This exciting trend isn’t limited to chatbots and image generation; it’s actively contributing to the legal landscape. Many of these groundbreaking Generative AI startups can be found in this collection of amazing AI applications at thataicollection.com, specifically in the “”legal”” category.””
Clearly, a legal framework is needed to establish common ground for the use of copyrighted works. Creators are concerned about the dilution of their work, while major companies are apprehensive about potential copyright violations through the use of generative AI. Some of that common ground will be created in the court, some by the demands of artists and some from generative AI companies who might find a way to meet creators halfway.