Authors sue OpenAI for copyright infringement, claim ChatGPT unlawfully 'ingested' their books

Name: ChatGPT company facing lawsuit over data privacy
Uploaded: 2023-06-29T16:23:36-04:00
Duration: 4 min 13 s
Description: Digits co-founder and CEO Jeff Seibert says AI companies should be more ‘transparent’ on their data usage and says the government should start ‘developing guidelines’ on how AI should be trained.

Digits co-founder and CEO Jeff Seibert says AI companies should be more ‘transparent’ on their data usage and says the government should start ‘developing guidelines’ on how AI should be trained.

video

ChatGPT company facing lawsuit over data privacy 

Digits co-founder and CEO Jeff Seibert says AI companies should be more ‘transparent’ on their data usage and says the government should start ‘developing guidelines’ on how AI should be trained.

Authors Paul Tremblay and Mona Awad filed a class-action complaint in California federal court alleging OpenAI broke copyright law by training its software to "ingest" their books without permission.

ChatGPT, a large language model, is "trained" by copying massive amounts of text and extracting expressive information from it to form a compilation of input material known as the "training dataset," according to the complaint filed in U.S. District Court in San Francisco.

The lawsuit says neither Tremblay nor Awad, both writers who live in Massachusetts, consented to the use of their copyrighted books as training material for ChatGPT. Nonetheless, "their copyrighted materials were ingested and used to train ChatGPT."

Tremblay owns registered copyrights in several books, including "The Cabin at the End of the World." Awad owns registered copyrights in several books, including "13 Ways of Looking at a Fat Girl" and "Bunny."

OPENAI FORCES SHUTDOWN OF CONSERVATIVE CHATGPT-POWERED AI BOT, CREATOR CLAIMS

OpenAI is facing a new copyright infringement claim in San Francisco court. (Nikolas Kokovlis/NurPhoto via Getty Images / Getty Images)

"Indeed, when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs’ copyrighted works — something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works," the 17-page complaint says. "Defendants, by and through the use of ChatGPT, benefit commercial and profit richly from the use of Plaintiffs’ and Class members’ copyrighted materials."

The complaint cites a June 2018 paper in which OpenAI revealed it trained its GPT-1 tool on BookCorpus, a collection of "over 7,000 unique unpublished books from a variety of genres, including Adventure, Fantasy, and Romance."

"OpenAI confirmed why a dataset of books was so valuable: ‘Crucially, it contains long stretches of contiguous text, which allows the generative model to learn to condition on long-range information.’ Hundreds of large language models have been trained on BookCorpus, including those made by OpenAI, Google, Amazon, and others," the complaint notes.

Paul Tremblay at New York City movie premiere

Author Paul Tremblay arrives for the world premiere of Universal Pictures' "Knock at the Cabin" at Jazz at Lincoln Center's Frederick P. Rose Hall in New York City Jan. 30, 2023. He is suiting OpenAI for copyright infringement (Angela Weiss/AFP via Getty Images / Getty Images)

Andres Guadamuz, a reader in intellectual property law at the University of Sussex, told The Guardian the complaint represents the first against OpenAI regarding copyright law.

BANKING INDUSTRY PUSHES BACK ON CFPB'S WARNING OVER USE OF AI CHATBOTS

Joseph Saveri and Matthew Butterick, attorneys representing the authors, told the newspaper using books to train large language models is ideal because they contain "high-quality, well-edited, long-form prose," essentially forming "the gold standard of idea storage for our species."

Authors filed a lawsuit against OpenAI for alleged copyright infringement. (CFOTO/Future Publishing via Getty Images / Getty Images)

"Defendants breached their duties by negligently, carelessly, and recklessly collecting, maintaining and controlling Plaintiffs’ and Class members’ Infringed Works and engineering, designing, maintaining and controlling systems — including ChatGPT — which are trained on Plaintiffs’ and Class members’ Infringed Works without their authorization," the complaint says.

GET FOX BUSINESS ON THE GO BY CLICKING HERE

The lawsuit seeks an award of statutory and other damages.

Fox News Digital reached out to OpenAI for comment Wednesday but did not immediately hear back.

Recommended Videos

'He Gets Us' campaign returns to Super Bowl with new video

US copper crisis threatens AI race with China

Exec warns of caution before jumping back into tech sector

Florida lawmakers advance bill to regulate data centers amid AI boom

Inside ICON’s 3D-printed homes revolution

3D-printed concrete homes promise lower energy bills, faster builds

Mark Mahaney discusses Amazon and Google's investment strategies and market valuations

There are 'absolutely opportunities' in good tech that have gotten arbitrarily dislocated: Strategist

Amazon's $200B AI spend raises profit concerns, tech stocks and crypto under pressure

Media CEO says it is ‘good’ celebrities are getting paid less for Super Bowl ads

Meta taps Auger's AI-powered autonomous system to run its supply chain

Ondas CEO breaks down new deal in Asia-Pacific

Charles Payne: Nothing is safe from this

Moderna, Merck report cancer vaccine breakthrough in melanoma trial

Future of AI is 'being determined everyday,' Palantir CTO says

Italy vows to support Trump's mineral supply effort, 'Project Vault'

Elon Musk merges xAI and SpaceX, O’Leary says he always ‘delivers’

Elon Musk is the ‘best person in the world’ to do this, argues CEO

Charles Payne: Feb 3 might become known as 'the day software stocks officially died'

Fox poll finds voters wary of rapid AI growth: Being used too quickly

Authors sue OpenAI for copyright infringement, claim ChatGPT unlawfully 'ingested' their books

Authors Paul Tremblay, Mona Awad file class-action complaint alleging OpenAI is 'training' its software tools using their books without permission

ChatGPT company facing lawsuit over data privacy