More than 8,000 writers have signed an open letter demanding that the companies behind ChatGPT, Bardand other generative AI platforms, stop stealing their books to train them.
The letter, which was published via the Authors Guild of America (The Authors Guild), is aimed at the CEOs of the main companies in the sector: Sam Altman (OpenAI), Sundar Pichai (Google), Satya Nadella (Microsoft), Mark Zuckerberg (Meta), Emad Mostaque (Stability AI) and Arvind Krishna (IBM).
What the authors demand is that the companies in question stop using their works to train chatbots such as ChatGPT and Bard, without having permission to do so. This means not only obtaining the consent of those who wrote the works, but also giving them the corresponding credit and “compensate them fairly”considering that it is material protected by copyright laws.
The writers not only accuse major companies in the field of artificial intelligence of using their work without authorization, but also to obtain it illegally. Specifically, they mention the use of “notorious hacking websites” to gain access to books that are later included in data sets for training platforms such as ChatGPT and Bard.
Among the signatories, names such as those of Margaret Atwood (The Handmaid’s Tale), James Patterson (Alex Cross), Jonathan Franzen (The corrections) and Dan Brown (The Da Vinci Code). The petition is still open to receive the support of more authors.
“Generative AI technologies based on large language models owe their existence to our writing. These technologies mimic and regurgitate our language, stories, style, and ideas. Millions of copyrighted books, articles, essays, and poetry provide the “food” for AI systems—endless meals for which there has been no bill. Billions of dollars are being spent to develop AI technology. It is only fair that we are compensated for using our writing, without which AI would be banal and extremely limited.”
Open letter from writers against OpenAI, Microsoft, Meta, Google, Stability AI and IBM.
Authors demand compensation for the use of their books to train Bard and ChatGPT
The claim of the writers is not without support. Nowadays, you can ask ChatGPT or Bard to write a text based on the literary style of a certain author, and chatbots can do it in a matter of seconds. This is only possible if the language models that empower them have previous —and massive— knowledge of their works.
In general, companies such as OpenAI, Google or Meta, maintain that their artificial intelligence platforms are trained with data publicly available on the web. This isn’t necessarily a fallacy, but it’s never entirely clear what each company defines as “publicly available.”
Weeks ago it became known that Sam Altman’s company is facing a class action lawsuit for allegedly stealing personal data to train ChatGPT. According to the indictment, OpenAI also “systematically collected 300 billion words from the internet, books, articles, websites and publications.” For their part, authors such as Sarah Silverman, Christopher Golden and Richard Kadrey also led to startup Californian in court for misuse of her books in her AI chatbot.
In the open letter that was released in the last few hours, the Authors’ Union assures that chatbots like ChatGPT and Bard are becoming a real threat. “As a result of incorporating our writing into their systems, generative AI threatens to harm our profession by flooding the market with mediocre machine-written books, stories and pieces of journalism based on our work,” he says.
Seeking to mitigate the impact of generative AI
To mitigate the impact of unauthorized use of their works to train artificial intelligence models, the authors propose various measures. First, that companies obtain permission before using copyrighted material before including it in the datasets training of your chatbots. Second, that writers be fairly compensated for past and current use of their books.
But the most curious thing is that the writers also ask that get paid for the use of their work in content generated by Bard or ChatGPT, even if these do not constitute a violation of copyright. What does this mean? That if, for example, a chatbot writes “original content” but with the literary style of Margaret Atwood, the writer receives a kind of royalty for it.
It remains to be seen how this story progresses. For now, the Authors Guild of America is not threatening to bring the targeted companies to justice. But it may all depend on what kind of response they get, if any.
What is interesting about this case is that it exposes a growing concern in different spheres of the artistic world. In the United States, the use of generative AI in Hollywood is on the table these days, both as a replacement for actors and extras, and as scriptwriters.
In Europe, meanwhile, the discussion around the Artificial Intelligence Law is also putting the spotlight on use of copyrighted material. In fact, this has already caused the first short circuits with OpenAI and other companies, since they should record what intellectual property data is used to train ChatGPT and similar platforms.