OpenAI faces numerous lawsuits over its use of copyrighted articles, books and artwork to train its generative artificial intelligence (AI) tools.
OpenAI, the company behind the artificial intelligence (AI) chatbot ChatGPT, said it would be “impossible” to train its AI tools without using copyrighted materials.
This comes as OpenAI faces multiple lawsuits related to its use of copyrights articles, booksand art to train ChatGPT. Other AI companies face similar lawsuits.
Generative AI tools are trained on large amounts of content from the internet which they use to analyze and learn patterns to generate new human-like content.
“Because copyright today covers virtually every type of human expression – including blog posts, photographs, forum posts, snippets of software code, and government documents – it would be impossible to train today’s leading AI models without using copyrighted materials” , OpenAI said in a written statement. evidence presented to the UK House of Lords last month.
The company’s response to an investigation into large language models (LLMs) was first reported by British newspaper The Telegraph.
OpenAI said that “limiting” training data to public domain content “would not deliver AI systems that meet the needs of today’s citizens.”
He added that while the company believes that “copyright law does not prohibit training,” it recognizes that “there is still work to be done to support and empower creators.”
ChatGPTreleased in November 2022, it has accelerated the advancement of AI tools thanks to its growing popularity over the past year.
But it has also raised concerns that AI tools that produce written content and artwork will lead to job losses across multiple industries.
OpenAI responds to New York Times lawsuit
The New York Times was the last company to do so file a lawsuit against OpenAI for copyright infringement, claiming that the AI company owed them “billions of dollars in statutory and actual damages.”
The extensive 69-page lawsuit claims that OpenAI illegally used the New York Times’ work to create artificial intelligence systems that can compete with media companies.
OpenAI’s tools generate “output that recites the Times’ content verbatim, faithfully summarizes it, and imitates its expressive style, as demonstrated by dozens of examples,” the lawsuit claims.
An example in the lawsuit shows text from GPT-4 that closely resembled a 2019 Pulitzer Prize-winning New York Times investigation into the taxi industry.
The lawsuit points out that these tools have also been extremely profitable for OpenAI and Microsoft, which is its largest investor.
OpenAI responded this week in a separate blog post addressing the US newspaper’s lawsuit, arguing that training AI models with material available on the Internet is “fair use” and the New York Times’ case was “without merit”.
It said it has worked to forge partnerships with news organizations to “create mutually beneficial opportunities” and said media is a “small slice” of the content used to train AI systems.
The AI company has struck deals with media companies such as Associated press AND Axel Springerwhich owns the media companies Politico, Business Insider, Bild and Welt, licenses their content for training.
OpenAI also said in its blog post that it has a simple opt-out to prevent it from accessing publisher websites.
He added that memorization and regurgitation of training content constitutes a “failure” of the system that is intended to apply concepts to “new problems.”