Britannica Hits OpenAI with Copyright Suit

Britannica sues OpenAI for scraping 100K articles to train ChatGPT, claiming 'digital cannibalization' and trademark harm. OpenAI cites fair use in escalating AI vs. publishers battle—will it reshape content licensing?

Encyclopedia Britannica has launched a lawsuit against the AI tech giant OpenAI for massive copyright infringement. They accused OpenAI of scraping and copying nearly 100,000 Britannica articles, encyclopedia entries, and Merriam-Webster dictionary definitions. OpenAI allegedly used Britannica's materials to train GPT-4 and the ChatGPT chatbot, enabling the AI to generate responses that directly compete with Britannica's core products.

OpenAI has stood firm on its fair use defence and publicly stated that it uses publicly available data, with no formal court response yet as the case enters early discovery amid a growing wave of similar legal challenges from content creators globally. This lawsuit is a clash between technological innovation and intellectual property rights. Britannica argues that OpenAI's practices threaten the very foundation of high-quality human-curated knowledge. It would be interesting to see how OpenAI counters Britannica's argument.

Also read || Why Anthropic is Suing the Pentagon

Allegations on OpenAI by Britannica

sam altman — Sam Altman and OpenAI are currently facing a large-scale series of lawsuits related to data privacy, copyright, and product liability. *Credit: Steve Jurvetson/ CC BY-2.0 / Wikimedia Commons*

Britannica and Merriam-Webster allege that OpenAI ingested large portions of their premium content during model training, enabling ChatGPT to produce 'full or partial verbatim reproductions' of articles and definitions. It is not a simple summarisation of data, but a kind of distillation of an encyclopedia. The lawsuit characterises ChatGPT as a 'high-tech photocopier', repeating exact phrases on subjects ranging from scientific ideas to historical occurrences. Thus, it affects Britannica's subscription-based business model.

Another damaging claim was ChatGPT's retrieval-augmented generation (RAG) system, which allegedly pulls live from scraped Britannica material to answer real-time questions. This is called 'digital cannibalisation', where ChatGPT is pulling millions of potential visitors away from Britannica's websites, which fund the editing team, scholars, and researchers through advertising revenue and subscriptions. Without revenue, Britannica's 250-year legacy can be wiped out, potentially flooding the internet with unchecked AI-generated content.

Furthermore, the lawsuit has expanded to include trademark infringement under the Lanham Act, in addition to copyright infringement. To leverage the brand's reputation and credibility, ChatGPT frequently "hallucinates" by making up details, such as incorrect definitions, historical dates, and false attributions to Merriam-Webster or Britannica. Not only does it impact users, but it also creates public distrust in Britannica, built over centuries. All they are asking for is substantial damages, disgorgement of OpenAI's profits from the infringement, and a permanent injunction to stop future use of their content.

Also read || 7 AI tools for lifestyle content creators

OpenAI's Defence and Industry Ripple Effects

OpenAI has been consistent in defending its data practices, asserting that it only trains on 'publicly available information' and that this is protected under the fair use doctrine. They compare AI development to a transformative process of turning raw data into novel outputs, and not rote copying. Further, they emphasize that their AI models encourage broader innovation in education, research, and creativity.

Legal experts anticipate a tedious battle between OpenAI and Britannica, filed March 13, 2026, in the US District Court for the Southern District of New York. Britannica aims to subpoena OpenAI's training datasets and internal logs to quantify the scope of copying. This strategy is similar to the ongoing lawsuit between The New York Times and OpenAI, filed in December 2023, in which The New York Times alleges that OpenAI unauthorisedly used millions of articles. Moreover, Britannica filed a similar lawsuit against Perplexity AI in the US District Court for the Southern District of New York in September 2025 for identical issues.

There are significant knock-on effects. This lawsuit is part of a growing number of lawsuits brought by news organizations, music labels, and authors (such as Sarah Silverman and John Grisham) against AI companies for training on protected works. Britannica's victory will mean billions of dollars in licensing fees, forcing AI developers to negotiate data access agreements and invest in synthetic data creation.

Also read || OpenAI to Buy Promptfoo for AI Security

The Future of AI and Publishing

The outcome of this lawsuit will likely change how we access information globally. As of now, OpenAI has about three weeks to respond formally to these serious allegations. Whether it is Silicon Valley in California or Silicon Hills in Austin, experts are watching this lawsuit closely. A win for Britannica could lead to "AI taxes," where companies must pay creators for using their data, ensuring that human-led research remains financially protected.

For content creators and web publishers, this case signals a major shift in digital strategy. Many publishers are using invisible watermarks and stronger security tools to prevent AI from scraping their content without permission. In the future, we may see high-quality information staying behind paywalls, accessible only through licensed partnerships. Whether AI giants start paying for human knowledge or continue using the public web for free, the balance of the digital economy is changing.

Explore More Topics

Ai, Artificial Intelligence, Openai, Sam Altman