Authors sue Meta alleging it used pirated books to train its AI systems

The authors claim that their books and other documents were illegally pirated for us in "training" AI systems. They are seeking compensation. Image (c) ConsumerAffairs

Comedian Sarah Silverman, activist Ta-Nehisi Coates say "fair use" doctrine doesn't apply

A group of authors, including Ta-Nehisi Coates and Sarah Silverman, have accused Facebook parent Meta Platforms of using pirated books to train its artificial intelligence systems with approval from CEO Mark Zuckerberg.

Coates is an author, journalist, and activist. He gained a wide readership during his time as national correspondent at The Atlantic, where he wrote about cultural, social, and political issues, particularly regarding African Americans and white supremacy. Silverman is a comedian who first rose to prominence for her brief stint as a writer and cast member on Saturday Night Live.

The authors allege that Meta Platforms CEO Mark Zuckerberg knew about the pirating but approved using the material anyway. Their suit, filed in 2023, claims that Meta used pirated works from a database called LibGen to train its AI, which includes millions of pirated books, and distributed them through torrents.

The suit accuses Meta of copyright infringement, arguing that the company misused their books to train its AI system, Llama. Meta has argued that their actions fall under the "fair use" doctrine, use of such material falls under "fair use."

The fair use doctrine is a legal principle that allows limited use of copyrighted material without permission from the copyright holder. It is part of U.S. copyright law and provides exceptions to the exclusive rights granted to copyright holders. The fair use doctrine is intended to balance the rights of creators with the public's interest in using creative works for certain purposes like education, commentary, or research.

The authors are now seeking to update their complaint, based on new evidence showing that Meta knowingly used pirated content. They also want to revive certain claims, including allegations that Meta illegally stripped their books' copyright information.

What about fair use?

Whether or not Zuckerberg knew about the copying, the case revolves around whether copying an entire book can possibly be considered fair use.

In determining whether a use qualifies as fair use, courts consider four factors, according to Cornell Law School's Legal Information Institute:

  1. Purpose and Character of the Use: This looks at whether the use is for commercial or non-commercial purposes and whether it transforms the original work. Uses that are educational, non-profit, or transformative (e.g., using a work in a way that adds new meaning or value) are more likely to be considered fair use.

  2. Nature of the Copyrighted Work: This considers whether the work is factual or creative. Factual works are more likely to be subject to fair use than highly creative works (like novels or movies).

  3. Amount and Substantiality of the Portion Used: The less of the work you use, the more likely it is to be considered fair use. However, even a small portion may be too much if it's considered the "heart" of the work.

  4. Effect on the Market or Value of the Work: If the use negatively impacts the market value or potential sales of the original work, it's less likely to be considered fair use. If the use does not affect the market for the original work, it’s more likely to be deemed fair use.

Examples of fair use include:

  • Quoting or paraphrasing a work in a review, commentary, or academic paper.
  • Using brief excerpts of a copyrighted work in a parody or satire.
  • Reproducing a copyrighted work for educational or non-profit purposes.

Fair use is not always clear-cut and often requires a case-by-case analysis or, as in this case, litigation.

The fair use doctrine is discussed in Section 107 of the Copyright Act of 1976 (Title 17 of the U.S. Code). This section outlines the factors that courts should consider when determining whether a particular use of copyrighted material qualifies as fair use.

For further details on the fair use doctrine, you can refer to the U.S. Copyright Office's official page: