Richard Marke, head of Bates Wells’ Corporate and Commercial team, recently spoke at the International Federation of Computer Law Associations (IFCLA) conference in Madrid on the topic of the ongoing battle between artificial intelligence (AI) companies and the publishing industry. Here we summarise the key developments he highlighted.

It is generally acknowledged that AI developers have been using works protected by copyright to train their AI models without consent from the right holders. Publishers and other content owners have begun a fight back and are more aggressively asserting their rights to object to this.  There are several ongoing cases, especially in the US.

In the UK Getty Images has brought a case against Stability AI in relation to infringement of its intellectual property during the training and operation of Stability’s AI image generation tool.

In parallel, content owners are using ever-more sophisticated technical means to limit web scraping. These include CAPTCHAs to ensure that only human users can access content, mandatory log-ins, and rate limiting (which prevents excessive requests from one IP address). There is also heavy lobbying going on to improve the legal protection for rights owners.  As is often the case, legislation is struggling to catch up with technological developments and law makers are looking to balance competing interests. The EU AI Act will (from August this year) impose a range of obligations on AI developers, including an obligation to be transparent about the training materials used to develop their AI models. In the UK the same right is hotly debated with no resolution in sight.

Meanwhile, AI companies seem to sense that the era of the content free-for-all is coming to an end and have been signing up licensing agreements with major media outlets. OpenAI has announced deals with multiple media businesses including Axel Springer, News Corp, the Atlantic, the Financial Times, Conde Nast and Le Monde.  The news agency, Associated Press, announced a deal with OpenAI which granted access to its news archive going back to 1985 for use in training its models and providing ChatGPT responses based on its data.

There are also opportunities for publishers and journal owners to capitalise on AI developers’ desire to use their content. In the last year academic publisher Wiley has signed content licensing deals worth a total of $44 million – showing just how valuable publishers’ data and content can be.

Where publishers, journal owners and content creators enter into licensing agreements with AI developers to use their materials and data, there are risks which these organisations should be aware of, including:

  • inadvertently licensing rights they do not own (publisher’s often do not have permission from their authors for all use cases envisaged by AI);
  • the potential for AI to misrepresent original authors’ work, leading to inaccuracies and misinterpretations;
  • the risk of false attribution, which can harm authors’ reputations; and
  • the replication of biases present in source materials, resulting in discriminatory outputs.

It is important to put in place well-structured and flexible licensing agreements so that publishers and journal owners can protect their intellectual property and reputation.

If you would like to discuss reviewing your own licensing agreements, or these issues more broadly, get in touch.