OpenAI strikes a deal with the FT and is sued again for copyright infringement

This is not the first time that the publishing world has complained about the behaviour of big tech. The New York Times is also currently in dispute with OpenAI over copyright infringement. The arguments are pretty much identical to those expressed by Arden Global Capital this week.

There is a commonly held view, unsurprisingly, that legal proceedings are being initiated by the New York Times and Arden Global Capital to force favourable settlement terms from OpenAI and Microsoft on the licensed use of its content.

After all, OpenAI did agree a deal with the Financial Times, just last week. Within the press release, FT Group CEO John Ridding, used the deal to underline the quality of its journalism and editorial copy, but critically also acknowledged the key commercial implications too: “It’s right, of course, that AI platforms pay publishers for the use of their material. OpenAI understands the importance of transparency, attribution, and compensation.”

While OpenAI did not reciprocate their need to compensate publishers (why would they?), actions do speak louder than words. Its deal with the FT is not unique – it has come to a similar understanding with Associated Press, French newspaper Le Monde, the El Pais owner Prisa Media and German publisher Axel Springer, owners of titles including Politico, Bild and Business Insider, too. (I hadn’t appreciated so many deals had been completed until I read this rather excellent article in the Guardian last Saturday, so referencing my source).

Reporting around the Axel Springer deal by the FT, stated that the deal with OpenAI would be worth “tens of millions of Euros a year.” It also “reflects much of what the industry has been puzzling over all year: how to value content archives that can stretch back for decades while creating an income stream from new journalism.”

We’ve been consulting with AI expert and consultant, Andrew Bruce Smith, Founder of Escherman, for the past few months to help us with our own AI integration. Andrew sums up the latest course of litigation rather well. “The fact OpenAI has agreed to pay some publishers for use of their content, proves they believe they should pay “something” – the question is how much? Therefore, if OpenAI is using litigation to control its costs, there will likely be a limit to how far they push legal proceedings before they discuss a settlement.”

As is often the case, the commercial argument is taking precedent over the ethical one. Determining whether OpenAI and other big tech companies have contravened copyright law is complicated business, however. Andrew has also drawn my attention to a quote from Professor Ethan Mollick from his book, Co-Intelligence: Living and Working with AI.

“…it is likely that most AI training data contains copyrighted information, like books used without permission, whether by accident or on purpose. The legal implications of this are still unclear. Since the data is used to create weights, and not directly copied into AI systems, some experts consider it to be outside standard copyright law. In the coming years, these issues are likely to be resolved by courts and legal systems, but they create a cloud of uncertainty, both ethically and legally, over this early stage of AI training.”

I can’t help feeling that with so much interest, and so much money being generated by the promise of generative AI, there will be sufficient compensation for all parties to agree a constructive way forward eventually. AI needs high quality original content, and publishers need new lucrative sources of revenue.

It just looks like the lawyers will have the most to gain in the immediate term while the value of the content is decided.