Home » Ai Tools » OpenAI and AP News Partners to Train AI on News Articles

OpenAI and AP News Partners to Train AI on News Articles


Arnold Kirimi

| Updated on:

OpenAI, the company behind ChatGPT, and The Associated Press (AP) have embarked on an extraordinary partnership to redefine AI training. OpenAI, a trailblazer in artificial intelligence, will gain exclusive access to AP’s vast news archive spanning over three decades. 

Quick facts:

  • OpenAI and The Associated Press (AP) form an unprecedented partnership, granting OpenAI access to AP’s extensive news archive for training AI models.
  • AP has been a pioneer in AI exploration, using automation for company earnings reports and sports coverage.
  • OpenAI’s collaboration with AP adds to its growing list of partnerships, driving AI innovation.
  • The use of public data without permission has sparked legal and ethical debates, leading to lawsuits and regulatory investigations.

OpenAI strikes a historic deal with the Associated Press to pay for using their news stories in AI training, fueling discussions on fair compensation for web content used by tech companies to build AI tools.

While the precise details of the technology and product expertise exchanged remain undisclosed, AP has long been at the forefront of AI exploration. 

Starting in 2014, the news organization pioneered automated reports on company earnings, subsequently expanding into automated stories on Minor League Baseball and college sports. This track record highlights AP’s unwavering commitment to embracing cutting-edge technologies.

By aligning with AP, OpenAI adds yet another milestone to its growing roster of partnerships. Recently, OpenAI made headlines with a six-year agreement with Shutterstock, enabling the licensing of images, videos, music, and metadata for training their groundbreaking text-to-image model, DALL-E. 

Collaborating with media powerhouse BuzzFeed, OpenAI aims to leverage AI tools to elevate content enhancement and personalization. Moreover, OpenAI’s strategic alliance with Microsoft, accompanied by substantial investment and product development, underscores their dedication to pushing the boundaries of AI innovation.

OpenAI’s access to AP’s vast and factual text archive will contribute to the ongoing enhancement of its AI systems. Brad Lightcap, OpenAI’s Chief Operating Officer, expressed appreciation for AP’s industry leadership in AI adoption and emphasized the valuable role AP’s high-quality text archive will play in advancing OpenAI’s capabilities.

It’s worth noting that while AP has embraced AI-powered projects, including Spanish-language news alerts and public safety incident documentation, the news organization explicitly clarified that AI technology is not incorporated into its news stories. AP’s dedication to maintaining the integrity of its journalism remains paramount.

OpenAI and Big Tech Data Usage Sparks Debate and Scrutiny

OpenAI, Google, and various AI companies have harnessed the power of billions of sentences sourced from the open internet to build their sophisticated “large language models” that fuel chatbot capabilities. 

This includes incorporating news stories, Wikipedia articles, social media comments, and blog posts into their models, all without seeking permission from the content creators. 

While these tech companies argue that the data is publicly available and fair game, a rising chorus of authors, musicians, news organizations, and social media companies is challenging this approach.

The Washington Post conducted an illuminating analysis, revealing that the AP’s main news website ranked as the 68th most cited source in a database utilized to train one of OpenAI’s earlier AI models. 

However, this unprecedented use of content to train AI models has sparked a vigorous debate. Critics argue that leveraging human-generated content to train AI tools marks a seismic shift in how the internet operates, especially considering that some of these tools are already replacing human workers.

In the past two weeks, a wave of lawsuits has swept the industry, alleging improper data usage. OpenAI and Google face class-action suits, while comedian Sarah Silverman and two acclaimed fiction authors have filed lawsuits against OpenAI.

Recently, the Federal Trade Commission has launched an investigation into OpenAI’s utilization of consumer data for training its models. This development underscores the heightened scrutiny surrounding data practices and highlights the need to address the ethical and legal dimensions of AI training.

Leave a Comment