Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Meta executives obsessed with beating OpenAI’s GPT-4 internally, court documents reveal


Executives and researchers leading Meta’s artificial intelligence efforts are obsessed with beating OpenAI’s GPT-4 model as they develop Llama 3, according to internal messages unsealed by a court on Tuesday in one of the company’s ongoing AI copyright cases, Kadrey v.

“Honestly… Our goal should be GPT-4,” said Meta’s VP of Generative AI, Ahmad Al-Dahle, in an October 2023 message to Meta researcher Hugo Touvron. “We have 64k GPUs coming! We need to learn how to build the frontier and win this race.”

Although Meta releases open AI models, the company’s AI leaders were much more focused on beating competitors who generally do not release their model weights, such as Anthropic and OpenAI, and instead bring them behind an API. Meta managers and researchers considered Anthropic’s Claude and OpenAI’s GPT-4 as a gold standard to work with.

The French startup AI Mistral, one of the biggest competitors open to Meta, was mentioned several times in internal messages, but the tone was dismissive.

“Mistral is peanuts for us,” Al-Dahle said in a message. “We have to be able to do better,” he said later.

Tech companies are racing to one-up each other with cutting-edge AI models these days, but these court documents reveal just how competitive Meta’s AI chiefs really were — and apparently. i still am. At various points in the message exchanges, Meta’s AI drivers talked about how they were “very aggressive” in getting the right data to train Llama; at one point, an executive even said that “Llama 3 is literally all I care about,” in a message to colleagues.

Prosecutors in this case allege that Meta executives occasionally cut corners in their mad rush to ship AI models, training on copyrighted books in the process.

Touvron noted in a message that the mix of datasets used for Llama 2 “was bad”, and talked about how Meta could use a better mix of data sources to improve Llama 3. Touvron and Al-Dahle talked about removing the way to use the LibGen dataset, which contains copyrighted works from Cengage Learning, Macmillan Learning, McGraw Hill, and Pearson Education.

“We have the right datasets in there (?),” said Al-Dahle. “Is there something you wanted to use but couldn’t for some stupid reason?”

Meta CEO Mark Zuckerberg has previously said that he is trying to close the performance gap between Llama’s AI models and closed models from OpenAI, Google and others. Internal messages reveal intense pressure within the company to perform.

“This year, Llama 3 is competitive with the most advanced models and leads in some areas,” said Zuckerberg in a letter from July 2024. “Starting next year, we expect future Llama models to become the most advanced in the industry.”

When Meta lately released Llama 3 in April 2024the open AI model was competitive with Google’s leading closed models, OpenAI and Anthropic, and outperformed the open options from Mistral. However, the Meta data used to train their models – data Zuckerberg said he gave the green light to use, despite its copyright status – are facing scrutiny in several ongoing processes.



Source link