Meta unveils biggest Llama 3 AI model, touting language and math gains

Meta Platforms unveiled the latest and largest version of its mostly free Llama 3 AI models on Tuesday, showcasing advancements in multilingual capabilities and performance metrics that closely rival those of paid models from competitors like OpenAI. The new Llama 3 model boasts 405 billion parameters, making it significantly larger than its predecessor released last year but still smaller than leading models from rivals. For comparison, OpenAI’s GPT-4 model reportedly has one trillion parameters, while Amazon is developing a model with 2 trillion parameters.

Chief Executive Mark Zuckerberg highlighted his expectation that future Llama models will surpass proprietary competitors by next year. He also anticipates that Meta’s AI chatbot, powered by these models, will become the most popular AI assistant by the end of this year, already serving hundreds of millions of users.

The release is part of the broader tech industry’s push to demonstrate that their large language models, despite their enormous costs, deliver significant advancements in areas like advanced reasoning. Meta’s top AI scientist has suggested that while these models have improved, breakthroughs may require other types of AI systems.

In addition to the flagship 405 billion parameter model, Meta is also releasing updated versions of its smaller 8 billion and 70 billion parameter Llama 3 models, which were initially introduced in the spring. All three models are multilingual and feature an expanded “context window,” which enhances their ability to handle complex user requests and generate higher-quality code.

Ahmad Al-Dahle, Meta’s head of generative AI, explained that the larger context window acts like a longer memory, improving the model’s ability to process multi-step requests. The team has also enhanced the model’s performance on math problems by incorporating AI-generated data for training.

Meta provides these Llama models free of charge to developers, a strategy that Zuckerberg believes will lead to innovative products and reduce dependency on competitors, while also boosting engagement on Meta’s core social networks. However, some investors have raised concerns about the associated costs.

The release aims to attract developers to Meta’s free models rather than paid alternatives, potentially undercutting rivals’ business models. Meta’s results on key benchmarks suggest that its largest Llama 3 model performs almost as well as or better than other leading models, such as Anthropic’s Claude 3.5 Sonnet and OpenAI’s GPT-4. For instance, the Llama 3 model scored 73.8 on the MATH benchmark, compared to GPT-4’s 76.6 and Claude 3.5 Sonnet’s 71.1, and 88.6 on the MMLU benchmark, just shy of GPT-4’s 88.7 and above Claude 3.5 Sonnet’s 88.3.

Meta researchers also hinted at upcoming “multimodal” versions of Llama 3 models, which will integrate image, video, and speech capabilities. Early tests suggest these models will compete effectively with other multimodal models such as Google’s Gemini 1.5 and Anthropic’s Claude 3.5 Sonnet.

- Advertisement -

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles

error: Content is protected !!

Sign Up for CXO Digital Pulse Newsletters

Sign Up for CXO Digital Pulse Newsletters to Download the Research Report

Sign Up for CXO Digital Pulse Newsletters to Download the Coffee Table Book

Sign Up for CXO Digital Pulse Newsletters to Download the Vision 2023 Research Report

Download 8 Key Insights for Manufacturing for 2023 Report

Sign Up for CISO Handbook 2023

Download India’s Cybersecurity Outlook 2023 Report

Unlock Exclusive Insights: Access the article

Download CIO VISION 2024 Report

Share your details to download the report

Share your details to download the CISO Handbook 2024

Fill your details to Watch