Llama 3.3 70B: Meta announces performance comparable to that of Llama 3.1 405B, but at a significantly reduced cost

Publié le 21 February 2025 à 05h01
modifié le 21 February 2025 à 05h02

Llama 3.3 70B: comparable performances to Llama 3.1 405B

The model Llama 3.3 70B, recently announced by Meta, strategically positions itself in the open-source model market. Meta emphasizes that this model matches the performance of Llama 3.1, which has 405 billion parameters, while presenting a significantly lower cost. This is a major advantage for companies looking to integrate AI while managing their budgets.

A rapid series of launches

Meta is not slowing down its release pace, having introduced Llama 3.1 in July, followed by Llama 3.2 in late September, and finally, Llama 3.3 last week. Meta states that the Llama 3.3 70B model provides access to superior quality and performance for text applications, all at a reduced cost.

Preparation and training data

For this ultimate version, Meta pre-trained its model on approximately 15 trillion tokens from publicly available sources. Fine-tuning included public instruction datasets and over 25 million synthetically generated examples. Researchers indicate that the data used for pre-training extends up to December 2023.

Architecture and development

Llama 3.3 70B is based on a Transformer architecture and uses an autoregressive model. Development involved supervised fine-tuning as well as reinforcement learning with human feedback (RLHF). The model offers a context window of 128,000 tokens, thereby optimizing its use for various text instructions.

Performance comparison

Benchmark results show that Llama 3.3 70B matches the performance of Llama 3.1 70B and Amazon’s Nova Pro model, which was recently presented. Throughout various tests, Llama 3.3 70B reportedly outperforms competitors such as Gemini Pro 1.5 and GPT-4o. It stands out by offering comparable performance to Llama 3.1 405B at a cost one-tenth lower.

Multilingualism and commercial applications

The model supports eight languages: German, Spanish, French, Hindi, Italian, Portuguese, Thai, and English. Llama 3.3 is designed for commercial and research uses, capable of functioning as a chatbot assistant or for text generation tasks. Meta encourages developers to leverage the model’s extensive linguistic capabilities while highlighting the importance of fine-tuning for unsupported languages.

Infrastructure and resources

A considerable volume of resources has been mobilized for training: 39.3 million hours of GPU computing on H100-80GB hardware. The infrastructures for pre-training, fine-tuning, annotation, and evaluation have been integrated into Meta’s production ecosystem, thus optimizing performance quality.

Potential and recommendations

Meta highlights that Llama 3.3 offers cost-effective performance with inference achievable on common workstations. While the model can produce text in other languages, Meta advises against its use for conversations in unofficial languages without prior adjustments.

Frequently asked questions about Llama 3.3 70B

What is the main difference between Llama 3.3 70B and Llama 3.1 405B?
The main difference is that Llama 3.3 70B offers similar performance to Llama 3.1 405B while requiring fewer financial and computational resources.
What financial advantages does Llama 3.3 70B provide compared to other models?
The Llama 3.3 70B model allows companies to access advanced AI technology at a significantly reduced cost, making AI more accessible.
How does Llama 3.3 70B achieve such performance with fewer parameters?
This performance is achieved through the optimization of algorithms and training on a larger volume of data, as well as an advanced model architecture.
What languages are supported by Llama 3.3 70B?
Llama 3.3 70B supports 8 languages, including German, Spanish, French, Hindi, Italian, Portuguese, and Thai.
How is Llama 3.3 70B pre-trained?
The model has been pre-trained on approximately 15 trillion tokens from publicly available sources, as well as on a dataset of instructions.
What types of applications can benefit from Llama 3.3 70B?
Llama 3.3 70B is ideal for multilingual dialogue applications, chatbots, and various text generation tasks in a commercial and research context.
What is the context window capacity of Llama 3.3 70B?
The model has a context window of 128,000 tokens, allowing it to handle longer and more complex textual contexts.
Is Llama 3.3 70B recommended for unsupported languages?
Although it can produce text in other languages, Meta advises against its use without fine-tuning and safety checks in those unsupported languages.
What technical infrastructures were used for training Llama 3.3 70B?
The pre-training was conducted on a custom GPU cluster from Meta, utilizing a total of 39.3 million hours of GPU on H100-80GB hardware.
Is Llama 3.3 70B still an open-source model?
Yes, Llama 3.3 70B remains an open-source model offering a community license that allows for a variety of commercial and research applications.

actu.iaNon classéLlama 3.3 70B: Meta announces performance comparable to that of Llama 3.1...

Shocked passersby by an AI advertising panel that is a bit too sincere

des passants ont été surpris en découvrant un panneau publicitaire généré par l’ia, dont le message étonnamment honnête a suscité de nombreuses réactions. découvrez les détails de cette campagne originale qui n’a laissé personne indifférent.

Apple begins shipping a flagship product made in Texas

apple débute l’expédition de son produit phare fabriqué au texas, renforçant sa présence industrielle américaine. découvrez comment cette initiative soutient l’innovation locale et la production nationale.
plongez dans les coulisses du fameux vol au louvre grâce au témoignage captivant du photographe derrière le cliché viral. entre analyse à la sherlock holmes et usage de l'intelligence artificielle, découvrez les secrets de cette image qui a fait le tour du web.

An innovative company in search of employees with clear and transparent values

rejoignez une entreprise innovante qui recherche des employés partageant des valeurs claires et transparentes. participez à une équipe engagée où intégrité, authenticité et esprit d'innovation sont au cœur de chaque projet !

Microsoft Edge: the browser transformed by Copilot Mode, an AI at your service for navigation!

découvrez comment le mode copilot de microsoft edge révolutionne votre expérience de navigation grâce à l’intelligence artificielle : conseils personnalisés, assistance instantanée et navigation optimisée au quotidien !

The European Union: A cautious regulation in the face of American Big Tech giants

découvrez comment l'union européenne impose une régulation stricte et réfléchie aux grandes entreprises technologiques américaines, afin de protéger les consommateurs et d’assurer une concurrence équitable sur le marché numérique.