Mistral AI presents Voxtral, an open source model dedicated to audio: speech recognition and transcription in the spotlight.

Publié le 17 July 2025 à 09h42
modifié le 17 July 2025 à 09h42

Voxtral redefines voice recognition with innovative and high-performance technology. Designed by Mistral AI, this open-source model facilitates audio transcription while offering unparalleled accuracy. At less than half the cost of competing solutions, Voxtral provides advanced features. This ambitious model integrates native semantic understanding, impressive linguistic recognition, as well as the ability to generate detailed summaries. In a constantly evolving technological landscape, Voxtral positions itself as an essential player in the field of artificial intelligence.

Mistral AI unveils Voxtral

Mistral AI, an iconic French company in the artificial intelligence sector, recently launched Voxtral, its first range of open-source models dedicated to voice recognition and transcription. This new offering comes in two variants, named Voxtral (24B) and Voxtral Mini (3B). According to Mistral AI, these models represent the pinnacle of vocal comprehension capabilities in the market.

Technical characteristics

Voxtral, aiming at a diverse audience, stands out with top-notch accuracy and native semantic understanding, all offered at a rate of less than $0.001 per minute. Available for download on Hugging Face and via the Mistral API, Voxtral processes up to 30 minutes of audio for transcription, while it can analyze 40 minutes for deeper understanding. Its ability to automatically recognize multiple languages, including Spanish, Hindi, and French, gives it international appeal.

Performance compared to competitors

Mistral AI claims that Voxtral outperforms established competitors on various benchmarks. According to the company, the model is capable of significantly exceeding Whisper large-v3, currently regarded as one of the most advanced open-source models. Moreover, Voxtral competes with Gemini 2.5 Flash and other solutions by offering excellence in both transcription and multilingual tasks.

Audio analysis features

The integration of Voxtral into The Chat, Mistral AI’s conversational agent, is set to occur in the near future. This new technology will allow users to record or import audio files. They will thus have the necessary tools to obtain transcriptions, ask content-related questions, and generate relevant summaries. These features promise to significantly enhance the user experience.

Options for businesses

Mistral AI also offers advanced options for the professional sector. Companies will benefit from fine-tuning the model, allowing them to adapt it to specific fields such as healthcare, law, or customer service. Additionally, a private deployment on their infrastructure will be available, accompanied by integration support. This personalized approach aims to meet the diverse needs of professionals.

Frequently asked questions

What are the main models available with Voxtral?
Voxtral comes in two main models: Voxtral (24B) and Voxtral Mini (3B), suited for various needs in voice recognition and transcription.

How do I access Voxtral and its features?
The Voxtral models are available for download on Hugging Face and via the Mistral AI API, starting at a cost of $0.001 per minute.

What languages are supported by Voxtral?
Voxtral can automatically recognize multiple languages, including Spanish, Hindi, and French, allowing for effective multilingual use.

What transcription and comprehension capabilities does Voxtral offer?
Voxtral allows for the transcription of up to 30 minutes of audio and understanding up to 40 minutes of recording, while generating summaries and answering questions.

How does Voxtral differentiate itself from competitors like Whisper large-v3?
According to Mistral AI, Voxtral outperforms Whisper large-v3 on multiple benchmarks while offering top-notch accuracy at a reduced cost.

What types of customizations are possible with Voxtral for businesses?
Mistral AI offers fine-tuning options to adapt the model to specific fields such as healthcare, law, or customer support.

When will Voxtral be integrated into The Chat?
The integration of Voxtral into The Chat will be gradual in the coming weeks, allowing users to record, import audio files, and easily interact with the content.

How does Voxtral handle speaker differentiation?
Voxtral may, in a future update, differentiate speakers and detect certain characteristics like age or gender, making the transcription more contextual.

actu.iaNon classéMistral AI presents Voxtral, an open source model dedicated to audio: speech...

The growing impact of conversational artificial intelligence on the daily lives of adolescents

découvrez comment l’intelligence artificielle conversationnelle transforme le quotidien des adolescents, influence leurs relations sociales, leurs habitudes d’apprentissage et bouleverse leurs modes de communication.

Do not be fooled by artificial intelligences: they cannot ‘suffer

découvrez pourquoi les intelligences artificielles, malgré leurs prouesses, restent incapables de ressentir la souffrance. ne tombez pas dans le piège des apparences : la conscience et l'émotion sont encore hors de leur portée.

Researchers discover evidence suggesting that ChatGPT shapes our way of communicating

des scientifiques révèlent que chatgpt influence notre communication quotidienne. découvrez comment cette ia transforme nos interactions et modifie notre façon d’échanger des idées.

Tencent Hunyuan: Dive into a realistic audio universe for your AI videos

découvrez tencent hunyuan, la solution immersive pour des expériences audio réalistes dans vos vidéos ia. améliorez la qualité sonore de vos créations grâce à une technologie innovante et performante.

The rise of the term ‘clanker’: the rallying cry of Generation Z against AI

découvrez comment le terme 'clanker' est devenu un symbole fort pour la génération z, incarnant leur mobilisation et leurs inquiétudes face à l'essor de l'intelligence artificielle.

AI agents: Promises of science fiction still to be refined before shining on the stage

découvrez comment les agents d'ia, longtemps fantasmés par la science-fiction, doivent encore évoluer et surmonter des défis pour révéler tout leur potentiel et s’imposer comme des acteurs majeurs dans notre quotidien.