DeepSeek launches Janus-Pro, a direct competitor to OpenAI’s DALL-E 3

Publié le 18 February 2025 à 20h18
modifié le 18 February 2025 à 20h18

DeepSeek makes waves with the launch of Janus-Pro, a revolutionary model in generative AI. Directly targeting a titan like DALL-E 3, this innovation proves to be a decisive advancement in the field of multimodal generation. Its optimized approach and advanced architecture promise to redefine the standards of understanding images from text. Janus-Pro surpasses the previous performances of competing models. With its expanded parameter network, this model demonstrates an unprecedented ability to interpret complex instructions. The ecological stakes of this technology cannot be ignored. The emergence of this challenger marks a turning point in the AI ecosystem, where innovation must be synonymous with accessibility and power. Companies must now prepare to navigate a landscape where competition is intensifying.

DeepSeek unveils Janus-Pro

The start-up DeepSeek recently launched its new AI model, Janus-Pro, designed for image generation. This model, which follows DeepSeek-R1, aims to rise to the level of the best solutions on the market, such as DALL-E 3 from OpenAI. Central to the ecosystem of generative AIs, Janus-Pro positions itself as a direct competitor to these giants.

Underlying technology of Janus-Pro

The Janus-Pro model is the result of a significant advancement in the field of multimodal AI. By the end of 2024, DeepSeek had already presented JanusFlow, a framework for integrating autoregressive language models with an innovative generative modeling technique called rectified flow. The recent model will be capable of generating images by interpreting textual instructions.

Performance and evaluation

Researchers at DeepSeek subjected Janus-Pro to rigorous tests across several benchmarks. The results were promising. The model, particularly the version with 7 billion parameters, achieved a score of 79.2 on the multimodal understanding benchmark MMBench, thereby surpassing competitors such as Janus and TokenFlow.

Comparative capabilities with DALL-E 3

The performance of Janus-Pro in terms of following instructions also stands out as a major asset. The Janus-Pro-7B model, for example, achieved a score of 0.80 on the GenEval benchmark, surpassing DALL-E 3 (0.67). This demonstrates a significant advance, strengthening DeepSeek’s position in the generative AI market.

Expansion of the model range

Janus-Pro is offered in two model sizes, respectively 1 billion and 7 billion parameters. This flexibility reflects the scalability of the visual encoding and decoding method adopted by DeepSeek. The company has decided to make its code and models available as open source, thereby fostering community adoption and contribution.

Limitations and future perspectives

Although Janus-Pro achieves remarkable results, some limitations remain. The input resolution is limited to 384×384 pixels, which may impact the quality of the generated images. Reconstruction losses caused by the visual tokenizer have been identified, leading to the production of images with rich semantic content but lacking in detail.

Researchers believe that increasing the resolution of images could bring notable improvements to Janus-Pro’s performance. By identifying these limitations, DeepSeek is committed to continually improving its models to ensure a competitive offering.

Frequently asked questions about DeepSeek’s Janus-Pro

What are the main features of Janus-Pro?
Janus-Pro stands out for its integration of an optimized training strategy, extensive training data, and its ability to interpret and generate images from text commands through advanced multimodal modeling.
How does Janus-Pro compare to DALL-E 3?
Janus-Pro, with its 1 billion and 7 billion parameter models, exhibits superior performance in multimodal understanding benchmarks, outperforming DALL-E 3 in several instruction-following tests.
Is Janus-Pro an open source model?
Yes, DeepSeek offers Janus-Pro as an open source model, allowing the community to access the code and models for ongoing use and enhancement.
What are the limitations of Janus-Pro?
One of the main limitations of Janus-Pro is the input resolution, which is limited to 384×384 pixels, potentially affecting its performance in tasks requiring high precision, such as optical character recognition.
How can I access Janus-Pro?
Janus-Pro is publicly available on platforms dedicated to sharing artificial intelligence models, where users can download and explore it.
What improvements does Janus-Pro bring compared to Janus?
Janus-Pro enhances multimodal understanding and visual generation through better interpretation of textual instructions thanks to an advanced model architecture.
Is Janus-Pro intended for professional users or the general public?
Janus-Pro is designed to be used by a variety of users, ranging from researchers and developers to artists and designers, thanks to its open-source approach and high-performance image generation.
What are the benefits of using a multimodal model like Janus-Pro?
Multimodal models, such as Janus-Pro, offer a better level of understanding of the relationships between text and images, thereby allowing for more accurate and contextually appropriate image generation.

actu.iaNon classéDeepSeek launches Janus-Pro, a direct competitor to OpenAI's DALL-E 3

Travel platforms are choosing AI to reinvent themselves and remain essential

découvrez comment les plateformes de voyage intègrent l'intelligence artificielle pour innover, personnaliser l'expérience utilisateur et conserver leur place de leader dans un secteur en pleine évolution.

Mistral AI has established itself as the first French gem to surpass the 10 billion euros valuation mark.

découvrez comment mistral ai devient la première start-up française à franchir le cap des 10 milliards d'euros de valorisation, marquant une étape historique dans l'écosystème tech français.
découvrez comment l'entreprise française mistral ai a atteint une valorisation impressionnante de 14 milliards de dollars suite à un investissement stratégique du leader mondial des puces électroniques, asml.

Artificial Intelligence on a Global Scale: Is a Slowdown Ahead?

découvrez si l'essor de l'intelligence artificielle à l'échelle mondiale marque une pause. analyse des tendances récentes, défis et perspectives sur le développement de l'ia dans le monde.

The impact of AI on the job market: Young people under 25 on the front line

découvrez comment l'intelligence artificielle transforme le marché du travail et pourquoi les jeunes de moins de 25 ans sont directement concernés par ces évolutions. analyse des risques, opportunités et métiers d'avenir.

It is better not to mislead by equating Mistral AI with ChatGPT

découvrez pourquoi il est important de ne pas confondre mistral ai et chatgpt. analyse des différences clés entre ces deux intelligences artificielles pour éviter toute méprise.