Janus-Pro: DeepSeek's answer to OpenAI's DALL-E 3

DeepSeek makes waves with the launch of Janus-Pro, a revolutionary model in generative AI. Directly targeting a titan like DALL-E 3, this innovation proves to be a decisive advancement in the field of multimodal generation. Its optimized approach and advanced architecture promise to redefine the standards of understanding images from text. Janus-Pro surpasses the previous performances of competing models. With its expanded parameter network, this model demonstrates an unprecedented ability to interpret complex instructions. The ecological stakes of this technology cannot be ignored. The emergence of this challenger marks a turning point in the AI ecosystem, where innovation must be synonymous with accessibility and power. Companies must now prepare to navigate a landscape where competition is intensifying.

DeepSeek unveils Janus-Pro

The start-up DeepSeek recently launched its new AI model, Janus-Pro, designed for image generation. This model, which follows DeepSeek-R1, aims to rise to the level of the best solutions on the market, such as DALL-E 3 from OpenAI. Central to the ecosystem of generative AIs, Janus-Pro positions itself as a direct competitor to these giants.

Underlying technology of Janus-Pro

The Janus-Pro model is the result of a significant advancement in the field of multimodal AI. By the end of 2024, DeepSeek had already presented JanusFlow, a framework for integrating autoregressive language models with an innovative generative modeling technique called rectified flow. The recent model will be capable of generating images by interpreting textual instructions.

Performance and evaluation

Researchers at DeepSeek subjected Janus-Pro to rigorous tests across several benchmarks. The results were promising. The model, particularly the version with 7 billion parameters, achieved a score of 79.2 on the multimodal understanding benchmark MMBench, thereby surpassing competitors such as Janus and TokenFlow.

Comparative capabilities with DALL-E 3

The performance of Janus-Pro in terms of following instructions also stands out as a major asset. The Janus-Pro-7B model, for example, achieved a score of 0.80 on the GenEval benchmark, surpassing DALL-E 3 (0.67). This demonstrates a significant advance, strengthening DeepSeek’s position in the generative AI market.

Expansion of the model range

Janus-Pro is offered in two model sizes, respectively 1 billion and 7 billion parameters. This flexibility reflects the scalability of the visual encoding and decoding method adopted by DeepSeek. The company has decided to make its code and models available as open source, thereby fostering community adoption and contribution.

Limitations and future perspectives

Although Janus-Pro achieves remarkable results, some limitations remain. The input resolution is limited to 384×384 pixels, which may impact the quality of the generated images. Reconstruction losses caused by the visual tokenizer have been identified, leading to the production of images with rich semantic content but lacking in detail.

Researchers believe that increasing the resolution of images could bring notable improvements to Janus-Pro’s performance. By identifying these limitations, DeepSeek is committed to continually improving its models to ensure a competitive offering.

Frequently asked questions about DeepSeek’s Janus-Pro

What are the main features of Janus-Pro?
Janus-Pro stands out for its integration of an optimized training strategy, extensive training data, and its ability to interpret and generate images from text commands through advanced multimodal modeling.
How does Janus-Pro compare to DALL-E 3?
Janus-Pro, with its 1 billion and 7 billion parameter models, exhibits superior performance in multimodal understanding benchmarks, outperforming DALL-E 3 in several instruction-following tests.
Is Janus-Pro an open source model?
Yes, DeepSeek offers Janus-Pro as an open source model, allowing the community to access the code and models for ongoing use and enhancement.
What are the limitations of Janus-Pro?
One of the main limitations of Janus-Pro is the input resolution, which is limited to 384×384 pixels, potentially affecting its performance in tasks requiring high precision, such as optical character recognition.
How can I access Janus-Pro?
Janus-Pro is publicly available on platforms dedicated to sharing artificial intelligence models, where users can download and explore it.
What improvements does Janus-Pro bring compared to Janus?
Janus-Pro enhances multimodal understanding and visual generation through better interpretation of textual instructions thanks to an advanced model architecture.
Is Janus-Pro intended for professional users or the general public?
Janus-Pro is designed to be used by a variety of users, ranging from researchers and developers to artists and designers, thanks to its open-source approach and high-performance image generation.
What are the benefits of using a multimodal model like Janus-Pro?
Multimodal models, such as Janus-Pro, offer a better level of understanding of the relationships between text and images, thereby allowing for more accurate and contextually appropriate image generation.

DeepSeek launches Janus-Pro, a direct competitor to OpenAI’s DALL-E 3

DeepSeek unveils Janus-Pro

Underlying technology of Janus-Pro

Performance and evaluation

Comparative capabilities with DALL-E 3

Expansion of the model range

Limitations and future perspectives

Frequently asked questions about DeepSeek’s Janus-Pro

Shocked passersby by an AI advertising panel that is a bit too sincere

Apple begins shipping a flagship product made in Texas

Flight at the Louvre: the mystery of the viral photo decoded by its photographer, between Sherlock Holmes and artificial...

An innovative company in search of employees with clear and transparent values

Microsoft Edge: the browser transformed by Copilot Mode, an AI at your service for navigation!

The European Union: A cautious regulation in the face of American Big Tech giants

DeepSeek launches Janus-Pro, a direct competitor to OpenAI’s DALL-E 3

DeepSeek unveils Janus-Pro

Underlying technology of Janus-Pro

Performance and evaluation

Comparative capabilities with DALL-E 3

Expansion of the model range

Limitations and future perspectives

Frequently asked questions about DeepSeek’s Janus-Pro

.tdi_114{z-index:84546!important}Apple begins shipping a flagship product made in Texas

.tdi_133{z-index:84546!important}Flight at the Louvre: the mystery of the viral photo decoded by its photographer, between Sherlock Holmes and artificial...

.tdi_152{z-index:84546!important}An innovative company in search of employees with clear and transparent values

.tdi_171{z-index:84546!important}Microsoft Edge: the browser transformed by Copilot Mode, an AI at your service for navigation!

.tdi_190{z-index:84546!important}The European Union: A cautious regulation in the face of American Big Tech giants

Apple begins shipping a flagship product made in Texas

Flight at the Louvre: the mystery of the viral photo decoded by its photographer, between Sherlock Holmes and artificial...

An innovative company in search of employees with clear and transparent values

Microsoft Edge: the browser transformed by Copilot Mode, an AI at your service for navigation!

The European Union: A cautious regulation in the face of American Big Tech giants