DeepSeek launches Janus-Pro, a direct competitor to OpenAI’s DALL-E 3

Publié le 18 February 2025 à 20h18
modifié le 18 February 2025 à 20h18

DeepSeek makes waves with the launch of Janus-Pro, a revolutionary model in generative AI. Directly targeting a titan like DALL-E 3, this innovation proves to be a decisive advancement in the field of multimodal generation. Its optimized approach and advanced architecture promise to redefine the standards of understanding images from text. Janus-Pro surpasses the previous performances of competing models. With its expanded parameter network, this model demonstrates an unprecedented ability to interpret complex instructions. The ecological stakes of this technology cannot be ignored. The emergence of this challenger marks a turning point in the AI ecosystem, where innovation must be synonymous with accessibility and power. Companies must now prepare to navigate a landscape where competition is intensifying.

DeepSeek unveils Janus-Pro

The start-up DeepSeek recently launched its new AI model, Janus-Pro, designed for image generation. This model, which follows DeepSeek-R1, aims to rise to the level of the best solutions on the market, such as DALL-E 3 from OpenAI. Central to the ecosystem of generative AIs, Janus-Pro positions itself as a direct competitor to these giants.

Underlying technology of Janus-Pro

The Janus-Pro model is the result of a significant advancement in the field of multimodal AI. By the end of 2024, DeepSeek had already presented JanusFlow, a framework for integrating autoregressive language models with an innovative generative modeling technique called rectified flow. The recent model will be capable of generating images by interpreting textual instructions.

Performance and evaluation

Researchers at DeepSeek subjected Janus-Pro to rigorous tests across several benchmarks. The results were promising. The model, particularly the version with 7 billion parameters, achieved a score of 79.2 on the multimodal understanding benchmark MMBench, thereby surpassing competitors such as Janus and TokenFlow.

Comparative capabilities with DALL-E 3

The performance of Janus-Pro in terms of following instructions also stands out as a major asset. The Janus-Pro-7B model, for example, achieved a score of 0.80 on the GenEval benchmark, surpassing DALL-E 3 (0.67). This demonstrates a significant advance, strengthening DeepSeek’s position in the generative AI market.

Expansion of the model range

Janus-Pro is offered in two model sizes, respectively 1 billion and 7 billion parameters. This flexibility reflects the scalability of the visual encoding and decoding method adopted by DeepSeek. The company has decided to make its code and models available as open source, thereby fostering community adoption and contribution.

Limitations and future perspectives

Although Janus-Pro achieves remarkable results, some limitations remain. The input resolution is limited to 384×384 pixels, which may impact the quality of the generated images. Reconstruction losses caused by the visual tokenizer have been identified, leading to the production of images with rich semantic content but lacking in detail.

Researchers believe that increasing the resolution of images could bring notable improvements to Janus-Pro’s performance. By identifying these limitations, DeepSeek is committed to continually improving its models to ensure a competitive offering.

Frequently asked questions about DeepSeek’s Janus-Pro

What are the main features of Janus-Pro?
Janus-Pro stands out for its integration of an optimized training strategy, extensive training data, and its ability to interpret and generate images from text commands through advanced multimodal modeling.
How does Janus-Pro compare to DALL-E 3?
Janus-Pro, with its 1 billion and 7 billion parameter models, exhibits superior performance in multimodal understanding benchmarks, outperforming DALL-E 3 in several instruction-following tests.
Is Janus-Pro an open source model?
Yes, DeepSeek offers Janus-Pro as an open source model, allowing the community to access the code and models for ongoing use and enhancement.
What are the limitations of Janus-Pro?
One of the main limitations of Janus-Pro is the input resolution, which is limited to 384×384 pixels, potentially affecting its performance in tasks requiring high precision, such as optical character recognition.
How can I access Janus-Pro?
Janus-Pro is publicly available on platforms dedicated to sharing artificial intelligence models, where users can download and explore it.
What improvements does Janus-Pro bring compared to Janus?
Janus-Pro enhances multimodal understanding and visual generation through better interpretation of textual instructions thanks to an advanced model architecture.
Is Janus-Pro intended for professional users or the general public?
Janus-Pro is designed to be used by a variety of users, ranging from researchers and developers to artists and designers, thanks to its open-source approach and high-performance image generation.
What are the benefits of using a multimodal model like Janus-Pro?
Multimodal models, such as Janus-Pro, offer a better level of understanding of the relationships between text and images, thereby allowing for more accurate and contextually appropriate image generation.

actu.iaNon classéDeepSeek launches Janus-Pro, a direct competitor to OpenAI's DALL-E 3

Amazon is investing 20 billion dollars in data centers in Pennsylvania, including one near a nuclear power plant.

découvrez comment amazon prévoit d'investir 20 milliards de dollars dans des centres de données en pennsylvanie, incluant l'un d'eux situé à proximité d'une centrale nucléaire. un projet ambitieux qui promet de transformer le paysage technologique et économique de la région.

help machines interpret visual content through AI

découvrez comment l'intelligence artificielle révolutionne l'interprétation du contenu visuel par les machines. apprenez les techniques innovantes qui permettent aux systèmes automatisés de comprendre et d'analyser les images, améliorant ainsi leur efficacité dans diverses applications.
découvrez comment tata, la jeune artiste propulsée par timbaland, attire l'attention et suscite des débats passionnés. plongez dans l'univers de cette talentueuse musicienne et les explications du célèbre producteur face aux controverses qui l'entourent.

Apple is gearing up for its annual event amid AI errors, technological upheavals, and Trump’s trade war

découvrez comment apple se positionne en vue de son événement annuel en pleine turbulence technologique, entre défis liés à l'intelligence artificielle, bouleversements du marché et tensions commerciales sous l'ère trump.

Perplexity AI records 708 million queries during the month of May

découvrez comment perplexity ai a enregistré 708 millions de requêtes en mai, un chiffre impressionnant qui témoigne de l'engouement croissant pour l'intelligence artificielle et son utilisation par les utilisateurs. explorez les implications de cette tendance dans notre article.
découvrez les détails de l'acquisition par openai de la start-up fondée par jony ive, l'ancien designer emblématique d'apple, pour un montant impressionnant de 6,5 milliards de dollars. analyse des enjeux et des perspectives d'avenir.