The latest model from OpenAI faces a hurdle: a lack of sufficient data for its training, according to a report

Publié le 20 February 2025 à 15h46
modifié le 20 February 2025 à 15h46

OpenAI is facing a *major obstacle* regarding the development of its latest artificial intelligence model. A recent analysis highlights a *lack of sufficient data* to train this system of unprecedented complexity. The *valuation of OpenAI* at 157 billion dollars relies on the success of this technology. The challenges encountered during its development underline the limitations of the data available in the current digital ecosystem. The implications of this situation resonate beyond simple technical challenges, questioning the ability of artificial intelligence to progress in the face of these constraints.

Problems encountered by OpenAI’s latest model

A report from the Wall Street Journal revealed that OpenAI’s artificial intelligence project, known as GPT-5 or Orion, is significantly behind schedule. This model, which requires a colossal volume of data to be operational, faces a troubling reality: the lack of sufficient data in the world for its development.

Astronomical development costs

The temptation to develop a cutting-edge AI model has led to significant expenditures. The training costs for Orion, over a six-month period, could reach nearly 500 million dollars. In comparison, the training of its predecessor, GPT-4, amounted to approximately 100 million dollars. These colossal sums underscore the financial pressure that OpenAI is facing, exacerbated by the need for a functional model.

Project guidelines

Designed to bridge the gap between data creation and achieving the desired outcomes, Orion was meant to surpass all previous advancements of the company, notably by making significant scientific discoveries and performing routine human tasks. Nevertheless, large-scale training trials revealed significant limitations.

Lack of data on the Internet

OpenAI researchers found a lack of available data on public internet, often used to train previous models. This insufficiency has prompted the company to consider alternative solutions. Software engineers and mathematicians have been hired to generate new data, but this process proves to be laborious and time-consuming.

Use of synthetic data

At the same time, OpenAI is utilizing synthetic data, created by the AI itself, to fuel Orion’s training. However, this method carries risks, resulting in noticeable malfunctions and inappropriate responses that harm the model’s credibility. Such problems only emerge after intensive training phases.

Absence of significant progress

No significant advancement on the anticipated projection has been observed during the ongoing testing. The operational results of Orion do not justify the exorbitant costs incurred. The initial projection anticipated a model that could emerge as a benchmark in the use of AI, equivalent to a PhD in artificial intelligence.

Internal challenges and external competition

OpenAI must also manage issues of internal governance, including organizational instability. Many leaders, including the co-founder and chief scientist, have left the company. This instability undoubtedly affects the project’s advancement.

Moreover, rivals such as Anthropic and Google are making significant milestones. Their models, often deemed superior, threaten OpenAI’s leading position in the market. As the development of GPT-4 has become obsolete, the pressure on Orion will only increase in the future.

Frequently Asked Questions

Why is OpenAI’s latest model, GPT-5, encountering difficulties during training?
The model faces obstacles due to a lack of sufficient data available on the Internet for training, complicating its effective development.
What are the consequences of the lack of data for the training of the GPT-5 model?
The lack of data can lead to suboptimal model performance, making it difficult to function as intended upon launch.
How is OpenAI attempting to address the data problem for GPT-5?
OpenAI is trying to create data from scratch by hiring software engineers and mathematicians while using synthetic data, but this proves to be a long and complicated process.
What are the financial implications of the development delays of GPT-5?
Delays at OpenAI can lead to high costs, with expenditures potentially reaching hundreds of millions of dollars without a guarantee of delivering a finished product.
Can you explain the concept of synthetic data used by OpenAI?
Synthetic data refers to data generated by artificial intelligence to train the model, but its use has shown limitations, such as incoherent or erroneous responses.
What is the link between OpenAI’s valuation and the success of GPT-5?
The valuation of OpenAI, estimated at 157 billion dollars, heavily depends on the success of GPT-5; if the model does not perform as expected, it could negatively impact investor confidence.
What alternatives does OpenAI have for developing higher-performing AI models?
OpenAI could consider collaborating with other companies to share resources or exploring different training methods that require less data.
How much time does OpenAI anticipate for the complete development of GPT-5?
Initially, the model was expected to be available around mid-2024, but due to the difficulties encountered, this deadline may be extended.

actu.iaNon classéThe latest model from OpenAI faces a hurdle: a lack of sufficient...

Shocked passersby by an AI advertising panel that is a bit too sincere

des passants ont été surpris en découvrant un panneau publicitaire généré par l’ia, dont le message étonnamment honnête a suscité de nombreuses réactions. découvrez les détails de cette campagne originale qui n’a laissé personne indifférent.

Apple begins shipping a flagship product made in Texas

apple débute l’expédition de son produit phare fabriqué au texas, renforçant sa présence industrielle américaine. découvrez comment cette initiative soutient l’innovation locale et la production nationale.
plongez dans les coulisses du fameux vol au louvre grâce au témoignage captivant du photographe derrière le cliché viral. entre analyse à la sherlock holmes et usage de l'intelligence artificielle, découvrez les secrets de cette image qui a fait le tour du web.

An innovative company in search of employees with clear and transparent values

rejoignez une entreprise innovante qui recherche des employés partageant des valeurs claires et transparentes. participez à une équipe engagée où intégrité, authenticité et esprit d'innovation sont au cœur de chaque projet !

Microsoft Edge: the browser transformed by Copilot Mode, an AI at your service for navigation!

découvrez comment le mode copilot de microsoft edge révolutionne votre expérience de navigation grâce à l’intelligence artificielle : conseils personnalisés, assistance instantanée et navigation optimisée au quotidien !

The European Union: A cautious regulation in the face of American Big Tech giants

découvrez comment l'union européenne impose une régulation stricte et réfléchie aux grandes entreprises technologiques américaines, afin de protéger les consommateurs et d’assurer une concurrence équitable sur le marché numérique.