The National Institute of Audiovisual (INA) is reinventing itself in the face of the rapid rise of Artificial Intelligence. This technology is not limited to optimizing processes; it is revolutionizing the access and valorization of heritage archives. *More than 700,000 hours of audiovisual content* have been meticulously analyzed to offer unprecedented visibility. INA thus becomes a pioneer in the implementation of AI solutions, forever transforming the perception of cultural resources. *The metadata enriches this heritage, allowing for a refined exploration*. The stakes of this initiative transcend mere access, redefining the contours of digital culture and its appropriation.
A New Era for INA
The National Institute of Audiovisual (INA) has crossed a decisive threshold by integrating Artificial Intelligence (AI) into the valorization of its archives. The project data.ina.fr aims to offer an unprecedented perspective on decades of audiovisual archives by mobilizing intelligent analysis tools. This site, launched in October 2024, has seen the analysis of more than 700,000 hours of audiovisual content using advanced technologies.
Analysis of Audiovisual Content
The three key integrated tools are based on AI. One of them, called INASpeechSegmenter, allows for the differentiation of speakers by gender. This approach improves the discoverability of content, thereby facilitating the analysis of major media trends.
The platform does not merely store data. It exploits metadata, linking archive elements together to create unexpected correlations. Thanks to data visualization, users can thus explore *the evolution of terms* or the frequency of certain personalities in the media.
The Implementation Methodology
Camille Pettineo, editorial manager, emphasizes the importance of human verification in the process. After an initial phase of analysis, the results are subjected to control by experts. This verification ensures the reliability of the data, as *700,000 hours analyzed* represent a colossal volume. A verification gap therefore arises to ensure the accuracy of the presented information.
Xavier Lemarchand, mission director at INA, discusses the creation of a representative archive corpus, which serves as a basis for comparing results generated by AI. This method imposes essential rigor to avoid recurring errors in analysis.
The Added Value of AI in Archives
The advances in AI allow for an exploration of *historical depth* and the highlighting of media concepts. Users have access to several years of archives, enabling them to visualize data by year, month, or day. This customization creates an enriching experience for anyone interested in media history.
Subsequently, the platform will be enriched every six months, thus increasing its historical depth. Scheduled updates ensure that the information remains relevant and adapted to contemporary challenges.
The Challenges Related to the Use of AI
A major challenge related to the use of AI is the risk of bias in data processing. INA has opted for a transparent approach, reporting only the biases without attempting to correct them. This decision avoids introducing human biases in response to algorithmic biases.
Potential errors are considered in view of their volume. Consequently, the validation process is particularly rigorous. The analysis and verification unfold in three stages: field verification, completeness of processing, and relevance control.
INA’s Place in Digital Transformation
INA’s transition to a digital era is affirmed through the development of its services. The platform madelen and initiatives like INA Hip Hop testify to this radical evolution, increasing its influence in the media landscape.
The implications of this integration of AI are not limited to the valorization of archives. INA is also pursuing reflections on the exploitation of data collected by the legal deposit of the web, although challenges of homogeneity persist.
Conclusion on the Implications of AI
The implications of AI for INA go beyond the technological framework. This project echoes a societal approach that combines heritage and innovation, providing the public with simplified and intuitive access to an audiovisual treasure. INA, by positioning itself as a key player in the valorization of audiovisual content, illustrates a model of transformation suited to contemporary digital demands.
Frequently Asked Questions
How does Artificial Intelligence improve access to INA’s archives?
Artificial Intelligence allows for better description and analysis of audiovisual content, making the archives easier to discover and explore. This includes processing large amounts of data to detect media trends over the long term.
What types of archives have been analyzed by INA using AI?
INA has analyzed thousands of hours of audiovisual content, mainly from news broadcasts, news channels, and other audiovisual programs over several years, to provide relevant and actionable data.
What AI tools did INA use for the data.ina.fr project?
INA developed several internal tools and used third-party solutions like Whisper for transcription and TextRazor for text analysis to improve the quality of the metadata associated with the archives.
How does INA guarantee the reliability of data processed by AI?
INA has established a three-step control process: “ground truth,” which compares AI-generated data with data validated by humans, as well as checks to ensure the completeness and relevance of the results.
What are the implications of using AI for audiovisual heritage?
This use raises ethical issues and potential biases. INA chooses not to correct identified biases but rather to report them, thus preserving the integrity of large-scale data while offering transparency and public awareness.
How often is the data on the data.ina.fr platform updated?
The platform is updated every six months with new data, continuously enriching the analysis of the archives and allowing for the exploration of increasingly extensive historical periods.
What types of data visualizations are available on data.ina.fr?
Users can access a variety of interactive visualizations, including timeline graphs, top 10 or top 20 media-cited personalities, and other temporal filtering options.