Bluesky may not train AI with your posts, but others can do so, and users are angry

Publié le 21 February 2025 à 15h10
modifié le 21 February 2025 à 15h10

Bluesky promises to respect *user confidentiality* by not training AI with their posts. Concerns remain regarding the use of data by third parties. The loss of trust in these platforms has raised *genuine outrage* among members of this growing community. Users demand *transparency* and guarantees against potential abuses. Bluesky’s commitments regarding machine learning collide with a reality where open-source raises essential questions about data control.

Bluesky, a safe haven for users?

Bluesky has managed to attract users dissatisfied with the practices of large platforms like X and Meta. Its decentralized model allows for a promise of increased control over personal data. The social network announced that it would not train artificial intelligence (AI) on users’ posts, setting it apart from many competitors.

A regrettable incident: the case of Daniel van Strien

Daniel van Strien, a librarian specialized in machine learning, recently created a dataset containing one million posts from Bluesky. Available via the platform’s Firehose API, this dataset included content identifiable through *decentralized identifiers*. Although his intention was to support AI research, the lack of anonymization raised concerns among users.

Users react vehemently

Following the publication of this dataset, many users expressed their outrage. They condemn the use of their content without explicit consent, a direct affront to the principles on which Bluesky was founded. The controversy escalated, and Van Strien ultimately had to withdraw the dataset while apologizing.

The implications of the Firehose API

Bluesky designed its Firehose API to promote transparency. This feature allows users and researchers to access real-time streams of public posts. However, this accessibility raises questions about the respect for *user privacy* and the risk of data misuse. The API represents a double-edged sword, leaving the door open to questionable uses.

A promise to respect personal data

Despite the incident, Bluesky reaffirms its commitment not to use user data for training AI models. A spokesperson mentioned the idea of creating mechanisms that would allow users to signal their consent for the use of their content in such projects. No concrete solution has emerged yet.

A call for vigilance for users

The situation illustrates the ambiguities surrounding decentralization. Users, who have left other platforms due to concerns about their content being respected, realize that this model is not an absolute guarantee. The scale of the discussions regarding data usage highlights the growing tensions between technological innovation and the protection of privacy.

Echoes from the past on Bluesky

This debate recalls the passionate discussions that raged on former platforms like Twitter. The users’ anger that is rising today towards Bluesky may represent the first major crisis for this social network. The stakes of such a situation could influence the path the platform takes as it seeks to grow and evolve at the user experience level.

Political initiatives in response to the situation

User mistrust of large tech firms has led to political measures. British MPs are seeking to call Elon Musk to discuss the impact of X on user data, illustrating the pressures that famous platforms now face. These events highlight the need for stricter regulation in the digital ecosystem.

Frequently asked questions about data usage on Bluesky

Does Bluesky really use user content to train AIs?
No, Bluesky assures that it does not use user content to train generative AI models, unlike other platforms. However, third parties may use this data.
Why are users angry about data on Bluesky?
Users are concerned that, although Bluesky does not train AI with their posts, other entities may exploit this data, raising privacy and consent issues.
What is Bluesky’s Firehose API, and how does it affect my data?
Bluesky’s Firehose API allows for a real-time stream of all public posts on the platform. This means that data can be collected and used by third parties without explicit user consent.
Does Bluesky plan to implement measures to protect user data in the future?
Bluesky has expressed interest in developing tools to allow users to signal their consent, but no concrete solution is in place yet.
What is the difference between data usage by Bluesky and that of other social networks like X?
While platforms like X include clauses in their terms of use that allow for data usage for AI models, Bluesky positions itself as an alternative that does not do this, although the risk of undesired use persists.
Can users request the deletion of their data on Bluesky?
Currently, Bluesky allows users to delete their account, which will result in data deletion, but there is not yet a specific mechanism to exclude data from third parties.
How can I be sure that my posts on Bluesky will not be used by third parties?
There is no total guarantee since Bluesky’s open architecture allows third parties to freely access public data. It is essential to remain vigilant about what you post.
What does the concept of decentralization mean for my privacy on Bluesky?
Decentralization aims to give users more control over their data, but it can also allow third parties to access information without restrictions, raising privacy concerns.
What should I do if I don’t want my posts to be used by AI researchers?
It is advisable to exercise caution when posting sensitive information and explore the privacy settings available on Bluesky, although they remain limited.

actu.iaNon classéBluesky may not train AI with your posts, but others can do...

an overview of employees affected by the recent mass layoffs at Xbox

découvrez un aperçu des employés impactés par les récents licenciements massifs chez xbox. cette analyse explore les circonstances, les témoignages et les implications de ces décisions stratégiques pour l'avenir de l'entreprise et ses salariés.
découvrez comment openai met en œuvre des stratégies innovantes pour fidéliser ses talents et se démarquer face à la concurrence croissante de meta et de son équipe d'intelligence artificielle. un aperçu des initiatives clés pour attirer et retenir les meilleurs experts du secteur.

An analysis reveals that the summit on AI advocacy has not managed to unlock the barriers for businesses

découvrez comment une récente analyse met en lumière l'inefficacité du sommet sur l'action en faveur de l'ia pour lever les obstacles rencontrés par les entreprises. un éclairage pertinent sur les enjeux et attentes du secteur.

Generative AI: a turning point for the future of brand discourse

explorez comment l'ia générative transforme le discours de marque, offrant de nouvelles opportunités pour engager les consommateurs et personnaliser les messages. découvrez les impacts de cette technologie sur le marketing et l'avenir de la communication.

Public service: recommendations to regulate the use of AI

découvrez nos recommandations sur la régulation de l'utilisation de l'intelligence artificielle dans la fonction publique. un guide essentiel pour garantir une mise en œuvre éthique et respectueuse des valeurs républicaines.

AI discovers a paint formula to refresh buildings

découvrez comment l'intelligence artificielle a développé une formule innovante de peinture destinée à revitaliser les bâtiments, alliant esthétique et durabilité. une révolution dans le secteur de la construction qui pourrait transformer nos horizons urbains.