Cloudflare accuses Perplexity of illegal web crawling

Publié le 6 August 2025 à 09h55
modifié le 6 August 2025 à 09h55

Cloudflare accuses Perplexity of illegal behavior, undermining trust on the web. The artificial intelligence startup stands out for its methods of accessing protected content. The term *“illegal crawling”* crystallizes the questions surrounding digital ethics.

The collection of data is often governed by strict standards. Ignoring these rules weakens the very foundation of cybersecurity. Perplexity claims to operate differently, but its actions raise serious concerns.

The tension between technological innovation and respect for creators’ rights prevails in this debate. The potential consequences shake the digital landscape.

Accusations by Cloudflare

The cybersecurity company Cloudflare has made formal accusations against Perplexity. It accuses it of performing illegal crawling of websites, bypassing security measures and data collection rules.

“Stealth crawling” behavior

According to Cloudflare, Perplexity adopts a “stealth crawling” strategy, characterized by unauthorized exploration of websites. This method allows it to access data while ignoring the instructions specified in the robots.txt files of the affected sites.

In the face of blocks issued by firewalls, it seems that Perplexity modifies its digital identity. The company claims to disguise its user agent and autonomous system numbers (ASN) to circumvent restrictions.

Tests conducted by Cloudflare

Cloudflare reacted to several complaints from clients, alerting to unauthorized access by Perplexity’s bots to their sites. The security company then set up tests. It created web pages unknown to Perplexity’s bots and blocked their crawling.

Despite these measures, Perplexity provided information drawn from these new pages. The tests suggest problematic behavior that raises questions about the compliance of the startup’s data collection methods.

Response from Perplexity

In response to Cloudflare’s accusations, Perplexity attempted to defend itself. According to its representatives, the tools used would not be robots in the traditional sense, but “AI assistants” activated by the user. This distinction aims to reduce the impact of the accusations of illegal crawling.

Perplexity insists that its system does not store the collected data nor use it for its learning. The tool simply focuses on retrieving relevant information at the user’s request, unlike conventional crawlers.

Ongoing controversies

This situation fuels the already numerous criticisms against artificial intelligence companies. They are often accused of mass data collection on the web without explicit consent. These practices raise ethical questions regarding privacy and individual rights.

As these legal conflicts emerge, the reputation of companies operating in the AI sector continues to tarnish. The accusations from Cloudflare against Perplexity, although unresolved, reinforce concerns about data collection practices in an increasingly digital world.

For examples of other similar litigation cases, one can refer to issues involving Reddit and Anthropic for illegal data exploitation. The stakes of ethics and legality remain at the core of the debate in the technological field.

For more reflections on the evolution of AI, the article related to the improvement of AI models, such as Mistral AI, offers intriguing perspectives on these constantly evolving technologies.

Frequently Asked Questions

What accusations has Cloudflare made against Perplexity?
Cloudflare accuses Perplexity of accessing websites and harvesting data from them without authorization, bypassing the security measures implemented by those sites.

What is the “stealth crawling” mentioned by Cloudflare?
“Stealth crawling” refers to a silent exploration method where a search engine, like Perplexity, accesses a site’s content without adhering to the rules of robots.txt or firewalls.

How does Perplexity justify its operation in the face of accusations?
Perplexity argues that its “AI assistants” are not merely crawling robots, but agents that retrieve data in response to specific user queries without storing that information.

What are the potential consequences for Perplexity if the accusations are proven?
If the accusations of violating data access rights are proven, Perplexity could face legal actions, fines, and damage to its reputation in the AI market.

Have there been other similar incidents involving AI companies?
Yes, artificial intelligence companies have been criticized for collecting data without consent on the web, raising ethical and legal concerns.

What measures can websites take to protect their data?
Websites can use robots.txt files, firewall rules, and other security systems to prevent unauthorized access from these crawling tools.

Is it legal for Perplexity to access data on websites?
The legality depends on the circumstances, such as compliance with the rules set by the sites on crawling and data collection. Ignoring these rules may constitute a violation of copyright or terms of service.

How did Cloudflare discover Perplexity’s behavior?
Cloudflare received complaints from clients and conducted tests by creating unindexed sites, where Perplexity was able to access despite the restrictions in place.

actu.iaNon classéCloudflare accuses Perplexity of illegal web crawling

Google introduces Jules and Gemini CLI, its AI agents dedicated to GitHub actions

découvrez comment google révolutionne la gestion des actions github avec jules et gemini cli, ses nouveaux agents d'intelligence artificielle. apprenez à optimiser vos workflows de développement grâce à ces outils innovants.

Microsoft redirects searches for “ChatGPT” and “Claude” on Bing to promote its Copilot tool

découvrez comment microsoft redirige désormais les recherches pour « chatgpt » et « claude » sur bing, afin de promouvoir son nouvel outil copilot. cette stratégie met en lumière l'innovation de microsoft dans le domaine des technologies d'assistance et son ambition de renforcer son intégration dans les recherches en ligne.

Understanding agentification and automation: challenges and impacts for your data strategy

découvrez comment l'agentification et l'automatisation transforment votre stratégie data. analysez les enjeux et impacts clés pour optimiser vos processus et rester compétitif dans un monde de plus en plus numérique.

The arrival of the AI Act: a new challenge for Europe and the United States

découvrez comment l'ia act, nouvelle législation sur l'intelligence artificielle, représente un défi majeur pour l'europe et les états-unis. analyse des implications réglementaires et des impacts sur l'innovation.
découvrez comment les groupes artistiques et médiatiques s'unissent pour alerter le gouvernement sur le vol massif de contenus australiens, en vue de protéger la création artistique contre les abus liés à l'entraînement de l'intelligence artificielle.

OpenAI launches free and downloadable models to catch up with the competition

découvrez les nouveaux modèles gratuits et téléchargeables lancés par openai, conçus pour rattraper la concurrence. profitez de technologies avancées pour vos projets d'ia tout en bénéficiant d'une accessibilité sans précédent.