- AiNews.com
- Posts
- Anthropic’s ClaudeBot Violates Website Anti-AI Scraping Policies
Anthropic’s ClaudeBot Violates Website Anti-AI Scraping Policies
Anthropic’s ClaudeBot Violates Website Anti-AI Scraping Policies
Anthropic’s ClaudeBot web crawler has come under fire for ignoring websites' anti-AI scraping policies, causing significant issues for site owners like iFixit.
iFixit CEO's Complaint
iFixit CEO Kyle Wiens revealed that ClaudeBot hit their website's servers nearly a million times in just 24 hours, violating the company's Terms of Use. Wiens took to X to express his frustration, stating, “If any of those requests accessed our terms of service, they would have told you that use of our content is expressly forbidden. But don’t ask me, ask Claude!” He posted images showing Anthropic’s chatbot acknowledging that iFixit’s content was off-limits and added, “You’re not only taking our content without paying, you’re tying up our devops resources. If you want to have a conversation about licensing our content for commercial use, we’re right here.”
Impact on iFixit
Wiens described the situation as an anomaly, stating, “The rate of crawling was so high that it set off all our alarms and spun up our devops team.” Despite iFixit's familiarity with handling web crawlers due to its high traffic, the aggressive scraping by ClaudeBot was unprecedented.
Terms of Use Violations
iFixit’s Terms of Use explicitly prohibit the reproduction, copying, or distribution of any content from their website without prior written permission, specifically including “training a machine learning or AI model.” When questioned, Anthropic referred to an FAQ page stating their crawler can be blocked via a robots.txt file extension.
Measures Taken
Wiens confirmed that iFixit added the crawl-delay extension to its robots.txt, which stopped the scraping. “Based on our logs, they did stop after we added it to the robots.txt,” Wiens says. Anthropic spokesperson Jennifer Martinez stated, “We respect robots.txt and our crawler respected that signal when iFixit implemented it.”
Wider Issues with AI Scraping
iFixit is not alone in this experience. Read the Docs co-founder Eric Holscher and Freelancer.com CEO Matt Barrie reported similar issues with Anthropic’s crawler. ClaudeBot’s aggressive scraping behavior has been a concern for months, with several reports on Reddit and an incident involving the Linux Mint web forum in April attributing site outages to ClaudeBot’s activities.
Challenges with robots.txt
Disallowing crawlers via robots.txt is the common opt-out method for AI companies like OpenAI. However, this method lacks flexibility, preventing website owners from specifying what scraping is permissible. Another AI company, Perplexity, is known to ignore robots.txt exclusions entirely. Despite its limitations, the robots.txt file remains one of the few tools available for companies to protect their data from AI training materials, as seen in Reddit's recent crackdown on web crawlers.