This is amazing. The number of crawlers that don’t respect crawling directives is out of hand. Feed them garbage till they can’t pay their bills.
Cloudflare announces AI Labyrinth, which uses AI-generated content to confuse and waste the resources of AI Crawlers and bots that ignore “no crawl” directives.
Submitted 2 days ago by Tea@programming.dev to technology@lemmy.zip
https://blog.cloudflare.com/ai-labyrinth/
Comments
draughtcyclist@lemmy.world 1 day ago
vermaterc@lemmy.ml 2 days ago
On the one hand this is good news, because allowing stealing data from websites we would just financially kill those who produce legitimate data (e.g. news site would have articles stolen without receiving add revenue as only bots visit them).
On the other hand - AI tools will get dumber, which makes me sad because I personally hoped for having one day a supercharged assistant that reads and summarises Internet for me. Tools like Perplexity creating research on given topics looked very promising
possiblylinux127@lemmy.zip 1 day ago
Calling it “stealing” is a stretch. The website is just serving up a page to a server. There is no theft in that. Is it fair to journalism? Not really but by your definition any person who views a page is stealing by sending the request to the server.
Unless you are saying that it is somehow theft to deprive them of revenue. In that case it would be theft just by walking into a store and not buying what the store wants you to buy.
vermaterc@lemmy.ml 1 day ago
Imagine a website showing a weather forecast. Maintainers of this page are running a webserver and a service that analyse raw meteorological data to calculate tomorrow’s weather. In exchange they are making money out of ads.
Now there is an AI agent that enters that page on behalf of user, gets the forecast and show it back to user. User never sees an ad, maintainer never sees his revenue. How is that not stealing?
jlow@beehaw.org 1 day ago
Not sure if throwing resource-wasting technology at a resource-wasting problem is a cool idea on an already burning planet …
possiblylinux127@lemmy.zip 1 day ago
Honestly we need a proof of work web standard. There should be an optional proof a work that a server can require.