Multiple AI companies bypassing web standard to scrape publisher sites, licensing firm says

Submitted ⁨⁨2⁩ ⁨years⁩ ago⁩ by ⁨BrikoX@lemmy.zip⁩ to ⁨technology@lemmy.zip⁩

https://www.reuters.com/technology/artificial-intelligence/multiple-ai-companies-bypassing-web-standard-scrape-publisher-sites-licensing-2024-06-21/

Multiple artificial intelligence companies are circumventing a common web standard used by publishers to block the scraping of their content for use in generative AI systems, content licensing startup TollBit has told publishers.

source

Comments

Sort:hotnew top

Arghblarg@lemmy.ca ⁨2⁩ ⁨years⁩ ago
Sounds like we’re all going to need to start putting the equivalent of Trap Streets in all our web content, source code, etc.

I heard someone has already had success placing nonsense in a white-on-white box of their site, later querying commercial AI to prove it was ingested w/o permission.

source
- LiveLM@lemmy.zip ⁨2⁩ ⁨years⁩ ago
  My fear is that those techniques will make the lives of people using screen readers increasingly harder
  
  source
- recursive_recursion@programming.dev ⁨2⁩ ⁨years⁩ ago
  Here’s another example/variant (The Office - Recorder)
  
  source
- onlinepersona@programming.dev ⁨2⁩ ⁨years⁩ ago
  There probably is a way to poison AI training material and it could be handy feature for social media.
  
  Anti Commercial-AI license
  
  source
homesweethomeMrL@lemmy.world ⁨2⁩ ⁨years⁩ ago
Hey member when google drove around and sopped up everyone’s wifi info and was all like, “What? We found it.” Then they threw it on the pile of data-4-sale and are still drowning in cash from?

Message received and understood! Oh, uh, here’s a couple-hundred-million fine for the uh, imposition. We’ll just leave it on the nightstand.

source
possiblylinux127@lemmy.zip ⁨2⁩ ⁨years⁩ ago
The Linux Mint forms got AI ddosed.

source
onlinepersona@programming.dev ⁨2⁩ ⁨years⁩ ago
Red light? Sorry, didn’t see it. Was going too fast. Don’t worry, it’s water under the bridge. I forgive you for putting it there.

Anti Commercial-AI license

source