phx
@phx@lemmy.world
- Comment on Amazon discovered a 'high volume' of CSAM in its AI training data but isn't saying where it came from 13 hours ago:
They fed them on the Internet including libraries of pirated material. It’s like drinking from a fountain at a sewage plant
- Comment on Amazon discovered a 'high volume' of CSAM in its AI training data but isn't saying where it came from 13 hours ago:
Yeah, a lot of people seem to think that these companies built these AI’s by buying or building some sort of special training set/data, when in reality no such thing really existed.
They’ve basically just scraped every bit of data they can. When it comes to big corps, at least some of that data is likely from scraping customer’s data. There’s also scraping of the Internet in general, including sites such as Reddit (which is a big reason why they locked down their API, they wanted to sell that data) but many have also been caught with a ton of ‘pirated ’ data from torrents etc.
I’m sure there was a certain amount of sludge in customers’ synced files, and sites like Reddit, but I’d also hazard a guess that the stuff grabbed from torrents etc likely had some truly heinous materials that they simply added to what was getting force-fed to AI, especially the early ones
- Comment on Latest Steam Deck update will warn you if an Xbox controller needs upgrading 1 month ago:
Agreed. Multi-pairing is such an underrated feature. It’s especially great when it comes with a hardware selector (in case you’ve got multiple devices paired in range).
I picked up a Logitech mouse that pairs up to 3 devices and has a button+LED’s on the bottom to switch between. It’s awesome