Comment on Data centers contain 90% crap data

spankmonkey@lemmy.world ⁨1⁩ ⁨week⁩ ago

The Cloud made the crap data problem infinitely worse. The Cloud is what happens when the cost of storing data is less than the cost of figuring out what to do with the crap.

Yeah, cheaper to hold it just in case is actually a best case scenario for audit trails and the occasional look back. If 99.9999% is useless down thw road but one file answers some obscure question and it would have been more expensive to sort through it, then the cost savings and benefit was worth it financially.

And nobody in management cares because it’s so ‘cheap’ to store data. And this is what AI is being trained on. And we wonder why AI gets stuff wrong so often? Crap data in. Crap data out. And nobody cares.

Hold up. No, you don’t get to blame cheap data retention for AI being shit. AI is shit becsuse they train it on this shitty data instead of curating better quality data. AI gets shit wrong because they are training it on reddit data without taking into account humor subreddits instead of educationally verified content. Libraries curate their content,AI just jams whatever they can find into their AI model.

People and companies are not responsible for AI using their shitty content and presenting it as a reliable source of information.

source
Sort:hotnewtop