From skimming that linked page, I think that this download perhaps doesn’t include recent pages? Because in the section talking about enterprise stuff, it mentions the paid API for recent articles
Comment on Wikipedia urges AI companies to use its paid API, and stop scraping
usernameusername@sh.itjust.works 13 hours ago
I don’t get it though… Why would any company use this when Wikimedia also offers a download of the entirety of Wikipedia, for free?
Maybe it’s just that if the AI companies don’t know, they can hopefully get a little money from them?
AnarchistArtificer@slrpnk.net 13 hours ago
usernameusername@sh.itjust.works 12 hours ago
It seems you’re right, I’m just dumb and didn’t read ths article I linked
Crashumbc@lemmy.world 12 hours ago
You think AI companies care what they scrape. Their system is set up to scrape anything it can get.
BanMe@lemmy.world 40 minutes ago
They can scrape an ongoing log of interactions between editors about the articles themselves, which is probably fairly worthwhile content honestly. More content there than in articles probably as well.