Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

Wikipedia is giving AI developers its data to fend off bot scrapers

⁨26⁩ ⁨likes⁩

Submitted ⁨⁨3⁩ ⁨weeks⁩ ago⁩ by ⁨Tea@programming.dev⁩ to ⁨technology@lemmy.zip⁩

https://enterprise.wikimedia.com/blog/kaggle-dataset/

source

Comments

Sort:hotnewtop
  • SaltSong@startrek.website ⁨3⁩ ⁨weeks⁩ ago

    Is this “surrender to avoid being defeated,” or am I misunderstanding the case?

    source
    • spankmonkey@lemmy.world ⁨3⁩ ⁨weeks⁩ ago

      The post title is phrased that way, but you can already download wikipedi and the article sounds like they are presenting it in a new way for a new audience.

      source
      • p03locke@lemmy.dbzer0.com ⁨3⁩ ⁨weeks⁩ ago

        It’s a common problem. People writing bot scrapers for public data, which costs a lot of bandwidth, when they could have easily just downloaded the entire dataset from a dedicated link. Finding better ways to tell them “Hey, morons, go download the goddamn link!” saves on that bandwidth and web server CPU.

        source