AI agents get office tasks wrong around 70% of the time, and a lot of them aren't AI at all

⁨43⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨cm0002@lemmy.world⁩ to ⁨technology@lemmy.zip⁩

https://www.theregister.com/2025/06/29/ai_agents_fail_a_lot/

Comments

Sort:hotnew top

EpeeGnome@feddit.online ⁨1⁩ ⁨year⁩ ago

"Basically their methodology was that they asked ChatGPT whether the job could be automated," he explained. "They also asked people whether the job could be automated and then they said ChatGPT and people agreed some portion of the time."

lol. This is such an idiotic thing to do. "Hey, you know that linguistic pattern matcher that doesn't actually reason or introspect? Since it can talk, why don't we just ask it what it can and can't do?" Seeing this, published by an AI research institute no less, is what inspired the creators of the actually riggourus test the article is about. It inspired me with a desire to smack those idiots upside their empty heads.

source