Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

Why it’s a mistake to ask chatbots about their mistakes

⁨18⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨day⁩ ago⁩ by ⁨misk@piefed.social⁩ to ⁨technology@lemmy.zip⁩

https://arstechnica.com/ai/2025/08/why-its-a-mistake-to-ask-chatbots-about-their-mistakes/

source

Comments

Sort:hotnewtop
  • givesomefucks@lemmy.world ⁨1⁩ ⁨day⁩ ago

    Why would an AI system provide such confidently incorrect information about its own capabilities or mistakes? The answer lies in understanding what AI models actually are—and what they aren’t.

    What’s ironic is this is one of the most human things about AI…

    when an object is presented in the right visual field, the patient responds correctly verbally and with his/her right hand. However, when an object is presented in the left visual field the patient verbally states that he/she saw nothing, and identifies the object accurately with the left hand only (Gazzaniga et al., 1962; Gazzaniga, 1967; Sperry, 1968, 1984; Wolman, 2012). This is concordant with the human anatomy; the right hemisphere receives visual input from the left visual field and controls the left hand, and vice versa (Penfield and Boldrey, 1937; Cowey, 1979; Sakata and Taira, 1994). Moreover, the left hemisphere is generally the site of language processing (Ojemann et al., 1989; Cantalupo and Hopkins, 2001; Vigneau et al., 2006). Thus, severing the corpus callosum seems to cause each hemisphere to gain its own consciousness (Sperry, 1984). The left hemisphere is only aware of the right visual half-field and expresses this through its control of the right hand and verbal capacities, while the right hemisphere is only aware of the left visual field, which it expresses through its control of the left hand.

    academic.oup.com/brain/article/140/5/…/2951052?lo…

    Tldr:

    They split people’s brains in half, and only the right side of the body could speak.

    So if you showed the left hand a text that said “draw a circle” the left hand would draw a circle.

    Ask the patient why, and they’d invent a reason and 100% believe it’s true.

    It’s why it seems like people are just doing shit and rationalizing it later…

    That’s kind of how we’re wired to work, and why humans can rationalize almost anything.

    source
    • Technus@lemmy.zip ⁨21⁩ ⁨hours⁩ ago

      A neurotypical human mind, acting rationally, is able to remember the chain of thought that lead to a decision, understand why they reached that decision, find the mistake in their reasoning, and start over from that point to reach the “correct” decision.

      Even if they don’t remember everything they were thinking about, they can reason based on their knowledge of themselves and try to reconstruct their mental state at the time.

      This is the behavior people are expecting from LLMs but not understanding that it’s something they’re fundamentally incapable of.

      One major difference (among many others, obviously) is that AI models as currently implemented don’t have any kind of persistent working memory. All they have for context is the last N tokens they’ve generated, the last N tokens of user input, and any external queries they’ve made. All the intermediate calculations (the “reasoning”) that led to them generating that output is lost.

      Any instance of an AI appearing to “correct” their mistake is just the model emitting what it thinks a correction would be, given the current context window.

      Humans also learn from their mistakes and generally make efforts to avoid them in the future, which doesn’t happen for LLMs until that data gets incorporated into the training for the next version of the model, which can take months to years. That’s why AI companies are trying to capture and store everything from user interactions, which is a privacy nightmare.

      It’s not a compelling argument to compare AI behavior to that of a dysfunctional human brain and go “see, humans do this too, teehee!” Not when the whole selling point of these things is that they supposed to be smarter and less fallible than most humans.

      I’m deliberately trying not to be ableist in my wording here, but it’s like saying, “hey, you know what would do wonders for productivity and shareholder value? If we fired half our workforce, then found someone with no experience, short-term memory loss, ADHD and severe untreated schizophrenia, then put them in charge of writing mission-critical code, drafting laws, and making life-changing medical and business decisions.”

      I’m not saying LLMs aren’t technically fascinating and a breakthrough in AI development, but the way they have largely been marketed and applied is scammy, misleading, and just plain irresponsible.

      source
      • givesomefucks@lemmy.world ⁨21⁩ ⁨hours⁩ ago

        A neurotypical human mind, acting rationally, is able to remember the chain of thought that lead to a decision, understand why they reached that decision, find the mistake in their reasoning, and start over from that point to reach the “correct” decision.

        No.

        What we learned from those experiments was that if we don’t know a reason for why we did something, we’d invent and whole heartedly believe the first plausible explanation we come up with.

        I didn’t read any further because you had a fundamental misunderstanding about what those studies actually proved

        source
        • -> View More Comments
  • lvxferre@mander.xyz ⁨23⁩ ⁨hours⁩ ago

    But with AI models, this approach rarely works, and the urge to ask reveals a fundamental misunderstanding of what these systems are and how they operate.

    Okay, this explanation might work for the masses, but never assume a person is an ignorant based on their behaviour. Never. Some people know it and don’t give it a fuck.

    Example of that later.

    What you’re actually doing is guiding a statistical text generator to produce outputs based on your prompts.

    Right… go on.

    Once an AI language model is trained (which is a laborious, energy-intensive process), its foundational “knowledge” about the world is baked into its neural network

    Here’s the example. By the quotation marks, odds are the author knows that those models do not have world knowledge strictu sensu. But they’re still using the idiotic analogy. Why?

    A: can’t be arsed to not use it, it’s misleading but easier than to find some idiot-friendly way to convey the same thing.

    source