mindbleach
@mindbleach@sh.itjust.works
- Comment on LLMs’ “simulated reasoning” abilities are a “brittle mirage,” researchers find 3 hours ago:
Please don’t mistake vindication for a lack of ambiguity. When this took off, we had no goddamn idea what the limit was. The fact it works half this well is still absurd.
Simple examples like addition were routinely wrong, but they were wrong in a way that indicated - the model might actually infer the rules of addition. That’s a compact way to predict a lot of arbitrary symbols. Seeing that abstraction emerge would be huge, even if it was limited to cases with a zillion examples. And it was basically impossible to reason about whether that was pessimistic or optimistic.
A consensus for “that doesn’t happen” required all of this scholarship. If we had not reached this point, the question would still be open. Remove all the hype from grifters insisting AGI is gonna happen now, oops I mean now, oops nnnow, and you’re still left with a series of advances previously thought impossible. Backpropagation doesn’t work… okay now it does. Training only plateaus… okay it gets better. Diffusion’s cute, avocado chairs and all, but… okay that’s photoreal video. It really took people asking weird questions on high-end models to distinguish actual reasoning capability from extremely similar sentence construction.
And if we’re there, can we please have models ask a question besides ‘what’s the next word?’
- Comment on ChatGPT Is Still a Bullshit Machine 10 hours ago:
Ass-pull nonsense metric.
Someone already told you ‘you have to know how to use the tool’ and it didn’t fucking help. Excuse me for trying to politely guide you toward what should be obvious.
- Comment on ChatGPT Is Still a Bullshit Machine 15 hours ago:
Charles Babbage was once asked, ‘But if someone puts in the numbers wrong, how will your calculator get the right answer?’
Using a chatbot to code is useful if you don’t know how to code. You still need to know how to chatbot. You can’t grunt at the machine and expect it to read your mind.
Have you never edited a Google search, because the first try didn’t work?
- Comment on ChatGPT Is Still a Bullshit Machine 15 hours ago:
This kind of assertion wildly overestimates how well we understand intelligence.
Higher levels of bullshitting require more abstraction and self-reference. Meaning must be inferred from observation, to make certain decisions, even when picking words from a list.
Current models are abstract enough to see a chessboard in an Atari screenshot, figure out which pieces each jumble of pixels represents, and provide a valid move. Scoffing because it’s not actually good at chess is a bizarre line to draw, to say there’s zero understanding involved.
Current models might be abstract enough to teach them a new game by explaining the rules.
Current models are not abstract enough to explain why they’re bad at a game and expect them to improve.
- Comment on GPT-5: Overdue, overhyped and underwhelming. And that’s not the worst of it. 1 day ago:
Seriously. Neural networks can approximate literally any function, and the lumbering giants have all decided ‘what’s the next word?’ is the only function worth pursuing.
It’d take a sliver of their current budget to try starting over like it’s 2020. Compare with benchmarks that now look quaint. Enjoy some wisdom where previously they could only guess. Buuut nope: all LLM, all the time, and big big big.
- Comment on GPT-5: Overdue, overhyped and underwhelming. And that’s not the worst of it. 1 day ago:
It’s a chatbot that can see and draw, but rich idiots keep pushing it as an oracle. As if “a chatbot that can see and draw” isn’t impressive enough.
Three years ago, ‘label a tandem bicycle’ would’ve produced a tricycle covered in squiggles. Four years ago it was impossible. I don’t mean ‘really really hard.’ I mean we had no fucking idea how to make that program. People have been trying since code came on punchcards.
LLMs can almost-sorta-kinda do it, despite being completely the wrong approach. It’s shocking that ‘guess the next word’ works this well. I’m confused by the lack of experimentation in, just… asking a different question. Diffusion’s doing miracles with ‘estimate the noise.’ Video generators can do photorealism faster and cheaper than an actual camera.
The problem is, rich idiots claim this makes it an actual camera. In that context, it’s fair to point out when a video shows the Eiffel Tower with in Berlin. It’s deeply impressive that computers can do that, now. But it might ruin people’s vacation plans.
- Comment on LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI 3 days ago:
Outright piracy? It’s not allowed, but it’s supposed to be a civil matter.
Videos posted without permission? I don’t think the audience is liable for that.
Scraping despite robots.txt? If that’s illegal for its own sake, then it’s overreaching on ‘unauthorized access.’
Training on any of this? … nah, it’s probably fine.
A pile of linear algebra that knows what pornography looks like does not serve the same function as any particular example. No more than one video infringes on another for the general idea of cameras pointed at naked people. Producing the same kind of thing is not infringement. (Though if it involves Shrek, the trademark people will have angry and confusing questions.)
Reproducing any particular input is a failure of training. Even the Bible should be paraphrased past about Genesis 1:9. The whole idea is getting the vibe of everything we’ve ever published. Cliff notes, passable imitation of the writing style, couple passages everyone’s quoted verbatim.
An encyclopedia article about a book doesn’t become illegal if we learn the author shoplifted it.
- Comment on LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI 3 days ago:
Arguments are easy when you make shit up.
- Comment on LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI 3 days ago:
Correct - only the filesharing is against the law. Training is transformative use.
You can’t cram a billion images into one gigabyte. They’d be one byte each. What these models do is very different from the bootlegging you’re trying to make it sound like.
- Comment on LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI 3 days ago:
Seeking distinctions is pretense. They’re just shuffling cards.
You can ask about models made from public-domain data, and most critics will not budge an inch. Mentioning copyright is working backwards from a gut feeling. The ones who say, sure, okay, it’d be different if– - maybe they have a consistent rationale. But even some of them haven’t examined how they’d feel about this technology, if all their complaints were addressed.
- Comment on LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI 3 days ago:
Lemmy really hates piracy… in this specific context.
And a lot of the extreme and extremist content going into these things is just Twitter. People post all kinds of shit from all kinds of places. At what point is this like clutching pearls over what the Internet Archive has saved? They’re trying to grab anything you could see.
It’s not some hacking and exfiltration campaign. Meta’s just bad at spidering. How do you go breadth-first across the entire internet and still DDoS any particular site? You don’t decide to check every DeviantArt account, at the same time, you dolts.
- Comment on 70 seconds makes all the difference under the new Marvel Rivals approach to rage-quitters 1 week ago:
This is a weird use of MOBA-style punishment games, if we’re talking about an entire minute of wasted time. Surely you’d be better-off making repeat disconnectors more likely to pair with other disconnectors. Just teach them that it sucks… or leave them mad at each other. Like pairing cheaters with other cheaters.
- Comment on Echelon kills smart home gym equipment offline capabilities with update 2 weeks ago:
Theft.
- Comment on Grim and joyful deckbuilder The Royal Writ is coming for genre king Balatro next month 2 weeks ago:
Rule of thumb: anything that’s pushed as ‘the X killer’ won’t.
- Comment on Delta Air Lines is using AI to set the maximum price you’re willing to pay 3 weeks ago:
Blaming AI is not what makes this a crime.
- Comment on Come slide with me through loopy shooter maps in this homage to Counter-Strike's beloved Source engine surfing 3 weeks ago:
Put the name in the goddamn title.
- Comment on Now Microsoft’s Copilot Vision AI can scan everything on your screen 3 weeks ago:
- Comment on Minecraft Creator Says That If Buying a Game Is Not a Purchase, Then Pirating It Is Not Theft 4 weeks ago:
Nazi parrots popular sentiment.
- Comment on A project to bring CUDA to non-Nvidia GPUs is making major progress — ZLUDA update now has two full-time developers, working on 32-bit PhysX support and LLMs, amongst other things 5 weeks ago:
If past is prologue, expect it to mysteriously shut down in three months, only for another very promising effort to emerge circa 2026.
- Comment on 11 Bit confirm that they used generative AI for The Alters "in a very limited manner" 1 month ago:
These tools aren’t going anywhere.
Make your peace.
- Comment on Federal judge sides with Meta in lawsuit over training AI models on copyrighted books 1 month ago:
Most anti-AI sentiments seem like misplaced hatred of awful companies forcing nonsense on everybody, or a refusal to place judgement on capitalism itself.
- Comment on Windows is getting rid of the Blue Screen of Death after 40 years 1 month ago:
Guru meditation.
- Comment on Federal judge sides with Meta in lawsuit over training AI models on copyrighted books 1 month ago:
Meta got shit on for pirating books.
Anthropic got shit on for not pirating books.
- Comment on Federal judge sides with Meta in lawsuit over training AI models on copyrighted books 1 month ago:
Right, a 12 GB model trained on 100,000,000 images isn’t big enough to contain an MD5 checksum of each.
The same people expect it to identify the authorship of sentence fragments, but never quote one whole paragraph from any book ever. Now: gigabytes of text could be a significant fraction of all books. But finding a single recognizable page is news. Storing text is not what these companies spent a bajillion dollars on.
- Comment on Federal judge sides with Meta in lawsuit over training AI models on copyrighted books 1 month ago:
Turning books into a language model is transformative. No LLM is a substitute for the original works.
- Comment on Reminder that you do not own digital games 1 month ago:
Copyright is about copying. When someone sells you a product, and you buy it, then you own it. No license is involved. Under the first sale doctrine, no license can be involved - a book can’t have an insert with a EULA. They can print it… but it doesn’t matter. You bought that slip of paper, too.
If you stubbornly believe there’s some instant contract required to look at the logo on a candy wrapper, why are you tutting at people for calling that intolerable nonsense, instead of demanding a change to that intolerable nonsense?
- Comment on Reminder that you do not own digital games 1 month ago:
Two crimes, then.
- Comment on Reminder that you do not own digital games 1 month ago:
No. I own that copy. It’s not a license to anything. I own it. It’s mine. That’s what the money was for.
Don’t play corporate word games with concepts as basic as having things.
- Comment on Reminder that you do not own digital games 1 month ago:
I own every book on my shelf. That copy is not the same as copyright. Grow up.
- Comment on Reminder that you do not own digital games 1 month ago:
Books can’t say “by buying this book, nuh uh, you secretly agreed to blah blah blah.”
That shit got thrown out a century ago. Fuck off making excuses for corporate bastards in a new medium.