brucethemoose
@brucethemoose@lemmy.world
- Comment on Zuckerberg's Huge AI Push Is Already Crumbling Into Chaos 1 day ago:
But they are putting the horse before the cart. These APIs and models are unsexy commodities, and Meta doesn’t have anything close to something they can charge for. Even OpenAI and Anthropic can barely justify it these days.
Others building on top Llama get them there, though. Which all the Chinese companies recognize now and are emulating: they can open the model, let it snowball with communal development to wipe out closed competitors, then offer products on top of it.
What’s conspicuous is that (at least some) in Meta recognized this. But Zuck is so fickle he won’t stick with any good idea.
- Comment on Zuckerberg's Huge AI Push Is Already Crumbling Into Chaos 1 day ago:
Yeah the article is pretty bad… But the missing context is Zuckerberg let a lot of devs go, and the lab that actually built something neat (Llama 1-4) has all but been dismantled.
The new hires reek of tech bro and big egos butting together, especially the (alleged) talk to close source their next models. ‘TBD Lab’ is supposedly tasked with the next Llama release, but I am not holding my breath.
- Comment on Civilization 7's latest update has "hit mods harder than usual", but for a good reason 2 days ago:
Aside:
“We wanted to acknowledge that this update hit mods harder than usual,” community manager Sarah Engel wrote on the game’s Discord.
I despise Discord. Every single niche I love is now locked behind a bunch of unsearchable banter in closed, often invite-only apps.
- Comment on Civilization 7's latest update has "hit mods harder than usual", but for a good reason 2 days ago:
Yeah, I think early access is a great model. Certainly better than “release it and (maybe) fix it later”
- Comment on AI experts return from China stunned: The U.S. grid is so weak, the race may already be over 6 days ago:
I mean, I’m a local AI evangelist and have made a living off it. The energy use of AI thing is total nonsense, as much as Lemmy doesn’t like to hear it.
I keep a 32B or 49B loaded pretty much all the time.
You are right about the theft vs social media thing too, even if you put it a little abrasively. Why people are so worked up in the face of machines like Facebook and Google is mind boggling.
…But AI is a freaking bubble, too.
Look at company valuations vs how shit isn’t working, and how much it costs.
Look around the ML research community. They all know Altman and his infinite scaling to AGI pitch is just a big fat tech bro lie. AI is going to move forward as a useful tool through making it smaller and more efficient, but transformers LLMs with randomized sampling are not just going to turn into real artificial intelligence if enough investors thrown money at these closed off enterprises.
- Comment on AI experts return from China stunned: The U.S. grid is so weak, the race may already be over 1 week ago:
The irony is Zuck shuttered the absolute best asset they have: the Llama LLM team.
Cuz, you know, he’s a fickle coward who would say and do anything to hide his insecurity.
- Comment on Tencent doesn’t care if it can buy American GPUs again – it already has all the chips it needs 1 week ago:
I think the underlying message is making/serving AI isn’t a mythical goldmine: it’s becoming a dirt cheap commodity, and a tool for companies to use.
- Comment on AI experts return from China stunned: The U.S. grid is so weak, the race may already be over 1 week ago:
This is all based on the assumption that AI will need exponential power.
It will not.
-
AI is a bubble.
-
Even if it isn’t, fab capacity is limited.
-
The actual ‘AI’ market is racing to the bottom with smaller, task focused models.
-
A bunch of reproduced papers (like bitnet) that reduce power exponentially are just waiting for someone to try a larger test.
-
Alltogether… inference moves to smartphones and PCs.
This is just the finance crowd parroting Altman. Not that the US doesnt need a better energy grid like China, but the justification is built on lies that just aren’t going to happen.
-
- Comment on The New Yorker Asks: Is the A.I. Boom Turning Into an A.I. Bubble? 1 week ago:
because there’s seemingly not enough power infrastructure
This is overblown. I mean, if you estimate TSMC’s entire capacity and assume every data center GPU they make is full TDP 100% of the time (which is not true), the net consumption isn’t that high. The local power/cooling infrastructure things are more about corpo cost cutting.
Altman’s preaching that power use will be exponential is a lie that’s already crumbling.
But there is absolutely precedent for underused hardware flooding the used markets, or getting cheap on cloud providers. Honestly this would be incredible for the local inference community, as it would give tinkerers (like me) actually affordable access to experiment with.
- Comment on Please don't promote Wayland 1 week ago:
Yeah, not to speak of stuff that doesn’t work/work well in X, and it’s bizarre quirks.
- Comment on The New Yorker Asks: Is the A.I. Boom Turning Into an A.I. Bubble? 1 week ago:
Not a lot? The quirk is they’ve hyper specialized nodes around AI.
The GPU boxes are useful for some other things, but they will be massively oversupplied, and they mostly aren’t networked like supercomputer clusters.
- Comment on The New Yorker Asks: Is the A.I. Boom Turning Into an A.I. Bubble? 1 week ago:
I mean, hardware prices will fall if there’s a crash, like they did with crypto GPU mining.
I am salivating over this. Bring out the firesale A100s.
- Comment on The New Yorker Asks: Is the A.I. Boom Turning Into an A.I. Bubble? 1 week ago:
Ohhh yes. Altmans promotion for it was the Death Star coming up from behind a planet.
Maybe something on the corporate side, like big players not seeing a return of their investment.
Ohhh, it is. The big corporate hosters arent making much money and burning cash, and it’s not getting any better as specialized open models eat them from the bottom up.
- Comment on GPT-5: Overdue, overhyped and underwhelming. And that’s not the worst of it. 1 week ago:
Nah, I tried them. For the size, they suck, mostly because there’s a high chance they will randomly refuse anything you ask them unless it STEM or Code.
…And there are better models if all you need is STEM and Code.
- Comment on GPT-5: Overdue, overhyped and underwhelming. And that’s not the worst of it. 1 week ago:
Meanwhile, the chinese and other open models are are killing it. GLM 4.5 is sick. Jamba 1.7 is a great sleeper model for stuff outside coding and STEM. The 32Bs we have like EXAONE and Qwen3 (and finetuned experiments) are mad for 20GB files, and crowding out APIs. There are great little MCP models like Jan too.
Are they AGI? Of course not. They are tools, and that’s what was promised; but the improvements are real.
- Comment on The Media's Pivot to AI Is Not Real and Not Going to Work 5 weeks ago:
, especially since something like a Mixture of Experts model could be split down to base models and loaded/unloaded as necessary.
It doesn’t work that way. All MoE experts are ‘interleaved’ and you need all of them loaded at once, for every token. Some API servers can hotswop wholes models, but its not fast, and rarely done since LLMs are pretty ‘generalized’ and tend to serve requests in parallel on API servers.
The closest to what you’re thinking of is LoRAX (which basically hot-swaps Loras efficiently). But it needs an extremely specialized runtime derived from its associated paper, hence people tend to not use it since it doesn’t support quantization and some other features as well: github.com/predibase/lorax
There is a good case for pure data processing, yeah… But it has little integration with LLMs themselves, especially with the API servers generally handling tokenizers/prompt formatting.
But, all of its components need to be localized
They already are! Local LLM tooling and engines are great and super powerful compared to ChatGPT (which offers no caching, no raw completion, primitive sampling, hidden thinking, and so on).
- Comment on The Media's Pivot to AI Is Not Real and Not Going to Work 5 weeks ago:
SGLang is partially a scripting language for prompt building leveraging its caching/logprobs output, for doing stuff like filling in fields or branching choices, so it’s probably best done in that. It also requires pretty beefy hardware for the model size (as opposed to backends like exllama or llama.cpp that focus more on tight quantization and unbatched performance), so I suppose theres not a lot of interest from more local tinkerers?
It would be cool, I guess, but ComfyUI does feel more geared for diffusion. Image/video generation is more multimodel and benefits from dynamically loading/unloading/swapping all sorts of little submodels, loras and masks, applying them, piping them into each other and such.
LLM running is more monolithic: you have the 1 big model, maybe a text embeddings model as part of the same server, and everything else is just processing strings to build the prompts which one does linearly om python or whatever. Stuff like CFG and Loras do exist, but aren’t used much.
- Comment on The Media's Pivot to AI Is Not Real and Not Going to Work 5 weeks ago:
Not specifically. Ultimately, ComfyUI would build prompts/API calls, which I tend to do in Python scripts.
I tend to use Mikupad or Open Web UI for more general testing.
There are some neat tools with ‘lower level’ integration into LLM engines, like SGlang (which leverages caching and constrained decoding) to do things one can’t do over standard APIs: docs.sglang.ai/frontend/frontend.html
- Comment on The Media's Pivot to AI Is Not Real and Not Going to Work 5 weeks ago:
I mean, I run Nemotron and Qwen every day, you are preaching to the choir here :P
- Comment on The Media's Pivot to AI Is Not Real and Not Going to Work 5 weeks ago:
AI is a tool (sorry)
This should be a bumper sticker. Also, thanks for this, bookmarking 404, wish I had the means to subscribe.
My hope is that the “AI” craze culminates in a race to the bottom where we end up in a less terrible state: local models on people’s phones, reaching out to reputable websites for queries and redirection.
And this would be way better for places like 404, as they’d have to grab traffic individually and redirect users there.
- Comment on Confirmed - China bans NVIDIA chips and accelerates its total independence from US technology 2 months ago:
Yeah honestly the Nvidia ban was stupid.
Everyone in the AI research space was saying it, but no, our old policymakers are captured by Altman, Musk and tech bros who would burn anything for their two years of pure anticompetitiveness.
The running joke is that the Nvidia ban was the best thing to ever happen to Chinese research, as it made them thrifty, while big US companies are lazily burning huge GPU farms scaling up and… not improving anything.
- Comment on X's new 'encrypted' XChat feature seems no more secure than the failure that came before it 2 months ago:
I have to wonder who this appeals to?
Most are already trapped in something established like Discord, WeChat, FB Messenger. As said, security isn’t a strong point, and there’s no engagement angle.
I guess if you already spend tons of time on X it’s kinda convenient?
- Comment on EA never grasped Dragon Age's value as an RPG, says Inquisition writer 2 months ago:
Side note, but even with all their troubles/turnover, I still love RPS’s hint of bite in their news writing (outside the columns).
- Comment on Tiny Corp heralds world's first AMD GPU driven via USB3 — eGPUs tested on Apple Silicon, with Linux and Windows also supported 3 months ago:
Tinycorp generates these headlines every once in awhile, but as far as I can tell no one uses it. At least not in the tinkerer space I can see.
It’d be cool if they can eat away at PyTorch, XLA and whatever else… Some day…
- Comment on Researchers unveil LegoGPT, an AI model that designs physically stable Lego structures from text prompts and currently supports eight standard brick types 3 months ago:
A pretty long time.
Niche models are tons of fun though.
- Comment on Consumers make their voices heard as Microsoft's huge venture flatlines in popularity 3 months ago:
Comment from the source:
Microsoft poisoned their own well with all the changes they have been forcing on users lately. The update nagging, resetting the default browser to edge, the the ads in windows features, and integrating bing into the start menu have all trained users that when Microsoft starts pushing something new, it probably isn’t great and should just be ignored, like ads in phone apps.
That ^. So much that.
Also, the copilot llm itself sucks. Local models are neat within their limitations, and they’d be even better if Microsoft made them trainable/customizable, did better RAG, or whatever, but they just shoved a bad thing down user’s throats, and now they’ve poisoned another well.
- Comment on Microsoft removes Windows 11 24H2 official support on 8th 9th 10th Gen Intel CPUs 5 months ago:
As crazy as that is, playing devil’s advocate, Comet Lake is basically the aging Skylake architecture.
Ice Lake though? WTF.
- Comment on Meta asks the US government to block OpenAI’s switch to a for-profit 8 months ago:
The limit is already (apparently) starting to be data… and capital, lol.
There could be a big computer breakthrough like , say, fast bitnet training that makes the multimodal approach much easier to train though.
- Comment on Meta asks the US government to block OpenAI’s switch to a for-profit 8 months ago:
Well for one, I directly disagree with Altman’s fundamental proposition, they don’t need to “scale” AI so dramatically to make it better.
See: Qwen 2.5 from Alibaba, a fraction of the size, made with a tiny fraction of the H100 GPUs and highly competitive (and (mostly) Apache licensed). And frankly, OpenAI is pointedly ignoring all sorts of open research that could make their models notably better or more powerful efficient, even with the vast resources and prestige they have… they seem most interested in anticompetitive efforts to regulate competitors that would make them look bad, using the spectre of actual AGI (which has nothing to do with transformers LLMs) to scare people.
Even if doing it for the wrong reasons, I feel like Google would be right to oppose Mozilla axing the nonprofit division if they were somehow in a similar position to OpenAI. Their mission of producing a better, safer browser would basically be lying through their teeth.
- Comment on Meta asks the US government to block OpenAI’s switch to a for-profit 8 months ago:
And in that case, will the Llama fork be the same as the Meta fork? We are talking about AI that has a considerable development, companies would probably not participate because it is not an open source license and its clause limits in those aspects.
Llama has tons of commercial use even with its “non open” license, which is basically just a middle finger to companies the size of Google or Microsoft. And yes, companies would keep using the old weights like nothing changed… because nothing did. Just like they keep using open source software that goes through drama.
Also you have to think that if the new version of Llama with the new license is 3 times better than Llama with the previous license, do you really think that the community will continue to develop the previous version?
Honestly I have zero short term worries about this because the space is so fiercely competitive. Also much of the ecosystem (like huggingface and inference libraries) is open source and out of their control.
And if they go API only, honestly they will just get clobbered by Google, Claude, Deepseek or whomever.
In the longer term… transformers will be obsolete anyway.