With sparse attention, very interesting. It seems GQA is a thing of the past.
GLM 4.6 is reportedly about to drop too.
Submitted 2 days ago by cm0002@lemmy.world to technology@lemmy.zip
https://huggingface.co/collections/deepseek-ai/deepseek-v32-68da2f317324c70047c28f66
With sparse attention, very interesting. It seems GQA is a thing of the past.
GLM 4.6 is reportedly about to drop too.
BroBot9000@lemmy.world 2 days ago
New version of the propaganda machine dropped 🤦♂️
brucethemoose@lemmy.world 2 days ago
Deepseek is only bad via the chat app, and whatever prefilter (or finetune?) they censor it with.
The model itself (via API or run locally) isn’t too bad. Obviously there are CCP mandated gaps, but its not as tankie as you’d think.
cm0002@lemmy.world 2 days ago
Just ignore them on anything AI related, they are the polar opposite of the AI Tech Bros. Shitting on anything and everyone using AI in any form for anything