Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

The dark deep side of DeepSeek: Fine-tuning attacks against the safety alignment of CoT-enabled models.

⁨3⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨Cat@ponder.cat⁩ to ⁨technology@lemmy.zip⁩

https://arxiv.org/abs/2502.01225

source

Comments

Sort:hotnewtop