I think comparing a small model’s collapse to a large model’s corruption is a bit of a fallacy. What proof do you have that the two behave the same in response to poisoned data?
I think comparing a small model’s collapse to a large model’s corruption is a bit of a fallacy. What proof do you have that the two behave the same in response to poisoned data?