Nemeski to [email protected]English • 2 months agoOpenAI’s latest model will block the ‘ignore all previous instructions’ loopholewww.theverge.commessage-square96fedilinkarrow-up1439
arrow-up1439external-linkOpenAI’s latest model will block the ‘ignore all previous instructions’ loopholewww.theverge.comNemeski to [email protected]English • 2 months agomessage-square96fedilink
minus-squarePasta DentallinkfedilinkEnglish65•2 months agoIll believe it when I see it: an LLM is basically a random box, you can’t 100% patch it. Their only way for it to stop generating bomb recipes is to remove that data from the training
Ill believe it when I see it: an LLM is basically a random box, you can’t 100% patch it. Their only way for it to stop generating bomb recipes is to remove that data from the training