Meet Nightshade, the new tool allowing artists to ‘poison’ AI models with corrupted training data

@[email protected] · 1 year ago

Meet Nightshade, the new tool allowing artists to ‘poison’ AI models with corrupted training data

@[email protected] · edit-2 1 year ago

In the case of Nightshade, the counterattack for artists against AI goes a bit further: it causes AI models to learn the wrong names of the objects and scenery they are looking at.

Sounds like it’s just adding fake tags to images, in the event that the image is scraped for AI training.

It’s a pretty trivial matter for these guys to add another AI that checks to make sure the information matches up with what’s expected to be honest.

@[email protected] · 1 year ago

It’s going to be an arms race

@[email protected] · 1 year ago

Good thing it’s not a fingers race, AI would lose for sure

@[email protected] · 1 year ago

The scary thing about this joke is that ai has been able to do hands for a relatively long time now.

its going much faster then people are able to process.

The thumbnail in this article is by Dalle-3

@[email protected] · 1 year ago

Yeah, by the time the joke was really making the rounds all the newest images had pretty good hands.

@[email protected] · 1 year ago

Yeah I know, I’ve got control net 1.1.4 also

@[email protected] · 1 year ago

Which is incredibly favorable for the AI side. Like current countermeasures are either almost completely worthless, or degrade the quality of the protected medium so much that you wouldn’t use it.

@[email protected] · 1 year ago

It’s going to be an AI vs AI all out, drag down, cage match.

Norgur · 1 year ago

Until the hype and thus the ridiculous worth estimations dry up and the AI companies suddenly can’t just throw money at every problem anymore.

FaceDeer · 1 year ago

Open source AI has been keeping up pretty well of late.

AphoticDev · 1 year ago

Most of the cool shit for AI these days is done by users, so at this point the companies aren’t super important anymore.

@[email protected] · 1 year ago

Up until then AWS and Azure party

@[email protected] · 1 year ago

The arms race will soon be AGI versus AGI and us humans will be on the sideline not even sure who is winning.

@[email protected] · 1 year ago

Do you think an authentic AGI would have ethical\moral boundaries completely divorced from what the original software programmed? In other words would it be able to make it’s own decisions without interference?

@[email protected] · 1 year ago

I am certain it will happen. Perhaps not with all AGIs, but for sure some. That day is coming.

@[email protected] · 1 year ago

I hope they will because I feel like if AGIs have ethical decision-making skills that Terminator-esque dystopian future becomes remote. If they never have that then we very well might be at the mercy of the world’s largest conglomerations.

@[email protected] · 1 year ago

Not really, if you read the paper what they’re doing is creating an image that looks like a dog, is labeled as a dog, but is very close to the model’s version of a cat in feature space. This means manual review of the training set won’t help.

@[email protected] · 1 year ago

What are the implications for the non-ai viewer? I have tonassume that these changes aren’t perceptible to humans but I find that to be a stretch also. I don’t see artists willing to have an AI manipulate thier art, so that AI can’t recreate it.

@[email protected] · 1 year ago

I don’t think the idea is to protect specific images, it’s to create enough of these poisoned images that training your model on random free images you pull off the internet becomes risky.

@[email protected] · 1 year ago

Which, honestly, should be criminal.

@[email protected] · 1 year ago

Hmm, sounds more like they are adding structures to the images such that what is clearly a picture of a dog registers as a picture of a cat to an AI. I suppose this can be done by altering the pixels in a way invisible to humans, but visible to AI, adding a cat into the “ghost pixels”.

@[email protected] · edit-2 1 year ago

I went and skimmed the paper because I was curious too.

If my skimming is correct, what they do is similar to adversarial attacks on classifiers, where a second model learns to change as few pixels as possible to confuse a classifier into giving a wrong prediction.

Looking at the examples of dogs and cats: They find pictures of dogs where by making only minimal changes, invisible to the naked eye, they can get the autoencoder to spit out (almost) the same latent representation as an image of a cat would have. Done to enough dog-images, this will then confuse the underlying diffusion model to produce latent representations of cat images when prompted to generate a dog. Edit for clarity: Those generated latent representations would then decode into cat images.

If my thinking doesn’t fail me, this attack could easily be thwarted by unfreezing the pretrained autoencoder. In the paper that introduced latent diffusion they write that such approaches already exist. If “Nightshade” takes off, I’m sure those approaches would be refined and used. Even just finetuning the autoencoder for a few epochs first should be enough to move the latent representations of the poisoned dog images and those of the cat images they’re meant to resemble far enough apart to make the attack meaningless.

Edit: I also wonder how robust this attack is against just adding an imperceptible amount of noise to the poisoned images.

@[email protected] · 1 year ago

I bet adding some blur might also defeat it.