doesn’t it follow that AI-generated CSAM can only be generated if the AI has been trained on CSAM?

This article even explicitely says as much.

My question is: why aren’t OpenAI, Google, Microsoft, Anthropic… sued for possession of CSAM? It’s clearly in their training datasets.

  • PM_ME_VINTAGE_30S [he/him]
    link
    fedilink
    English
    161 month ago

    If AI spits out stuff it’s been trained on

    For Stable Diffusion, it really doesn’t just spit out what it’s trained on. Very loosely, it starts with white noise, then adds noise and denoises the result based on your prompt, and it keeps doing this until it converges to a representation of your prompt.

    IMO your premise is closer to true in practice, but still not strictly true, about large language models.

    • @[email protected]
      link
      fedilink
      21 month ago

      It’s akin to virtually starting with a block of marble and removing every part (pixel) that isn’t the resulting image. Crazy how it works.