comparative scale of the content involved
PhotoDNA is based on image hashes, as well as some magic that works on partial hashes: resizing the image, or changing the focus point, or fiddling with the color depth or whatever won't break a PhotoDNA identification.
But, of course, that means for PhotoDNA to be useful, the training set is literally 'every CSAM image in existance', so it's not really like you're training on a lot less data than an AI model would want or need.
The big safeguard, such as it is, is that you basically only query an API with an image and it tells you if PhotoDNA has it in the database, so there's no chance of the training data being shared.
Of course, there's also no reason you can't do that with an AI model, either, and I'd be shocked if that's not exactly how they've configured it.
Yeah, I think you've made a mistake in thinking that this is going to be usable as generative AI.
I'd bet $5 this is just a fancy machine learning algorithm that takes a submitted image, does machine learning nonsense with it, and returns a 'there is a high probability this is an illicit image of a child', and not something you could use to actually generate CSAM with.
You want something that's capable of assessing the similarities between a submitted image and a group of known bad images, but that doesn't mean the dataset is in any way usable for anything other than that one specific task - AI/ML in use cases like this is super broad and has been a thing for decades before the whole 'AI == generative AI' thing became what everyone is thinking.
But, in any case: the PhotoDNA database is in one place and access to it is scaled by the merit of uh, lots of money?
And of course, any 'unscrupulous engineer' that may have any plans for doing anything with this is probably not a complete idiot, even if a pedo: they're going to have shockingly good access controls and logging and well, if you're in the US, if the dude takes this database and generates a couple of CSAM images using it, the penalty is, for most people, spending the rest of their life in prison.
Feds don't fuck around with creation or distribution charges.