The problem I ran into is that every single platform that primarily interacted with Mastodon (The keys, etc.) had the same exact same set of problems.
While yes, my Firefish instance had search, what was it searching? Local data only, and once I figured out that Mastodon-style replies didn't federate to all of someone's followers, it became pretty clear that it was uh, not very useful.
You can search, but any given server may or may not have access to data you actually want and thus, well, you just plain cannot meaningfully search for shit unless you go to one of the mega instances, or join giant piles of relays and store gigabyte upon gigabyte upon gigabyte of garbage data you do not care about.
The whole implementation is kinda garbage for search-based discovery from it's very basic design all the way through to everyone's implementations.
PhotoDNA is based on image hashes, as well as some magic that works on partial hashes: resizing the image, or changing the focus point, or fiddling with the color depth or whatever won't break a PhotoDNA identification.
But, of course, that means for PhotoDNA to be useful, the training set is literally 'every CSAM image in existance', so it's not really like you're training on a lot less data than an AI model would want or need.
The big safeguard, such as it is, is that you basically only query an API with an image and it tells you if PhotoDNA has it in the database, so there's no chance of the training data being shared.
Of course, there's also no reason you can't do that with an AI model, either, and I'd be shocked if that's not exactly how they've configured it.