this post was submitted on 04 Dec 2023
888 points (97.9% liked)

Technology

58142 readers
4266 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 313 points 9 months ago (22 children)

How can the training data be sensitive, if noone ever agreed to give their sensitive data to OpenAI?

[–] [email protected] 138 points 9 months ago (15 children)

Exactly this. And how can an AI which "doesn't have the source material" in its database be able to recall such information?

[–] [email protected] 70 points 9 months ago (7 children)

Model is the right term instead of database.

We learned something about how LLMs work with this.. its like a bunch of paintings were chopped up into pixels to use to make other paintings. No one knew it was possible to break the model and have it spit out the pixels of a single painting in order.

I wonder if diffusion models have some other wierd querks we have yet to discover

[–] [email protected] 9 points 9 months ago* (last edited 9 months ago)

The technology of compression a diffusion model would have to achieve to realistically (not too lossily) store “the training data” would be more valuable than the entirety of the machine learning field right now.

They do not “compress” images.

load more comments (6 replies)
load more comments (13 replies)
load more comments (19 replies)