this post was submitted on 20 Sep 2023
556 points (95.6% liked)
Technology
59246 readers
3330 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Ok, so why not wait until those hypothetical violations occur and then sue?
Because that is far harder to prove than showing OpenAI used his IP without permission.
In my opinion, it should not be allowed to train a generative model on data without permission of the rights holder. So at the very least, OpenAI should publish (references to) the training data they used so far, and probably restrict the dataset to public domain--and opt-in works for future models.
Assuming that books used for GPT training were indeed purchased, not pirated, and since "AI training" was not prohibited at the time of the purchase, the engineers had every right to use them. Maybe authors in the future could prohibit "AI training" but for the books purchased before they do, "AI training" is a fair usage.
I think we'll find our whether or not that is true will be decided in a trial like this.