this post was submitted on 23 Jan 2024
47 points (66.7% liked)

Technology

59594 readers
3416 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

I fucked with the title a bit. What i linked to was actually a mastodon post linking to an actual thing. but in my defense, i found it because cory doctorow boosted it, so, in a way, i am providing the original source here.

please argue. please do not remove.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 35 points 10 months ago (1 children)

Google scanned millions of books and made them available online. Courts ruled that was fair use because the purpose and interface didn't lend itself to actually reading the books in Google books, but just searching them for information. If that is fair use, then I don't see how training an LLM (which doesn't retain the exact copy of the training data at least in the vast majority of cases) isn't fair use. You aren't going to get an argument from me.

I think most people who will disagree are reflexively anti AI, and that's fine. But I just haven't heard a good argument that AI training isn't fair use.

[–] [email protected] 5 points 10 months ago (2 children)

here's a sidechannel attack on your position: every use, even infringing uses, are fair use until adjudicated, because what fair use means is that a court has agreed that your infringing use is allowed. so of course ai training (broadly) is always fair use. but particular instances of ai training may be found to not be fair use, and so we can't be sure that you are always going to be right (for the specific ai models that may come into question legally).

[–] [email protected] 10 points 10 months ago (1 children)

"Its perfectly legal unless you get caught!"

[–] [email protected] 0 points 10 months ago* (last edited 10 months ago)

Considering most copyright cases come down to the individual judge's decision, essentially yes

[–] [email protected] 3 points 10 months ago (1 children)

I am no lawyer, but I suspect what will be considered either fair use or infringing will probably depend on how the programmed AI model is used.

For example, if you train it on a book of poetry, asking it questions about the poetry will probably be considered fair use. If you ask the AI to write poetry in the style of the book's poems and you publish the AI's poetry, I suspect it might be considered laundering copyright and infringing. Especially if it is substantially similar to specific poems in the book.

[–] [email protected] 11 points 10 months ago

If you ask the AI to write poetry in the style of the book’s poems and you publish the AI’s poetry, I suspect it might be considered laundering copyright and infringing.

is the image of a cabin in a snowy landscape copyrighted by Thomas kinkade? fuck no. That's an idea. ideas can't be copyrighted. a style isn't a discreet work. it is an idea. it can't be copyrighted. if I produce something in the style of Keats or Stephen King or Rowling, they can't sue me for copyright unless I make a substantially infringing use of their work. The style isn't sufficient, because the style can't be copyrighted.