this post was submitted on 08 Jan 2024
335 points (96.1% liked)
Technology
59608 readers
3434 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
If it's not infringement to input copyrighted materials, then it's not infringement to take the output.
Copyright can be enforced at both ends or neither end, not one or the other.
Because.. why?
A better question is: Why not?
If Copyright doesn't protect what goes in, why should it protect what comes out?
Because sometimes it spits it out verbatim, and sometimes GPLed code gets spat out in the case of Copilot.
See: the time Copilot spat out the Quake inverse square root algorithm, comments and all.
Also, if it's legal to disregard libre/open source licenses for this, then why isn't it legal for me to look at leaked code, which I also do not have permission to use, and use the knowledge gained from that to write something else?
Which is exactly why the output of an AI trained on copyrighted inputs should not be copyrightable. It should not become the private property of whichever company owns the language model. That would be bad for a lot more reasons than the potential for laundering open source code.
Well. That sounds perfectly legal. However, mind that "leaked" implies unauthorized copying and/or a violation of trade secrets. But it's not a given, that looking at such code violates any law.
And if they're not going to respect the copyleft, they are also performing unauthorised copying.
"Copyleft" means certain types of copyright licenses. Since these licenses generally allow and encourage public distribution/copying, such code is certainly not leaked. Laws pertaining to trade secrets cannot be involved in principle.
I think the copies made during AI training would be typically allowed under copyleft licenses. In any case, as it is a copyright license, it is subject to the same limitations.
Public distribution and copying is allowed, but only if the license in it's entirety is respected.
And when the license is void, it's all rights reserved, right?
Sure. Is there a problem with any copyleft license?
? I'm not sure what your point is
You asked?
The part that you're apparently having trouble understanding is that a language model is not a human mind and a human mind is not a language model.