this post was submitted on 17 Feb 2024
1059 points (98.8% liked)
Technology
60073 readers
3588 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The more accessible training data there is the easier it is for new AI projects to enter the field less dominant those "giant corporations" become.
The free labour was already freely given. If someone doesn't want to have shitposted on Reddit for free then maybe they shouldn't have shitposted on Reddit for free.
"if you didn't want me to steal your intellectual property, you shouldn't have thought of it in the first place"
No, you shouldn’t have posted it to Reddit, in which you were required to give them a perpetual license to use your IP in any way they see fit.
For the record, I’m here because Reddit pissed me off when they axed the free API, and I’m pissed at myself for not expecting it. That’s what I get for accepting their terms and conditions, I guess.
Edit: I also don’t accept the idea that using my content for training data is “fair use” when it is used to train proprietary models, especially ones in which the end user is allowed to prompt it to plagiarize or otherwise imitate my content.
So, for an example of what the other user was talking about, I'm just some guy and for my first foray inyo programming / machine learning (I kind of just threw myself into the deep end) I modified stylegan 3 and trained it on about 500g of reddit porn that I scraped off reddit.
Now, I stopped the training after about a week (it was going to take about a solid month on my rtx 2080 ti) when I found out stable diffusion existed but I learned a LOT from that experience.
I couldn't do that now. Arguably none of that was how any of that should be done but whatever.
I'm not sure what you mean here. Nothing's being stolen. Even if you think there needs to be permission for training an AI off of data, Reddit has that permission.
I assume you're more of a moron than a troll, which is disappointing. Regardless, you're not worth my time, as I don't think any argument could convince you to have an open mind and be willing to change. Good luck out there!