this post was submitted on 14 Jun 2024
836 points (97.3% liked)

memes

10477 readers
3206 users here now

Community rules

1. Be civilNo trolling, bigotry or other insulting / annoying behaviour

2. No politicsThis is non-politics community. For political memes please go to [email protected]

3. No recent repostsCheck for reposts when posting a meme, you can only repost after 1 month

4. No botsNo bots without the express approval of the mods or the admins

5. No Spam/AdsNo advertisements or spam. This is an instance rule and the only way to live.

Sister communities

founded 1 year ago
MODERATORS
 
top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 73 points 5 months ago (7 children)

Think of the money we'd all have if we were the ones selling our data

[–] [email protected] 43 points 5 months ago* (last edited 5 months ago) (22 children)

Big tech companies making vast profits off of users providing data for free instead of paying workers wages in exchange for manufacturing goods is only going to deepen the disparity of wealth in society.

What we desperately need is essentially a Digital Bill of Rights so that we can legally own our own data.

load more comments (22 replies)
[–] [email protected] 7 points 5 months ago* (last edited 5 months ago)

The amount is incredibly vaste. If we go by quantity no one here is getting a dime, and if we go by quality...it's probably the same. Not to mention the logistics of getting everyone their penny or two.

And the data right now belongs to everyone. For example, Reddit technically 'owns' it's content, but anyone can use it for ml purposes.

It's why a lot of these campaigns about data ownership are being pushed. If the gov passes laws, it won't be to the benefit of the individual but the data aggregators like Reddit, Shutterstock, etc.

They are playing on emotions and manipulating people into thinking killing AI FOSS and erecting data barriers is in their interest.

[–] [email protected] 3 points 5 months ago (2 children)

Don't mind my tipsy Friday rambling, but this is actually an interesting thing to think about. Kinda wonder how that would work, if it were to be real. Maybe there'd be a single centralized data broker, or we could choose from a list of vendors, like how sharing cookies works.
Would it be per a specific amount of data, identifiable data, what if we just dumped 10 years of chats into it.

[–] [email protected] 4 points 5 months ago

Maybe there’d be a single centralized data broker

Hmm like a government office? With the power to adhere marks to ip that prevent copying and granting rights to people? Like a department of copy-right or something

load more comments (1 replies)
load more comments (4 replies)
[–] [email protected] 51 points 5 months ago (4 children)

IMO it’s one thing if you posted things publicly on the internet and it’s getting scraped, in the same way a human would find it.

But it’s disgusting when all these companies retroactively update their TOS, or force you into zero privacy to continue using their service.

[–] [email protected] 22 points 5 months ago (2 children)

They were already doing that stuff with your data but now they're telling you about it.

[–] [email protected] 11 points 5 months ago (2 children)

They were already doing this but now they're beginning to become afraid of the legal ramifications. Using data like this hasn't gotten so close to redistribution of Common Law copyrighted work .

If someone can provide evidence of an AI regurgitating their work verbatim or with no alterations they can be in serious legal trouble. We all have this right, but they have the right to sign a contract with us outlining the terms of the copyright and or granting them a perpetual license to the work. Just remember any work you do for your company isn't yours but the companies as per the work for hire doctrine.

load more comments (2 replies)
[–] [email protected] 3 points 5 months ago

They told us before. But memes

[–] [email protected] 5 points 5 months ago* (last edited 5 months ago)

This is the entire issue for me.

Privatizing what is otherwise public content, and then privatizing the models that are trained on that content and making me pay for having it regurgitated back at me.

I think AI would be really cool, IF:

  • it wasn't being shoved into every goddamn thing
  • it wasn't being used as justification to cut jobs
  • it was a open source project and wasn't being gatekept by capitalist interests
[–] [email protected] 4 points 5 months ago

Exactly THIS ☝️! Well put! Thanks mate!

load more comments (1 replies)
[–] [email protected] 36 points 5 months ago (4 children)

There is no "data ownership". It's all made up. If you don't want people to copy and build off your ideas, don't share them. That's not to defend corpos Btw. I posit that any ai models trained on public data must be open sourced by default.

[–] [email protected] 26 points 5 months ago (2 children)

Autodesk has mandatory cloud saves, and MS got caught training on private github repos. They don't care whether it's public or not

[–] [email protected] 5 points 5 months ago

What did the user agreement say? Also just out of curiosity do you remember all those privacy nuts back in the day who warned us all about the dangers of closed source software?

[–] [email protected] 3 points 5 months ago

"Private github repos" arent really private. You have to selfhost for that.

[–] [email protected] 16 points 5 months ago (1 children)

Your heart rate. Your step count. Your location. Your searches. Your browser history. Your call history. Your contacts. Your transactions. Your credit history. Your medical history. This is data that you didn't choose to create or share, but that you exhaust in the day-to-day things you do.

Surveillance capitalism has grown too unfathomably huge and ingrained to choose not to share this data; that would be akin to checking out of modern life wholesale in a lot of ways. Guarding this data takes not only the realisation that it needs guarding, but changing law and culture such that the parties that have to have all that data to provide you with services cannot take it from you to sell.

[–] [email protected] 2 points 5 months ago (1 children)

There's a difference between private data and content. Obviously this is not what we're talking about here

[–] [email protected] 8 points 5 months ago (1 children)

You were talking about data ownership, not intellectual property.

load more comments (1 replies)
[–] [email protected] 4 points 5 months ago (2 children)

AI is here and it's here to stay whether we want it or not, either it's free and legal for everyone to develop (ie training on copyrighted data does not violate copyrights), or only the massively rich corporations will be able to afford to pay for (or already happen to have the rights to as the case may be, see stock photo companies or reddit for examples) the sheer amounts of data that are needed to adequately train them

load more comments (2 replies)
load more comments (1 replies)
[–] [email protected] 34 points 5 months ago (2 children)

Yeah, I'm genuinely feeling like I don't want to publish things I create onto the internet, because these companies will gladly break laws to use it. Companies spent decades building up ridiculous copyright laws and when they go to violate those laws themselves, law enforcement fails.

[–] [email protected] 10 points 5 months ago (1 children)

Don't. Please stop. Don't publish anything you don't want shared. There was so much cool free stuff that everyone shared until content creators showed up trying to sell us shit like a bunch of car salesmen. Please stop

[–] [email protected] 5 points 5 months ago (3 children)

Oh, I do want it shared. I just don't want to be taken advantage of by immoral companies. That's why I would share it under licenses like AGPLv3 or CC BY-NC-SA. In a sense, I'm very much blocking others from taking the free stuff I share and turning it into a commercial product, because I do feel the same as you.

load more comments (3 replies)
[–] [email protected] 2 points 5 months ago

Just create demented shitposts that will poison any AI, like the ones trained on Reddit posts telling users to put glue on their pizza and make chlorine gas.

[–] [email protected] 12 points 5 months ago* (last edited 5 months ago) (1 children)

We're speeding towards a point where only the obscenely rich resource hoarders and their corporations actually own anything.

The rest of us will just use anything, including what is still technically our own bodies, at their pleasure.

Can't say I'm quite gleeful about it tbh..

[–] [email protected] 7 points 5 months ago (1 children)

We're going back to the feudal system

[–] [email protected] 4 points 5 months ago (1 children)

That's a more concise way of putting it, yes 😁

[–] [email protected] 5 points 5 months ago (1 children)
load more comments (1 replies)
[–] [email protected] 12 points 5 months ago* (last edited 5 months ago) (5 children)

I'm [email protected] all the way (even my email).

Highly recommend it, even if you start small with like just your calendar or something.

Even if you can't self-host, maybe one of your friends can/does and would set you up on their stuff. I've got a handful of friends and family hooked into my stack (email, Nextcloud, Matrix, Lemmy, AdGuard DNS, etc).

[–] [email protected] 6 points 5 months ago (1 children)
[–] [email protected] 4 points 5 months ago

I felt that lol.

[–] [email protected] 5 points 5 months ago (1 children)

Do you have a good "getting started" resource for self-hosting that you would point people to?

[–] [email protected] 8 points 5 months ago* (last edited 5 months ago)

Not really, though there's probably something like that out there. It's more a collection of skills that build on each other, finding a problem to solve, and then solving it (with occasional detours along the way to fill in any knowledge gaps).

Basically, just stack these on top of each other:

  1. Learn basic Linux skills (I can't in good conscious recommend hosting or even using Windows)
  2. Familiarize yourself with web standards. Don't have to be an expert, just understand the basic concepts (web traffic is HTTP based, HTTP usually runs on port 80, HTTPS is secure/encrypted HTTP, don't send passwords over HTTP, etc).
  3. Find a self-hosted project you'd like to play with. Usually you can just google "self hosted {thing}" such as "Self hosted trello"
  4. The previous step will typically land you on a Github or other project page. Review the docs for getting started on those.
  5. You'll likely encounter terms or things you don't understand. Detour to familiarize yourself with them.
  6. Follow the steps to get your first service up and running.
  7. Enjoy!
  8. Once you're past that, you can fine tune, re-deploy in a better way, or otherwise optimize.

The next thing you decide to deploy will usually be easier and will further extend and cement the skills you've just used.

It's definitely a process and collection of skills rather than just one monolithic thing, but each one builds off the other. There's a learning curve, sure, but just reading the docs for different things will usually get you going or provide a "jumping off" point. e.g. Many services utilize Docker, so you'll see that in a lot in the docs and probably end up detouring to learn the basics of working with it.

Some self-hostable applications do have easy deploy scripts which can definitely be good for beginners, but I tend to not like those as if/when something goes wrong, you're ill-equipped to do any meaningful troubleshooting.

Members of various selfhosted communities are usually happy to help as long as you're willing to learn; we typically don't like to just do it for you lol.

load more comments (3 replies)
[–] [email protected] 5 points 5 months ago

....and then happily selling it back to you for a profit.

[–] [email protected] 4 points 5 months ago (1 children)

The better image is of the "Me" was drinking from the company's pee stream.

load more comments (1 replies)
[–] [email protected] 3 points 5 months ago (1 children)

This is how they get their revenge on piracy.

[–] [email protected] 6 points 5 months ago (1 children)

But the solution is more piracy

load more comments (1 replies)
load more comments
view more: next ›