this post was submitted on 28 Jan 2025
871 points (94.4% liked)

memes

11276 readers
3228 users here now

Community rules

1. Be civilNo trolling, bigotry or other insulting / annoying behaviour

2. No politicsThis is non-politics community. For political memes please go to [email protected]

3. No recent repostsCheck for reposts when posting a meme, you can only repost after 1 month

4. No botsNo bots without the express approval of the mods or the admins

5. No Spam/AdsNo advertisements or spam. This is an instance rule and the only way to live.

A collection of some classic Lemmy memes for your enjoyment

Sister communities

founded 2 years ago
MODERATORS
 

Office space meme:

"If y'all could stop calling an LLM "open source" just because they published the weights... that would be great."

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 1 day ago (1 children)

The differenge is that the dataset is baked into the weights of the model. Your emulation analogy simply doesn't have a leg to stand on. I don't think you know how neural networks work.

The standards are literally the basis of open source.

[–] [email protected] 1 points 1 day ago (1 children)

I made my level of understanding kinda open at the start. And you say it’s not, open source most say it is, and they explained why, and when i checked all their points were true, and o tried to understand as best i could. The bottom line is that the reason for the disagreement is you say the training data and the weights together are an inseparable part of the whole and if any part of that is not open then the project as a whole is not open. I don’t see how that tracks when the weights are open, and both it and the training data can be removed and switched to something else. But i have come to believe the response would just boil down to you can’t separate it. There really is no where else to go at this point.

[–] [email protected] 1 points 1 day ago

You can read all the other comments which explained why it is not open source. You can't really retrain the model without petabytes of data. Even if you "train" stuff on your dataset: it's more like tweaking the model weights a bit, rather than building the model from scratch.

"Open source" is PR talk by Meta and deepseek.