Microblog Memes

6320 readers

3168 users here now

A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.

Created as an evolution of White People Twitter and other tweet-capture subreddits.

Rules:

Please put at least one word relevant to the post in the post title.
Be nice.
No advertising, brand promotion or guerilla marketing.
Posters are encouraged to link to the toot or tweet etc in the description of posts.

Related communities:

founded 2 years ago

MODERATORS

[email protected]

1092

deepseek (lemmy.ml)

submitted 2 days ago by [email protected] to c/[email protected]

266 comments fedilink hide all child comments

(page 3) 50 comments

sorted by: hot top controversial new old

[–] [email protected] 3 points 1 day ago (7 children)

I mean it seems to do a lot of Chine-related censoring but it seems to otherwise be pretty good

[–] [email protected] 2 points 1 day ago (3 children)

I think the big question is how the model was trained. There's thought (though unproven afaik), that they may have gotten ahold of some of the backend training data from OpenAI and/or others. If so, they kinda cheated their way to their efficiency claims that are wrecking the market. But evidence is needed.

Imagine you're writing a dictionary of all words in the English language. If you're starting from scratch, the first and most-difficult step is finding all the words you need to define. You basically have to read everything ever written to look for more words, and 99.999% of what you'll actually be doing is finding the same words over and over and over, but you still have to look at everything. It's extremely inefficient.

What some people suspect is happening here is the AI equivalent of taking that dictionary that was just written, grabbing all the words, and changing the details of the language in the definitions. There may not be anything inherently wrong with that, but its "efficiency" comes from copying someone else's work.

Once again, that may be fine for use as a product, but saying it's a more efficient AI model is not entirely accurate. It's like paraphrasing a few articles based on research from the LHC and claiming that makes you a more efficient science contributor than CERN since you didn't have to build a supercollider to do your work.

[–] [email protected] 1 points 1 day ago

So here's my take on the whole stolen training data thing. If that is true, then open AI should have literally zero issues building a new model off of the full output of the old model. Just like deepseek did. But even better because they run it in house. If this is such a crisis, then they should do it themselves just like China did. In theory, and I don't personally think this makes a ton of sense, if training an LLM on the output of another LLM results in a more power efficient and lower hardware requirement, and overall better LLM, then why aren't they doing that with their own LLMs to begin with?.

load more comments (2 replies)

load more comments (6 replies)

[–] [email protected] 6 points 1 day ago (1 children)

Deepsink

[–] [email protected] 5 points 1 day ago (1 children)

What is it sinking deeply about?

[–] [email protected] 2 points 1 day ago

GPU proudly running by Oceangate

[–] [email protected] 19 points 2 days ago (1 children)

if you can imagine a fish enjoying a succulent chinese meal rn, rolling its eyes

[–] [email protected] 18 points 2 days ago (2 children)

load more comments (2 replies)

load more comments