this post was submitted on 23 Nov 2024

360 points (89.3% liked)

Technology

60055 readers

3551 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

360

OpenAI, Google, Anthropic admit they can’t scale up their chatbots any further (pivot-to-ai.com)

submitted 4 weeks ago* (last edited 4 weeks ago) by [email protected] to c/[email protected]

88 comments fedilink hide all child comments

I'm usually the one saying "AI is already as good as it's gonna get, for a long while."

This article, in contrast, is quotes from folks making the next AI generation - saying the same.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 100 points 4 weeks ago (4 children)

It's absurd that some of the larger LLMs now use hundreds of billions of parameters (e.g. llama3.1 with 405B).

This doesn't really seem like a smart usage of ressources if you need several of the largest GPUs available to even run one conversation.

[–] [email protected] 30 points 4 weeks ago (3 children)

I wonder how many GPUs my brain is

[–] [email protected] 65 points 4 weeks ago (1 children)

It's a lot. Like a lot a lot. GPUs have about 150 billion transistors but those transistors only make 1 connection in what is essentially printed in a 2d space on silicon.

Each neuron makes dozens of connections, and there's on the order of almost 100 billion neurons in a blobby lump of fat and neurons that takes up 3d space. And then combine the fact that multiple neurons in patterns firing is how everything actually functions and you have such absurdly high number of potential for how powerful human brains are.

At this point, I'm not sure there's enough gpus in the world to mimic what a human brain can do.

[–] [email protected] 22 points 4 weeks ago

That's also just the electrical portion of our mind. There are whole levels of chemical, and chemical potentials at work. Neurones will fire differently depending on the chemical soup around them. Most of our moods are chemically based. E.g. adrenaline and testosterone making us more aggressive.

Our mind also extends out of our heads. Organ transplant recipricants have noted personality changes. Food preferences being the most prevailant.

The neurons only deal with 'fast' thinking. 'slow' thinking is far more complex and distributed.

[–] [email protected] 20 points 4 weeks ago (2 children)

[–] [email protected] 3 points 4 weeks ago

The Answer to the Ultimate Question of Life, The Universe, and Everything

[–] [email protected] 2 points 4 weeks ago (1 children)

Sounds generous.

[–] [email protected] 1 points 4 weeks ago

You said GPUs, not CPUs and threading capabilities

[–] [email protected] 13 points 4 weeks ago (1 children)

I don't think your brain can be reasonably compared with an LLM, just like it can't be compared with a calculator.

[–] [email protected] 21 points 4 weeks ago (1 children)

LLMs are based on neural networks which are a massively simplified model of how our brain works. So you kind of can as long as you keep in mind they are orders of magnitude more simple.

[–] [email protected] 6 points 4 weeks ago (1 children)

At some point it becomes so “simplified” it’s arguably just not the same thing, even conceptually.

[–] [email protected] 0 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

It is conceptually the same thing. A series of interconnected neurons with a firing threshold and weighted connections.

The simplification comes with how the information is transmitted and how our brain learns.

Many functions in the human body rely on quantum mechanical effects to function correctly. So to simulate it properly each connection really needs to be its own super computer.

But it has been shown to be able to encode information in a similar way. The learning the part is not even close.

[–] [email protected] 1 points 3 weeks ago (1 children)

It is conceptually the same thing. [...] The learning the part is not even close.

Well... isn't the "learning part" precisely the point? I don't think anybody is excited about brains as "just" a computational device, rather the primary function of a brain is ... learning.

[–] [email protected] 1 points 3 weeks ago (1 children)

No, we are nowhere close to learning as the human brain does. We don't even really understand how it does at all.

The point is to encode solutions to problems that we can't solve with standard programming techniques. Like vision, speech recognition and generation.

These problems are easy for humans and very difficult for computers. The same way maths is super easy for computers compared to humans.

By applying techniques our neurones use computer vision and speech have come on in leaps and bounds.

We are decades from getting anything close to a computer brain.

[–] [email protected] 1 points 2 weeks ago

No, we are nowhere close to learning as the human brain does. We don’t even really understand how it does at all.

Sorry then if I sound like a broken record but again, doesn't that mean that the analogy itself is flawed? If the goal remain the same but there is close to no explanatory power, even if we do get pragmatically useful result (i.e. it "works" in some useful cases) it's basically "just" inspiration, which is nice but is basically branding more than anything else.

[–] [email protected] 17 points 4 weeks ago

Seeing as how the full unquantized FP16 for Llama 3.1 405B requires around a terabyte of VRAM (16 bits per parameter + context), I'd say way more than several.

[–] [email protected] 9 points 4 weeks ago

That's capitalism

[–] [email protected] 6 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

Larger models train faster (need less compute), for reasons not fully understood. These large models can then be used as teachers to train smaller models more efficiently. I've used Qwen 14B (14 billion parameters, quantized to 6-bit integers), and it's not too much worse than these very large models.

Lately, I've been thinking of LLMs as lossy text/idea compression with content-addressable memory. And 10.5GB is pretty good compression for all the "knowledge" they seem to retain.

[–] [email protected] 1 points 3 weeks ago (1 children)

I don't think Qwen was trained with distillation, was it?

It would be awesome if it was.

Also you should try Supernova Medius, which is Qwen 14B with some "distillation" from some other models.

[–] [email protected] 1 points 3 weeks ago (1 children)

Hmm. I just assumed 14B was distilled from 72B, because that's what I thought llama was doing, and that would just make sense. On further research it's not clear if llama did the traditional teacher method or just trained the smaller models on synthetic data generated from a large model. I suppose training smaller models on a larger amount of data generated by larger models is similar though. It does seem like Qwen was also trained on synthetic data, because it sometimes thinks it's Claude, lol.

Thanks for the tip on Medius. Just tried it out, and it does seem better than Qwen 14B.

[–] [email protected] 1 points 3 weeks ago* (last edited 3 weeks ago)

Llama 3.1 is not even a "true" distillation either, but its kinda complicated, like you said.

Yeah Qwen undoubtedly has synthetic data lol. It's even in the base model, which isn't really their "fault" as its presumably part of the web scrape.