this post was submitted on 12 Jun 2024

393 points (95.4% liked)

Technology

59698 readers

2795 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

393

Tim Cook is “not 100 percent” sure Apple can stop AI hallucinations (www.theverge.com)

submitted 5 months ago by [email protected] to c/[email protected]

146 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 58 points 5 months ago* (last edited 5 months ago) (1 children)

Wow whoosh. The point is that "AI" isn't actually "intelligent" like a human and thus can't "hallucinate" like an intelligent human.

All of this anthropomorphic terminology is just misleading marketing bullshit.

[+] [email protected] -31 points 5 months ago (5 children)

Who said anything about human intelligence? AIs have a different kind of intelligence, an artificial kind. I'm tired of pretending they don't

Ever heard of the Turing test? Ever since AIs could pass it it became not a thing. Before that, playing Go was the mark of AI.

Any time an AI achieves a new thing people move goalposts. So I ask you: what does AI need to achieve to have intelligence?

[–] [email protected] 36 points 5 months ago (1 children)

The Turing Test says that any person could have any conversation with a machine and there's no chance you could tell it's a machine. It does not say that one person could have one conversation with a machine and not be able to tell.

Current text generation models out themselves all the damn time. It can't actually understand the underlying concepts of words. It just predicts what bit of text would be most convincing to a human based on previous text.

Playing Go was never the mark of AI, it was the mark of improving game-playing machines. It doesn't represent "intelligence", only an ability to predict what should happen next based on a set of training data.

It's worth noting that after Lee Se Dol lost to Alphago, researchers found a fairly trivial Go strategy that could reliably beat the machine. It was simply such an easy strategy to counter that none of the games in the training data had included anyone attempting that strategy, so the algorithm didn't account for how to counter it. Because the computer doesn't know Go theory, it only knows how to predict what to do next based on the training data.

[+] [email protected] -8 points 5 months ago (1 children)

Detecting the machine correctly once is not enough. You need to guess correctly most of the time to statistically prove it's not by chance. It's possible for some people to do this, but I've seen a lot of comments on websites accusing HUMAN answers of being written by AIs.

If the current chat bots improve to reliably not be detected, would that be intelligence then?

KataGo just fixed that bug by putting those positions into the training data. The reason it wasn't in the training data is because the training data at first was just self-play games. When games that are losses for the AI from humans are included, the bug is fixed.

[–] [email protected] 8 points 5 months ago (1 children)

When games that are losses for the AI from humans are included, the bug is fixed.

You're not grasping the fundamental problem here.

This is like saying a calculator understands math because when you plug in the right functions, you get the right answers.

[–] [email protected] -2 points 5 months ago (1 children)

The AI grasps the strategic aspects of the game really well. To the point that if you don't let it "read" deeply into the game tree, but only "guess" moves (that is, only use the policy network) it still plays at a high level (below professional, but strong amateur)

[–] [email protected] 6 points 5 months ago (1 children)

How does it "understand the strategic aspects of the game really well" if it can't solve problems it hasn't seen the answers to?

[–] [email protected] -2 points 5 months ago

It doesn't get fed answers in the training data, only positions. If it sees a position, it will eventually learn to solve it by itself

[–] [email protected] 22 points 5 months ago (2 children)

The same thing actually passing a turing test would require. You've obviously read the words "Turing test" somewhere and thought you understood what it meant, but no robot we've ever produced as a species has passed the turing test. It EXPLICITLY requires that intelligence equal to (or indistinguishable from) HUMAN intelligence is shown. Without a liar reading responses, no AI we'll produce for decades will pass the turing test.

No large language model has intelligence. They're just complicated call and response mechanisms that guess what answer we want based on a weighted response system (we tell it directly or tell another machine how to help it "weigh" words in a response). Obviously with anything that requires massive amounts of input or nuance, like language, it'll only be right about what it was guided on, which is limited to areas it is trained in.

We don't have any novel interactions with AI. They are regurgitation engines, bringing forward sentences that aren't theirs piecemeal. Given ten messages, I'm confident no major LLM would pass a Turing test.

[–] [email protected] 3 points 5 months ago (1 children)

The Turing test is flawed, because while it is supposed to test for intelligence it really just tests for a convincing fake. Depending on how you set it up I wouldn't be surprised if a modern LLM could pass it, at least some of the time. That doesn't mean they are intelligent, they aren't, but I don't think the Turing test is good justification.

For me the only justification you need is that they predict one word (or even letter!) at a time. ChatGPT doesn't plan a whole sentence out in advance, it works token by token... The input to each prediction is just everything so far, up to the last word. When it starts writing "As..." it has no concept of the fact that it's going to write "...an AI A language model" until it gets through those words.

Frankly, given that fact it's amazing that LLMs can be as powerful as they are. They don't check anything, think about their answer, or even consider how to phrase a sentence. Everything they do comes from predicting the next token... An incredible piece of technology, despite it's obvious flaws.

[–] [email protected] 7 points 5 months ago (1 children)

The Turing test is flawed, because while it is supposed to test for intelligence it really just tests for a convincing fake.

This is just conjecture, but I assume this is because the question of consciousness is not really falsifiable, so you just kind of have to draw an arbitrary line somewhere.

Like, maybe tech gets so good that we really can't tell the difference, and only god knows it isn't really alive. But then, how would we know not to give the machine legal rights?

For the record, ChatGPT does not pass the turing test.

[–] [email protected] 2 points 5 months ago

ChatGPT is not designed to fool us into thinking it's a human. It produces language with a specific tone & direct references to the fact it is a language model. I am confident that an LLM trained specifically to speak naturally could do it. It still wouldn't be intelligent, in my view.

[–] [email protected] 17 points 5 months ago (1 children)

Have you ever heard of the Turing test?

https://en.m.wikipedia.org/wiki/Turing_test

Here you go since you've heard of it but don't understand it.

[+] [email protected] -15 points 5 months ago (1 children)

Current AIs pass it, since most people can't reasonably tell between AI and human-written stuff every time

[–] [email protected] 17 points 5 months ago (1 children)

It's dead simple to see if you're talking to an LLM. The latest models don't pass the Turing test, not even close. Asking them simple shit causes them to crap themselves really quickly.

Ask ChatGPT how many r's there are in "veryberry". When it gets it wrong, tell it you're disappointed and expect a correct answer. If you do that repeatedly, you can get it to claim there's more r's in the word than it has letters.

[+] [email protected] -8 points 5 months ago (2 children)

[–] [email protected] 8 points 5 months ago (2 children)

that's it? you asked one question and that was enough for you?

[–] [email protected] 1 points 5 months ago

xD God damn that was funny.

[–] [email protected] 6 points 5 months ago (1 children)

Here's what I got:**

[–] [email protected] 0 points 5 months ago (1 children)

Can you show the question you asked that led to this and which model was used? I just tested in several models, even slightly older ones and they all answered precisely. Of course if you follow up and tell it the right answer is wrong you can make it say stuff like this, but not one got it wrong out of the gate.

[–] [email protected] 8 points 5 months ago (1 children)

My point is that telling it a right answer is wrong often causes LLMs to completely shit the bed. They used to argue with you nonsensically, now they give you a different answer (often also wrong).

The only question missing at the start was "How many r's are there in the word 'veryberry'. I think raspberry also worked when I tried it. This was ChatGPT4-O. I did mark all the answers as bad, so perhaps they've fixed this one by now.

Still, it's remarkably trivial to get an LLM to provide a clearly non-human response.

[–] [email protected] 1 points 5 months ago (1 children)

Fair enough, but it does somewhat undercut your message that every model I’ve tested including quite old ones answer this question correctly on the first try. This image is ChatGPT-4o.

[–] [email protected] 7 points 5 months ago

Perhaps it was being influenced by the chat history. But try asking how many r's in raspberry, it does get that consistently wrong for me. And you can ask it those followup questions to easily get it to spout nonsense, and that was mostly my point; figuring out if you're talking to an LLM is fairly trivial.

[–] [email protected] 15 points 5 months ago

Ever heard of the Turing test? Ever since AIs could pass it it became not a thing.

In place of the Turing test we have a new test that informs us whether an individual can properly identify a stochastic parrot

[–] [email protected] 5 points 5 months ago

People can mean different things. Intelligence can mean a calculator doing a sum, and it can mean the way humans talk to each other. AI can do some intelligent things without people agreeing that it's intelligent in the latter sense.