this post was submitted on 12 Dec 2023
-25 points (19.5% liked)
Technology
59152 readers
2441 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
LLMs have some difficulty with reasoning, especially low-parameter models like this one. This is pretty typical of the current state of the art. Bigger LLMs do a much better job.
Yes, but I'm sure any other model 7b or even less wouldn't give "three" after having written "two-headed", just because of the way probability works
I just fired up Llama2-70B, the biggest model I happen to have handy on my local machine, and repeated your exact prompt to it four times. The answers it gave were:
So one correct guess out of four attempts. Not a great showing. So I tried a more prompt-engineerish approach and asked it:
And gave it another four attempts. Its responses were:
Response 1:
Response 2:
Response 3:
Response 4:
So that was kind of interesting. It didn't get any more accurate - still just one "success" out of four - but by rambling on about its reasoning I think we can see how this particular model is getting tripped up and ending up at four so often. It's correctly realizing that it needs to double the number of horns, but it's mistakenly doubling it twice. Perhaps mixtral-8x7b is going down a similar erroneous route. Try asking it to explain its reasoning step by step.