This isn't the argument I am making. The argument is that LLMs make certain types of errors which suggest that they are unable to create models of the world, something which is arguably necessary for understanding.
You gave no justification for why the types of errors you listed are in some different category than the types of errors I listed. Both involve answering questions that require some understanding of how the real world works. I'd argue committing the Conjunction Fallacy is just as egregious an error as failing the car wash "puzzle" (which thinking models mostly don't do, anyway). I think it's just because you're used to the limitations of humans and so you don't think our blind spots "count".
I flip five coins. Am I more, less, or equally likely to see the pattern HHT (at some point) vs the pattern HTH? If you don't know the answer, then you're failing at mentally modeling a very very simple real-world scenario of five coin flips, which has only 32 possibilities.
In fairness, the goalposts were moved because we realized LLMs couldn't do certain AGI things despite passing the "AGI" tests.
Yeah, no argument here. Like you said, it's kind of natural that we adjusted our expectations as we learned more about the nature of intelligence (now that we have more than just one kind to generalize from). We sort of assumed that a lot of other human-like capabilities would necessarily come along for the ride when an AI passed the Turing Test, and that was wrong.
Just as long as we don't keep using "it's not true AGI!" as a cognitive stop sign to avoid recognizing the incredible progress we've made.
Although I agree with SnapDragon that they're "partial AGI". I believe the missing component is continuous learning: they start output like a human, as they've been trained to, so if they continued to be "trained" on their observations, presumably they'd continue to output like a human.
Indeed. I've heard of efforts to graft a learning layer onto LLMs (with a "memory" that's an embedding rather than just CoT text), but obviously it hasn't worked so far, and maybe it never will. Also that still seems like a short-term solution.
Oh wow. Um, ok, how can I dumb this down as much as possible for you?
- You: LLMs make cognitive errors, so they can't understand the world!
- Me: Humans make cognitive errors, so they can't understand the world!
Having intellectual blind spots - like reading comprehension in your case - is not proof that you're not "modeling the world".
And what about the cognitive errors that humans make all the time? The rationalist community was founded on a list of widespread "fallacies", after all. To pick one field, I would argue that humans lack a true capability to understand probability. We lose to even basic computer programs at Rock Paper Scissors. Gamblers think Red coming up 3 times makes Black more likely next time. There are actual medical professionals who don't understand that a positive on a 90%-accurate test for a rare disease does not mean you are 90% likely to have it. Simpson's Paradox will fool almost anyone, including me.
And on this very forum (and ACT's), every so often I try to correct people about the Doomsday Argument, which, like Monty Hall, is easily modeled and shown to be false. Yet Scott - and a motivated subset of Wikipedia editors - believe it anyway. Somebody who believes something false is clearly lacking a "true capability to understand probability". But they can still be intelligent.
Heh. See, the AI making that Dyson Sphere doesn't have general intelligence, I bet it can't get the Wordle 6 days in a row like me.
I have the unpopular (and, ok, partially tongue-in-cheek) position that we've already hit AGI. What LLMs can do is already very general, just not fully general. But I wish it was emphasized more that we messy meaty humans don't have fully general intelligence, either - it doesn't matter how you bring up a precocious child, they're not going to be able to rotate 50-dimensional shapes or approximate partial differential equations in their head, and all but the best of us max out at fluency in a few languages, or memorizing a few thousand digits of pi. We're just so used to the things we (and everyone else we've ever known) can't do in our heads that we intuitively don't even think of them as tests of "intelligence".
Someone from the early 2000s, having LLM capabilities described to them, would indeed think that it meets the definition of general intelligence. What we kind of subconsciously expected, but didn't happen, was that someone would just suddenly launch an AI product that lit up a giant neon sign saying "AGI ACHIEVED!". Instead, the AI we've developed so far just turned out to have a different set of strengths and weaknesses than us. By the time we're able to bring those weak points up to human level - i.e., where an AI can perform equally well as an average human on any task, which is what a lot of people think of when they say "AGI" - it'll actually be vastly superhuman in the things that come naturally to it. (LLMs are already superhuman on language comprehension, after all.)
I don't think you can say for sure that they don't have a "world model" hidden somewhere in their trillion-dimensional space. I've certainly used them in ways that seem to require one, and while it's certainly possible that it's because they're faking it with statistics and I'm overestimating the difficulty of what I ask ... the argument does have to trail off at some point, right? You have to include some way to show that they really do "understand causal relationships" (even if it's just through preponderance of evidence), otherwise you're using unfalsifiable faith-based reasoning to assert that only human intelligence is real intelligence.
What they definitely don't have is temporal persistence of thought, just because of their actual mechanics. (CoT reasoning is a patch to this, but an imperfect one.) A priori, I would have thought this was necessary to do complex reasoning.
And I would advise heavily discounting anything Gary Marcus says. He's just enjoying a career as a self-purported "expert" that the media can go to whenever they want a skeptical quote, but almost every testable claim he's made has been wrong.
I agree that his posts are far below the average level of quality here, but they're not THAT frequent, and I wouldn't want them to be modded. This is supposed to be a free speech forum, after all. Our whole thing is that the good ideas are supposed to win over the bad ideas, and fortunately he seems to be getting plenty of pushback. And there are some decent debates happening in the replies, it's just that they're despite OP, not because of him.
No, you presented it as a conceptual proof that LLMs will never get better. All it takes is one innovation that addresses your concern about recycled data to make it invalid. All arguments about intelligence are necessarily a bit wishy-washy, mind you, so I'm not saying your thought experiment is useless.
I think if you really want to argue that LLMs have an inherent cap on their capability, you should address their actual algorithm rather than how they're trained. However much we rejigger them with CoT thinking and non-text data sources, they're fundamentally not designed for anything more than next-token prediction. It should be a source of constant surprise that they do so well on such a wide variety of non-creative-writing tasks (look at early SSC posts about GPT3's output to see this surprise evolve in real time). You could argue that if LLMs end up hitting a soft or hard limit, that's really just the "surprise" petering out, that we really can't just take a glorified text completer and keep pumping neurons into it until it's a genius.
I don't personally believe this will happen, but hey, I don't think anyone really knows for sure.
- Prev
- Next

Yeah, I kind of lost it. Sorry. Maybe it's time to take a break from AI threads for a bit.
More options
Context Copy link