site banner

Small-Scale Question Sunday for January 4, 2026

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

1
Jump in the discussion.

No email address required.

What does everyone think of Eliezer Yudkowsky?

I just realized this website was born from /r/TheMotte, which came from /r/slatestarcodex, which came from Scott Alexander, who came from LessWrong, which was created by Eliezer himself. He's technically a grandfather of this community.

I for one think he's more influential than he's given credit for, and I consider myself lucky to have come across his writings in my younger days.

[context and geneology]

He's... hard to talk about.

The critique has long echoed the old Samuel Johnson quote about being "both good and original; but the part that is good is not original, and the part that is original is not good" -- and the man has had a hatedom before 2012, so it's been echoing for a while. Most of the man's more accessible scientific writing is 'just' presenting well-established popsci into a more accessible form (sometimes without sufficient citations), while a lot of his predictive or even cutting-edge scientific analysis has to get suffixed with 'from a certain point of view' at best and 'ah yes but' at worst. If anything, that's only become more true over time: both The Golden Age Sequences and HPMoR have long relied on some of the sociology research that's be found the most wanting under the replication crisis.

Yudkowsky's been moderately open about some of this stuff, and his pro-AI, AI-is-easy, AI-is-hard, anti-AI changes have been a part of his whole story. I like that more than the people insisting they've always been right. It's still not something everyone likes, or that he can do consistently. There's never been a good retrospective on how MIRI's output was so absolutely bad on both the academic paper and popular-reader sides for so long, or the time they had an employee embezzle (tbf, not an unusual thing for new non-profits to have hit them), or yada yada.

But that's a bit of a victim of own success thing. Yudkowsky can't claim the whole replication movement anymore than he can claim the whole effective altruism one. He's at least been in the general vicinity too early to have jumped in front of the parade post-hoc, though. "Map is not the territory" and "fake answers" might have been well-known and obvious before 2008, but it wasn't until after that anyone put them together to actually poke at the core tools we thought we were using to find deep truths about reality. And these movements have been a large part of why so many of the older posts have aged so poorly, though not the only part.

((Although he's also a weird writer to have as big an impact as it seems he's had? The Sequences, fine, if good blog should change people's minds, it's a good enough blog. Why is HpMoR a more effective AI Safety program than Friendship is Optimal? Why is the Sword of Good so much more effective than a lot of more recent attempts at its take?))

... but all that's kinda side stories, at this point. Today, if you care about him, it's the AI safety stuff, not whether he guessed correctly on Kahneman vs Elisabeth Bik, or even on neural networks versus agentic AI research.

Which gets messy, because like Reading Philosophy Backwards, today, all of his demonstrated successful predictions are incredibly obvious, his failed ones ludicrous-sounding, and only the ones we can't evaluate yet relevant. Why would anyone care about the AI Box experiment when corporations or even complete randos are giving LLMs a credit card and saying have fun? (Because some extremely well-credentialed people were sure that these sort of AI would be perfectly harmless if not allowed access to the outside world, even months after the LLMs were given credit card info.) Why would anyone be surprised that an AI might disclose private or dangerous information, if not told otherwise, when we now know LLMs can and do readily do those things? (Because 'the machine will only do what we program it to do' was a serious claim for over a decade.) Who could possibly believe that an LLM couldn't improve code performance? (Uh, except all the people talking about stochaistic parrots today, and convinced that it was philosophically impossible for years before then.)

And the big unsolved questions are very important.

But in turn, that doesn't make his proposed answers better or useful. Say what you will for the ethos of singularitarity races, but at least they have something more credible than the 'you can't just tell people not to do something' guy telling people not to do something, and ultimately that's all that policies like an AI pause boil down to. The various attempts to solve morality have made some progress, despite my own expectations. It might seem like the difference between timeless decision theory and functional decision theory is just huffing fumes, but it does have some genuine benefits... and we have no way to implement them, and no way to validate or even seriously consider whether we're even looking at the most important measures. We don't know what the system they'd need to be implemented on looks like, and it's speculative (though increasingly likely) there will even be a system, and it's not clear the people building that system will be interested or even aware of the general AI safety issues.

So there's big unsolved questions that have been largely left unasked.

But in turn, that doesn't make his proposed answers better or useful.

but it does have some genuine benefits... and we have no way to implement them, and no way to validate or even seriously consider whether we're even looking at the most important measures.

Note that these are useful if you share the Yudkowskian view of neural nets. Specifically, the view that it is impossible to align a neural net smarter than you; "a technique, inventable before the Singularity, that will allow us to make neural-net ASI and not die" is a contradiction in terms. There are thus no "useful" answers, if you define "useful" as "works on neural nets".

In this paradigm, 100% of surviving worlds follow this two-point plan:

  1. Neural nets are totally and permanently abandoned until after the Singularity; they are banned in all countries (convincing everyone is hard; easier is convincing enough nuclear powers, hard enough, that the holdout countries are either occupied or obliterated).

  2. Non-doomed versions of AI research (e.g. GOFAI, uploads) continue.

The reason you need #1 is that #2 is going to take at least 50 years to hit the Singularity. The reason you need #2 is that #1 is only metastable, not actually stable; sooner or later, in a hundred years or a million, the Butlerian Jihad will break down, at which point everybody dies unless we've hit the Singularity in the meantime.

And hence, work on how to make non-neural-net AI work is necessary (if less urgent than stopping neural nets, on which point Yudkowsky is indeed currently focusing).