Like many people I've been arguing about the nature of LLMs a lot over the last few years. There is a particular set of arguments that I found myself having to recreate from scratch over and over again in different contexts, so finally put it together in a larger post, and this is that post.
The crux of it is that I think both the maximalist and minimalist claims about what LLMs can do/are doing are simultaneously true, and not in conflict with one another. A mind made out of text can vary along two axes, the quantity of text it has absorbed, which here I call "coverage," and the degree to which that text has been unified into a coherent model, which here I call "integration." As extreme points on that spectrum, a search engine is high coverage, low integration, and an individual person is low coverage, high integration, and LLMs are intermediate between the two. And most importantly, every point on that spectrum is useful for different kinds of tasks.
I'm hoping this will be a more useful way of thinking about LLMs than the ways people have typically talked about them so far.
Jump in the discussion.
No email address required.
Notes -
This is an excellent post that does a great job of tackling the problem in terms both sides of the argument can understand. Of course, the link included elevates it to greatness by association, but I won't say which, to encourage the reader to chase it down.
I've already bookmarked this, reported it as an AAQC, and would go so far as to say you should cross-post it to LessWrong. Even if some aspects aren't entirely novel to a well read audience, the concise and clear packaging and presentation make it something worth sharing.
I'd hope Lesswrongers at least read Moulton themselves (surprised to see him here). Why Deep Learning Works Even Though It Shouldn’t is a classic.
More options
Context Copy link
More options
Context Copy link