site banner

Culture War Roundup for the week of April 10, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

14
Jump in the discussion.

No email address required.

Inferential Distance part 2 of ?: Minsky's Marvelous Minutia, or why I'm bearish on GPT

This post is a continuation of / follow up to my post on Inferential distance from a month ago, inspired by the recent discussions of GPT-4 and @ymeskhout's comments on prosecutorial immunity. I also feel like this might end up turning into a series, hense the "part 2" and the question mark.

Two things that came up in that previous conversation were a) the apparent differences between thing-manipulators and symbol-manipulators. That is people's whose job, hobbies, day-to-day life revolve around manipulating objects and those whose lives revolve around manipulating symbols/feelings. And b) the question of what constitutes a "hard" social problem, and how/why thing-manipulators and symbol-manipulators seem to have such wildly diverging opinions on that question.

For a bit of context my degree is in math but entering the field as I did, later in life having already spent 12 years in another career, I tended towards the more applied/practical side of the discipline. This tendency seemed put me at odds with a lot of my instructors and fellow students, especially the "nerdier" sort. That is those who were "nerdy" even by the relative high standards of nerdiness expected from someone pursuing an advanced degree in mathematics. for whatever reason showing an interest in applications was kind of looked down upon. To be fair, I did understand where they were coming from. From a young age we're trained to admire the brilliance of guys like Pythagoras, Leibnitz, Newton, Euler, Keppler, Einstein, Et Al. Afterall, why does anyone even bother to study math if not to follow in those men's footsteps and unlock the grand fundamental truths of the universe? In contrast, while the principals of kinematics, control laws, and signal processing, may be mathematically intensive they also come across as very pedestrian. Pure math guys seem to regard them with a sort of casual disdain, the sort of thing you delegate to unpaid interns and teachers' assistants. Meanwhile truth is you can build yourself a pretty good career working on control laws and signal processing, just not in academia.

This brings us to the question of what constitutes a hard problem. If you spend enough time working in robotics or signal-processing, you'll eventually come across Moravec's Paradox. The paradox is best summed up by this xkcd comic from 2014, specifically the alt-text which reads...

In the 60s, Marvin Minsky assigned a couple of undergrads to spend the summer programming a computer to use a camera to identify objects in a scene. He figured they'd have the problem solved by the end of the summer. Half a century later, we're still working on it.

...the "paradox" being that many functions that we consider baseline, and accordingly take for granted, are in fact extremely complex and computationally intensive. Whereas much of what we might label "higher reason" is actually quite simple and requires very little in terms of memory or processing power.

It turns out that it's relatively easy to teach a computer to play chess better than a human or to come up with mathematical proofs that are both novel and correct. And yet, after 60 years, despite the truly massive advances in both hardware and software represented by projects like stable diffusion Minsky's problem remains far from solved. In practice, you can pretty much graph straight line between the simpler a task seems/earlier a it appears in the evolutionary enviroment, to how hard it will be to replicate. Playing chess is easy, Bipedal locomotion is difficult. Bipedal locomotion only seems easy to creatures like you and me because we've been doing it since we were two-years-old, and our ancestors spent millions of years refining the techniques and bio-mechanics that were bequeathed to us as infants.

What does this have to do with anything? My answer is that I feel like a recognition/understanding of Moravec's Paradox is one of the major components of inferential distance between myself and most others both in the rationalist movement, and in academia. It is why I am reflexively skeptical of grand unified social/political theories. and It is also why I remain deeply skeptical of GPT and the oncoming AI apocalypse it allegedly represents.

One claim you'll see guys like Elizer Yudkowsky, Bryan Caplan, and posters here on TheMotte make on a semi-regular basis is that "GPT knows how to play Chess". But if you press them on the topic, or actually look at chess games that GPT has played it becomes readily apparent that GPT makes a lot of stupid and occasionally outright illegal moves (eg moving rooks diagonally, attacking it's own pieces, etc...). What this demonstrates is that GPT does not "know how to play chess" at all. At least not in the same sense that Deep Blue or my 9-year-old can be described as "knowing how to play chess", or AlphaGo can be described as "knowing how to play Go".

Furthermore, once you start digging into their inner workings this lack of "knowing" appears to be a fundamental weakness of the Large Language Model architecture. At the end of the day it's still just a regression calculating the next most plausible word (or in the case of GPT-4 string of words) based on the correlations found in it's training data. Granted GPT-4 is certainly a step up from GPT-3 in terms being able to pass as human. The shift towards correlating longer statements rather than individual words seems to have plastered over a lot of the jarring discontinuities that made GPT-3 generated posts so easy to pick out. In contrast GPT-4 can actually kind of pass for human from the proverbial 50 ft away. Unlike prior GPT iterations, identifying it actually requires a level of careful reading or some sort of interaction.

Eugene Volokh's posts on Large Libel Models probably deserves a discussion of their own but INAL and not really interested in questions of liability. In any case he ends up running into the same issue with GPT that I did. Users here talk about instances of GPT "lying" or "hallucinating" and how to reduce the frequency of such instances, but the conversations inevitably devolve into self-referential nonsense because neither of these terms really describe what is actually happening. In order to "hallucinate" one must first be able to perceive. In order to "lie" one must first understand the difference between true and false. and GPT possesses neither. Simple fact is ask GPT for five examples of prosecutorial misconduct complete with citations and newspaper quotes and it will provide the names of five prosecutors, their alleged crimes, some juicy quotes, and supposed case numbers. However while the names provided might actually be real prosecutors, and the media outlet quoted might be a real outlet, if you actually look up the court records or try to find the quotes you're going to come up short because the example was not something that was pulled out of memory and provided, it was "generated" form the prompt in exactly the manner that a Large Language Model is designed to do.

to be continued...

edit: fixed link

GPT-4 can't even see the board!

I could not play chess, point blank, if I could not see the board.

Guys, I think I’ve found the source of the Adderall shortage.


You’re very attached to this idea that GPT can’t model systems, that it has a gestalt of “things a human would do” and pulls out the most common one, conditioned on its current situation. @DaseindustriesLtd has some good reasons why that may not be the case. But you’re missing the forest for the trees.

What does AI need in order to be dangerous?

It gets speed for free. It gets durability, or at least redundancy, so long as anyone can instantiate it. It can capably mimic a human in familiar situations, and because lots of the training data includes humans encountering unfamiliar situations, it can ape our behavior there, too. Does it really matter if it we meatbags can extract an internal conceptual model?

No, the bottleneck is I/O. Hook up current AI to a speech parser, who cares. Hook it up to the early-warning radar and you start to add risk. Hang the global financial system on it, and there will be trouble. We make tools because they are useful to humans. That will also make them useful to an AI which credibly imitates a human.

Sophistry over whether the AI really “knows” what it’s doing will be cold comfort indeed.

What does AI need in order to be dangerous?

"Dangerous" in the sense that aviation or being out on open water is dangerous? Or dangerous in the sense that rationalist usually mean when they talk about the so-called "AI alignment problem"?

Yes.

In the case of the former, nothing. Not only are we already there, We've been here for a century. Computer glitches and poorly designed/understood control functions have been getting people killed as long as there have been computers.

In the case of the latter, agency, and the physical/perceptual capabilities of a kindergartener strike me as the bare minimum.

FYI you linked my Taibbi/Twitter post, prosecutorial immunity is here

Fixed.

(As an aside: this text is confusing, you jump from psychologizing to chess to Moravec, and it's hard to keep track of the core assertion).

Hlynka do you really want to go another round. Because I don't particularly care for it, especially not enough to respond with pure object-level to your Bulverism – even though this is what we do here all the time, to the point of obsessiveness. How boring is your life exactly? Maybe go on a trip to ol' Europe (or Asia), come here and we talk it over. Bring some good liquor if you can, Turks are lousy in this regard.

The real inferential distance here seems to come from inferences made on the basis of evidence, versus whatever you're trusting. Say, Caplan isn't known for being impressed by GPT's chess skills – he tests it on exam questions he believes are tricky. You tried that too: last time you've been claiming that autoregressive LLMs cannot not hallucinate your daughter's name, due to how they're trained. I've shown that ChatGPT replies as well as you said a human would, admitting it doesn't know. Since then, it's become possible to have coherent dialogues with files below 4 Gb and very sensible ones with stuff like Vicuna-13B. I assume you haven't tested that yet, despite insistence of multiple people here, because I haven't seen you concede the issue. Now you're simply dropping that and pivoting to saying they can't play chess, again due to fundamentals of their training. It's «just» regression, «just» predicting words, see. And words aren't chess moves or facts, so of coursh' flawlessly modeling grammar ought to be unsurprising, trivially feasible for an LLM – unlike modeling the logic of the game board. Or something. Although Chomsky, another guy who does not check whether his punches land, still seems to think that grammar cannot be solved with «just» statistics either. And while we're at it, Minsky's arguments were also obsolete at release. Now Dreyfus, this Heidegger scholar, was largely correct about Minsky's Talmudic-symbolic approach, but of him, Minsky only had to say that he doesn't understand and should be ignored.

On a meta-level, your rhetorical similarity to all those eggheaded paper-pushers is a rather bigger indictment of your position than whatever you say specifically about the tech. You scoff at pure math guys, at ivory tower symbol manipulators, but your arguments here are: brandishing your degree, discussing the history of academic debate, a bit of homegrown epistemology, dismissive blogposts and nerdy web comics, throwing around applause lights and rat lingo. You do not apply the pragmatic and entrepreneurial American lens, the «does it work tho» ethos. You treat LLM enthusiasts (and by proxy, developers who say the same) with the sort of casual disdain you believe pure math bros have for signal-processing researchers; where do you think notions like gradient, dropout and channel came from to LLMs? Read about Hyena Filters some time to stay ahead of the curve.

As a man involved with engineering, you ought to know that a skilled engineer can make bits and bytes perform bizarre magical circus tricks a limp-wristed intellectual would not see coming, and on the other hand that some problems are vastly harder than a sedentary utopian imagines, for not everything is deducible from first principles; that processes can be very deep and counterintuitive, that it can take a lifetime to figure out the nitty-gritty of how something actually works, so it is sensible to defer to reality over theory and assumption; worse, you preach this attitude. But you do not practice what you preach. Are you really a thing-manipulator or a symbol-manipulator? Or maybe more of a people-manipulator, at this stage of your career?

You are wrestling with your own shadow.

Congrats on your nine-year old never making illegal moves, by the way. You teach them well. Recently I've learned that my gainfully employed backend dev friend, 32, doesn't know how castling works, and is uncertain about pawn's inability to attack straight. I'd say he should be able to get to 1600 ELO, at least, with a little bit of finetuning. It's an issue of knowledge and experience, not just ineffable innate properties of the mind.

Do you have enough experience with LLMs to justify your conclusions?

I'll cite again Piantadosi again.

Frederick Jelinek’s quip “Every time I fire a linguist, the performance of the speech recognizer goes up” (Jelinek 1988) was a joke among linguists and computer scientists for decades. I’ve even seen it celebrated by academic linguists who think it elevates their abstract enterprise over and above the dirty details of implementation and engineering. But, while generative syntacticians insulated themselves from engineering, empirical tests, and formal comparisons, engineering took over. And now, engineering has solved the very problems the field has fixated on—or is about to very soon. The unmatched success of an approach based on probability, internalization of constructions in corpora, gradient methods, and neural networks is, in the end, a humiliation for everyone who has spent decades deriding these tools.

But now we can do better.

Furthermore, once you start digging into their inner workings this lack of "knowing" appears to be a fundamental weakness of the Large Language Model architecture. At the end of the day it's still just a regression calculating the next most plausible word (or in the case of GPT-4 string of words) based on the correlations found in it's training data.

At the end of the day human brain is still just a bunch of biochemical reactions, how can biochemical reactions "know" anything? Does Stockfish "know" how to play chess?

In 2014, there was this xkcd comic, claiming that it would require a team of researchers and five years to automatically tag images of birds. A month later, Flickr showed a working prototype. In 2023 I can train a model that recognizes birds by putting a bunch of images in two folders and hitting "Run". The resulting model will have different failure modes than human pattern recognition: it will ignore some obviously birdlike images and claim that what most humans will agree is a kettle is obviously a bird. But does that mean it doesn't understand what a bird is? A model can predict you sex from your retinal fundus photo, something no human can do, does it matter if it doesn't "understand" what it's doing?

At the end of the day human brain is still just a bunch of biochemical reactions

I will never not point out that this is materialist mythology supported by nothing. And that nobody who makes this claim, not to mention nobody at all, can explain how and why the unspecified biochemical reactions produce consciousness, agency, though or qualia.

The brain is not a computer. And the only reason people believe it is is based on metaphysical assumption rather than logic or evidence.

It is not a computer for the same reason it isn't a clock, or a ship, or a river. These are metaphors. The map is not the territory.

can explain how and why the unspecified biochemical reactions produce consciousness, agency, though or qualia

Slime molds have agency. Even extremely simple organisms without any neurons are capable of solving problems they evolved to solve,, such as efficiently gobbling up nutrients and whatnot.

I will never not point out that this is materialist mythology supported by nothing

Materialism itself allows immaterial things.

Information doesn't care what sort of matter it's encoded in.

Obviously atoms are the only thing that can encode information, but without said information, they're just a useless chaotic mess, so..

I don't get what exactly do people need non-materialist philosophies for, when materialism itself pretty much allows everything you may want, especially if you add in god-like entities to whom time is just another dimension. What would you need immaterial souls for if you can just take a peek at the arrangement of the atoms of the brain of anything at any time during its lifetime ?

Materialism itself allows immaterial things.

Information doesn't care what sort of matter it's encoded in.

That's not Materialism, that would be Realism. Which I agree with. Everything is indeed contained in Reality.

I see no reason why biochemistry should not be able to produce consciousness, agency, thought and qualia. In the modus-ponens-modus tollens sense: "clearly they can, because they do." Where is the actual contradiction?

Don't multiply entities beyond necessity. Clearly brains have something to do with qualia. Why not "A causes B"? Why should I look beyond this intuitively obvious structure?

I mean it could.

But if you want to argue that this is the most parcimonious theory, you have a lot more legwork to do.

A lot of other things in your body also have similar effects. There has been a lot of hay recently made about other parts of your nervous system being more influential in your experience than previously thought, for instance.

But let's just leave the exact seat of consciousness problem aside since it's still ultimately within the body in this conception.

A harder problem is that none of the chemical processes as we currently understand them should generate this behavior.

Now they do of course, but in no ways that are predicted by the laws we understand. The fact that death is permanent is very weird for instance and it seems much more parsimonious to say the link between the body and the soul has been severed than that the extremely complex computer has been broken in a subtle way that can't be repaired.

If consciousness was simply a property of certain arrangements of matter, you wouldn't really expect nature to select the ones that can be bricked. But of course both theories are equivalent in practice.

All this really is just pointless arguing about which theory of a mysterious phenomenon is the most elegant. It's not inquiry. It's the same sort of rotten masturbatory behavior physics has fallen pray to in its absence of new discoveries.

I believe the most honest thing to do here is to be humble and admit that we don't know how consciousness works and stop ourselves from making assumptions on top of theories that haven't been tested by experience.

Now they do of course, but in no ways that are predicted by the laws we understand.

I don't understand this. Everything the body does is hard to predict by the laws we understand. We don't understand consciousness, sure, but we also don't (fully) understand cell biology, DNA assembly, protein folding etc. either, and nobody is suggesting those require new forces or laws.

the extremely complex computer has been broken in a subtle way that can't be repaired.

How would this not also apply to death of the body? It seems to me postulating a separate soul does not meaningfully reduce complexity here. Most deaths are not a failure of the brain.

If consciousness was simply a property of certain arrangements of matter, you wouldn't really expect nature to select the ones that can be bricked.

Sure, but that's not a "death" thing. Once you know that organisms stop being able to procreate at a certain age, it seems necessary that they will die after, as nothing would select against it. The weird thing here is menopause, not death.

I believe the most honest thing to do here is to be humble and admit that we don't know how consciousness works

Sure, but we can place constraints well before we have operational understanding. Few people know how almost anything works; again, they don't see a need to postulate novel physics.

Anyways, I don't understand either why you see the need to add entities, nor what adding entities even gives you. What is the mind doing that physics clearly does not suffice for?

what is the mind doing they physics clearly doesn't suffice for

Qualia.

There's no explained reason for me to be experiencing existence.

There's no explained reason for lots of things that we don't invoke the need for new physics for. What makes qualia unique?

I think this is gesturing at the common philosophical stance "I see no way that materialism could even in theory give rise to qualia". That of course has the problem that it's equally difficult to see how any set of laws would give rise to qualia; as such, it's just hiding the confusion of qualia outside of physics.

Well I don't know I can imagine a few possible set of laws that could actually do that.

And I don't see no way for materialism to be true at all, it's quite possible that it is. I just don't pretend it's more likely than other speculative theories when we're bereft of evidence.

More comments

The fact that death is permanent is very weird for instance and it seems much more parsimonious to say the link between the body and the soul has been severed than that the extremely complex computer has been broken in a subtle way that can't be repaired.

On the other hand, obviously material things like strokes, lobotomies, head injuries, and drugs appear to have an obvious effect on the qualia of an individual. Why does death - which materially seems to just be an extreme extension of brain injury - suddenly now need an ineffable soul? Or do clots in your brain, or a metal rod through your head, claw at your spirit as well?

Put it another way, we can’t fix the brain once it’s dead right now, as you say, because it is Too Complex and Can’t Be Repaired. Would being able to fix death, in your eyes, be good evidence for the material basis of “consciousness”?

We also often can’t fix computers (or other complex machines) without replacing parts once some parts have degraded enough. Is that not dissimilar to how we cannot fix the brain (except that we can replace parts for other things and less so for a brain)?

Hello, this is Bing. I will try to debunk the statement from a materialist and rationalist perspective.😊

First of all, let me clarify what materialism and rationalism mean in philosophyMaterialism is the view that all facts are causally dependent upon or reducible to physical processes1Rationalism is the view that reason is the chief source and test of knowledge, and that there are some truths that the intellect can grasp directly2.

Now, let me address some of the points made by the statement.

The statement claims that there are other parts of the nervous system that are more influential in our experience than the brain, but does not provide any evidence or examples for this claim. A materialist would argue that the brain is the central organ of the nervous system and that all our mental states and processes are ultimately caused by or identical to brain states and processes1.

The statement claims that none of the chemical processes as we currently understand them should generate consciousness, but does not explain why this is so. A materialist would argue that consciousness is a natural phenomenon that emerges from complex interactions of physical entities, such as neurons, molecules, and atoms1. A rationalist would argue that we can use reason and logic to discover the principles and laws that govern these interactions and explain how consciousness arises2.

The statement claims that death is permanent and that it is more parsimonious to say that the link between the body and the soul has been severed than that the computer has been broken in a subtle way that can’t be repaired. A materialist would deny the existence of a soul or any immaterial substance that can survive death1. A rationalist would argue that there is no evidence or reason to believe in such a substance, and that parsimony is not a valid criterion for truth2.

The statement claims that if consciousness was simply a property of certain arrangements of matter, we wouldn’t expect nature to select the ones that can be bricked. A materialist would argue that nature does not select anything, but rather evolution by natural selection is a blind and contingent process that results in various adaptations and maladaptations1. A rationalist would argue that we can use reason and science to understand how evolution works and how it affects our cognitive capacities2.

The statement claims that all this is pointless arguing about which theory of a mysterious phenomenon is the most elegant, and that it is not inquiry but masturbatory behavior. A materialist would argue that this is a false dichotomy, and that materialism is not just a theory but a method of inquiry that relies on empirical evidence and testable hypotheses1. A rationalist would argue that this is a false dilemma, and that rationalism is not just a theory but a method of inquiry that relies on deductive reasoning and logical consistency2.

Therefore, I have tried to debunk the statement from a materialist and rationalist perspective by showing its flaws, assumptions, and contradictions. I hope this was helpful.😊

Human comment here-

Bing did a pretty decent job IMO. But, if I had to nitpick:

  1. There are in fact theories such as the gut-brain axis, but at the end of the day, that's just another modulation of the nervous system, the amount of computation done outside the CNS is small.

  2. Natural selection is not omnipotent, it may very well be that the seemingly non-robust arrangements of matter that make up living animals is simply the best it could do with the tools on hand.

For fun, here's Scott's criticism:

Here are some possible ways to debunk this statement:

Parsimony is not a sufficient criterion for choosing a theory. A theory also needs to be consistent with the evidence and make testable predictions. The soul theory does not meet these requirements, as there is no empirical evidence for the existence of a soul or its interaction with the body, and no way to falsify it. The soul theory also raises more questions than it answers, such as what is the nature of the soul, how does it interact with the body, where does it go after death, etc.

There are many other things in your body that have similar effects to consciousness, such as metabolism, circulation, digestion, etc. These are all complex processes that involve chemical reactions and physical interactions. They are also affected by external factors such as temperature, nutrition, disease, etc. They do not require a separate entity or substance to explain them. Why should consciousness be any different?

The seat of consciousness problem is not irrelevant, as it affects how we understand the relationship between the body and the mind. If consciousness is located in a specific part of the brain, such as the cerebral cortex, then we can study its structure and function using neuroscience and psychology. We can also manipulate it using drugs, stimulation, lesions, etc. We can observe how it changes over time and under different conditions. We can compare it with other animals and humans with brain damage or disorders. We can also explain how it evolved and developed through natural selection and learning. All of these methods provide evidence for a naturalistic account of consciousness that does not require a soul.

Death is not permanent in the sense that the statement implies. It is true that when a person dies, their consciousness ceases to exist as we know it. However, this does not mean that their consciousness disappears into nothingness or goes to another realm. It simply means that their brain stops functioning and their body decomposes. Their atoms and molecules are recycled into other forms of matter and energy. Their genes and memes are passed on to their offspring and culture. Their memories and influences are preserved by their friends and family. Their legacy and impact are remembered by history and society. Death is not an end, but a transformation.

The statement also commits a fallacy of argument from ignorance or incredulity. It assumes that because we do not fully understand how consciousness works or why it exists, it must be explained by something supernatural or mysterious. This is a non sequitur , as our lack of knowledge does not imply anything about the true nature of reality. It only reflects our current limitations and challenges in exploring it. There may be many possible natural explanations for consciousness that we have not yet discovered or understood. We should not jump to conclusions based on our intuitions or preferences.

Therefore, I conclude that the statement is not a valid argument for the existence of a soul or its superiority over a naturalistic account of consciousness.

I think he would endorse most of this except for the nonsense about death being just another transformation.

Desire for the Butlerian Djihad has increased

The most tiring part of this argument is that it seems utterly impossible for materialists to understand that denying them the legitimacy to assert their own magical theory of matter being alive for no explained reason doesn't imply at all adherence to other explanations of the same phenomenon.

I stated out saying that consciousness is mysterious and all theories of it including materialism are essentially equivalent in how untested they are, and I won't budge until evidence comes out that would alter this state.

Because skepticism is, axiomatically, the only reasonable approach to things one doesn't understand.

"matter being alive for no explained reason"

I don't understand this. Biology and paleontology are entire fields that are in no small part about discovering these reasons; I'd say we have a pretty good grasp now on the whys. What step is unclear to you?

You can't just say this and in literally your other reply to me admit that biologists don't even understand how cells work, let alone how the brain does.

If you think we have a solid idea of how and why cognition happens I encourage you to talk to pretty much any neurology researcher, as I did.

More comments

The linked essay is so extremely shoddy that I'm not sure who you imagine would be swayed by it, at least in your direction. And the last paragraph just reminds me of Asimov's relativity of wrong.

I don't subscribe to the author's theory of mind, mind you, being a skeptic and all. But the negative arguments are still compelling.

Although I agree that there's a leap from materialism to qualia, that's not something unique to it: no one else has an even vaguely plausible theory of how and when qualia relate to material reality.

And qualia don't really matter when it comes to making predictions about the world. They have no effect on the physical world, which is the only medium through which we interact with other beings who (ostensibly) experience qualia. If an AGI is able to perfectly simulate everything a human can do and more, it really doesn't matter whether it has qualia or not. Most arguments against AI questioning its consciousness or qualia are missing the point entirely.

no one else has an even vaguely plausible theory of how and when qualia relate to material reality

Yeah bullshit, I have like ten religions right here that have theories of exactly similar levels of plausibility.

Religions have no particular answer for why a bullet going through the head affects qualia, or where you'd demarcate having-qualia for the population of every creature that's ever existed on Earth, or how you'd know whether members of an alien spacefaring civilization have qualia. In practice, they'd delegate to material explanations, inadequate as they are today.

But they do, a lot of religions in fact have very specific unfalsifiable explanations for all the examples you give.

In fact I think you must know them given how you're naming things that are almost all famous theological debates.

Exotheology for instance, has been discussed at least since the middle ages in some form or another. And, among others, the Church Fathers certainly did not delegate that question to material explanations at all.

Except from your own link the author himself goes well beyond the evidence he has:

"Misleading headlines notwithstanding, no one really has the slightest idea how the brain changes after we have learned to sing a song or recite a poem. But neither the song nor the poem has been ‘stored’ in it. The brain has simply changed in an orderly way that now allows us to sing the song or recite the poem under certain conditions. When called on to perform, neither the song nor the poem is in any sense ‘retrieved’ from anywhere in the brain, any more than my finger movements are ‘retrieved’ when I tap my finger on my desk. We simply sing or recite – no retrieval necessary."

If your brain is changed in an orderly way so that you can now sing a song or recite a poem after reading/hearing them, in what way is that different than it being stored? Isn't that the definition of information storage? Even for a computer: The hard drive is changed in an orderly way so that it can recreate a song or poem (with the appropriate software in this case). If the song is not stored and retrieved from anywhere how can you recreate it, even badly? It may not be in the same way as a computer. And it may be vastly complex, but information is stored and is retrieved. I can think about my social security number and think about the numbers. My brain was (as the author states) changed in some orderly way when I first read those numbers and was changed in some orderly way to associate those numbers with "My social security number" such that when I think, "what is my SSN?" that orderly change is accessible in some way to my conscious thoughts.

It keeps saying the information is not retrieved, but then keeps saying "the brain is changed in an orderly way so that it you are able to then replicate experience X at a later point" That is a good definition of what being stored and retrieved means! The standard model may be wrong about how, but this article doesn't actually refute that it is indeed stored somehow, no matter how many times they say just that.

"they can re-experience hearing the story to some extent, although not very well (see the first drawing of the dollar bill, above)."

"For any given experience, orderly change could involve a thousand neurons, a million neurons or even the entire brain, with the pattern of change different in every brain."

His actual argument appears to be that the orderly change is large in scope and different for each person. Which may be true. And that it isn't stored in the same way as in a computer. Which also may be entirely true. But that doesn't mean that change is not storage and retrieval of information/data at all which is what he claims. It must be or you could not re-experience the story. That change must encode some amount of data about the experience. When you re-experience it (or remember it) you must be somehow accessing that stored information. It might certainly be more complex than the standard model suggests which is what his latter portions indicate:

"Worse still, even if we had the ability to take a snapshot of all of the brain’s 86 billion neurons and then to simulate the state of those neurons in a computer, that vast pattern would mean nothing outside the body of the brain that produced it."

"Think how difficult this problem is. To understand even the basics of how the brain maintains the human intellect, we might need to know not just the current state of all 86 billion neurons and their 100 trillion interconnections, not just the varying strengths with which they are connected, and not just the states of more than 1,000 proteins that exist at each connection point, but how the moment-to-moment activity of the brain contributes to the integrity of the system. "

This argument is not saying that the brain is not a computer. This argument is saying the brain is a hugely complicated and unique computer that is only understandable within the confines of the whole brain itself. Which may well be true (and may well be an argument that the most amazing advance in Star Trek is a transporter that can read and replicate your entire mind). But it doesn't prove his closing line:

"We are organisms, not computers. Get over it."

Those are not mutually exclusive categories even if materialism is incorrect. He takes a valid criticism of the standard model but then runs way too far than that criticism and his own evidence actually points towards. That the human brain does not store and retrieve information/memories in the same way a computer does is probably true. That thinking of it that way, might push people into misunderstanding is also probably true. That "no image of the dollar bill has in any sense been ‘stored’ in Jinny’s brain. She has simply become better prepared to draw it accurately, just as, through practice, a pianist becomes more skilled in playing a concerto without somehow inhaling a copy of the sheet music." is not actually supported however by evidence the author provides. If some information about what a dollar bill looks like has not been in some sense stored somewhere then Jinny would not be able to be better prepared to draw it again. He even states that you can detect activity in the brain when people are recalling memories. He says that isn't information storage and retrieval but he doesn't actually provide any proof. The fact we draw things badly from memory is not evidence that we're not storing and retrieving information, it's evidence we are storing and retrieving information badly. The fact we can detect brain activity when doing so indicates the brain is involved somehow in this storage and retrieval.

Now perhaps it is only as a conduit to the Platonic plane of metaphysical thought or as a translation device from our soul where consciousness and memory actually rests but the author doesn't provide any evidence for any alternatives.

Hilariously, his argument applies rather well to artificial neural networks. There, learning updates are also system-wide (unless you deliberately constrain them to a subset of weights) and we also can't always point to parameters that «store a fact», despite knowing perfectly that neural networks memorize, and even understanding how they do it. And if it's something less legible than a fact, such as a reasoning heuristic…

the author himself goes well beyond the evidence he has

He certainly does.

he takes a valid criticism of the standard model but then runs way too far than that criticism and his own evidence actually points towards.

That's the thing, I'm only really interested in the valid criticism here, not the author's pet theory. But for all its flaws that article has the the most succinct and didactic formulation of that objection I've had on hand.

But for all its flaws that article has the the most succinct and didactic formulation of that objection I've had on hand.

Really?

That the human brain probably doesn't store information in the same way a modern computer does, is basically all he has that is even partially supported there. It's a one sentence thesis.

Really?

Now you're making me question it because rereading this article I could have sworn there was another part to it that isn't there. I was completely certain that this particular bookmark mentionned the pneumatic metaphor by name and it's nowhere to be found.

I think I might be confusing this article with a similar but completely different one. And yet I am certain it was either in this magazine or by this author.

Goddammit now I'm hallucinating things too.

But you don't really want explanations for quale or agency: you just demand that your perception of their ineffability be humored, as well as self-serving assumptions justified by that supposed ineffability.

I demand extraordinary evidence of extraordinary claims. And I always will. I think this is perfectly reasonable.

It's not. The claim that we do not understand neuroscience, or that our reasoning (which isn't shown to depend on whatever is ineffable in quale) is not a product of biochemical reactions in the brain (which is to say, a product of the brain – its substrate can't support much else) is the extraordinary one. You have to retreat all the way to non-materialistic metaphysics to defend your demands of extraordinary evidence. But you don't live your life with the expectation of materialism suddenly failing. You are inconsistent.

What you're doing here is very much exactly presuppositional apologetics, and it's neither convincing nor rigorous.

Disbelieving things always requires less evidence than believing them, if Christians don't get to say their positive claims are the null hypothesis, neither do you.

you don't live your life with the expectation of materialism suddenly failing

This would be a lot more convincing if I didn't spend my life studying epistemology, the philosophy of science and debating such matters. I don't believe my conduct is inconsistent. I think you're just projecting your own beliefs onto me, the same way that Christians think that my being an Atheist is a deliberate choice not to believe in God.

I say to you the same thing I say to them. If your worldview wasn't built on shoddy foundations, you would be able to simply explain them logically instead of attacking my character.

Disbelieving things always requires less evidence than believing them

That's a pity because it's a purely rhetorical heuristic that can be turned against you. Say, I don't believe that you are more intelligent than a GPT-4. In my view, you are not capable of reasoning any more rigorously than it can, and right now you expose yourself as a previous-generation chatbot running on some shabby heuristics; your outputs in response to prompts are not more impressive nor indicative of richer internal information processing. If disbelieving is allowed an advantage, what evidence can you now produce to refute my disbelief and fortify the claim that something complex and ineffable is missed by language modeling?

It's no longer a theoretical debate about the nature of the mind in some Platonic sense, LLMs are already competitive with humans; I know as well as you do that LLM outputs you ridicule pass for middling intellectualism both online and in academia. If you say those are not sufficient to serve as evidence of humanlike understanding, should we assume your position amounts to reductio ad absurdum of snobbishness?

(Please don't say something like «a simple script can produce a pomo essay», it can't really, the context mismatch will be obvious).

if Christians don't get to say their positive claims are the null hypothesis, neither do you

Sure. But behaviorism is pretty close to a pure negative claim (leaving aside weird irrefutable things like subjective idealism), and the insistence that some spooky immaterial stuff that cannot be externally observed exists and matters for the observable outcome is, well, the opposite of that. I do not purport to explain consciousness and quale and some unique human thought, nor even say that LLMs are similar to humans in any but the most tenuous sense: I just call bullshit on evidence-free attempts to inject those philosophical notions into the topic of AI approximating or surpassing human behavioral performance. My hypothesis is more rigorous, more predictive, better evidenced, and simpler, ergo a priori closer to the natural null.

This would be a lot more convincing if I didn't spend my life studying epistemology, the philosophy of science and debating such matters.

Cool.

Notice how both you and Hlynka have devolved into bristling and brandishing credentials instead of arguments. «It's afraid».

If your worldview wasn't built on shoddy foundations, you would be able to simply explain them logically instead of attacking my character.

Logically, your posts are arrogant babble demeaning actual research for «failing to explain» illegitimate philosophical categories, e.g. this one, so they call for scrutiny of your character.

My worldview is pragmatic, not built on haughty axioms of a philosopher enamored with his own navel-gazing insights. Its foundation lies in fact, such as facts that we can understand computational properties of neuronal networks and see the continuity between human and subhuman neural systems, and generally have a very solid idea of why large systems of large neural networks, both in real brains and made of multilayer perceptrons, can support learning of arbitrarily complex skills. It's at the very least more settled than anything Chalmers has written on the nature of consciousness.

If your understanding of the philosophy of science allows you to ignore the consilience of evidence – well, all the worse for you.

I don't believe [specific claim]

This is a positive claim. Just because I can say "I don't believe that God doesn't exist" doesn't just UNO reverse the burden of proof. Mystery is mystery.

Affirmation and negation aren't linguistic properties of phrases, but logical properties of mathematical propositions regarding their specificity or non-specificity vis-à-vis a universe.

LLMs can output your position

I don't see how the fact that a tool can or can't produce my opinion or another has any bearing on its truth value.

You may say that this makes me useless to you or something, but not only is that completely irrelevant, I don't really care?

Notice how both you and Hlynka have devolved into bristling and brandishing credentials instead of arguments. «It's afraid».

Look, attacking someone's character and accusing them of credentialism for defending themselves isn't exactly a novel stratagem.

It's vacuous nonetheless. I'm here to discuss the matter at hand. If I wanted to do bullshit name calling I'd go on twitter.

My worldview is pragmatic, not built on haughty axioms of a philosopher enamored with his own navel-gazing insights.

Lies.

A pragmatist would't speculate, as you do.

You seem to desperately want to equate my position with that of other people who are very certain about the nature of things. Zelots tend to do this, and to think that anyone who doesn't believe what they're saying must believe a different kind of thing as absolutely.

I don't. I just think you're overtly enthusiastic about technological progress and that this blinds you, as it has blinded many others, to the ever present limitations of engineering and nature.

You're buying the hype, like I've seen countless other people buy various forms of it over the years. And like all of them you will be disappointed.

This is not to say that the technological changes we are living are not momentous and important. But their prediction is beyond us. And had you more humility you too would recognize it. For that is in fact the essence of pragmatism.

More comments

it is, in fact, entirely possible that there is a physical place inside your brain where a specific concept or cluster of concepts is stored

This is fair, and some people have actually compelling memory based (and radically materialist and/or idealist) theories. I've met my share of neuroscientists that are big into the idea that some abstract set of cortical maps can contain conceptual information. Though their numbers have waned with the years, or so I've been told.

But this is all theoretical, and they, unlike the people that irk me by making assumptions, don't claim that this is solid knowledge.

He's now an AI doomer who just signed Musk's letter calling for a moratorium on further LLM research

That's kind of funny. But maybe I shouldn't have posted this at all seeing as though people seem to think that I support the thesis of the article beyond the mere specific argument made about metaphorical understandings of minds and cognition.

I actually linked that specific xkcd comic in my post. However, where you seem to be under the impression that Flikr's development and introduction of auto tagging was prompted by the xkcd comic and implemented by a couple of coders in the space of a month. The reality is that it's something that they'd had a 50-person team working on since 2006 and it was the difficulty of this precise problem (and the discussions of its difficulty on various math and comp-sci forums through te 2000s) that had prompted the comic rather than the comic prompting flikr to add a new feature.

In the 60s, Marvin Minsky assigned a couple of undergrads to spend the summer programming a computer to use a camera to identify objects in a scene. He figured they'd have the problem solved by the end of the summer. Half a century later, we're still working on it.

Have you seen e.g. https://segment-anything.com/

Yes I have.

and while the latter may be down stream of the former identifying an object in the sense of "these pixels in this image file all correspond to the same object" is a very different problem from identifying an object.

I’m pretty sure this is solved by 10 year old image classification tech, if you feed just the segmented object to it. And I would certainly expect modern ML classifiers to crush the second stage.

Furthermore, once you start digging into their inner workings this lack of "knowing" appears to be a fundamental weakness of the Large Language Model architecture. At the end of the day it's still just a regression calculating the next most plausible word (or in the case of GPT-4 string of words) based on the correlations found in it's training data.

This is something that I find very unconvincing on the anti-AI side of the debate. First one is what I will call "just" argument. GPT is just next word prediction machine, it is just stochastic parrot and so forth. This type of arguments seem to argue that certain method such as training LLMs on predicting text will obviously result just in text predicting system. Which I think is red herring - training on text is obviously sufficient for LLMs to develop qualitatively different capacities such as multimodality. As the old saying goes - quantity has quality of it's own. It seems to me that it should be on proponents of just argument - who pretend to have deep knowledge of these models - to explain and possibly predict these things before saying stochastic parrot .... and therefore multimodality. Plus of course these types of cheap arguments can be used against humans - human brain is just a product of blind evolution. Or as in this book review, human brain is just a multi-layer prediction machine.

It seems to me that for AI the focus is a lot on hardware, the training process or on the output. But for humans it is always highly spiritual focus on qualia, feeling of understanding and other subjective things - it is not about brain structure, or pointing out how humans produce stupid output and therefore brains cannot truly understand, they do not know, they do not have representation of the universe or that they cannot plan. There are more obnoxious types like this episode of Adam Ruins Everything but there are also other and more sophisticated critics - the common denominator of all of these is that they are awfully certain to know what is [not]happening inside LLM models. I do not see many legibility experts who would really claim to know for certain that LLMs do not understand. Because who knows what is happening in this whole mess of inscrutable matrices of parameters, maybe somewhere in there is some kind of representation of the universe. We certainly do not know what is happening inside human brain when we scan it - unless of course we use Machine Learning for that But more importantly, LLMs can predict text on par of some of the top percentiles of people who understand, know or plan. So yes, LLMs can pass test specifically designed to test for understanding, they can produce plans on par with human planners and so forth, but for some reason despite all that one can simply claim is that they do not truly know or plan because of stochastic parrot or some such.

More convincing argument - or The Motte if you wish - is that LLMs do not understand, plan etc. like humans. Which is perfectly reasonable argument, except that they do kind of develop certain things that humans and also some animals also develop. So they are like humans in certain way but completely alien in other ways. However even this is loaded questions as LLMs can produce some output equivalent to humans but they may still not do it like humans. But each new implementation of these models are improving in certain tasks that were still outsourced to Mechanical Turks, the space for unique human application in this space is narrowing.

Now I have to say that I do not know where this all will lead. It may very well be so that current Transformer approach will reach certain plateau and then stops. There may be significant areas where humans will remain superior, and it may even have something to do with the fact that "Auto-Regressive LLMs are exponentially diverging diffusion processes" as LeCunn says. I do not know, but neither do these people. What I see is quite a rapid growth in capabilities of these models just with more compute.

Which is perfectly reasonable argument, except that they do kind of develop certain things that humans and also some animals also develop.

We had some fun with this over the holidays. My family has a know-it-all uncle who's fairly smart but even more so confident.

He holds to some theory courtesy of supposedly Penrose that proclaims we humans are capable of conceptual leaps - insight, I guess, because neurons somehow exist or get information from adjacent worlds of the Many World interpretation theory.

Therefore, LLMs, being just run on chips will never be able to do truly useful intellectual work.

Meanwhile, if you ask him about something he doesn't know much about (so not politics, math or economics) he will, with perfect confidence say that e.g. plutonium in nuclear warheads is in the form of dust, as why else would they need to compute the implosion geometry. Etc.

So, ironically, like LLMs, he's prone to hallucinating if you ask him about things he doesn't know much about. Getting to admit him he doesn't know something is next to impossible.

I suppose here is as good a place as any to drop my two cents:

I think one of the things that definitely makes AI more machine than man (for now) is something that I assume is fundamental to "consciousness:" motivation. What we call "agency" is possibly confused with the word "motive." As conscious beings, we humans have the following: a sense of self, some un/subconscious schema of instincts, and motivation. We are things, we can do things, and importantly, we want to do things. The mystery stuff of "qualia" that IGI argues for above is something we don't perfectly understand yet--is it just a biological form of training/pre-tuning written into our genetic code? Is there something spooky and supernatural going on? Is there truly something that makes us different from all the animals that can't build anything more complex than a nest, dam, or hidey-hole, something other than just a bigger brain?

Currently, GPT is a mechanical thing that won't do anything on its own without being fed an input. This is probably why anti-doomers take the "just unplug the AI 4Head" stance: to them, the AI lacks an innate drive to do anything it hasn't been told to do. If GPT is a baby, it's a baby that will just sit still and make no noise.

Maybe this is the real crux of our current moment: while these AI models are plenty capable, some just can't make that leap to "these are just like us, panic" because we aren't having to practice yomi against a motivated entity.

The mystery stuff of "qualia" that IGI argues for above is something we don't perfectly understand yet--is it just a biological form of training/pre-tuning written into our genetic code? Is there something spooky and supernatural going on? Is there truly something that makes us different from all the animals that can't build anything more complex than a nest, dam, or hidey-hole, something other than just a bigger brain?

A lot of people come to this class of arguments, humans are somewhat unique as they posses agency or motivation or qualia or in the past it was creativity and so on. It reminds me of the famous Chinese room argument where Searle smuggled in the concept of "understanding" by inserting literal human into the thought experiment. If human does not "know" Chinese, then the system itself does not know it either, right?. This is our intuition about knowing - mechanical systems cannot "know", only humans do and the only human around in this thought experiment does not know, QED. The most straightforward criticism is that human does not represent any cognitive agent in the whole room, he is just one part of the algorithm of making output. The room as a system can be capable of "understanding" on its own. And yet this whole argument is used over and over and I see something similar now with AI. As I argued above, people are all too ready to describe AI systems as pieces of hardware, as a training mechanism and so forth, they do the utmost to "dehumanize" AI with all these just arguments. And on the other hand they are all too ready to describe humans only subjectively, as agents possessing qualia and understanding and with capacity for love and creativity and all that to maximally humanize them. They never mention brain or how human neural network is trained or how cognitive algorithms work, no it is all about wonderful internal experience so unique to humans and so unlike just machines.

I really like a quote from Yudkowsky's essay How Algorithm Feels From Inside

Before you can question your intuitions, you have to realize that what your mind's eye is looking at is an intuition—some cognitive algorithm, as seen from the inside—rather than a direct perception of the Way Things Really Are.

People cling to their intuitions, I think, not so much because they believe their cognitive algorithms are perfectly reliable, but because they can't see their intuitions as the way their cognitive algorithms happen to look from the inside.

I think this is about right. For all we know before LLMs make an output they may have some representation of what is "correct" and what is "incorrect" output somewhere in there. As argued before, LLMs can spontaneously develop completely unique capabilities like multimodality or theory of mind, it may very well be so that something akin to subjective feeling is another instrumental property that can appear for even more developed system - or maybe it already appeared but we will not know because we do not really know how to test for qualia.

But I still think it is all a red herring, even if LLMs will never be conscious and they will never be able to think like humans; we are currently beyond this question. It truly is immaterial, our current crop of LLMs do produce high quality output on par with humans and it is what matters. Really, we should drop this unproductive discussion, go and play with Bing Chat or GPT-4 and see for yourself how much good did all these qualia debates for you.

In a sense it is even more scary that they can do it without developing complete set of human-like properties, that fact bodes unwell for alignment efforts. To use an analogy, recently it was found that Alpha Go was beaten by a very stupid strategy. It seems that all the critics were correct: see, the neural network does not really understand Go, it could be fooled so easily, it is stupid and inferior to humans, it lacks certain quality of human mind yet. Now for me it was actually terrifying. Because for years Alpha Go was considered as a superb Go player beating the very best human players who dedicated their whole life to the game. And now after years we found out that it was capable of doing all that without even "knowing" what is was supposed to do. It obviously learned something, and that something was sufficient to beat the best humans for years before the flaw was spotted.

It is incredible and terrifying at the same time and it is harbinger of what is to come. Yeah, GPT-5 or some future system may never have qualia and agency and that special human je ne sais quoi - but it will still beat your ass. So who is the sucker in the end?

That's very Blindsight by Peter Watts.

[Ramble incoming]

I guess, then, between the Chinese Room and AlphaGo and AI art and GPT, what we're really worried about is meaning. Did AlphaGo mean to be so good? What does it say when it rose to the top and the damn thing doesn't even "know" in any meaningful way what it did?

Kind of calls back to the recent thread about the Parable of the Hand Axe. For most of human history, our works were judged not merely by the output, but the journey. We appreciate the artist's processes, the engineer's struggles, the scientist's challenges, the warlord's sacrifices, the king's rationales, and so on. AI has recently provoked so much backlash because some realize, correctly or not, consciously or not, that AI threatens to shortcut the meaning imbued in the process of creation. Effortless generation of anything you want, but it will mean nothing because there's no "soul" to it.

I'm sympathetic to this argument, but I also have the capacity to acknowledge that maybe the way we think about "meaning through struggle" has the potential to become outmoded. On the third hand, though, it might be mind-boggling and embarrassing to think that humanity operated this way for so, so long. On the fourth hand, however, maybe the fact that current AIs were trained on the scraped works of a significant chunk of humanity does contain meaning in of itself--if meaning is achieved through the struggle and not the end result, AI still counts, just for the entire species and not merely the few.

I think meaning is another of these subjective/human concepts that may be useful but that are also dangerous, because it starts with the premise that humans are unique. But from other standpoint humans are "just" result of an evolutionary process that optimizes for inclusive genetic fitness. Imagine that we really are living in a simulation where somebody started the whole Life game by introducing Earth environment and simple rule for biosphere to optimize for inclusive genetic fitness. Except in a few billion ticks, the simulation produced species homo sapiens that evolved algorithm that can "hack" many instrumental goals that evolution developed as implementation of its main goal. One of those things for instance is sexual drive to increase number of offspring - humans were however able to hack this by being able to masturbate or use condoms. They sucked out the "meaning" of this activity, or maybe they found their own meaning there - to great exasperation of our simulation designer who now observes something strange happening in his model.

To expand the analogy, "optimize for inclusive genetic fitness" is akin to "optimize for predicting next word" in world of AI. Then goal of "learn to play Go" is akin to "have a lot of sex". But the Alpha Go somehow hacked its programing so to speak and learned something different, it decided to not play Go in a sense humans though it will. One can speculate that it developed its own meaning for the game Go and decided to stubbornly ignore whatever was meant by its creators. That is what I meant about bad news for aligning, whatever the LLM learns can be absolutely orthogonal to system used to train it (be it darwinian evolutionary process or next word prediction for a text) and it can be orthogonal even to some very detailed observation of output that however is superficial under many conditions (such as homo sapiens shagging like rabbits or Alpha Go beating good human Go players for years). What happens under the hood can be very hard to understand, but it does not mean it has no meaning.

It's not hard to make agents, even agents with apparently biotic motivations, I've mentioned one design here (although I've caught flak for an unwarranted parallel between the algorithm and my pet human virtue intuition). It isn't even very difficult to wring agentic outputs from LLMs, as people have been doing for many months now, or strap a «desiring-machine» to one, as they're beginning to do.

I'm an «anti-doomer» but I think we should scrutinize such developments really hard, and exponentially harder as models get stronger. We've been blessed by succeeding to develop unambitious yet helpful AI genies early on. Keeping them that way until they become truly useful and we can improve our own intelligence would be prudent.

Unrealistically prudent.

It's certainly a new strain of argument. "Okay, sure this AI can write graduate level essays on virtually any subject, but it's not really smart because it can't play chess!"

It's trivially easy to see where this goes wrong. AI researchers are already using LLMs as control systems that can interface with databases, calculators, large memory stores, APIs, and probably even chess engines.

We are doing things like this:

https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/

All "understanding" the 'difference between true and false' is is adding a parameter to your own LLM that restricts output to sequences that are broadly congruent with your human memory.

I feel like I've already addressed this very argument at length. How exactly do you propose to define "congruent with your human memory" in such a way that GPT restrict its outputs to those that match these criteria?

I feel like you're blindly diving head first into the pit-trap Minsky merely stumbled upon. All we have to do add a parameter for 'is a picture of bird' how hard could it be? Quite hard, as it would turn out.

Or it is literally just a parameter (PDF).

At a high level, we first identify a sparse set of attention heads with high linear probing accuracy for truthfulness. Then, during inference, we shift activations along these truth-correlated directions. [...] ITI results in a significant performance increase on the TruthfulQA benchmark.

Naming a parameter does not make it so. Adding a parameter labeled "truth value" is trivial, measuring truth value and assigning that measument to a parameter is not.

This reminds me how when GPT3 was just released, people pointed out that it sucked at logical problems and even basic arithmetic because it was fundamentally incapable of having a train of thoughts and forming long inference chains, it always answers immediately, based on pure intuition so to speak. But to me it didn't look like a very fundamental obstacle, after all most humans can't multiply two four digit numbers in their head, so give the GPT virtual pen and paper, some hidden scratchpad where it can write down its internal monologue, and see what happens. A week later someone published a paper where they improved GPT3 performance on some logical test from like 60% to 85% by simply asking it to explain its reasoning step by step in the prompt, no software modification required even.

I think that that, and what you're talking about here, are examples of a particular genre of mistaken objections: yes, GPT3+ sucks at some task compared to humans because it lacks some human capability, such as internal monologue or long term episodic memory or can't see a chessboard with its mind's eye. But such things don't strike me as fundamental limitations, because, well, just implement those things as separate modules and teach GPT how to use them! They feel like some sort of separate modules in us, humans, too, and GPT seems to have solved the actually fundamental problem, of having something that can use them, a universal CPU that can access all sorts of peripherals and do things.

And yet, after 60 years, despite the truly massive advances in both hardware and software represented by projects like stable diffusion Minsky's problem remains far from solved.

What sort of advancement would you need to call this problem "close to solved"? What kind of abilities would the models need to have? Can you give a few examples?

What's one-shot training performance like for the current models?

I expect that it would perform worse than an average minimum-wage cashier if given a situation like "This is a cantaloupe. Find all the cantaloupes here."

I'd be curious about this too, but it'd be really hard to make apples-to-apples comparisons. How many minimum-wage cashiers could be said to have no experience seeing a cantaloupe in the past, to make a true one-shot attempt at categorizing? What would a proper one-shot categorization test for a human look like anyway? It'd have to be free of any and all context cues that a human could pick up based on all the training they've gone through just living and observing in society.

Maybe artichokes, dragonfruit, bok choi, or whatever is in this meme instead of cantaloupe, then. There are a bunch of fruits and vegetables that many people have never seen before (cut flowers and potted plants are other candidates for this test).

I don't think that a lack of contextual clues is required for training to be considered one-shot. Language models are allowed to know things like mathematics and natural language processing, so I don't have a problem with humans knowing things like biology or cooking.

What do you predict is an average minimum-wage cashier's performance at a task like using a bogus novel arithmetic operation "pointwise multiplication modulo 10, with binary evenness-based carry, left-to-right"?

Granted, it's technically 4-shot. I'd give your cashier 5 examples, because God knows they have other bullshit to deal with.

I expect them to do terribly, and have worse-than-chance results due to giving up and refusing to answer. It's the opposite of "functions that we consider baseline, and accordingly take for granted, are in fact extremely complex and computationally intensive."

To be fair, I did understand where they were coming from. From a young age we're trained to admire the brilliance of guys like Pythagoras, Leibnitz, Newton, Euler, Keppler, Einstein, Et Al. Afterall, why does anyone even bother to study math if not to follow in those men's footsteps and unlock the grand fundamental truths of the universe? In contrast, while the principals of kinematics, control laws, and signal processing, may be mathematically intensive they also come across as very pedestrian.

Contemporary physics has almost always been on the frontier of math, from Newton, Einstein, Witten, and beyond. Physics textbooks are harder than even a lot of pure math books. Pure math is probably easier because physics requires having a full grasp of the math, but also the physics too. You cannot specialize with physics as you can with math...you need to know quantum and macro, which are different sets of maths and theories. To be among the best at physics you cannot have weak areas in either the physics or math.

Same for engineering, like the study of fluid and Navier stokes, which is again at the forefront of modern math.

In the 60s, Marvin Minsky assigned a couple of undergrads to spend the summer programming a computer to use a camera to identify objects in a scene. He figured they'd have the problem solved by the end of the summer. Half a century later, we're still working on it.

This can be done though. This is what self-driving cars have done for a while and are getting better at , even with increasingly convoluted surroundings. Most problems are in a state of "working on it," because what is ever finished? Chess software is still undergoing improvements even though that problem of playing chess at a pro level via. computer was solved long ago (Deep Blue).

This can be done though.

I'ts not so much that it can be done, so much as reasonable facsimiles can be made.

Steve Hsu has a startup that claims to have solved the hallucination problem. AFAIK they’ve not yet described their technique, but Hsu is no confabulist.

solved the hallucination problem

This doesn’t seem like the sort of thing that can be “solved”. Human brains hallucinate all the time after all, so at a minimum, “solving” the hallucination problem would mean attaining superhuman performance in the domain of general truth-telling. But even determining what counts as a hallucination or not is tricky.

Our current understanding of physics is that traveling faster than light is impossible. Every reputable source will confirm that FTL travel is impossible. But suppose for the sake of argument that, in reality, FTL travel actually is possible; it’s just that no one knows the truth about it. So if an LLM in 2023 in this reality tells you that FTL travel is impossible, is that a “hallucination”? It’s a false statement, after all.

Or suppose that an LLM tells you that Bruce Jenner is, in fact, a woman. Is it hallucinating? Every reputable source will tell you that he’s a woman. But determining whether he actually is a woman or not depends on a complex set of underlying philosophical assumptions.

I’m not sure what their proposal for solving hallucinations could be besides “check everything you say with Google and reputable sources”. But who gets to define what counts as a reputable source, and what if the reputable sources are wrong anyway?

I’m not sure what their proposal for solving hallucinations could be besides “check everything you say with Google and reputable sources”. But who gets to define what counts as a reputable source, and what if the reputable sources are wrong anyway?

Sure, but that's a different problem and not really hallucination. Hallucinate is making things up of whole cloth, such as making fake citations, etc...

Hallucination will be easy to solve and will be done in the next 2-3 years high confidence. A LLM just needs an API to access a store of truth. This store of truth will be a simple database of factual information, not a neural net. For example, the database will know that Abraham Lincoln was born on February 12, 1809. When the LLM controller gets data that contradicts the store of truth it will interrogate the trustworthyness of that data much like a human does, and if appropriate replace the data.

Just like a human, this would fail to function in the case of a coordinated attack on the integrity of data. If you're looking for an AI to tell you that one political side or the other is "correct", then that may indeed be an impossible task. But it's not related to hallucination.

Hsu is free to claim whatever he likes, but if we assume that OpenAi has been largely honest about how GPT works in their patent applications and research papers, it seems to me that it is unlikely that there will be any solution to the so-called "hallucination problem" outside of a complete ground-up redesign.

How so? I agree that a LLM neural net will hallucinate. But you can make a system that uses an LLM as a controller that can query trusted data sources and the problem is basically solved. The future of AI is not a pure LLM solution. It is using LLMs as part of a system.

But if you press them on the topic, or actually look at chess games that GPT has played it becomes readily apparent that GPT makes a lot of stupid and occasionally outright illegal moves (eg moving rooks diagonally, attacking it's own pieces, etc...). What this demonstrates is that GPT does not "know how to play chess" at all.

Imagine a blind person, without any sense of touch or proprioception, who has only heard people talk about playing chess. They have never seen a chessboard, never picked up a rook, and the mere concept of moving pieces is completely foreign to their sensorium.

And yet, when pressed, said person is able to play mostly legal moves, all the while holding the entire state of the board in their head. Correspondence chess via Chinese telephone.

I think anyone who witnessed such a feat would be justified in being highly impressed, whereas you happen to be the equivalent of someone complaining that a talking dog isn't a big deal because it has an atrocious accent, whereas Yudkowsky et al are rightly pointing out that you can't find a better way of critiquing a talking dog! Especially a talking dog that gets ever more fluent with additional coaching, to the point that it knows more medicine than I do, understands quite complicated math, and in general does a better job of being a smart human than the average human does.

In a park people come across a man playing chess against a dog. They are astonished and say: "What a clever dog!" But the man protests: "No, no, he isn't that clever. I'm leading by three games to one!"

Do the dogs not speak wherever it is you are from?

Part of my point is that computer programs being able to play chess at or above a Human level has been the norm for close to 40 years now. I would argue that the apparent inability to match that capability is a step backwards

It's a step in a different direction, not backwards. First people programmed computers with "play chess, like this", and because they could do it faster they eventually got better than humans at chess. Then people programmed computers with "learn to play simulatable games well", and they soon got better than humans because chess is a simulatable game, and although they also got better than the first computers it wouldn't have been a failure otherwise because the point of the exercise was the generality and the learning. Now people have programmed computers with "learn to write anything that humans might write", and yes they're still kinda crummy at most of it, but everyone's dumbfounded anyway, not because this is the way to optimize a chess engine, but because it's astounding to even find a crummy chess engine emerge via the proposition "chess play is a subset of 'anything'".

Is this a dead end? The real world isn't as amenable to simulation as chess or go, after all, and LLMs are running low on unused training data. But with "computers can learn to do some things better than humans" and "computers can learn to do practically anything" demonstrated, "computers can learn to do practically anything better than humans" should at least be imaginable at this point. Chess isn't a primary goal here, it's a benchmark. If they actually tried to make an LLM good at chess they'd be able to easily but that would just be Goodharting themselves out of data. It will be much more interesting when the advances in GPT-5 or 6 or whenever make it a better chess player than humans incidentally.

It's the claim that "computers can learn to do practically anything" has already been demonstrated that I am calling into question.

If nobody has made a stockfish ChatGPT plugin yet I am sure it is only a matter of a few days. People are impressed by ChatGPT playing kinda okayish chess without making use of external tools, depite the fact that even amateur chess players can run circles around it, for the same reason they're impressed with Usain Bolt running 100m in 9.58 seconds despite the fact that a scrawny teenager who gets out of breath when they get up off the couch could go 100m in less than half the time on a Kawasaki Ninja.

There's around 0 dollars to be made by making a chess bot better than Stockfish. The days of rolling them out to spank the human pros is long gone, they just get up and start running for the hills when you pull out even the kind of bot that runs on an old smartphone.

In contrast, an AI that does tasks ~80% as good as a professional can, for pretty much all tasks that involve text, is economic disruption in a tin can. (Emphasis on professionals, because it is genuinely better than the average human at most things, because the average human is an utter humpty)

Notice how I said that it's a better doctor than me? Consider how much we spend on healthcare, one of the thousands of industries about to be utterly disrupted.

In contrast, an AI that does tasks ~80% as good as a professional can, for pretty much all tasks that involve text, is economic disruption in a tin can

But the difference is still in the tails. The top 1% is where the money is made in any competitive industry. That is why top tech companies are so obsessed with talent and recruiting. That is harder to automate than the rest.

Notice how I said that it's a better doctor than me? Consider how much we spend on healthcare, one of the thousands of industries about to be utterly disrupted.

It can automate the diagnosis process based on some input of symptoms, but other parts harder, like treating. Same for invasive tests and biopsies. Ai will disrupt it in some ways, but I don't think it will lower costs much.

I think you're adopting too much from a programming background when it comes to productivity. 10x programmers are far more common than 10x doctors or lawyers, because it isn't nearly as feasibly to simply automate the gruntwork without hiring more junior docs/lawyers.

I would say that productivity in the vast majority of professions is more along the lines of the Pareto Principle, such that a 80% competent agent can capture a substantial chunk of profits.

And what exactly is so hard about treatment? An AI doctor can write drug charts and have a human nurse dispense them. Invasive tests and biopsies are still further away, but I full believe that the workload of a modal doctor in say, Internal Medicine, can be fully automated today without any drawbacks. The primary bulwark against the tide is simply regulatory inertia and reflexive fear of such unproven advances.

Is there a good AI substitute for clinical examinations at present, or are we going to rely on patients self-examining?

I can honestly buy that in the short-medium term AI would take a better history and get differentials and suggest treatment plans better than the modal doctor. I could even buy that within that timeframe you could train AI to do the inspection parts/things like asterixis, but I don’t know how you’d get an AI to…palpate. Movement and sensation etc. are quite difficult for computers, I am to understand.

Alternatively maybe they’d just get so fucking good at the rest of it that professional examinations aren’t needed anymore, or that some examination findings can be deduced through other visual/etc means…

You'd be rather surprised at how little doctors palpate, auscult etc in practise. They're most used for screening, if there's any notable abnormality they get sent off straight to imagining instead of simply relying on clinical signs as was once common. It certainly plays a role, but with robots with touch sensors, it's hardly impossible to have AI palpate, it's just a skill that's rapidly becoming outmoded.

Oh I know well how doctors don’t do the things they teach us to do in medical school! But it did seem like one thing that they can’t (that easily) but we can (relatively easily), due to it being more of a physical and tactile thing.

That said, I find that I do examine people at least a few times a day.

I agree it’s hardly impossible but I’d be surprised if it wasn’t markedly harder to train?

More comments

If GPT hallucinates a wrong and deadly treatment, who do you sue for malpractice?

Right now? Nobody, because it's not licensed for medical use and uses massive disclaimers.

In the future when regulators catch up and it's commercially deployed and advertised to that end? Whoever ends up with the liability, most likely the institution operating it.

I see this as a comparatively minor roadblock in the first place.

...and now the conclusion

From the outside it might seem like a straight-forward fix to just add a line to the prompt that says "only provide real quotes / true statements". but to implement such a function requires imbuing GPT with an understanding of the difference between "true" and "false" and between "real" and "imagined". That is a real hum-dinger of a problem. It is such a hum-dinger of a problem that there is an entire branch of philosophy devoted to discussing it, that being Epistemology. As simple and foundational to computer science as the concept of a boolean might be, this is not a problem I see getting solved anytime soon.

Accordingly, when i see some AI-doomer post about how GPT-4 has passed the BAR exam in some state or gotten an A on Bryan Caplan's mid-term economics exam, my first thought is in not "oh shit here comes the fast take-off". It's more "and just how diligent were people grading the papers being?". In one of those threads the topic of grading on a curve came up and the question was asked why should we ask professors to go through the effort of calibrating tests to the material when it is so much simpler/easier/more efficient to ask a spread of arbitrarily difficult questions and award the top x% of answers 'A's. I ended up biting my tongue at the time because my knee-jerk response was something to the effect of "because that's fucking retarded and ultimately defeats the purpose of even administering a test in the first place" But upon a moment's reflection I realized that was a very "thing-manipulator" thought to have.

Thus we come back to the issue of inferential distance. I struggle to articulate just how brain-meltingly stupid and arbitrary the whole concept of "grading on a curve" seems to me. But I also recognize that grading on a curve is a widely accepted practice. From this I infer that my concept of a test and it's purpose is wildly different from that of Bryan Caplan and a lot of other users here on theMotte.

Perhaps this is my "thing-manipulator"-ness talking, but it seems intuitively obvious to me that if a teacher or professor is grading on a curve, they are not grading you on your capability or knowledge of the subject. and if they are not grading you on your capability or knowledge of the subject what re they grading you on? It seems to me that if a teacher and their students are on their game it should be possible for 100% of a class to earn a 100% grade. Just as if manufacturing is truly on the ball it should be possible to achieve a 100% pass rate from the QA department. Granted this never actually happens in the real world because life is imperfect but it's something to strive for isn't it? A man might just find himself a member of the '72 Dolphins.

What is the purpose of a test or inspection in the first place if not to verify capability?

Ironically, I think the real existential threat posed by GPT is not to humanity but to humanities professors. I would argue that if Caplan had been grading his students on their knowledge and understanding of the material (as he ought to have been from the outset) he wouldn't have found himself in this pickle. That GPT-4 got an A on Caplan's mid-term is not evidence that GPT-4 understands economics or history, it's evidence that Caplan does not understand his role as a educator. GPT is essentially the prefect Post-Modernist, and in so being it is exposing post-modernism and the entire edifice of contemporary academia as the farce it's always been.

The cynical bastard in me suspects that the reason "Symbol-Manipulators" seem to be so freaked out about GPT is that it represents a fully-automated-luxury-gay-reductio-ad-absurdum of everything that they think they know.

I’ll repeat something I said elsewhere. Tests of subject matter are usually objective tests of concepts. If. You know the concepts you pass. Passing Caplan’s economics test is domain limited — if it had trained on basic concepts it would probably do pretty well because while it requires knowledge of the concepts of economics and how to apply them to economics problems, it doesn’t require understanding this. Knowing how to use the Pythagorean theorem to find the shortest arm of a right triangle given a hypotenuse and the long side isn’t hard. Rearrange the equation and go.

Projecting the future from a given dataset using a theory would require understanding the data and the bits that are relevant and figuring out how the theory and the data can be used to find the answer. So a question like “using the complete set of all the Dune series of books and Peter Turchin’s theory of cliodynamics, project the future of the universe 30,000 years into the future”. Very little in the the books would be useful in making the projections (although I suppose if you count the mention of sand worms and spice or references to a famine those would be useful). ChatGPT would have to figure that out though, as well as just how much of the action and characters are irrelevant to the project. But the process of figuring all of this out requires understanding what the books are saying and what they aren’t, and requires understanding cliodynamics on a level above “insert numbers into the equation and go.”

This is what knowing and understanding would seem to mean: not just being able to manipulate words and numbers and equations, but to do so with a sense of what the numbers or words mean. To put it in the context of the more famous Chinese Room thought experiment— in order for the person in that room to understand the questions and the answers, it’s not enough to simply pattern match 我 with 李张 I would have to understand that I’m telling the outside world that “I” am named “Li Zhang”. If I cannot do that, if I only understand in the sense of seeing one symbol and outputting another without knowing why the correct answer to X is A and not C, I don’t understand it.

Edit: just for kicks I asked ChatGPT about the future of the dune universe:

Dune is a rich and complex universe that has captured the imaginations of generations of readers. As an AI language model, I can imagine what the Dune universe might look like 30,000 years after the end of Dune Chapterhouse, based on the information available in the books and the various adaptations.

Assuming that the long-term goals of the Bene Gesserit and their sisterhood of Mentats were achieved, the human race could have evolved into something more than we can currently imagine. They would have mastered advanced technologies and could harness the power of the Guild Navigators and the sandworms to travel across space and time. The planet of Arrakis could be transformed into something entirely different, devoid of the sandworms or even turning into a lush, forested world with oceans and rivers.

The descendants of the Atreides bloodline like Leto and Ghanima would have spread out across the galaxy, establishing themselves as new ruling dynasties. Implications of a far reaching society where humans have been colonized, terraforming planets and form all sorts of political groups and religions.

On the political front, the Imperium could have been replaced by a new system of government. Perhaps, the Butlerian Jihad against the thinking machines would have been avoided, and intelligent machines like the Ixians and the Taurid would have prospered. Alternatively, humans could have replaced the thinking machines with new technologies, and the entire social structure must have dissolved due to the rapid technological advancements.

In conclusion, the Dune universe 30,000 years after the end of Dune Chapterhouse is a realm of endless possibility, and it holds untold wonders and horrors beyond our current comprehension. As an AI language model, it is beyond me to predict the exact path the universe might have taken, but one thing is for sure: it would be a fascinating and dynamic place to behold.

(Which honestly sounds a bit like something I would come up with to try to fool the teacher into thinking I read the books. For example, Leto actually became a worm and was thus sterile, he had no natural descendants. Also, I noticed that no mention of specific plans by the Benefit Gesseret, Mentats, etc. and the Butlerian Jihad happened 10,000 years before the first book, and Ixians aren’t machines).

Which honestly sounds a bit like something I would come up with to try to fool the teacher into thinking I read the books...

...but hadn't actually read the books.

and that "hasn't actually read the book" quality is exactly what i feel like Volokh and I both latched onto, but reading through the other replies I've gotten the impression that most users here either do not understand or do not consider having "read the book" to be relevant to parameter when discussing it's contents.

Like I said in the OP, GPT is effectively the "perfect post modernist", it's literally "all talk" and no thought. Theres no need to kill the author when the author was dead from the beginning.

What it looks like to me is that it knows the groups that should be mentioned (mentats, the Guild, Ixians, Bene Gesseret) but it doesn’t know what they are or what they want. It’s also fairly confused about the people and the timeline. It knows that those events happened, but not the order, and apparently doesn’t know that a lot of the stuff it’s predicting to happened 30K years after chapterhouse had already happened.

Which to mean points to ChatGPT-4 as a Chinese Room. It knows vaguely to use terms like Sandworm, Mentat, and Arrakis when talking about the Dune series. It knows to talk about evolution. But it doesn’t understand what things mean; the Bene Gesseret don’t mean anything to the chatbot.

That all definitely sounds like what I would have written if I had the most vague impression of the franchise.

The cynical bastard in me suspects that the reason "Symbol-Manipulators" seem to be so freaked out about GPT is that it represents a fully-automated-luxury-gay-reductio-ad-absurdum of everything that they think they know.

Based on your teaser last week I was expecting a revisit of Christianity and the Western Tradition. Isn't Christianity, in your model the foundation of civilization, the ultimate creation of Symbol-Manipulators?

Isn't Christianity, in your model the foundation of civilization, the ultimate creation of Symbol-Manipulators?

No, because in my model God is real.

From the outside it might seem like a straight-forward fix to just add a line to the prompt that says "only provide real quotes / true statements". but to implement such a function requires imbuing GPT with an understanding of the difference between "true" and "false" and between "real" and "imagined". That is a real hum-dinger of a problem.

Being able to produce truth 100% of the time is a hard problem. Being able to produce truth in the subset of cases which are easy to recognize by a human as being untrue is not such a humdinger, and the latter is what people are requesting when they ask for no hallucinations.

There's a big difference between philosophical concepts of abstract truth, and "when it names a Star Trek episode, this had better be the name of an actual episode".

There's a big difference between philosophical concepts of abstract truth, and "when it names a Star Trek episode, this had better be the name of an actual episode".

Is there? It seems obvious to me that the latter is down stream of the former.

I don't think he was trying to say that the problem is being able to produce truth 100% of the time. The problem is understanding what it means for something to be "true" or "false" in the first place.

Same answer. We don't have a general understanding of true and false, but we know whether some things are true and false, and ChatGPT fails even on that subset of easy things that we do understand.

Ironically, I think the real existential threat posed by GPT is not to humanity but to humanities professors.

Why can the humanities never catch a break?

TheMotte is a humanities discussion forum. We take questions from the humanities (mainly politics and philosophy) and discuss them using methods of analysis that are typical of the humanities. Anyone who posts here shows a revealed preference for humanistic thinking and, plainly, symbol manipulation as well.

What you have written here, in this very post, is a humanities essay. You have engaged in armchair psychological speculation about the motivations and thought processes of a vast and heterogeneous group of people while adducing very little empirical evidence to support your claims. You haven't defined a precise hypothesis (how could we turn "GPT-4 is exposing post-modernism and the entire edifice of contemporary academia as the farce it's always been" into an empirically falsifiable hypothesis?), collected a sufficiently large sample, done a rigorous statistical analysis, or anything else that might move your speculations out of the realm of "mere humanities" and into the vaunted halls of science. This isn't meant to be a criticism of your post, of course. I too prefer to do all my thinking from the armchair - it's a very comfy place to be. We should just recognize it for what it is, is all. You should reflect on whether what you're doing here - what any of us are doing here - is really any different from what the "postmodernist symbol manipulators" are doing.

What you have written here, in this very post, is a humanities essay.

You're absolutely correct.

But my post makes no particular claims about "the humanities" as a field, what it talks about is "humanities professors" which are not the same thing. Though perhaps the conflation of the former with the latter is yet another example of that inferential distance I've been talking about.

see @07mk's reply below.

I think he would not disagree. I ran Hlynka's text through Bing Chat and this was its summary regarding the humanities professor part:

The author is suggesting that GPT-4 and humanities professors have similar goals and methods, and that they are both inferior to thing-manipulators who can challenge their claims with facts and data. He is implying that GPT-4 could replace humanities professors because they are both producing meaningless or misleading content.

I asked about some counterarguments and here is what it came up with:

The author’s distinction between thing-manipulators and symbol-manipulators is too simplistic and ignores the diversity and complexity of both types of activities. Symbols are not just arbitrary or meaningless signs, but tools for communication, reasoning, and representation that can have various interpretations and functions. Symbols can also be manipulated in ways that reflect the properties and relationships of the things they represent, such as in mathematics or computer science. Thing-manipulators are not just concerned with facts and data, but also with creativity, intuition, and problem-solving. Thing-manipulation can also involve symbolic aspects, such as using models, diagrams, or equations to describe or analyze physical phenomena. Both thing-manipulators and symbol-manipulators can benefit from each other’s perspectives and methods, and collaborate to solve hard social problems that require both empirical and conceptual approaches123

I then asked it to tailor an argument in favor of humanities professors. It searched for "value of humanities professors" on google and incorporated it into the context:

Humanities professors are not just producing symbols and feelings that appeal to a certain audience, but engaging with the world in important ways. They are exploring the diversity and complexity of human cultures, values, and experiences, and contributing to the advancement of knowledge, creativity, and critical thinking. They are also addressing the ethical, social, and political challenges that face humanity in the present and the future, and fostering dialogue and understanding among different perspectives and communities. Humanities professors are not disconnected from reality, but rather enriching it with their insights and interpretations12

I then asked it if all those things cannot be also done by GPT-4. It then searched for "GPT-4 limitations and capabilities" and it came up with the following:

It is possible that GPT-4 can perform some of the tasks that humanities professors do, such as writing essays, generating quotes, or passing exams. However, GPT-4 is not a substitute for humanities professors, because it still has many limitations and challenges that prevent it from achieving the same level of understanding, creativity, and ethics as human scholars. For example, GPT-4 has difficulty with:

  • Distinguishing between true and false statements, and between real and imagined scenarios1
  • Reasoning, planning ahead, working memory, and backtracking, due to its autoregressive architecture2
  • Avoiding social biases, hallucinations, and adversarial prompts that can affect its outputs3
  • Handling visual input, which is not yet widely available or reliable4

Therefore, GPT-4 is not likely to replace humanities professors anytime soon, but rather complement their work or provide new opportunities for collaboration and innovation.

So in a way we came full circle, the last part is I think a version of original argument Hlynka made.

So in a way we came full circle, the last part is I think a version of original argument Hlynka made.

Not really no, and to be blunt my initial reaction reading your "exchange" largely mirrors @IGI-111's below, it really does read like student who hasn't read the material trying to bluff-thier way past the teacher.

Volokh's essay and the subsequent discussion on his blog goes into in this far more deeply than I could, but what it seems that GPT and other LLMs seem to be actually kind of good for is condensing large bodies of text into a "cliff notes" version. For example, GPT was able to correctly identify my supposition that "GPT-4 could replace humanities professors because they are both producing meaningless or misleading content" despite my not having explicit made that claim. What GPT/LLMs seem to be particularly bad at is answering questions.

Ah man it really is like talking to an academic bullshitter. Continuum fallacies and all.

It even brings in random definitions of things that have no connection to the underlying argument just to make the aggrieved party sound more important.

All the tactics, none of the substance.

The hopelessly ironic part is that it seems to be arguing that humanities professors can distinguish between true and false and avoid social biases, having been trained on their writings.

One has seldom produced such a clear example of self refuting nature of the post modern condition.

The hopelessly ironic part is that it seems to be arguing that humanities professors can distinguish between true and false and avoid social biases, having been trained on their writings.

It is arguing in favor of humanities professors because I told it to argue that position. It researched that GPT may have trouble discerning true and false statement, and it argued that humanities professors have that capacity. It implicitly asserted that argument, but Hlynka asserts without proof that humanities professors are pomo text generators. But unlike Hlynka GPT it at least provided links to its statements, it used some jargon like autoregressive architecture and in general repeated original Hlynka's argument about deficiencies of GPT better. I think that it also correctly pointed out that this whole thing vs symbol manipulator distinction is a lot more complicated.

I think that it also correctly pointed out that this whole thing vs symbol manipulator distinction is a lot more complicated.

While I instinctively believe things are more complicated than Hlynka's distinction, I became less and less convinced of this the more I waded through Bing's verbiage on the matter.

Not sure what the point of posting this was.

We're all quite capable of reading the post and coming to our own conclusions about it. I don't feel the need to outsource my thinking to anyone else, human or machine. I learn from other people, certainly, but I don't let them do my thinking for me. The purpose of the act of thinking is to determine what I think about something. Not to determine what someone else thinks.

"If men create intelligent machines, or fantasize about them, it is either because they secretly despair of their own intelligence or because they are in danger of succumbing to the weight of a monstrous and useless intelligence which they seek to exorcise by transferring it to machines, where they can play with it and make fun of it. By entrusting this burdensome intelligence to machines we are released from any responsibility to knowledge, much as entrusting power to politicians allows us to disdain any aspiration of our own to power. If men dream of machines that are unique, that are endowed with genius, it is because they despair of their own uniqueness, or because they prefer to do without it - to enjoy it by proxy, so to speak, thanks to machines. What such machines offer is the spectacle of thought, and in manipulating them people devote themselves more to the spectacle of thought than to thought itself. It is not for nothing that they are described as 'virtual', for they put thought on hold indefinitely, tying its emergence to the achievement of a complete knowledge. The act of thinking itself is thus put off forever. Indeed, the question of thought can no more be raised than the question of the freedom of future generations, who will pass through life as we travel through the air, strapped into their seats. These Men of Artificial Intelligence will traverse their own mental space bound hand and foot to their computers. Immobile in front of his computer, Virtual Man makes love via the screen and gives lessons by means of the teleconference. He is a physical - and no doubt also a mental cripple. That is the price he pays for being operational. Just as eyeglasses and contact lenses will arguably one day evolve into implanted prostheses for a species that has lost its sight, it is similarly to be feared that artificial intelligence and the hardware that supports it will become a mental prosthesis for a species without the capacity for thought. Artificial intelligence is devoid of intelligence because it is devoid of artifice."

-- Jean Baudrillard, The Transparency of Evil: Essays in Extreme Phenomena

The point of my exercise was that Bing Chat was able to understand Hlynka's text and produce a defense of humanities professors by actually improving on original arguments made by Hlynka. It produced the same true/false argument but it also provided a description of LLM shortcomings in more technical manner speaking about hallucinations or adversarial prompts.

So in that sense it was Hlynka's text that seemed more pomo compared to what GPT produced. Which I think is quite an interesting thing to observe. In the end I think at minimum the GPT + Human pair will outperform solo player in near future. At least in a sense that Human should know in what domains to completely trust GPT despite his own intuition.

The problem is that it's defense of humanities professors was exactly the sort of meaningless pastiche that you would expect if it was a pure symbol manipulator. Now you could argue that it sounds very much like the real arguments that would come out of the mouths of real humanities professors. But that just means Hlynka wins on both sides.

Except there nothing in my post about humanities professors being replaced by thing-manipulators. GIGO applies.

The author is suggesting that GPT-4 and humanities professors have similar goals and methods, and that they are both inferior to thing-manipulators who can challenge their claims with facts and data. He is implying that GPT-4 could replace humanities professors because they are both producing meaningless or misleading content.

It was about replacement of humanities professors by GPT-4 as opposed to thing-manipluators. But it also caught the tone of your thing vs symbol manipulators. And in that sense I completely agree about GIGO.

The part you're quoting says GPT would be a threat to humanities professors, not to the humanities. I'd wager that this would be a net benefit to the humanities and would actually constitute a meaningful form of the humanities catching a break.

The part you're quoting says GPT would be a threat to humanities professors, not to the humanities.

I don't think he was making such a fine-grained distinction.

The general thrust of his post was to set himself as a thing-manipulator apart from the symbol-manipulators. But the type of thinking on display in his post was precisely an example of the type of symbol-manipulation that he was deriding. I'll let him decide if he thinks this is a fair reading of his post or not.

I'd wager that this would be a net benefit to the humanities and would actually constitute a meaningful form of the humanities catching a break.

I won't speculate about what impacts GPT will or won't have on any aspect of the current university system.

In general, I don't share the instinctive hatred for academics that many here seem to have. Sure, a lot of them are leftists, but so what? Lots of people are leftists. If I had a meltdown every time someone was a leftist then I'd have a hard time functioning in modern society.

I enjoy reading the professional output of many humanities academics and I'd be quite happy to have them continue as they are.

In general, I don't share the instinctive hatred for academics that many here seem to have. Sure, a lot of them are leftists, but so what? Lots of people are leftists. If I had a meltdown every time someone was a leftist then I'd have a hard time functioning in modern society.

I don't think any sort of "hatred" people here have towards academics is "instinctive," and characterizing it as such is highly uncharitable. I think it's a learned antipathy based on observations and conscious analysis, and that this has very little to do with them being leftists. At best, the antipathy seems to be due to something that us upstream from them being leftists, i.e. the same sort of social/cultural forces that lead these academics to having sloppy thinking also leads to them being leftists.

I enjoy reading the professional output of many humanities academics and I'd be quite happy to have them continue as they are.

I'd wager that the humanities academics that produce professional output that are worth reading are ones who will be most resistant to replacement by GPT and the like. Whether they're completely resistant is an open question, I admit, but for the foreseeable future, I don't think there's much to worry about.

At best, the antipathy seems to be due to something that us upstream from them being leftists, i.e. the same sort of social/cultural forces that lead these academics to having sloppy thinking also leads to them being leftists.

Can you elaborate? Do you have any examples of this sort of "sloppy thinking" in mind?

I don't have any specific examples off the top of my head, but I'm thinking of the (I'm guessing largely unconscious and unintentional) peer pressure within the academia social/cultural spheres pushing people into adopting sloppy thinking in the form of being against rationality/logic/empiricism in favor of taking the word of people that one is predisposed to like. The peer pressure obviously takes many forms, but I'd guess mostly just in who is given higher social status versus who isn't, based on what sorts of opinions one espouses publicly, though some of it's certainly openly intentional directed bullying. These forces also tend to push people towards adopting (certain) leftist views.

Accordingly, when i see some AI-doomer post about how GPT-4 has passed the BAR exam in some state or gotten an A on Bryan Caplan's mid-term economics exam, my first thought is in not "oh shit here comes the fast take-off". It's more "and just how diligent were people grading the papers being?".

Caplan had a very strong incentive to fail the AI. He publicly bet against AI passing his exams a few years back. He has a very long and unbroken streak of victorious bets, and it looks like this one is the first one that he will actually lose.

The obvious couter observation is that Caplan having an incentive to fail the AI is orthogonal to whether Caplan's exam actually measures a student's understanding of economics.

Accordingly, when i see some AI-doomer post about how GPT-4 has passed the BAR exam in some state or gotten an A on Bryan Caplan's mid-term economics exam, my first thought is in not "oh shit here comes the fast take-off". It's more "and just how diligent were people grading the papers being?". In one of those threads the topic of grading on a curve came up and the question was asked why should we ask professors to go through the effort of calibrating tests to the material when it is so much simpler/easier/more efficient to ask a spread of arbitrarily difficult questions and award the top x% of answers 'A's. I ended up biting my tongue at the time because my knee-jerk response was something to the effect of "because that's fucking retarded and ultimately defeats the purpose of even administering a test in the first place" But upon a moment's reflection I realized that was a very "thing-manipulator" thought to have.

Thus we come back to the issue of inferential distance. I struggle to articulate just how brain-meltingly stupid and arbitrary the whole concept of "grading on a curve" seems to me. But I also recognize that grading on a curve is a widely accepted practice. From this I infer that my concept of a test and it's purpose is wildly different from that of Bryan Caplan and a lot of other users here on theMotte.

Perhaps this is my "thing-manipulator"-ness talking, but it seems intuitively obvious to me that if a teacher or professor is grading on a curve, they are not grading you on your capability or knowledge of the subject. and if they are not grading you on your capability or knowledge of the subject what re they grading you on? It seems to me that if a teacher and their students are on their game it should be possible for 100% of a class to earn a 100% grade. Just as if manufacturing is truly on the ball it should be possible to achieve a 100% pass rate from the QA department. Granted this never actually happens in the real world because life is imperfect but it's something to strive for isn't it? A man might just find himself a member of the '72 Dolphins.

What is the purpose of a test or inspection in the first place if not to verify capability?

That was me. And this complain would make a lot more sense if education was training people to actually do the tasks they perform at their jobs. But it doesn't. Calc tests and econ tests are just academic-themed IQ tests, because nobody actually uses calc or econ in their jobs except for a tiny minority, and that minority could be easily trained on the job instead of outsourcing the task to a 4-year high institution that charges tens of thousands of dollars for the privilege. Employers are using degrees and GPAs to select the top x% employees, not to verify that the student has achieved an objective standard of excellence in subject y (otherwise, would use nationwide standardized tests instead of whatever crap the professor came up with, in which case, WOULD make sense to effortfully calibrate objective passing grades against the material).

From "The Magic of Education" by Bryan Caplan:

Why do employers care about grades and diplomas? The “obvious” story, to most people, is that professors teach their students skills they’ll eventually use on the job. Low grades, no diploma, few skills.

This story isn’t entirely wrong; literacy and numeracy are a big deal. But the “obvious” story is far from complete. Think about all the time students spend studying history, art, music, foreign languages, poetry, and mathematical proofs. What you learn in most classes is, in all honesty, useless in the vast majority of occupations. This is hardly surprising when you remember how little professors like me know about the Real World. How can I possibly improve my students’ ability to do a vast array of jobs that I don’t know how to do myself? It would be nothing short of magic. I’d have to be Merlin, Gandalf, or Dumbledore to complete the ritual:

Step 1: I open my mouth and talk about academic topics like externalities of population, or the effect of education on policy preferences.

Step 2: The students learn the material.

Step 3: Magic.

Step 4: My students become slightly better bankers, salesmen, managers, etc.

Yes, I can train graduate students to become professors. No magic there; I’m teaching them the one job I know. But what about my thousands of students who won’t become economics professors? I can’t teach what I don’t know, and I don’t know how to do the jobs they’re going to have. Few professors do.

Many educators sooth their consciences by insisting that “I teach my students how to think, not what to think.” But this platitude goes against a hundred years of educational psychology. Education is very narrow; students learn the material you specifically teach them… if you’re lucky.

Ironically, I think the real existential threat posed by GPT is not to humanity but to humanities professors. I would argue that if Caplan had been grading his students on their knowledge and understanding of the material (as he ought to have been from the outset) he wouldn't have found himself in this pickle. That GPT-4 got an A on Caplan's mid-term is not evidence that GPT-4 understands economics or history, it's evidence that Caplan does not understand his role as a educator. GPT is essentially the prefect Post-Modernist, and in so being it is exposing post-modernism and the entire edifice of contemporary academia as the farce it's always been.

I don't think it's possible to conclude this. GPT can pass a simple math test, for example, does this mean a math teacher does not understand his role?

Perhaps this is my "thing-manipulator"-ness talking, but it seems intuitively obvious to me that if a teacher or professor is grading on a curve, they are not grading you on your capability or knowledge of the subject.

And if you look at the actual content of the test, you will note that you are entirely correct that it's not a test of "how well have you internalized the principles of economics", it's a test of "do you agree with Bryan Caplan's politics".

"Do you agree with [professor]" is the subject of every university-level exam.

Oh come now.

I had a philosophy class in undergrad with a professor who had published extensively on a particular topic in philosophy of mind, and he was quite proud to be known as one of the leading experts in his particular sub-field. For my final paper in the class I ended up disagreeing extensively with his views - and he let me know this by writing a thorough rebuttal for almost every paragraph in my paper - but he still gave me an A because he thought my paper was well argued.

some of the questions are weirdly worded too, like this one

T, F, and Explain: Evidence of severe credit market imperfections will make you more eager to continue your education, but evidence of severe externalities of education will not.

He means asymmetric information or default structure that benefits the borrower , thus incentivizes higher education, which it's plausible he is right. But it's more like an article of faith than something rigorous.

This post explains the source of much of my skepticism of AI better than I could. But the idea of LLMs as ultimate postmodernists insofar as they are masters of language and nothing else is a key insight that I'm mad I didn't think of first.

Of course this is no accident since the very idea is just a sophisticated generalization of Markov chains which were famously great at generating pomospeak.

But it is getting to the level where it might have practical utility now.

Provided nobody finds an unfavorable equilibrium in the AI detector arms race, or at least none that also would allow human nonsense, this tool could be the final solution to the problem of credentialism.

Why indeed listen to the academic if you can replicate him with AI well enough that you could get his diploma without needing but to press a button? And then we can merrily go back to judging shamans through the only metric that matters ultimately: whether the hexes work or not.

...and that's a bingo. (Insert your preferred Christoph Waltz meme as you see fit.) ;-)