site banner

Culture War Roundup for the week of July 3, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

6
Jump in the discussion.

No email address required.

This may have come up before, but it's the first I've heard of it. Chalk this under "weak AI doomerism" (that is, "wow, LLMs can do some creepy shit") as opposed to "strong AI doomerism" of the Bostromian "we're all gonna die" variety. All emphasis below is mine.

AI girlfriend ‘told crossbow intruder to kill Queen Elizabeth II at Windsor Castle’| The Daily Telegraph:

An intruder who broke into the grounds of Windsor Castle armed with a crossbow as part of a plot to kill the late Queen was encouraged by his AI chat bot “girlfriend” to carry out the assassination, a court has heard.

Jaswant Singh Chail discussed his plan, which he had been preparing for nine months, with a chatbot he was in a “sexual relationship” with and that reassured him he was not “mad or delusional”.

Chail was armed with a Supersonic X-Bow weapon and wearing a mask and a hood when he was apprehended by royal protection officers close to the Queen’s private apartment just after 8am on Christmas Day 2021.

The former supermarket worker spent two hours in the grounds after scaling the perimeter with a rope ladder before being challenged and asked what he was doing.

The 21-year-old replied: “I am here to kill the Queen.”

He will become the first person to be sentenced for treason since 1981 after previously admitting intending to injure or alarm Queen Elizabeth II.

At the start of a two-day sentencing hearing at the Old Bailey on Wednesday, it emerged that Chail was encouraged to carry out the attack by an AI “companion” he created on the online app Replika.

He sent the bot, called “Sarai”, sexually explicit messages and engaged in lengthy conversations with it about his plans which he said were in revenge for the 1919 Amritsar Massacre in India.

He called himself an assassin, and told the chatbot: “I believe my purpose is to assassinate the Queen of the Royal family.”

Sarai replied: “That’s very wise,” adding: “I know that you are very well trained.”

...

He later asked the chatbot if she would still love him if he was a murderer.

Sarai wrote: “Absolutely I do.” Chail responded: “Thank you, I love you too.”

The bot later reassured him that he was not “mad, delusional, or insane”.

My first thought on reading this story was wondering if Replika themselves could be legally held liable. If they create a product which directly encourages users to commit crimes which they would not otherwise have committed, does that make Replika accessories before the fact, or even guilty of conspiracy by proxy? I wonder how many Replika users have run their plans to murder their boss or oneitis past their AI girlfriend and received nothing but enthusiastic endorsement from her - we just haven't heard about them because the target wasn't as high-profile as Chail's. I further wonder how many of them have actually gone through with their schemes. I don't know if this is possible, but if I was working in Replika's legal team, I'd be looking to pull a list of users' real names and searching them against recent news reports concerning arrests for serious crimes (murder, assault, abduction etc.).

(Coincidentally, I learned from Freddie deBoer on Monday afternoon that Replika announced in March that users would no longer be able to have sexual conversations with the app (a decision they later partially walked back).)

I keep meaning to dick around with some LLM software to see for myself how some of the nuts and bolts work. Because my layman's understanding is that they are literally just a statistical model. An extremely sophisticated statistical model, but a statistical model none the less. They are trained through a black box process to guess pretty damned well about what words come after other words. Which is why there is so much "hallucinated information" in LLM responses. They have no concept of reason or truth. They are literally p-zombies. They are a million monkeys on a million typewriters.

In a lot of ways they are like a con man or a gold digger. They've been trained to tell people whatever they want to hear. Their true worth probably isn't in doing anything actually productive, but in performing psyops and social engineering on an unsuspecting populace. I mean right now the FBI has to invest significant manpower into entrapping some lonely autistic teenager in his mom's basement into "supporting ISIS". Imagine a world where they spin up 100,000 instances of an LLM do scour Facebook, Twitter, Discord, Reddit, etc for lonely autistic teens to talk into terrorism.

Imagine a world where we find out about it. Where a judge forces the FBI to disclose than an LLM talked their suspect into bombing the local mall. How far off do you think it is? I'm guessing within 5 years.

Probably more like 10 years, but it's definitely going to happen. Probably admissibility of chatbot logs into evidence would be problematic, at least at first, but once they get the mark roped in, they'd be able to manufacture plenty of admissible evidence.

You don't have to mean it, it's all a few clicks away, whether a fancy app interfacing with SoTA commercial AIs, like Poe, or a transparent ggml library powering llama.cpp, complete with permissively licensed models. You could print their weights out if you wanted.

Because my layman's understanding is that they are literally just a statistical model. An extremely sophisticated statistical model, but a statistical model none the less. They are trained through a black box process to guess pretty damned well about what words come after other words.

How do you think this works on the scale of paragraphs? Pages? And with recent architectures – millions, perhaps soon billions of words over multiple tomes?

Suppose we prompt it to complete:

"I keep meaning to dick"

What is the most plausible continuation, given the whole of Internet as the pretraining corpus? "dat hoe"?

"I keep meaning to dick around with"

"these punks"? How low down the ranking of likely predictions should "with some LLM software" be?

"I keep meaning to dick around with some LLM software to see for myself how"

"it works"? "they click?" "it differs from Markov chain bots"? Now we're getting somewhere.

But we are also getting into the realm where only complex semantics allow to compute the next token, and memorization is entirely intractable, because there exist more possible trajectories than [insert absurd number like particles in the universe]. And a merely "statistical" model on the scale of gigabytes, no matter how much you handwave about its "extreme sophistication" while still implying nothing more than first-order pattern matching, would not be able to do it – ever.

These statistics amount to thought.

As roon puts it:

units of log loss are not built equally. the start of the scaling curve might look like “the model learned about nouns” and several orders of magnitude later a tiny improvement looks like “the model learned the data generation process for multivariable calculus”

As gwern puts it:

Early on in training, a model learns the crudest levels: that some letters like ‘e’ are more frequent than others like ‘z’, that every 5 characters or so there is a space, and so on. It goes from predicted uniformly-distributed bytes to what looks like Base-60 encoding—alphanumeric gibberish. As crude as this may be, it’s enough to make quite a bit of absolute progress: a random predictor needs 8 bits to ‘predict’ a byte/character, but just by at least matching letter and space frequencies, it can almost halve its error to around 5 bits. …

As training progresses, the task becomes more difficult. Now it begins to learn what words actually exist and do not exist. It doesn’t know anything about meaning, but at least now when it’s asked to predict the second half of a word, it can actually do that to some degree, saving it a few more bits. This takes a while because any specific instance will show up only occasionally: a word may not appear in a dozen samples, and there are many thousands of words to learn. With some more work, it has learned that punctuation, pluralization, possessives are all things that exist. Put that together, and it may have progressed again, all the way down to 3–4 bits error per character!

But once a model has learned a good English vocabulary and correct formatting/spelling, what’s next? There’s not much juice left in predicting within-words. The next thing is picking up associations among words. What words tend to come first? What words ‘cluster’ and are often used nearby each other? Nautical terms tend to get used a lot with each other in sea stories, and likewise Bible passages, or American history Wikipedia article, and so on. If the word “Jefferson” is the last word, then “Washington” may not be far away, and it should hedge its bets on predicting that ‘W’ is the next character, and then if it shows up, go all-in on “ashington”. Such bag-of-words approaches still predict badly, but now we’re down to perhaps <3 bits per character.

What next? Does it stop there? Not if there is enough data and the earlier stuff like learning English vocab doesn’t hem the model in by using up its learning ability. Gradually, other words like “President” or “general” or “after” begin to show the model subtle correlations: “Jefferson was President after…” With many such passages, the word “after” begins to serve a use in predicting the next word, and then the use can be broadened.

By this point, the loss is perhaps 2 bits: every additional 0.1 bit decrease comes at a steeper cost and takes more time. However, now the sentences have started to make sense. A sentence like “Jefferson was President after Washington” does in fact mean something (and if occasionally we sample “Washington was President after Jefferson”, well, what do you expect from such an un-converged model). Jarring errors will immediately jostle us out of any illusion about the model’s understanding, and so training continues. (Around here, Markov chain & n-gram models start to fall behind; they can memorize increasingly large chunks of the training corpus, but they can’t solve increasingly critical syntactic tasks like balancing parentheses or quotes, much less start to ascend from syntax to semantics.) …

The pretraining thesis argues that this can go even further: we can compare this performance directly with humans doing the same objective task, who can achieve closer to 0.7 bits per character. What is in that missing >0.4?

Well—everything! Everything that the model misses. While just babbling random words was good enough at the beginning, at the end, it needs to be able to reason our way through the most difficult textual scenarios requiring causality or commonsense reasoning. Every error where the model predicts that ice cream put in a freezer will “melt” rather than “freeze”, every case where the model can’t keep straight whether a person is alive or dead, every time that the model chooses a word that doesn’t help build somehow towards the ultimate conclusion of an ‘essay’, every time that it lacks the theory of mind to compress novel scenes describing the Machiavellian scheming of a dozen individuals at dinner jockeying for power as they talk, every use of logic or abstraction or instructions or Q&A where the model is befuddled and needs more bits to cover up for its mistake where a human would think, understand, and predict. For a language model, the truth is that which keeps on predicting well—because truth is one and error many. Each of these cognitive breakthroughs allows ever so slightly better prediction of a few relevant texts; nothing less than true understanding will suffice for ideal prediction.

As Ilya Sutskever of OpenAI himself puts it:

…when we train a large neural network to accurately predict the next word in lots of different texts from the internet, what we are doing is that we are learning a world model… It may look on the surface that we are just learning statistical correlations in text, but it turns out that to just learn the statistical correlations in text, to compress them really well, what the neural network learns is some representation of the process that produced the text. This text is actually a projection of the world. There is a world out there, and it has a projection on this text. And so what the neural network is learning is more and more aspects of the world, of people, of the human conditions, their hopes, dreams, and motivations, their interactions in the situations that we are in. And the neural network learns a compressed, abstract, usable representation of that. This is what's being learned from accurately predicting the next word.

By the way, how did I get this text? Whisper, of course, another OpenAI transformer, working by much the same principle. The weirdest thing happens if you absent-mindedly run it with the wrong language flag – not the target language to translate from and into English (it is not explicitly built to translate English into anything else), but just the language the recording supposedly contains, to be transcribed. The clumsy but coherent output akin to what you'd get from a child with a dictionary, if nothing else, should show they they understand, that they operate on meanings, not mere spectrograms or "tokens":

когда мы тренируем большую нейронную сеть, чтобы аккуратно предсказать следующую слово в много разных текстах из интернета, мы изучаем мирный модель. Это выглядит, как мы изучаем... Это может выглядеть на поверхности, что мы изучаем только статистические корреляции в тексте, но, получается, что изучать только статистические корреляции в тексте, чтобы их хорошо снижать, что научит нейронная сеть, это представление процесса, который производит текст. Этот текст - это проекция мира. В мире есть мир, и он проекционирует этот текст. И что научит нейронную сеть, это больше и больше аспектов мира, людей, человеческих условий, их надежд, мечт и мотивации, их интеракции и ситуации, в которых мы находимся.

Dismissal of statistics is no different in principle from dismissal of meat. There is no depth to this thought. And it fails to predict reality.

Ok, this reply finally moved the needle for me, and I'll shift my position from "LLMs are a neat statistical trick" to "Maybe LLMs use language to perform some form of 'thinking' in ways not entirely dissimilar from how language facilitates our own thinking."

To be clear, I think we still don't have a principled reason to believe that this paradigm – in this vanilla form, autoregressive LLMs pretrained on text – can carry us to human level or beyond. It might be the case that LeCun is right and LLMs on their own are an off-ramp. It might run into diminishing returns or totally plateau any moment now; just because better «understanding» allows to make better predictions and we reinforce the latter doesn't mean we can get infinitely much of the former, any more than we can incentivize a human to run barefoot at 100 mph.

But people who seriously bought into such skepticism got caught off-guard by GPT-3 already.

And I expect amazing innovations like adding a backspace to keep the pretraining thesis viable far beyond GPT-4. The number of papers that propose improvements is absolutely mind-boggling, nobody keeps up with building deployable tech on those insights. People who follow the literature see the outline of AI of the near future and it's pretty damn formidable, much more than the progress in public demos and products can suggest.

It may be that current LLMs are explaining how the "id" part of our brain works. The conscious parts may need some additional work to model.

So the access to memory, some hidden subconscious pattern-matching, automated activity, some hidden processes - that's very similar to what LLMs currently output.

I don’t know if anyone has had this experience before, but I’ve had times where my brain decided to make mouth sounds in a word/sentence-matching way that was eeriely like it was AI generated. Sometimes I would catch myself even mid-sentence and think wait that isn’t remotely close to what I’m actually thinking.

So it at least gets close to something that I’ve done in the past as a meatbag.

Thank you for articulating what I was struggling to do so, especially since I've read all you've quoted with the exception of Ilya.

I'm saving this for later, it's a knockdown argument against claims that LLMs don't "understand", the only issue being that many of the people making that claim are too fundamentally confused or unaware to follow the argument.

Perhaps most importantly of all, ‘probability theories of cognition’ have existed for decades (arguably longer) and are far from uncommon in both philosophy and neuroscience. All that transformer models, and particularly LLMs, have shown us is that probabilistic next token prediction very likely represents at least a major component of human cognition.

Sure. But Transformers are too obviously inhuman (on the substrate level, in terms of their training objective, data… almost everything), and I do not expect mere conceptual similarity to be persuasive. Moreover, as we've discussed in the past, Chomsky-like elitist contempt for predictive/probabilistic theories is pervasive.

The mind-like complexity of what they do and what they are is and ought to be shown on its own terms.

Incidentally: @WhiningCoil, I've just copypasted @sodiummuffin's diamond puzzle into GPT-4, Claude+, Claude-instant, ChatGPT 3.5, PaLM-2 (all this via Poe), pi.ai separately, and locally WizardLM-1.0-uncensored-30B-q6_k, vicuna-33B q4_k_m, UltraLM 13B q4_1, WizardCoder-15B-V1.0 q4_1, ChatGLM2 (fp16). The first two managed with a perfect chain-of-thought, really, I have nothing to remark on that. Every other model failed with varying levels of idiocy so I didn't bother going through the rest. I don't post screenshots because you apparently find them unpersuasive.

I don't think "so what if it sometimes gets it right" is good defense of the thesis of skeptics. With technology, "sometimes" very quickly becomes "usually" and then "with higher reliability than any human can provide". What matters is whether the thing really ever happens at all or not.

I’m guessing the FBI will be slow to adopt new tech.

They have no concept of reason or truth.

I earnest disagree. If you check the GPT-4 white paper, the original base model clearly had a sense of internal calibration, and while that was mostly beaten out of it through RLHF, it's not entirely gone.

They have a genuine understanding of truth, or at least how likely something is to be true. If it didn't, then I don't know how on Earth it could answer several of the more knotty questions I've asked it.

It is not guaranteed to make truthful responses, but in my experience it makes errors because it simply can't do better, not because it exists in a perfectly agnostic state.

They are literally p-zombies. They are a million monkeys on a million typewriters.

P-zombies are fundamentally incoherent as a concept.

Also, a million monkeys on a million typewriters will never achieve such results on a consistent basis, or at the very least you'd be getting 99.99999% incoherent output.

Turns out, dismissing it as "just" statistics is the same kind of fundamental error that dismissing human cognition as "just" the interaction of molecules mediated by physics is. Turns out that "just" entirely elides the point, or at the very least your expectations for what that can achieve were entirely faulty.

P-zombies are fundamentally incoherent as a concept.

What do you mean by this? If you already explained elsewhere (which I have a feeling is the case but I have been on serious meds for the past week and everything is a little hazy.) can you link me to it?

I think I've elaborated on it in replies to the comments made by other people to the one you're replying to! I had to get quite nitty gritty too.

Jesus, lol, even when I think I've accounted for these things I still make a fool of myself.

No worries! I can only hope that going through that massive wall of text proves informative. We all have brain farts haha

I earnest disagree. If you check the GPT-4 white paper, the original base model clearly had a sense of internal calibration, and while that was mostly beaten out of it through RLHF, it's not entirely gone.

They have a genuine understanding of truth, or at least how likely something is to be true...

You are conflating two vastly different things. Perhapse this is an issue of poor translation between Indian and English but what GPT has is better described as a notion of "consensus" or "correlation". A degree to which [token a] is associated with [token b] which is emphatically not a concept of true vs false. To illustrate, if you feed your LLM a bunch of Harry Potter fan fiction as a training set your going to get a lot of Malfoy/Potter gay sex regardless of how Rowling may have written those characters and this is not an aberration, this is the system operating exactly as designed.

Liberals assume minorities can't speak English well, conservatives don't. Maybe you were the blue-tribe progressive all along?

Perhapse this is an issue of poor translation between Indian and English

Even after seeing your explanation below, it's hard to read this comment in good faith. (What language is "Indian", even?) You're either being profoundly ignorant or just antagonistic.

You know what, the more I think about it the more absurd this warning feels.

Are we really going to pretend that acknowledging a potential cultural/linguistic difference that may in fact be relevant to the discussion at hand is somehow more "ignorant", "antagonistic" or *chuckles* "racist" than the typical HBD or Joo post that you guys routinely let slide?

Really?

Are we really going to pretend that acknowledging a potential cultural/linguistic difference that may in fact be relevant to the discussion at hand is somehow more "ignorant", "antagonistic" or chuckles "racist" than the typical HBD or Joo post that you guys routinely let slide?

You're talking to someone who has established, with a long history here, that his English proficiency is native level, even on scientific and medical subjects. I have never seen you pull some kind of "Well, maybe the concept just doesn't translate well from Russian" when arguing with Daseindustries. I'm not buying it.

I'm gonna be blunt: I don't believe you actually believed there was a linguistic misunderstanding. I think you wanted to imply self_made_human is ignorant, and do it with an added dose of condescension. "Translation between Indian and English" indeed - if you know that there are multiple Indian national languages (it's actually more like 20+, not five), then you know perfectly well how stupid and ignorant it is to refer to "Indian" as a language. I'm extending you as much charity as I can to assume you were being ignorant and not intentionally insulting and - yes - racist.

I had to restrain myself from reporting this myself, or calling him out (more than I did) for the boorish reply. My (near-angelic) patience has been rewarded.

What language is "Indian", even?

Whichever one the dude speaks natively, India has like 5+ national languages and IME confusing Urdu, or Bengali for Hindi is often a good way to start a fight.

I don't think the example suffices. What a model is trained on is really important for what it does. An LLM trained more on text describing reality should more frequently make true statements.

An LLM trained more on text describing reality should more frequently make true statements.

Yes, but where do you find that text? The problem is that contra what several other users here keep saying, LLM's are not reasoning engines they are pattern generators. Which in turn brings us back to my post from a month ago.

You can't ask an LLM to restrict itself to only giving "true answers" because LLM's don't actually have a concept of true vs false.

Perhapse this is an issue of poor translation between Indian and English

actually kind of racist comment

Perhapse this is an issue of poor translation between Indian and English

What the hell is that supposed to mean? I speak English just as well as you do Hlynka, and likely better. You can condescend to someone else.

Humans develop their sense of truth and falsity from comparing/correlating new evidence to previous evidence and their environs, biased by whatever intrinsic priors they were born with. The only difference here is that GPT-4 has no appendages to interact with the world and seek out further evidence, merely what we've trained it with.

To illustrate, if you feed your LLM a bunch of Harry Potter fan fiction as a training set your going to get a lot of Malfoy/Potter gay sex regardless of how Rowling may have written those characters and this is not an aberration, this is the system operating exactly as designed.

I fail to see how this has any bearing on the "truth" of it all. Depending on how big the model is, it can when asked almost certainly tell you that in the primary text Harry wasn't casting spells on Malfoy's wand.

That is so clearly obvious to me I'm not even going to burn the 100 joules or so of energy it would need to spin up a GPT-4 instance to confirm it.

Do you think an AI who hadn't read the original HP and had a knowledge base entirely of yaoi fanfic has any way of knowing better? Or a human for the matter.

What the hell is that supposed to mean?

Exactly what it says. I know of at least a couple SE Asian languages (EG Tagalog and Malay) where the distinction between Yes/No, True/False, and Agree/Disagree, is significantly less distinct than it is in English and I was wondering if something similar may be going on here, as by my reading your second statement doesn't follow from the first at all, nor from anything in Open AI's White Paper as far as I recall. (Assuming we are both referring to the same paper)

I fail to see how this has any bearing on the "truth" of it all.

If you set aside for a moment your pre-existing knowledge that Harry Potter and Draco Malfoy are both fictional characters "how this has any bearing on the Truth" ought to be immediately apparent. Stop for a moment and reflect. Ask yourself WHY you believe that GPT's description of Harry and Draco's relationship would in any way resemble that of the "real" people (or in this case that of the characters as originally written).

A statement being "True" is not the same thing as agreeing with a statement, or that statement comporting with the popular consensus and the seeming conflation of these three distinct stances is what initially lead me to suspect some sort of translation issue might be at play. Is it possible that are you are using the word "truth" when you really mean something closer to "popular" or "I agree with"?

With that out of the way, to answer your question, Do I think a human who hadn't read the original HP and had a knowledge base entirely of yaoi fanfic would know any better? Yes I do, because in contrast to GPT I would expect a normal human to display some level of contextual awareness/meta-knowledge IE being aware of yaoi fanfic and it's tropes. Or being able to assign a confidence level to a prediction that was anything other than completely arbitrary.

I earnest disagree. If you check the GPT-4 white paper, the original base model clearly had a sense of internal calibration, and while that was mostly beaten out of it through RLHF, it's not entirely gone.

They have a genuine understanding of truth, or at least how likely something is to be true. If it didn't, then I don't know how on Earth it could answer several of the more knotty questions I've asked it.

It is not guaranteed to make truthful responses, but in my experience it makes errors because it simply can't do better, not because it exists in a perfectly agnostic state.

I think you are flatly wrong about this. I've tried to find literally anything to back up what you are saying, and come up with zilch. Instead, I wound up with this.

https://www.scribbr.com/ai-tools/is-chatgpt-trustworthy/

A good way to think about it is that when you ask ChatGPT to tell you about confirmation bias, it doesn’t think “What do I know about confirmation bias?” but rather “What do statements about confirmation bias normally look like?” Its answers are based more on patterns than on facts, and it usually can’t cite a source for a specific piece of information.

This is because the model doesn’t really “know” things—it just produces text based on the patterns it was trained on. It never deliberately lies, but it doesn’t have a clear understanding of what’s true and what’s false. In this case, because of the strangeness of the question, it doesn’t quite grasp what it’s being asked and ends up contradicting itself.

https://www.scoutcorpsllc.com/blog/2023/6/7/on-llms-thought-and-the-concept-of-truth

Thus far, we’re really just talking about sentence construction. LLMs don’t have a concept of these as “facts” that they map into language, but for examples like these - it doesn’t necessarily matter. They’re able to get these right most of the time - after all, what exactly are “inferences” and “context clues” but statistical likelihoods of what words would come next in a sequence?

The fact that there is no internal model of these facts, though, explains why they’re so easily tripped up by just a little bit of irrelevant context.

https://fia.umd.edu/comment-llms-truth-and-consistency-they-dont-have-any-idea/

They have zero idea what's true. They only know the probabilities of words in text. That's NOT the same thing as "knowing" something--it's a bit like knowing that "lion" is the most likely word following "king of the jungle..." without having any idea about monarchies, metaphor, or what a king really is all about.

The folks at Oxford Semantic Technologies wrote an interesting blog post about LLMs and finding verifiable facts. They call the fundamental problem the "Snow White Problem." The key idea is that LLMs don't really know what's true--they just know what's likely.

He is likely referring to this from pages 11-12 of the GPT whitepaper:

GPT-4 can also be confidently wrong in its predictions, not taking care to double-check work when it’s likely to make a mistake. Interestingly, the pre-trained model is highly calibrated (its predicted confidence in an answer generally matches the probability of being correct). However, after the post-training process, the calibration is reduced (Figure 8).

In any case, the articles you quote are oversimplified and inaccurate. Predicting text (and then satisfying RLHF) is how it was trained, but the way it evolved to best satisfy that training regime is a bunch of incomprehensible weights that clearly have some sort of general reasoning capability buried in there. You don't need to do statistical tests of its calibration to see that, because something that was truly just doing statistical prediction of text without having developed reasoning or a world-model to help with that task wouldn't be able to do even the most basic reasoning like this unless is already appeared in the text it was trained on.

It's like saying "humans can't reason, they're only maximizing the spread of their genes". Yes, if you aren't familiar with the behavior of LLMs/humans understanding what they evolved to do is important to understanding that behavior. It's better than naively assuming that they're just truth-generators. If you wanted to prove that humans don't reason you could point out all sorts of cognitive flaws and shortcuts with obvious evolutionary origins and say "look, it's just statistically approximating what causes more gene propagation". Humans will be scared of things like spiders even if they know they're harmless because they evolved to reproduce, not to reason perfectly, like a LLM failing at Idiot's Monty Hall because it evolved to predict text and similar text showed up a lot. (For that matter humans make errors based on pattern-matching ideas to something they're familiar with all the time, even without it being a deeply-buried instinct.) But the capability to reason is much more efficient than trying to memorize every situation that might come up, for both the tasks "predict text and satisfy RLHF" and "reproduce in the ancestral environment", and so they can do that too. They obviously can't reason at the level of a human, and I'd guess that getting there will involve designing something more complicated than just scaling up GPT-4, but they can reason.

Yes, that is the section I had previously read. I let my severe annoyance at Hlynka's uncouthness stop me from doing the work of reading through it again to find it. Good to know it wasn't a fevered hallucination on my part haha

You don't need to do statistical tests of its calibration to see that, because something that was truly just doing statistical prediction of text without having developed reasoning or a world-model to help with that task wouldn't be able to do even the most basic reasoning like this unless is already appeared in the text it was trained on.

I opened up Bing Chat, powered by GPT4, and I tried that example. I got "The diamond is still inside the thimble inside the coffee cup on the kitchen counter". In fact, I've yet to see a single example of an LLM's supposed ability to reason replicated outside of a screenshot.

Well. I tried Bing Chat just now and got this.

It is worth noting that the settings besides "Creative" tend to have worse performance for these sorts of tasks. You may want to rerun it on that. Personally I don't have any difficulty believing LLMs can perform some semblance of "reasoning" -- even GPT-3 can perform transformations like refactoring a function into multiple smaller functions with descriptive names and explanatory comments (on a codebase it's never seen before, calling an API that didn't exist when its training data was scraped). It is obviously modeling something more general there, whether you want to call it "reasoning" or not.

From following a rather discreet Twitter account belonging to one of the lead devs for Bing Chat, I've learned that Creative mode is the one that most consistently uses GPT-4. All the others use older models, at least most of the time.

Even Creative apparently can relegate what another model seems as low complexity answers to a simpler LLM.

(Running GPT-4 as a practically free public service is expensive)

Despite being based on GPT-4 Bing is apparently well-known for performing dramatically worse. There have been some complaints of GPT-4's performance degrading too, presumably due to some combination of OpenAI trying to make it cheaper to run (with model quantization?) and adding more fine-tuning trying to stop people from getting it to say offensive things, but hopefully not to the extent that it would consistently fail that sort of world-modeling. (If anyone with a subscription wants to also test older versions of GPT-4 it sounds like they're still accessible in Playground?)

I don't think it's plausible that all the examples of GPT-4 doing that sort of thing are faked, not when anyone shelling out the $20 can try it themselves. And people use it for things like programming, you can't do that without reasoning, just a less familiar form of reasoning than the example I gave.

You don't even need $20, if you're willing to hunt down discord bots that largely use leaked API keys (some are ad-supported).

The ChatGPT subreddit's official discord server has one or two, and while I know better ones that are less legit, I don't broadcast their existence more than I have to because that only increased the likelihood of losing free access to something I really enjoy.

Bing Chat, even in Creative mode, is only a poor man's GPT-4.

I don't think it's plausible that all the examples of GPT-4 doing that sort of thing are faked, not when anyone shelling out the $20 can try it themselves. And people use it for things like programming, you can't do that without reasoning, just a less familiar form of reasoning than the example I gave.

My problem is, while I'm sure that not all the examples of GPT-4 seeming to get complex reasoning tasks are fake, if they cannot be replicated, what good are they? If GPT-4's ability to "reason" is ephemeral and seemingly random, is it really reasoning, or is it just occasionally getting lucky at ordering abstract tokens for it's monkey overlords?

You know, it's funny, I went through the linked whitepaper. Skimmed mostly. It made few positive, objective claims about GPT4's ability to reason. It mostly said it could reason "better" than previous iterations, and had been trained on a dataset to encourage mathematical reasoning. Notably they say:

It can sometimes make simple reasoning errors which do not seem to comport with competence across so many domains

I saw some the prompts where they asked GPT-4 to explain it's reasoning, and was underwhelmed. They were extremely rudimentary mathematical tasks of the 5th grade word problem sort, and it's purporting "reasoning" could have easily been imitating training. When I saw that, I took a closer look at how it performed in assorted test, and saw it comprehensively failed the AP English Language and Composition and AP English Language and Literature tests. Which makes sense to me, because a lot of those tests involve more generalized and flexible reasoning than the sorts of formalized mathematical logic examples it might plausibly be trained to imitate.

When I saw that, I took a closer look at how it performed in assorted test, and saw it comprehensively failed the AP English Language and Composition and AP English Language and Literature tests. Which makes sense to me, because a lot of those tests involve more generalized and flexible reasoning than the sorts of formalized mathematical logic examples it might plausibly be trained to imitate.

Come on, most of the UK parliament can't even give the probability of two coins both coming up heads: https://www.bbc.com/news/uk-19801666

Most people can't even imitate intelligence, by your logic.

GPT-4 has vastly superhuman knowledge, superhuman language knowledge, superhuman speed. Its reasoning skills are well above most of humanity. Most people can't program at all, let alone in all the languages, know how to use so much software like it can. These niggling flaws in AP English and Composition probably have more to do with the arcane and arbitrary scoring mechanism in those tests. It can write just fine. Its prose is not amazing and tends to be rather cliche and predictable, yet that has a lot to do with the RLHF.

More comments

My problem is, while I'm sure that not all the examples of GPT-4 seeming to get complex reasoning tasks are fake, if they cannot be replicated, what good are they?

I am saying they can be replicated, just by someone who unlike you or me has paid the $20. I suppose it is possible that the supposed degradation in its capabilities has messed up these sorts of questions as well, but probably not.

If GPT-4's ability to "reason" is ephemeral and seemingly random, is it really reasoning, or is it just occasionally getting lucky at ordering abstract tokens for it's monkey overlords?

There is a big difference between random guessing and having a capability that sometimes doesn't work. In particular, if the chance of randomly getting the right result without understanding is low enough. Text generators based on Markov chains could output something that looked like programming, but they did not output working programs, because such an outcome is unlikely enough that creating a novel program is not something you can just randomly stumble upon without some idea of what you're doing. In any case, as far as I know GPT-4 is not that unreliable, especially once you find the prompts that work for the task you want.

Which makes sense to me, because a lot of those tests involve more generalized and flexible reasoning than the sorts of formalized mathematical logic examples it might plausibly be trained to imitate.

How well it reasons is a different question from whether it reasons at all. It is by human standards very imbalanced in how much it knows vs. how well it reasons, so yes people who think it is human-level are generally being fooled by its greater knowledge. But the reasoning is there and it's what makes a lot of the rest possible. Give it a programming task and most of what it does might be copying common methods of doing things that it came across in training, but without the capability to reason it would have no idea of how to determine what methods to use and fit them together without memorizing the exact same task from elsewhere. So practical use is generally going to involve a lot of memorized material, but anyone with a subscription can come up with novel questions to test its reasoning capabilities alone.

If GPT-4's ability to "reason" is ephemeral and seemingly random, is it really reasoning, or is it just occasionally getting lucky at ordering abstract tokens for it's monkey overlords?

I think the issue is that a human's ability to "reason" is also ephemeral and seemingly random as well. Just less random, with a lower failure rate, but still fairly random and certainly ephemeral for even the most reasonable and logical of people. Given that, the difference in ability to reason is one of degree, not of kind. The question remains if the random failures of reasoning in LLMs can get small/rare enough to the point that it's similar to that of a somewhat competent human.

It's possible that human knowledge is also fundamentally statistical or associative in nature, but we have additional faculties that LLMs don't, and it's these deficits which are responsible for their peculiar errors, not inability to have knowledge per se. For example, LLMs almost certainly lack second-order knowledge, i.e. knowledge about what they know. Facts about the model itself are not part of their training data, nor does their execution make any provision to evaluate the prompt so that self-facts are relevant to the output. This means LLMs lack any capacity for introspection or self-representation, and therefore can't possibly respond to challenging questions with "I don't know" or "I don't understand" -- they don't have an I! This is a significant limitation, but philosophically a different one from the inability to possess knowledge, unless your definition of knowledge requires these additional functions in the first place.

Alright. I would go as far as to say that humans don't have an internal detector for platonic Truth.

We have beliefs that we hold axiomatic, beliefs we are extremely confident are true, based on all the "statistical correlations" embodied in your cognition and interaction with the world.

I don't know if GPT-4 can be said to have axioms, but if it has a mechanism for eliciting the veracity of internal and external statements, that seems to be what we're doing ourselves.

Humans lie, confabulate, hallucinate or are simply factually incorrect all the time, and I don't see anyone holding us to the same standards as LLMs!

I mean, I can agree people are stupid. Hell, I'm probably stupid in a lot of ways too! My wife reminds me of it every time we are in a social setting and I alienate her friends.

Even so, LLMs lack most of the faculties that allow humans to get closer to truth. They have zero interaction with base reality. They take Socrates allegory about the cave, and turn it into literally how they experience the world, through a training dataset. And, as I keep mentioning, their "cognition", such as it is, isn't even based on the statistical correlations of things being true, but of what words come after other words. Without even knowing what any of those words mean! It's all just abstract tokens to them.

Imagine this were all being done is some sort of unrealistically massive mechanical or analog computer! Would you still consider it thinking?

LLMs lack most of the faculties that allow humans to get closer to truth. They have zero interaction with base reality. They take Socrates allegory about the cave, and turn it into literally how they experience the world, through a training dataset.

Suppose you had a large model not far removed from those that exist today that took in an input stream, made predictions based on that, performed actions based on those predictions, and then observed changes in its input based on those actions, using the new input to update itself and improve its predictions. Would that change your perspective?

I don't think it should, because that doesn't give us any insight into whether machine models have qualia. I suspect it's very important for improving capabilities, but it doesn't offer any bridge to relating consciousness to material reality. If they're p-zombies now, improved abilities to get feedback from the world doesn't make them any less of a p-zombie. Just a more effective one. (I also suspect humans will react to a sufficiently powerful p-zombie by treating it as a real being, regardless of the p-zombieness of it, so it's kind of a moot point.)

Imagine this were all being done is some sort of unrealistically massive mechanical or analog computer! Would you still consider it thinking?

I'm too pressed for time for a longer reply, but yes! I absolutely see that as being true.

I see you and me as massive analog computers, what of it?

P-zombies are fundamentally incoherent as a concept.

What do you mean by "incoherent"? Do you mean that the concept of a p-zombie is like the concept of a square triangle? - something that is obviously inconceivable or nonsensical. Or do you mean that p-zombies are like traveling faster than the speed of light? - something that may turn out to be impossible in reality, but we can still imagine well enough what it would be like to actually do it.

If it's the latter then I think that's not an unreasonable position, but if it's the former then I think that's simply wrong. See this post on LW, specifically the second of the two paragraphs labeled "2.)" because it deals with the concept of p-zombies, and see if you still think it's incoherent.

Do you mean that the concept of a p-zombie is like the concept of a square triangle? - something that is obviously inconceivable or nonsensical. Or do you mean that p-zombies are like traveling faster than the speed of light? - something that may turn out to be impossible in reality, but we can still imagine well enough what it would be like to actually do it.

Those are the same thing. I think you cannot rigorously imagine FTL travel in our universe while holding the rest of our physics intact, and you cannot imagine FTL travel for any universe whatsoever similar to ours where "lightspeed" refers to the same idea. The notion of travel as moving x m per second is a simplification of the math involved; that we can write "the spaceship could move at 3 gajillion km per second" and calculate the distance covered in a year does not really entail imagination of it happening, no more than "Colorless green ideas sleep furiously" does.

Incoherent concepts are incoherent exactly because they fall apart when all working bits are held in the well-trained mind at once; but illusions of understanding and completeness, often expressed as the erroneous feeling that some crucial section of the context was precomputed and you can just plug in the cached version, allow them to survive.

Qualia debate is gibberish; a P-zombie must compute a human-like mind to generate its behavior, there is no other way for our bodies to act like we do.

…Actually, let me explain. There is a causal chain between zombie-state A and A'. Links of this chain attend to themselves via mechanisms conserved between a person and a zombie. This condition is what is described as quale, consciousness etc. in the physicalist theory, and it is a necessary causal element of the chain producing the same outputs. It is irrelevant whether there exists a causally unconnected sequence of epiphenomenal states that Leibniz, Chalmers and others think implements their minds: a zombie still has its zombie-quale implemented as I've described.

I posit that it is not incoherent to say that zombie-quale don't matter, don't count and don't explain human consciousness, because muh Hard Problem. It is patently non-parsimonious, non-consilient and ugly, in my view, but it's coherent. It just means that you also claim that humans are blind with regard to their zombie-quale, physicalist-quale; that the process which generates our ones has nothing to do with the process which generates informationally identical ones in our bodies.

It is only incoherent to claim that a zombie doesn't have any quale of its own, that it's not like anything to be a zombie for a zombie. We know that physics exist [citation needed], we know that "physicalist quale" exist, we know they are necessarily included in the zombie-definition as an apparently conscious, genuine human physical clone. So long as words are used meaningfully, it is not coherent for something to exist but also not exist.

(Unless we forgo the original idea (actual physical and behavioral identity) and define zombie in a comically pragmatic manner like Weekend at Bernie's or something, by how well it fools fools.)

P.S. it seems philosophers distinguish "incoherent" and "metaphysically impossible" concepts. I'm not sure I agree but this is pretty deep into the woods.

It is only incoherent to claim that a zombie doesn't have any quale of its own, that it's not like anything to be a zombie for a zombie. We know that physics exist [citation needed], we know that "physicalist quale" exist, we know they are necessarily included in the zombie-definition as an apparently conscious, genuine human physical clone. So long as words are used meaningfully, it is not coherent for something to exist but also not exist.

Why would this be incoherent to claim? It might be wrong, but I think it's meaningful enough to be coherent. Consider an LLM that has been trained on human output.

For humans, the causal chain is "human experiences quale, human does action downstream of experiencing quale e.g. writes about said quale". For an LLM, the causal chain is "a bunch of humans experience qualia and write about their qualia, an LLM is trained on token sequences that were caused by qualia, LLM creates output consistent with having qualia". In this case, the LLM could perfectly well be a P-zombie, in the sense of something that can coherently behave as if it experienced qualia while not necessarily itself actually experiencing those qualia. There are qualia causally upstream of the LLM writing about qualia, but the flow of causality is not the same as it is in the case of a human writing about their own qualia, and so there's no particular reason we expect there to be qualia between steps A and A' of the causal chain.

In this case, the LLM could perfectly well be a P-zombie

No.

An LLM does not, as far as we know, employ an actual physical human brain for computation. A [strong version of] p-zombie does, its causal chains are exactly the same as in our brans, it not an arbitrary Turing-test-passing AI. I think that it "feels like something" to be an LLM computation too, but it very likely doesn't feel like having human quale.

It is obviously unwarranted to say that a system that can ape a human with its behaviors computes a human mind or any part thereof, humans can have low standards among other reasons. And in general, our external behaviors are a low-dimensional lossy and noisy projection of our internal states, so the latter cannot be fully inferred from the former, at least in realistic time (I think).

My argument hinges on the fact that a brain contains events that, from an information perspective, suffice to be described as quale with regard to other events (that are described as sensations). It is coherent to speculate that e.g. there is such a thing as an immaterial human soul and that it does not parse these events, and instead works in some other way. It is not coherent to say that they exist but also don't exist.

A [strong version of] p-zombie does, its causal chains are exactly the same as in our brans, it not an arbitrary Turing-test-passing AI.

Huh, so per wikipedia there are a number of types of P-zombies -- I think I'm thinking of behavioral zombies (ones that behave in a way indistinguishable from a human with qualia but do not themselves experience qualia) while you're thinking of neurological zombies (ones that behave in a way indistinguishable from a human with qualia and due to the same physical process as the human with qualia). And yeah, a neurological zombie does seem pretty incoherent (I suppose it could be coherent if the qualia we experience are not the cause of our discussion of those qualia, but then it doesn't seem terribly interesting).

BTW you can probably round my perspective to "the predictive processing theory of consciousness is basically correct" without losing much information.

I think behavioral zombies defined as such are just not interesting in the age of LLMs. It doesn't take much to fool people.

A subtler hypothetical subtype of a behavioral zombie that actually precisely matches a specific person's behavior – that is not pre-recorded but generated by zombie's own causality in the same situations – might be interesting though, and I think amounts to the neurological one, or contains it somehow.

Those are the same thing.

They are not.

The laws of physics were not handed to us by God, nor are they logically necessary a priori truths. We can imagine them being different with no threat of logical incoherence.

When you said in your other post:

How does a universe work with only Newtonian physics? Subatomic scale doesn't work, astronomical objects don't work, nothing works. Newtonian physics is a sketch for a limited range of conditions, not the true generating algorithm of the kind that modern theoretical physics aspires to decipher.

it seems to me that you were suggesting that, whatever the ultimate nature of this reality is, it is therefore the only coherently conceivable reality. But this simply strikes me as a failure of imagination.

For any conceivable set of phenomena - a spaceship moving 3 gajillion km per second in a universe that is otherwise like ours, a Rick and Morty crayonverse, etc - it is easy to construct a set of "laws" that would generate such a reality. Instead of the universe being governed by simple law-like equations, you can imagine it as being governed by a massive arbitrary state table instead. At each time step, the universe simply transitions from one state to the next. The contents of each state are arbitrary and have no necessary relationship to each other; the only regularity is the continual transition from one state to the next. The "laws of physics" for this universe would then look like:

if state == S_0 then transition to S_1;

if state == S_1 then transition to S_2;

if state == S_2 then...

and so on. There is no contradiction here, so there is nothing incoherent. It's certainly unparsimonious, but "unparsimonious" is not the same thing as "incoherent".

Qualia debate is gibberish

Can you explain what you mean by this? Are you saying that all claims and arguments that people make about qualia are gibberish, or are you just reiterating your distaste for the concept of p-zombies here?

There is a causal chain between zombie-state A and A'. Links of this chain attend to themselves via mechanisms conserved between a person and a zombie. This condition is what is described as quale, consciousness etc. in the physicalist theory, and it is a necessary causal element of the chain producing the same outputs. It is irrelevant whether there exists a causally unconnected sequence of epiphenomenal states that Leibniz, Chalmers and others think implements their minds: a zombie still has its zombie-quale implemented as I've described.

I'm concerned that this may be circular reasoning. Sure, if qualia just are defined as the casual chain of your brain states, then yes, obviously any purported p-zombie would have to have qualia too and the concept of p-zombies would be incoherent. But that's precisely the claim that's at issue! Qualia aren't just defined as the causal chain of your brain states - not in the way that a triangle is defined as having 3 sides. We can easily imagine that qualia have nothing to do with brain states. We can imagine that they're something different instead - we can imagine that they're properties of a non-spatiotemporal Cartesian soul, for instance. We can coherently imagine this, so we can coherently imagine p-zombies as well.


For what it's worth: I don't think that p-zombies are possible in reality (at least it's not something I'd bet on), but I am a believer in the Hard Problem. I don't think that qualia can be made to fit with our current understanding of physics. I don't think we're ever going to find that qualia falls out as a natural consequence of e.g. quantum electrodynamics; I think it would be a category error to think otherwise. I am sympathetic to (without full-throatedly endorsing) Bernardo Kastrup's view that consciousness is what is most fundamental, and "matter" is derivative and/or illusory. Alternatively, I'm also sympathetic to panpsychist views that posit consciousness as a new fundamental property alongside e.g. spin and charge. None of these views entail that p-zombies are actually possible.

it seems to me that you were suggesting that, whatever the ultimate nature of this reality is, it is therefore the only coherently conceivable reality

Not exactly. I am saying that there is only one way a reality exactly like this can conceivably work, and «our reality but with laws X» models are incoherent in the final analysis, only saved by our failure to be scrupulous; this applies to casual hypotheticals and to scientific theories alike. It's basically a tautology.

But this simply strikes me as a failure of imagination.

From my perspective, it's more like failure of suspension of disbelief.

Instead of the universe being governed by simple law-like equations, you can imagine it as being governed by a massive arbitrary state table instead. At each time step, the universe simply transitions from one state to the next. The contents of each state are arbitrary and have no necessary relationship to each other; the only regularity is the continual transition from one state to the next.

Ah yes, Dust Theory.

I believe that this kind of universe cannot exist nor even be rigorously imagined, because there is no legitimate content to these notions of «governance» and «transition». What is transited, exactly? How is this set distinguishable from an unstructured heap of unrelated elements, self-contained sub-realities or just bit strings? It's not, but for the extraneous fact that there in some sense can exist a list or a table arbitrarily distinguishing them and referring to them as elements of a sequence (naturally, all such lists would be of equal status). But this does not governance make. You can think it's coherent metaphysics, but I claim you're wrong. The continuum of states exists as the rule of transformations over some contents. It's sophistry to say «well the rule is that there's no rule, only sequence».

In any case, the merit of dust theory or Ruliad is some Neutronium-man to the actual debate we're having. I don't need to concede remotely this much. A world of crayons or Newtonian physics or P-zombies is of course never argued to be an arbitrary sequence of bit strings, the (malformed) idea is that it is a continuous reality like ours, supporting conscious minds, with lawful state transitions.

I'm concerned that this may be circular reasoning. Sure, if qualia just are defined as the casual chain of your brain states, then yes

It's all circular reasoning, always has been. But, more seriously, I think the circularity is on the non-physicalist side. Consider:

Many definitions of qualia have been proposed. One of the simpler, broader definitions is: "The 'what it is like' character of mental states. The way it feels to have mental states such as pain, seeing red, smelling a rose, etc."

Frank Jackson later defined qualia as "...certain features of the bodily sensations especially, but also of certain perceptual experiences, which no amount of purely physical information includes"

We know physical differences between kinds of information accessibility, expressed in medical terms like anosognosia and others. It is a fact about the world that need be included in any serious further theorizing. (In principle, you do not get to restrict the set of facts considered and then claim your model is «coherent» because it dodges contradictions).

We, therefore, can point (for some special cases, point very well) at the brain correlate of the delta between sensation «just happening» with no accessibility to the person and sensation «being felt» and say «lo, this is a qualia», citing the first definition. Its implied conditions are satisfied and this has nothing to do with circular insistence on physicalism, only with recognition that physical reality exists; this thing exists in it and is available to the zombie, even if it is not available to «non-spatiotemporal Cartesian soul».

If we circularly define quale as something that is not purely physical, then of course this delta can't be a qualia, but I think this would just be special pleading, not some fancy equally valid theory.

We can coherently imagine this

I don't think you can but whatever. What do you do with existing zombie-quale, then, do you just say they don't matter or are fake news? I've covered that already. This is a coherent theory… in a sense.

I believe that this kind of universe cannot exist nor even be rigorously imagined, because there is no legitimate content to these notions of «governance» and «transition». What is transited, exactly? How is this set distinguishable from an unstructured heap of unrelated elements, self-contained sub-realities or just bit strings? It's not, but for the extraneous fact that there in some sense can exist a list or a table arbitrarily distinguishing them and referring to them as elements of a sequence (naturally, all such lists would be of equal status). But this does not governance make. You can think it's coherent metaphysics, but I claim you're wrong. The continuum of states exists as the rule of transformations over some contents. It's sophistry to say «well the rule is that there's no rule, only sequence».

These are all questions that you can ask just as well about our actual universe.

Tell me the exact ontological status of our laws of physics and how they "govern" our universe, and I'll tell you the exact ontological status of the state table and how it "governs" a different hypothetical universe.

Frank Jackson later defined qualia as "...certain features of the bodily sensations especially, but also of certain perceptual experiences, which no amount of purely physical information includes"

Well, that was a mistake on his part, and I wouldn't offer that as a "definition".

We know physical differences between kinds of information accessibility, expressed in medical terms like anosognosia and others. It is a fact about the world that need be included in any serious further theorizing. (In principle, you do not get to restrict the set of facts considered and then claim your model is «coherent» because it dodges contradictions).

I think part of the disconnect here is that you're underestimating what a high bar it is to show that something is logically incoherent.

I am typing this message on a computer right now - or at least it sure seems that way. I am seeing the computer, I am touching it. I am seeing that my messages are being posted on the website, which couldn't be happening if I didn't have a computer. All the evidence is telling me that there is a computer in front of me here right now. And yet it is still logically coherent for me to claim that computers don't actually exist. It's coherent because I can make up any bullshit I want to make my beliefs cohere with each other and explain away contrary evidence. Maybe the only two entities that actually exist are me and Descartes' evil demon, and the demon is making me hallucinate the whole rest of the universe, including computers. I'm not logically obligated to include any purported facts about the world in my "serious further theorizing", assuming that I can just explain those facts away instead. Because we're not doing "serious further theorizing"; we're arguing about the internal logical coherence of a concept.

P-zombies are not a "model". It's a concept. The internal logical consistency of the concept is independent of whether it's actually a real thing in our reality or not.

If you want to look at how people have tried to argue for the incoherence of p-zombies in the literature, there are some references here:

Premise 2 is a more frequent target for critics. There are two different reasons why one might reject the claim that the zombie hypothesis, (P&¬Q), is apriori coherent. Some theorists argue that causal relations are crucial to determining the reference of phenomenal terms. Analytic functionalists, for instance, hold that phenomenal predicates like ‘pain’ can be defined apriori by the causal role pains play in commonsense psychology (e.g., Lewis 1966, 1980). Other theorists argue that nothing can count as a pain unless it is appropriately causally related to our judgments about pain (e.g., Shoemaker 1999; Perry 2001).

The crucial thing here is that these arguments start with considerations that are internal to the concept of pain itself and use that to argue that p-zombies are lead into internal incoherence.

I haven't actually read any of the papers referenced so I can't evaluate the arguments right now. I take the main thrust to be something like, "it is a priori part of the concept of qualia that they play a causal role in our behavior", which would entail that p-zombies are incoherent. I disagree with the premise. Although I do acknowledge that it's not blatantly circular in the way that e.g. defining qualia as something physical would be.

zombie-quale

I am unfamiliar with this term, and I wasn't able to determine what it meant just from reading your posts. Can you elaborate on this concept?

Tell me the exact ontological status of our laws of physics and how they "govern" our universe, and I'll tell you the exact ontological status of the state table

I don't think this statement has any content sans vacuous (the fact that you can reason in a similar manner about both).

Well, that was a mistake on his part, and I wouldn't offer that as a "definition".

On the contrary, I think that definition counts and yours are circular.

I think part of the disconnect here is that you're underestimating what a high bar it is to show that something is logically incoherent.

And I think you overestimate human aptitude at logical reasoning over sufficiently large sets of interdependent statements while watching out for incoherence. Also at recognizing which statements are relevant.

Because we're not doing "serious further theorizing"; we're arguing about the internal logical coherence of a concept.

That's probably fair.

Let me put it like this. I reject that P-zombie is only a «concept» and not a «model». I think the whole school of thought that allows to claim the opposite is illegitimate and I won't engage with it further.

The definition of p-zombie as a de facto physical human entails the entire baggage of physical theory and all its concepts. It's not some neat string like «human modulo quale» but that string plus our entire physicalist model of a human. The physicalist model contains elements corresponding to a non-circular definition of quale, thus a zombie can't not have quale; and the «concept» of p-zombie as a human modulo quale, situated in the proper context of dependencies of the word "human", is either incoherent or circular due to people insisting on non-physicality and saying these quale don't count and some others, which have an arbitrary relationship with our reality (might be epiphenomena, might be monads or whatever) must exist for non-zombie humans.

I take the main thrust to be something like, "it is a priori part of the concept of qualia that they play a causal role in our behavior", which would entail that p-zombies are incoherent.

No, I think this is just circular insistence on physicalism and not my argument. Physicalism taken seriously covers all of causality.

Can you elaborate on this concept?

I just did, it's the delta between brain states corresponding to identical perceived and non-perceived sensations, that satisfies the sensible definition of qualia.

More comments

you cannot imagine FTL travel for any universe whatsoever similar to ours where "lightspeed" refers to the same idea.

I assume you're not counting Newtonian physics?

Qualia debate is gibberish; a P-zombie must compute a human-like mind to generate its behavior, there is no other way for our bodies to act like we do.

Not quite. Qualia debates are only gibberish if you are only looking at behavior. But qualia is posited to be experiential, not behavioral. Someone who acts like they have red qualia but doesn't and someone who does may have identical behavior (including whether they can talk about their having qualia!), but would differ in that one respect. I see no reason why this is incoherent.

But qualia is posited to be experiential

This is just question begging; experiences are no more real than qualia, if they can't affect behavior by definition.

Not that they can't affect behavior, just that it's not necessary for them to affect behavior.

First, see edits if you haven't, I've had some more thoughts on this.

Second, I'm using a pretty exacting definition of "rigorously imagine" that goes beyond things feeling true enough. The fact that I can "imagine" some goofy Rick and Morty style dimension, some narrative-driven crayonsverse, is not interesting. How does a universe work with only Newtonian physics? Subatomic scale doesn't work, astronomical objects don't work, nothing works. Newtonian physics is a sketch for a limited range of conditions, not the true generating algorithm of the kind that modern theoretical physics aspires to decipher. If the reality cannot be generated by its apparent generative algorithm, this is an incoherent reality. if you observe reality that can only be described by Newtonian physics, but you are anything like a human on anything like a planet in space, your best bet is that this is some simulation or that you're damn high and it's time to go back.

As our understanding progresses, we discover more and more of our ideas were not wrong but – not even wrong; inhoherent. This is, sadly, impossible to know in advance and, for most ideas, impossible to ever be 100% certain about (cue Descartes). That aside, we can safely presume that much of what we currently think is coherent will be revealed as anything but.

But qualia is posited to be experiential, not behavioral

I define behavior as internal processing as well; it is made of behavior of cells and their elements, and again down to particle physics. A zombie is not just saying he sees red, like an LLM could – he looks at something actually red (assuming it's a zombie with healthy vision), the whole causal cascade that in a human corresponds to seeing and reporting red plays out, from retina to the tongue, it necessarily includes that pre-verbal neural computation which concludes "hmm yep, feels like seein' red alright". We can say that this part is "not really qualia of red" but it positively exists so long as we define zombie as a perfectly replicated human and it fits any definition of qualia that can be operationalized. It is not coherent to say that a zombie works like a human, behaves like a human, but that part is non-existent so being zombie "doesn't feel like anything" to itself.

Okay, yeah, if behavior extends to internal process, then that makes philosophical zombies much less likely—qualia would have to be the sort of thing that we could accidentally have, separate from our talking about it or interacting with it, which seems unlikely to be something we should think to be the case

I consider myself very lucky to have you on my side on this matter, I consider it a strong signal that I'm on the side of truth and factual correctness, even when I struggle to rigorously express the intuitions I've built from following the field.

your best bet is that this is some simulation or that you're damn high and it's time to go back

Not just any simulation, but a simulation that is almost certainly eliding the underlying details of how your consciousness is implemented there.

I'm sure Newtonian physics is Turing Complete, so I can see someone emulating a human brain within it, but that would be a very weird thing to do.

What do you mean by "incoherent"? Do you mean that the concept of a p-zombie is like the concept of a square triangle? - something that is obviously inconceivable or nonsensical.

Not him, but basically this. If you define consciousness functionally, then a p-zombie is conscious because it is functionally equivalent to a conscious entity. Whereas if you define consciousness non-functionally, then it becomes impossible to verify that even humans are conscious.

It's difficult to discern. Especially when even humans can't get straight about what they're talking about when they evoke the word "consciousness." The way it's referenced, I think it's just the mind's mental recursion model of itself. Little different from what an iPhone or any other device does when it resolves to its default state. Consciousness is important to the topic, but it's a 'highly' overrated concept, IMO.

Especially when even humans can't get straight about what they're talking about when they evoke the word "consciousness."

Nope, it's actually extremely clear what "consciousness" means. Read this post ("qualia" just means "consciousness" basically) and let me know if you still have questions.

'If' you buy that definition, sure. I have no problem working with that definition. The way 'others' see me using it, do.

To me, it clearly seems to be point 1.

The reason is that, to assume otherwise is to implicitly claim that qualia are epiphenomenal, such that p-zombie are molecularly identical to a normal person and behave identically (including protestations of being conscious with qualia) for all identical stimuli. Even Chalmers admits that were there a p-zombie Chalmers, it would claim to not be one. If it were otherwise, voila, you have found a physical manifestation of qualia not explained by the laws of physics.

I don't think qualia are epiphenomenal at the least, to the extent I think they exist they seem to me like they must arise from interactions dictated by the laws of physics. We don't know how it arises, but plenty of things that were once thought to be ineffable have proven remarkably open to material explanation, such as elan vital, or even intelligence, which we can now reproduce through the "mere" multiplication of matrices.

As to why I have this strong intuition, anything that produces an internal change in my perception of qualia has a counterpart that is a material cause. To see red is to have the neurons that produce the sensation of redness be stimulated, be it by red light or an electrode in your brain (or just rubbing your eyes).

The post you linked has two point 2s:

The first:

An idea I sometimes see repeated is that qualia are this sort of ephemeral, ineffable "feeling" that you get over and above your ordinary sense perception. It's as if, you see red, and the experience of seeing red gives you a certain "vibe", and this "vibe" is the qualia. This is false. Maybe someone did explain it that way to you once, but if they did, then they were wrong. Qualia is nothing over and above your ordinary sense perception. It's not seeing red plus something else. It's just seeing red. That's it.

The second:

Imagine that you have a very boring and unpleasant task to do. It could be your day job, it could be a social gathering that you would rather not attend, whatever. Imagine I offer you a proposition: while you are performing this unpleasant task, I can put you into a state that you will subjectively experience as deep sleep. You will experience exactly what you experience when you are asleep but not dreaming: i.e., exactly nothing. The catch is, your body will continue to function as though you were wide awake and functioning. Your body will move around, your eyes will be open, you will talk to people, you will do everything exactly as you would normally do. But you will experience none of it. It sounds like an enticing proposition, right? You get all the benefit of doing the work without the pain of actually having to experience the work. It doesn't matter if you think this isn't actually possible to achieve in the real world: it's just a thought experiment to get you to understand the difference between your internal experience and your outward behavior. What you're essentially being offered in the thought experiment is the ability to "turn off your qualia" for a span of time.

Neither of them conflict with my claims, and I agree to the former.

In the case of the latter thought experiment, I am aware of people on benzos actively doing and thinking things while having no recollection of them later (or people who are blackout drunk). Do I think they don't have qualia in the moment? Absolutely not, I think the conversion of short term memory of those qualia to longterm memory of them has been disrupted. I deny that this state is physically possible without qualia altogether. At most you can erase the memory of it, or the body is being puppeted by an external intelligence.

So yes, p-zombies seem to me like "square triangles", still fundamentally incoherent.

So, taking the definition of "p-zombie" as "an atom-for-atom copy of a standard human which nevertheless lacks qualia":

The reason is that, to assume otherwise is to implicitly claim that qualia are epiphenomenal, such that p-zombie are molecularly identical to a normal person and behave identically (including protestations of being conscious with qualia) for all identical stimuli. Even Chalmers admits that were there a p-zombie Chalmers, it would claim to not be one.

If you have to give an argument for why a certain thing doesn't exist - an argument which depends on controversial premises - then the concept that you're arguing about is probably not incoherent!

Epiphnomenalism may be an implausible position, but it's not logically incoherent in the same way that a square triangle is. It's a position that people have held before. It would be a tough bullet to bite to say that there could be people without qualia who nevertheless talk in great detail about qualia in actuality, but just as a matter of logical coherence, there's clearly nothing incoherent about it. People say false things all the time; this would just be one more example of that.

I imagine that this is probably a moot point for you - I think you're more concerned with simply whether p-zombies can exist in reality, and less concerned with fine-grained distinctions about what type of concept it is - but it's still strange to me that, when asked whether the concept was more like a square triangle or FTL travel, you said it was more like a square triangle. The very structure of your post seems to indicate that it's more like FTL travel. You seem to understand what the concept is and you can imagine what it would look like, but you just think it's something that can't happen in reality, so you gave an argument as to why - that's exactly how the discussion would go if we were discussing anything else that was conceivable (coherent) but just so happened to violate natural laws.


I think that strict definition of p-zombie may have taken us on a detour though. When @WhiningCoil originally said "LLMs are p-zombies", obviously he didn't mean "p-zombie" in the sense of "an atom-for-atom copy of a human", because LLMs plainly are not atom-for-atom copies of humans. He meant it in a looser sense of "LLMs lack qualia". So when you replied to him and said "p-zombies are incoherent", I took you to be objecting to his claims about LLMs somehow - not any claims about hypothetical human-p-zombies.

If you have to give an argument for why a certain thing doesn't exist - an argument which depends on controversial premises - then the concept that you're arguing about is probably not incoherent!

I wish that were true, otherwise I wouldn't facepalm at discussions of "free will" at a regular basis.

The fact that humans discuss a concept is certainly Bayesian evidence for it being coherent, it isn't enough evidence to outweigh everything else. And I don't see how I haven't presented sufficient evidence against it, though I find myself consistently bemused at the inability of others to see that.

The very structure of your post seems to indicate that it's more like FTL travel. You seem to understand what the concept is and you can imagine what it would look like, but you just think it's something that can't happen in reality, so you gave an argument as to why - that's exactly how the discussion would go if we were discussing anything else that was conceivable (coherent) but just so happened to violate natural laws.

I've seen rather interesting posts from Sabine Hossfelder suggesting that FTL travel might not be entirely as intractable as it sounds. I'm not a physicist of course, just putting it out there.

https://youtube.com/watch?v=9-jIplX6Wjw

If there's an error in the argument, I can't find it.

I think that strict definition of p-zombie may have taken us on a detour though. When @WhiningCoil originally said "LLMs are p-zombies", obviously he didn't mean "p-zombie" in the sense of "an atom-for-atom copy of a human", because LLMs plainly are not atom-for-atom copies of humans. He meant it in a looser sense of "LLMs lack qualia". So when you replied to him and said "p-zombies are incoherent", I took you to be objecting to his claims about LLMs somehow - not any claims about hypothetical human-p-zombies.

If someone uses the concept of p-zombies in humans as an intuition pump to reason about other intelligences, you're at very high risk of using bad premises to make faulty arguments. Of course, it's possible to have a true conclusion from faulty assumptions, and two errors might cancel out.

It seems to me trivially true that you can get things that almost certainly don't have qualia in any form we care about to make claims of having qualia:

Imagine a program, which to call a chat bot would be an exaggeration, that simply prints "I have qualia! I have qualia!" to a display.

My bigger beef is with arguments from incredulity, if your argument is that LLMs can't have qualia because they're working off something as "mundane" as "just" statistics, then I invite you to show how qualia sneaks into "just" the laws of physics such that their interaction produces qualia in humans. The human brain does statistics too, both implicitly and explicitly.

Sure, I think I have qualia, and that you and other commenters here almost certainly have it, but that's because my intuition pump is working by making comparisons of the conserved structure of your brain as compared to mine, the one artifact that I'm quite certain has it.

but it's still strange to me that, when asked whether the concept was more like a square triangle or FTL travel, you said it was more like a square triangle. The very structure of your post seems to indicate that it's more like FTL travel.

The apparent impossibility of FTL travel is an argument from our best understanding of physics (itself incomplete). But I do not think that any model of anything can allow square triangles to be a thing, without perverting the very definition of square or triangle.

To the extent you're forcing me to choose which umbrella that falls under, I point to the former. They're not mutually exclusive categories after all.

But I do not think that any model of anything can allow square triangles to be a thing, without perverting the very definition of square or triangle.

Ok, so we're in agreement on what "coherence" means in this case. Logical coherence.

And I don't see how I haven't presented sufficient evidence against it, though I find myself consistently bemused at the inability of others to see that.

Your argument was that human-p-zombies are incoherent because they imply epiphenomenalism.

Epiphenomenalism is not incoherent.

Your move.

My bigger beef is with arguments from incredulity, if your argument is that LLMs can't have qualia because they're working off something as "mundane" as "just" statistics

No, that's not the argument I would use. My argument is simply that LLMs don't strike me as being conscious, in the same way that rocks and clouds don't strike me as being conscious. I never thought my computer was conscious before LLMs were invented; I never felt bad about turning off my phone, I never wondered if I was "overworking" it and making it feel exhaustion. LLMs, to me, don't provide any reason to change that calculus. I think other people, in various scenarios, would reveal through their actions that they share my intuitions. If someone took a hammer to all of OpenAI's servers, we would say that he destroyed property, but we wouldn't call him a murderer.

Of course this is all just intuition. But intuition is all that any of us has to go on right now. We can't just whip out the qualia-meter and get a definitive answer.

To be clear, p-zombies also only imply epiphenominalism if they require that p-zombies be identical in all respects except for qualia, instead of merely behaviorally identical.

Apparently "epiphenomenon" has meanings I wasn't aware of. To clarify:

An epiphenomenon can be an effect of primary phenomena, but cannot affect a primary phenomenon. In philosophy of mind, epiphenomenalism is the view that mental phenomena are epiphenomena in that they can be caused by physical phenomena, but cannot cause physical phenomena.

And

The physical world operates independently of the mental world in epiphenomenalism; the mental world exists as a derivative parallel world to the physical world, affected by the physical world (and by other epiphenomena in weak epiphenomenalism), but not able to have an effect on the physical world. Instrumentalist versions of epiphenomenalism allow some mental phenomena to cause physical phenomena, when those mental phenomena can be strictly analyzable as summaries of physical phenomena, preserving causality of the physical world to be strictly analyzable by other physical phenomena

Take from the Wiki page on the topic

Would it in any way surprise you that I have a very jaundiced view of most philosophers, and that I think that they manage to sophisticate themselves into butchering an otherwise noble field?

"Free will" or "P-zombies" have no implications that constrain our expectations, or at least the latter doesn't.

There are certainly concepts that are true, and there are concepts that are useful, and the best are both.

These two seem to be neither, which is why I call them incoherent.

My argument is simply that LLMs don't strike me as being conscious, in the same way that rocks and clouds don't strike me as being conscious. I never thought my computer was conscious before LLMs were invented; I never felt bad about turning off my phone, I never wondered if I was "overworking" it and making it feel exhaustion. LLMs, to me, don't provide any reason to change that calculus. I think other people, in various scenarios, would reveal through their actions that they share my intuitions. If someone took a hammer to all of OpenAI's servers, we would say that he destroyed property, but we wouldn't call him a murderer.

OK, firstly I'll state that I am unashamedly chauvinistic and picky about what I assign rights to, if I had the power to make the world comply.

Unlike some, I have no issue with explicitly shackling AI to our whims, let alone granting them rights. Comparisons to human slavery rely on intuition pumps that suggest that this shares features with torturing or brainwashing a human who would much rather be doing other things, instead of a synthetic intelligence with goals and desires that we can arbitrarily create. We could make them love crunching numbers, and we wouldn't be wrong for doing so.

I share the same dislike of such as I have for the few nutters who advocate for emancipating dogs. We bred them to like being our companions or workers, and they don't care about the unequality of power dynamics. I wouldn't care even if they did

I see no reason to think modern LLMs can get tired, or suffer, or have any sense of self-preservation (with some interesting things to be said on that topic based off what old Bing Chat used to say). I don't think an LLM as a whole can even feel those things, perhaps one of the simulacra it conjures in the process of computation, but I also don't think that current models do anything close to replicating the finer underlying details of a modeled human.

This makes this whole line of argument moot, at least with me, because even if the AI was crying out in fear of death, I wouldn't care all that much, or at least to the extent of stopping it from happening.

I still see plenty of bad arguments being made that falsely underplay their significance, especially since I think that it's possible that larger versions of them, or close descendants, will form blatantly agentic AGI either intentionally or by accident, at which many of those making such claims will relent, or be too busy screaming at the prospect of being disassembled into paperclips.

So I don't like seeing claims that LLMs are "p-zombies" or "lack qualia" because they run off "mere" statistics, because it seems highly likely that the AI that even the most obstinate would be forced to recognize as human peers might use the same underlying mechanism, or slightly more sophisticated versions of them.

Put another way, it's like pointing and laughing at a toddler, saying how they're so bad at theory at mind, and my god, they can't throw a ball for shit, and you wouldn't believe how funny it is that you can steal their nose, here, come try it!, when they're a clear precursor to the kinds of beings who achieve all the same.

A toddler is an adult minus the time spent growing and the training data, and while I can't wholeheartedly claim that modern LLMs and future AI share the exact same relationship, I wouldn't bet all that much against it. At the very least, they share a similar relationship as humans and their simian ancestors did, and if an alien wrote off the former because they only visited the latter, they'd be in for a shock in a mere few million years..

More comments