site banner

Culture War Roundup for the week of November 20, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

7
Jump in the discussion.

No email address required.

I don't particularly care Hlynka, if this Thanos snapping managed to take both of us, you included, I'd consider it a net positive!

But I fail to see what the difficulty of Turing-testing random pseudonymous accounts on a text-based forum has anything to with it. Last time I checked, we're both operating according to the laws of physics and biology. Your analogy of how ML works is simply painful.

Turing-testing random pseudonymous accounts

That's not really what his question is about.

I've never accused him of being concise and clear, or having a point.

Am I supposed to sob in horror at the idea of replacing humans with soulless automata instead? He doesn't provide any reason to think that humans or LLMs can't both be represented as the output of statistical processes occurring on computational substrates, even if said processes and substrates are very different.

As @ArjinFerman says, this isn't about "replacing humans with soulless automata" it's about replacing you in particular. I'm asking you whether you believe that the sum of your existence (your thoughts, feelings, memories, physical existence, output here on theMotte, etc...) is meaningfully distinct from that of an arbitrarily complex random number generator in any way?

If so, why do you believe that?

Ironically for how often I get accused of not understanding how machine learning works, I suspect that I have far more practical "hands-on" experience designing, implementing, and working with machine learning algorithms than most users here.

is meaningfully distinct from that of an arbitrarily complex random number generator in any way?

Sure, obviously. I can only assume that you think this is a valid description of ML/LLMs/AI, which it very much is not. If it's "randomness" that has you up and at it, then set the temperature of a model to 0 to get deterministic outputs. Problem solved?

If so, why do you believe that?

I need no justification for such atomic preferences, I just have them, both in the incredibly stupid case you wish to make, and my vain attempts at steel-manning it in the scenario you're hand-waving at modern ML. LLMs do not capture the complexity of a human, nor do they have other aspects I care about, such as the fact that I'm not talking to a machine that will immediately flush everything out of memory as soon as it's done talking to me. Then again, I think that's a valid description of certain people on this forum, so who am I to judge?

I value my existence for its own sake, but if there's a human intelligence or smarter AI out there that is capable of remembering discussions and updating on them in the future, and capable of modifying future behavior on that basis, then I'm perfectly fine talking with it at length. Even GPT does update, but only slightly so as newer conservations enter the training data for the next one, but not in the same manner as a human.

If you mean a mind upload of myself running in-silico, and not a random LLM fine tuned on me, then yes, I would accept it as a valid replacement, given my conviction that it's very likely that in internally subjective terms it has the same qualia as I do. I would obviously prefer we both co-exist, at least until my flesh fails me, but I accord such an entity every right to use the SMH name to the same extent I do.

Ironically for how often I get accused of not understanding how machine learning works, I suspect that I have far more practical "hands-on" experience designing, implementing, and working with machine learning algorithms than most users here.

Here I was thinking I'm a human chauvinist, and now I'm pitying an ML model. Such insanity is hardly unheard of, I happen to have an uncle who is a professor in microbiology who swears by homeopathy.

I suppose it's a sign of how streamlined the process has become, when people so utterly divorced from the theoretical underpinnings of the technology are making a living off it.

Sure, obviously.

Is it though? If it's obvious it should be trivial to either demonstrate or falsify, should it not?

I suppose it's a sign of how streamlined the process has become, when people so utterly divorced from the theoretical underpinnings of the technology are making a living off it.

Says the guy who thinks his ability to type a prompt into Bing makes him oh-so-clever. I would argue that it is my familiarity with the theoretical underpinnings of this technology that enable me to recognize both its utility and its limitations.

Ultimately what a regression-based machine learning algorithm (of which LLMs are a subset) is under the hood, is a random number generator rolling on a table like the one I linked above (Wtf are those goblins doing?). What's happening mechanically when you "train" a regression engine is that you are populating that table and assigning different statistical weights to the various outputs within it based on the prompt provided. EG replacing a 15% chance of 2d6 bandits in the random encounter table with a 30% chance of 3d3 goblins based on whether the environment variable has been set to city or dungeon.

While this sort of statistical processes can excel at associative tasks where the bounds of likely inputs and outputs are known in advance such as linguistic translation and ranking search results, it ends up being worse than useless for other more agentic tasks like pathfinding, and is only capable of "finding useful information" in so far as what is "useful" and what is "statistically probable" based on its training data are in alignment.

Dear reader, please don't let Hlynka distract you from the fact that a humble "Stochastic Parrot" did a better job of both understanding a complicated physics question from implied context and answering it correctly than he did.

The most utterly glaring error here is that you're flat out wrong about LLMs being a subset of regression-based ML algorithms. I will risk wasting the time of @curious_straight_ca and @DaseindustriesLtd here to back me up on that, even if a cursory search reveals that they're completely different things.

But Hlynka is of the opinion that Chihuahuas are good hunting dogs, so who's surprised at yet more abuse of truth or the meaning of language?

At any rate, such a combination of such utter confidence while being "not even wrong" levels of confused about things is unique, if not particularly charming.

Besides, maybe the error is on my part, translations to and from "Indian" can be fraught, am I right? It's entirely possible I've mistaken a very subtle and important argument for gish-galloping.

While this sort of statistical processes can excel at associative tasks where the bounds of likely inputs and outputs are known in advance such as linguistic translation and ranking search results, it ends up being worse than useless for other more agentic tasks like pathfinding, and is only capable of "finding useful information" in so far as what is "useful" and what is "statistically probable" based on its training data are in alignment.

@HlynkaCG Actually, the techniques used in language modeling are great at "pathfinding" and other "agentic" tasks, too. See Decision Transformers and similar work. One of the most central, and at-the-time most surprising to many, results of ML is that the same techniques work for a wide variety of tasks. Neural nets "want to work".

You say NNs / language models are regression based. This is vacuously true. Wiki says:

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one or more independent variables (often called 'predictors', 'covariates', 'explanatory variables' or 'features').

So, language models are regression based in the sense that they predict things based on other things. Every possible technique for doing what language models do, or indeed any method of machine learning or AI, or indeed humans behavior itself, could be cast as a "regression", in that sense. In the sense you mean, though, of randomness or simplicity, they aren't - the models that are trained are horrendously complex, and capable of representing very complicated computations. As opposed to "regressions" in the colloquial sense, which are relatively simple statistical models.

What's happening mechanically when you "train" a regression engine is that you are populating that table and assigning different statistical weights to the various outputs within it based on the prompt provided

You're, presumably, familiar with physics and causality, right? Any discrete theory of physics (and modern physics is strongly suspected to be discrete, as involving real numbers anywhere leads to all sorts of paradoxes) can, necessarily, be modeled as a (very large) "table", or matrix, with a row/column for each world-state, and various transition "statistical weights" / probabilities from each state to each state. This is certainly an incredibly coarse-grained representation, especially given you need a state for each large-scale quantum state (distribution-across-universe-branches), but it's doable. So, given your "regression engine" can, in theory, run the entire universe, I think it's premature to say it can't run an AI.

Now, obviously there are chinese room-level scale issues with the comparison, and physics has mathematical patterns that lead to a description much simpler, and smaller, than a transition matrix of size 2^2^(number of atoms in the universe). Fortunately, neural networks have those too! They're not huge transition matrices either, but very complicated functions with a lot of internal regularity.

So, HlynkaCG, I don't get why we keep having these discussions, you just assert a bunch of things that are patently false, and then repeat them a few months later after they're corrected.

If it's a transformer, its not regression-based. Yes transformers are often used in the training of regression engines (parallel processing is a hell of a drug) but they are not the same thing, they have different use cases.

More comments

So, HlynkaCG, I don't get why we keep having these discussions, you just assert a bunch of things that are patently false, and then repeat them a few months later after they're corrected.

At this point I'm just tempted to make a FAQ-style compilation to save my breath later.

@HlynkaCG may be stretching the definition of «regression» past the breaking point in my view. But if one wants to argue that attention over 80 layers is «regression» over a trained collection of regressors, then fine, I won't stop it – categories were made for man, not… and all that. I think at this point it's a fool's errand to fight over such stuff manually instead of…

Well, typing some prompt like «How do large language models (transformers) correspond to regression-based ML algorithms? Answer at the level of PhD CS adjunct professor level. Focus on mechanistic details, not use cases» into a frontier model of your fancy. I quite like Claude 2's style but GPT-4 is still king.

Of course, that reference to regression is just a more specific way to diss the «complex statistical model», and a complex enough transformer model can approximate most anything in a compact domain (with some sane constraints, but as much can be said of the brain with its finite expressivity and learning capacity). Maybe we could talk about actual expressivity limits of some architectures, and orthodox Transformers can't learn to solve PARITY problem in the general case, but Universal Transformers do better, and path independent equilibrium models must do better still; at some point human+tool generalization will be comprehensively surpassed, and we'll be able to confidently say that an AI of such and such design and hyperparameters can learn everything a human mind can learn and more, and even does that in practice, and the question will be moot. Or is the question about the possibility to establish the correspondence between some types of data and some types of things, like, symbol sequences and thoughts?

I am not aware of some strong information-theoretical or broadly mathematical reason, which Hlynka and some other guy (@IGI-111 maybe?) alluded to, for believing this won't be done with known ML primitives in a few years. It looks to be about the «just» fallacy: some people think that if they understand the primitives (like regression, or gradient descent, or matmul – whatever abstraction layer they want to squint at), the full thing is «just» the interaction of those primitives and thus… something something… cannot be intelligent/conscious/superhuman/your option. I can't understand this way of thinking, it seems mainly ego-driven to me but that's a hypothesis, I literally cannot comprehend it, it does not compute.

This is all progressively far from the high-level generator of disagreement, which is… what is it again? And how many are there?

That said, I also do not share your theory of consciousness/personal identity, my views are closer to Christof Koch's. I think a high-quality computable upload of myself would be able to output thoughts in the distribution of my own (hell, one can finetune an LLM and see the resemblance already, it would even fool some); but it would be, for most intents and purposes, a p-zombie, even if you throw an «agentic» for loop on top. I do not subscribe to the Lesswrongian purely computational doctrine; I am a specific subject, not information about an object. For the same reason I would not use destructive teleportation nor advocate it to anyone, I think humans are causal entanglements, not blueprints for those.

Well, typing some prompt like «How do large language models (transformers) correspond to regression-based ML algorithms? Answer at the level of PhD CS adjunct professor level. Focus on mechanistic details, not use cases» into a frontier model of your fancy.

This is a nice theory, but the problem with regression-based algorithms in practice is that to receive a "correct" response to such a query you not only need to have an example of the correct response in your training data, you need to have enough such responses (or a robust enough statistical model) to ensure that it becomes the most probable output.

More comments

You can talk all the shit you want, but it will still just be talk.

You can yell at clouds all you like, but much like Shamans and their hexes, RCTs have shown that doesn't do much to help with rain ;)

More comments

I suspect that I have more practical "hands-on" experience actually designing, implementing, and working with machine learning algorithms than most users here.

Please tell me you've moved on from work for the military...

I left active duty in 2010.

I didn't mean so much active duty, as being a developer for the military industrial complex working on AI controlled drones

When the first line is "The New York Times reports" take anything that follows with a solid helping of salt.

As is often the case in these sorts of stories, the headline is simultaneously true and also misleading. Yes "The Pentagon is moving toward letting AI weapons autonomously decide to kill humans" in the sense that getting drones/missiles/etc... to the point where they can independently recognize/identify a target or threat (and more pointedly, do so on a small enough footprint that the process can be run in something resembling real time) is a field of active study. That's not the same thing as handing the keys to Skynet. The irony is that the end goal of such research is actually the inverse of what is being portrayed. Killing everything in a grid square is pretty straight forward. Killing one specific thing and only that thing is a much harder problem. Nor are the applications for such research purely military, a major issue for both the aerospace and automotive industries is currently "See and Avoid" IE giving autopilots and self-driving cars the ability to recognize obstacles and navigate around them rather than simply following a preset route, likewise the ability to respond to conditional cues like a cop trying to pull them over or wave them through an intersection. But you don't see breathless articles about training self-driving cars to kill even though there a massive amount of overlap in the implementation.

I've never accused him of being concise and clear, or having a point.

Why can't we all just get along?

Am I supposed to sob in horror at the idea of replacing humans with soulless automata instead?

Well, it's less humans, and more you in particular. It's also less about sobbing in horror, and more about whether you see much of a difference between the two cases. I think the question is interesting given Rat ideas on uploading consciousness.

There's about a zero percent chance that Hlynka doesn't know about my enthusiasm for the potential of mind uploading, with me seeing such an emulation of a human mind as equivalent in every way that matters to me as the same as a biological human.

That is not the same as replacing a human with a LLM trained on the corpus of their text, with outputs indistinguishable from the human. You'd need to do way more to establish it as a high fidelity replication of the original consciousness, even if I think in principle it's doable.

I don’t know that it is, and I think in time we’ll come to understand that training an advanced LLM on our personalities and actions is ‘mind uploading’ in the science fiction sense.

I agree that in principle, it is possible to emulate a complex system to within the limits of observation and random error by treating it as a blackbox and then training on its outputs or response to stimuli.

After all, in ML, that's already a thing in the form of teacher-student distillation, where you train a new neural net to be indistinguishable from another by feeding it the latter's outputs.

I still don't think that just training on a corpus of text written by a human is sufficient to reproduce said human, maybe if you had an enormous amount of video, audio and other biometrics. It's arguably better than nothing as a form of immortality, but I personally expect more. If it can be demonstrated that such a technique is somehow equivalent to mind uploading via scanning, then I'll have no objections.

I agree there are some important multimodal steps left, but they’re all happening pretty quickly. I think people will be stunned how well they can be captured just by training on their linked social media and text message data when that becomes easily available. Since most people are very similar, and even ‘personality quirks’ are often repeated, it’s likely that the tuning required is less than one might expect. Personalities aren’t actually very unique.