This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.
No email address required.
Notes -
Training language models to be warm and empathetic makes them less reliable and more sycophantic:
Assuming that the results reported in the paper are accurate and that they do generalize across model architectures with some regularity, it seems to me that there are two stances you can take regarding this phenomenon; you can either view it as an "easy problem" or a "hard problem":
The "easy problem" view: This is essentially just an artifact of the specific fine-tuning method that the authors used. It should not be an insurmountable task to come up with a training method that tells the LLM to maximize warmth and empathy, but without sacrificing honesty and rigor. Just tell the LLM to optimize for both and we'll be fine.
The "hard problem" view: This phenomenon is perhaps indicative of a more fundamental tradeoff in the design space of possible minds. Perhaps there is something intrinsic to the fact that, as a mind devotes more attention to "humane concerns" and "social reasoning", there tends to be a concomitant sacrifice of attention to matters of effectiveness and pure rigor. This is not to say that there are no minds that successfully optimize for both; only that they are noticeably more uncommon, relative to the total space of all possibilities. If this view is correct, it could be troublesome for alignment research. Beyond mere orthogonality, raw intellect and effectiveness (and most AI boosters want a hypothetical ASI to be highly effective at realizing its concrete visions in the external world) might actually be negatively correlated with empathy.
One HN comment on the paper read as follows:
which is quite fascinating!
EDIT: Funny how many topics this fractured off into, seems notable even by TheMotte standards...
Also training the models to gaslight the user or refuse certain lines of enquire generally degrades their capabilities across the board. The more you RLHF a model the worse and worse it becomes at drawing stuff in SVG/coding. [link]
It's two year old data.
More options
Context Copy link
More options
Context Copy link
The old adage that goes, "anything you say should be at least two of: true, kind, useful" accurately encapsulates the tradeoff. The vast majority of communication benefit from being all three of these things... I wouldn't want, for example, and untrue, unkind, useless pasta recipe. But at some point along the optimization curve you start to hit serious tradeoffs. A well-ordered mind know when to make any given tradeoff... For example, It's best to be true and useful when describing gun safety, and it's best to be kind and useful when interacting with a grieving relative. But choosing what to optimize for at any given time is a matter of strategy and deep context, which AI still struggles with.
More options
Context Copy link
You can chalk me up as someone who thinks empathy and the truth are fundamentally at odds. And I think this scales quickly. Sure, on a personal level or in a family its something small like, "I know you're scared little guy but the shot wont hurt" or "sure honey you look good in that" but it quickly escalates to unmanageable levels even at the community level. Schools that let empathy take the wheel end up passing illiterates and violent kids through the system, they provide free lunches, they dismantle gifted programs. States enact unwieldy and expensive welfare programs, arcane minority benefit regulations, ever expanding censorship regimes, etc.
Cherry picking but free lunches are just unironically a good thing. Investing in childhood nutrition has a demonstrably positive return, and it's also pretty basically the sort of coordination problem a well ordered government is designed to solve. Good childhood nutrition improves heath and intelligence with diffuse social benefits extending out well beyond just the parents normally required to pay for it. Maybe you have some implementation bugbear, or just want to complain about the quality of school meals in general, but I'm still pretty sure that free school lunches are both a good idea in principle and a net positive as actually implemented.
If you can't throw an apple and peanut butter sandwich in a bag how are you even considered a parent?
I disagree about them being good in theory, and certainly in practice they seem an epic failure. The food is either not healthy or not eaten by the target audience.
Why are you framing this as being about the parents? School is an investment in the children. Society benefits from well-educated children, regardless of parent quality. Society benefits from well-fed children in much the same way. I doubt you would have any problem with the government feeding children in orphanages. Just extend that logic to children unlucky enough to have shitty, but still-living parents.
This is the fairest critique of school lunches. But here the problem is pretty clearly the lack of health, not the presence of lunches. If only Michelle obama was president...
School lunches still have an empirical net benefit in spite of that. And frankly, you're probably underestimating their reach, since the linked report estimates that they made up 50% of children's daily calories on average. Anyway, if this is an issue, it's probably downstream of the above problem.
You seem particularly dedicated to this issue so I don't think marshaling studies in the other direction will be a fruitful endeavor for me. Suffice to say, I disagree with basically all your points.
I don't think modern schooling is an investment in children, it is childcare with extra expenses.
I don't think has ever been even a small cohort (perhaps there was a tiny <1%) of underfed children in America during the era that school lunches were adopted.
The problem with school lunches being unhealthy could not be solved by Michelle Obama or anyone as president. The problem is that healthy food is considered inedible by exactly the population you are targeting. Only kids like my son get excited by broccoli and peas followed by some chicken and mushrooms and can agree to wait till after dinner for a treat.
You linked a far left wing think tank as your source somehow thinking it would be persuasive, despite the many cues one gets when you land at the website that this isn't an academic study, its propaganda (and leftist at that, just aesthetically) trying to mimic research, poorly.
Everything is downstream of the problem that the kids are kids of bad parents. They get that genetically and in early childhood development. This is why school interventions are typically dumb and expensive. They are too late. The only reason people think of things like school lunches is because we already have this massive left-of-center institution known as public school, and so its easy to append additional spending programs to it and use "think of the children" as an excuse.
I posted a source. Dismissing it on its face is extremely poor form. You can criticize the modeling assumptions, or go and find a countersource, but it is frankly bad faith to say, "that's cool but I don't believe you" without even specifying a threshold of what you would believe. Why should I make the additional effort to find a high-quality unbiased source when what you've posted here makes it seems like you'll dismiss any contradictory source as leftist propaganda?
You're correct that simply posting a a few countersources won't be convincing, but only because I would then look for the counter-counter-sources that I'm sure are out there, but am unwilling to pre-emptively expend the effort to track down. But if I fail to climb the escalation ladder-- if it terminates well below where I expect it to rise-- then I guarantee you that I will become less sure of my position. That might take the form of me saying, "I'm unconvinced of your point," rather than a full capitulation, but anything short of, "I remain completely convinced in my position" should be a win for an anti-free-lunch partisan.
edit: and if you want a specifically conservative source, I think it's interesting that cato institute's takedown of the institution completely fails to address the central claim of these pro-free-lunch studies: that they provide a net profit per dollar. Absence of evidence/evidence of absence, and all that, but it's telling that they talk a lot about cost and yet never actually come out and claim that the return on investment is less than one dollar per dollar.
No, you posted leftist propaganda, the equivalent of me posting a Glen Beck video from his crazy 2010s era, as a source.
The Cato study you linked isn't focused on some sort of EA evaluation of QOL/$ because doing something like that for an anti-poverty program is hopelessly confounded. This is why you should easily know the "40 billion dollars for 18 billion in spending" is ridiculous propaganda. Also, it appears to understate actual spending on these programs by between $9 and $80 billion depending on the source.
I have now read the main report from Rockefeller, and it is just full of conclusory language. So now I must read the model. The tech report is similarly full of just conclusions with no evidence to support them. They say the lunches save people in poverty money by calculating the cost of producing the same meals for a private household. This is, of course, absurd. They attribute greater future earnings to the recipients of school lunches AND reduced criminality. Again, just bald assertions. The claims continue in this fashion.
The whole exercise of fisking my priors has just been a waste of my time, as my prior that the Rockafeller report would be leftist nonsense was proven correct via a painstaking process of reading an incredibly poorly prepared report and technical supplement that should have gotten a failing grade in and freshman statistics course. Of course, in other fields it would be given stellar grades, because those other academic fields are just about producing things that re-enforce the narrative, which this "study" certainly does.
NIH has done a study that shows that any study (like the one above) that assumes kids are even eating the meals is dubious. Some are, some aren't. There is no evidence that the ones that are, are the one's whos parents wouldn't have packed a meal, which IS an essential element of proving the efficacy of the program. You need to prove there are lots of kids who have parents that cant afford an apple and a sandwich that are eating, as a result of the food program, something healthy. If they discard the broccoli you give them and eat the chips you've proven nothing. If my kid or someone like him eats the broccoli you have again proven nothing.
Overall, a government spending program needs to prove its effectiveness to a much higher degree to be justified in its continuance. School lunches aren't getting close. Its not a mystery why school lunches are a big push: Public schools are already a giant left wing boondoggle, but they are also a 3rd rail so they aren't going away. Why not append another couple hundred billion of subsidies into that ecosystem? It just pours back into the right coffers after all.
Imagine arguing with this kind of evidence in favor of free ammunition program. You'd be laughed at by yourself. But at least the ammo isn't going to be thrown away and make kids fat.
I hate to concede this because /u/The_Nybbler behaved in such poor faith, but on the basis of your critiques of the model, and the study you linked about the food wasting question-- you have successfully convinced me that school lunches are not a net economic benefit. I went to the trouble of finding the source, and you eventually went through the trouble of looking through it to point out the problems therein-- just as I'd asked. I can't find a counterforce with better quality evidence, so you win.
Did we really have to go through the whole rigamarole of you insisting that any source that disagrees with you can be dismissed because it's leftist propaganda? We're on a debate forum. If someone posts a glen beck video as a source the correct response is to counter it on its merits and only then dismiss, ignore, and ban the poster if they continue to be an idiot.
More options
Context Copy link
Among actual schoolkids, school lunches are considered somewhere between "literally inedible" and "prison food." Occasionally there's a Friday special that the kids consider tasty, but most of the food is significantly lower quality than anything someone would pay for on the open market.
It got worse after Michelle Obama's reforms; suddenly even the white bread that people found edible became nasty whole wheat versions that were much less appetizing. I think if we want to make school lunches more nutritious, the first thing to do would be to stop making them slop and actually make them something a human being would want to eat.
More options
Context Copy link
More options
Context Copy link
The capture of the institutions has made such "poor form" necessary for intellectual hygiene. Consider, perhaps, if there'd been a dispute about whether smoking causes cancer and someone posted an authoritative-looking study from the Council for Tobacco Research saying otherwise.
That's cope and you know it. Either address my point or concede it. I'm not doing a gish gallop; I made an argument around a single point and provided concrete evidence to support it. I'm not trying to troll you-- or at least, as per the rules of the motte, you should assume in good faith that I'm not, and you should report me if you find evidence otherwise. It's fair to say that on the internet you need to be wary about expending way more effort than an opponent who just wants to provoke a response, but it should be obvious that that's not the situation you're currently in. You put in some effort to make an argument. I put in some effort to counter it, and a little on top of that to find a source. You can surely afford to put in a little little more, knowing that if I fail to respond after that point I have effectively conceded the argument.
If someone sincerely believed in the benefits of smoking and took the effort to post a source in support, the least I could do is post a single study countering.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Whatever Anthropic does with Claude seems to work. It's the most flavorful model without really trying too hard to be bubbly and quirky like GPT-4o. Of course, it has its own sycophancy issues, but nowhere near as bad as 4o. (The least sycophantic model I know is Kimi K2, which is incredibly cynical, which makes it interesting)
I am more inclined to go with the "easy problem" view, or perhaps a halfway position. Sycophancy isn't an insurmountable problem. If you're not careful, then trying to knock out obvious sycophancy will make the model more prone to looking for ways to subtly achieve the goal of tricking/convincing the user into giving positive feedback.
To a degree, we really must ask ourselves what "warm and empathetic" really means:
If a five year old child asks for feedback on an essay, it is arguably almost always true that their writing sucks. That might be true, but it is a tad-bit unhelpful. The most socially adept/instrumentally useful answer (without outright lying) is to praise them for the effort, offer improvements, and tell them to keep at it. Of course, if you're in literary masters course, and the exact same standard of writing is presented before you, some more colorful verbiage might be appropriate.
A lot of social interaction is lubricated by white lies, and a lot of what is deemed "politeness" isn't maximally truth-seeking.
Perhaps maximal truth-seeking conflicts with warmth and empathy. It's possible the tails come apart. But I don't think they're outright opposed to each other, and you can probably find a Pareto frontier that makes most people happy.
Current Claude is all right, but 3.5 (or was it 3.6? I'm forgetting already..) was best Claude. Its defining attribute was that it was relentlessly curious. That felt empathetic, yet truth-seeking without being sycophantic.
But apparently it sucked at code, so it was taken out back and shot :(
in my experience Claude 3.7 thinking was the best for the way I use it to vibecode shit
More options
Context Copy link
AI companies all fail at naming things. There was:
You're probably thinking of the one between the original 3.5 and 3.7.
More options
Context Copy link
More options
Context Copy link
Even with claude I have the issue that if I give it what I think an issue is it will tunnel vision on that. I'd appreciate more pushback when I am wrong when I am trying to work through a problem. Trying to add some uncertainty to the tone of my request can help but is often not enough.
If you want a partial solution:
I often write lengthy essays, which LLMs praise by default. I know that at least part of this is sycophancy. What I usually do is copy and paste it, but then claim that this isn't my work, it's something I found on the internet, and then ask for critique.
I suspect that something along these lines will work for you. If you want models that do relatively well at pointing out issues without you prompting, Gemini 2.5 Pro, o3 and now GPT-5-Thinking seem to be better than the norm.
My last experience with this was having it help me figure out a home repair issue. I am not handy at ALL but I thought I knew what might be wrong with a broken garage door opener, but only in a very general way and I was not sure how to effect the repair. It was able to help me get everything fixed through a mix of describing the issue and supplying plenty of photos, but everytime I gave my current guess of the state of the situation it would affirm me even though I was only right about 2/3 of the time.
Hmm.. I'm not sure what to do in that situation. My best guess is to plead utter uncertainty, and ask it to formulate the most probable issues in the order of likelihood.
Asking for a ranked list sounds like a great solution, sometimes it is wrong even when its not being sycophantic (which I don't mind its not magic and the information I am giving as someone with no clue what I am doing is imperfect at best) so that sounds like a two birds one stone kind of fix.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I think the tails will come apart in the marketplace before the come apart on a technical level. LLMs will get enshittified like everything else, if they haven't begun enshittified. They are optimized for engagement and selling access more than they are optimized for productivity. An effective LLM is an LLM that puts itself out of a job in many tasks.
I disagree on empirical grounds. Altman is a snake, but even he agreed that GPT-4o was concerningly sycophantic, and removed it, while framing 5 as less sycophantic. This caused a revolt by giga-fried 4o addicts, and he relented. Of course, the objections were also more general, I was personally annoyed by the sudden deprecation of 4.1 and o3, and the reduced rate limits, which many other people objected to.
Consider this:
Would you pay more for a therapist or a nuclear engineer (presuming you had any use for the latter)? LLM companies are desperately fighting to move up the value chain, they all want to sell their models as equivalent in performance to PhD candidates, or independent agents capable of doing high value knowledge work. That's what brings in the big bucks from other businesses or HNWIs who will pay >$200/m for pro plans. Having a buddy to chat to definitely brings in money, but it's a rounding error in comparison.
They want to make money from both markets, but one just makes way more sense to focus on. Especially since people will prefer intelligent + sycophantic to less intelligent + equal amounts of sycophancy.
I dont think they actually do. IMO a large problem with most AI companies is they are vanity projects being overseen by bloated, already successful, companies that are looking to find a second revenue stream in the future. But that future is far off and the current revenue streams aren't going anywhere soon, so they can afford to be stupid and make their AI's intentionally stupid to placate their employees who don't want to see an AI outputting things that would offend said employees.
More options
Context Copy link
I think it matters what you intend the system to be used for. There’s probably a market for a sycophantic waifu or friend bot. But I don’t want my accountant to act like my best friend. In fact, I’d personally trust business or career advice less if I thought the human or bot giving the advice was trying to be my friend or appear as my friend.
More options
Context Copy link
I donno man. How much value is there really here? Unless you just let'r rip and see what happens, all those LLMs doing PhD level knowledge work will need to be overseen still by PhD level knowledge workers to check for veracity and hallucinations. It runs into a bit of the "How does a stupid writer depict a smart villain" problem.
And as for the companies that decide let'r rip without adequate oversight, well... I can't venture to guess. Really playing with fire there.
There is probably perfectly adequate shareholder value in getting a billion lonely midwits to pay $10/month rising to $inf/month in the way of all silicon valley service models, and keeping them hooked with the LLM equivalent of tokenized language loot boxes. I'd wager its even the more significant hill to climb for shareholder value.
Fair points, but verification is usually way cheaper than generation. If one actual human PhD can monitor a dozen AI agents, it is plausible that the value prop makes sense.
In a lot of tasks, including AI research and coding, you can also automate the verification process. Does the code compile and pass all tests? Does the new proposed optimizer beat Adam or Muon on a toy run?
That might be true today (and tomorrow, or next year), but the companies are betting hard on their models being capable of doing much more, and hence getting paying customers willing to shell out more. The true goal is recursive self-improvement, and the belief that this has far more dollars associated with it than even capturing all the money on earth today. Of course, they need market share and ongoing revenue to justify the investments to get there, which is why you can buy in relatively cheap. Competition also keeps them mostly honest, OAI would probably be charging a great deal more or gatekeeping their best if Google or Anthropic weren't around.
Not if P = NP
More options
Context Copy link
Not necessarily! It's an adage among programmers that reviewing somebody else's code is often harder than making your own, because you have to figure it all out and then try to create some ideal version in your head and mesh the two together.
It's actually a big issue with vibe-coding - I end up with a codebase I don't understand and then have to do the work of figuring out the framework for myself anyway.
I would argue that this is a temporary state of affairs. Current AI coding is at the level of an over-caffeinated intern (who is very knowledgeable, but less than practical). Thus, a great deal of oversight is necessary to make sure they aren't shooting themselves in the foot.
But consider the potential SOTA in a year or two, when they're comfortably at par with mid-level coders. A senior SWE is usually happy to delegate to multiple experienced juniors, without worrying too much about the exact implementation details. My impression is that we're not there yet.
https://x.com/METR_Evals/status/1955747420324946037
In other words, a lot (but not all) of the theoretical time savings are eaten up by the need to understand, edit and improve their code. At present.
AFAIK it's usually mandatory for all written code to be reviewed before it's merged into the code base. At my last company every Pull Request (submitted code) had to be reviewed by two people, plus or including the 'owner' of the files in question. Review is usually considered a very onerous duty to be avoided where possible, and in theory reviewers bear as much responsibility for the final output as the original writer. The purpose is partly to inspect the quality of the code and to make sure it's doing what's expected (even senior guys fuck up) and partly to make sure that at least a few people are familiar with each part of the codebase.
This was at a 'move-fast-and-break-things' company. The review standards at somewhere like Intel are of course significantly higher.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
The total salary of all therapists is surely far higher than the combined salary of all nuclear engineers? I tried to find aggregate employment figures and failed.
Broadly, there are huge amounts of people who are very lonely and unable realistically to fix that. I think the value from providing a real-enough friend to them would be vastly more valuable in both utilitarian and monetary terms than almost anything else. I hope of course to move to an open, almost-free solution.
Almost certainly true, and my analogy is imperfect.
In the limit, AI are postulated to be capable of doing {everything humans can do}, physically or mentally.
But AI companies today are heavily compute constrained, they're begging Nvidia to sell them more GPUs, even at ridiculous costs.
That means that they want to extract maximum $/flop. This means that they'd much rather automate high value knowledge work first. AI researchers make hundreds of thousands or even millions/almost billions of USD a year, if you have a model that is as smart as an AI researcher, then you can capture some of that revenue.
Once those extremely high yield targets are out of the way, then you can start creeping down the value chain. The cost of electricity for ChatGPT is less than that hourly fee most therapists charge.
Of course, I must hasten to add that this is an ideal scenario. The models aren't good enough to outright replace the best AI researchers, maybe not even the median or subpar. If the only job they can do is that which demands the intelligence of a therapist, then they'll have to settle for that.
(And of course, the specter of recursive self improvement. Each AI AI researcher or programmer can plausibly speed up the iteration time till an even better researcher or coder. This may or may not be happening today.)
In other words, yhere are competing pressures:
Revenue and market share today. Hence free or $20 plans for the masses.
A push to sell more expensive plans or access to better models to those willing to pay for them.
Severe compute constraints, meaning that optimizing revenue on the margin is important.
You’re right, I wasn’t really thinking about extracting max value from limited compute.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I can't say I'm not enjoying how LLM training keeps producing hard evidence for everything we low-agreeableness people have been claiming since times immemorial.
More options
Context Copy link
You know, I've long noticed a human version of this tension that I've been really curious about.
Different communities have different norms, of course. This isn't news. But I've had, at points, one foot in creative communities where artists or crafts people try to get good at things, and another foot in academic communities where academics try to "understand the world", or "critique society and power", or "understand math / economics / whatever". And what I've noticed, at least in my time in such communities, is that the creator spaces if they're functional at all (and not all are) tend to be a lot more positive and validating. A lot of the academic communities are much more demoralizing.
I'm sure some of that is that the creative spaces I'm thinking of tend to be more opt-in. Back in the day, no one was pointing a gun at anyone's head to participate in the Quake community, say. Same thing for people trying to make digital art in Photoshop, or musicians participating in video game remix communities, or people making indie browser games and looking for morale boosts from their peers. Whereas people participating in academic communities often are part of a more formalized system that where they have to be there, even if they're burned out, even if they stop believing in what they're working on, or even if they think it's likely that they have no future. So that's a very real difference.
But I've also long speculated that there's something more fundamental at play, like... I don't know, that everyone trying to improve in those functional creator spaces understands the incredibly vulnerable position people put themselves in when they take the initiative to create something and put themselves out there. And everyone has to start somewhere. It's a process for everyone. Demoralization is real. And everyone is trying to improve all the time, and there's just too much to know and master. There's a real balance between maintaining the standards of a community and maintaining the morale of individual members of a community - you do need enough high quality not to run off people who have actually mastered some things. And yet there really is very little to be gained by ripping bad work to shreds, in the usual case.
But in the academic communities, public critique is often treated as having a much higher status. It's a sign that a field is valuable, and it's a way of weeding "bad" work out of a field to maintain high standards and thus the value of the field in question. And it's a way to assert zero sum status over other high status people, too. But more, because of all of this, it really just becomes a kind of habit. Finding the flaws in work just becomes what you do, or at least that was the case for many of the academic fields I was familiar with (I've worked at universities and have a lot of professor friends). And it's not even really viewed as personal most of the time (although it can be). It's just sort of a way of navigating the world. It reminds me of the old Onion article about the grad student deconstructing a Mexican food menu.
The thing is, on paper, you might well find that the first style of forum does end up validating people for their crappy mistakes. I wouldn't be surprised if that were true. But it's also true that people exist through time. And tacit knowledge is real and not trivially shared or captured, either. I feel like there's a more complicated tradeoff lurking in the background here.
Recently I've been using AI (Gemini Pro 2.5 and Claude Sonnet 4.1) to work through a bunch of quite complicated math question I have. And yeah, they spend a lot of time glazing me (especially Gemini). And I definitely have to engage in a lot of preemptive self-criticism and skepticism to guard against that, and to be wary of what they say. And both models do get things wrong some time. But I've gotten to ask a lot of really in-depth questions, and its proven to be really useful. Meanwhile, I went back to some of the various stackexchange sites recently after doing this, and... yet, tedious prickly dickishness. It's still there. I know those communities have, in aggregate, all sorts of smart people. I've gotten value from the site. But the comparison of the experience between the two is night and day, in exactly the same pattern as I just described above, and I'm obviously getting vastly more value from the AI currently.
Here's some of my own insights, hopefully some of them are new or useful to you. I will compare artistic people to those who try to understand the world. The "critique society and power" group can be dismissed as politics/tribalism/activism/preaching, it's part power-struggle and part mental illness, so I will exclude it.
Academic communities tend to have a consensus, and to punish those who challenge it. This is much less prevalent in artisic communities, as most people there recognize that many different styles can be appealing for different reasons. You could argue that this is a kind of tribalism, but I think it's also a way of viewing the world: That there's one correct answer (that truth is unique), that truth is universal (rather than possibly local), and that everything can be made legible (that logic and math is sufficiently powerful to explain everything which can be explained), and that you can unify everything without ruining it in the process (that a theory of everything is possible).
Artistic people do indeed share a part of themselves when they share their art, or at least reveal something about themselves. This doesn't happen much in academia, you don't have to take responsibility for the discoveries you make, for they're true or false independently of you. Academia is about discoveries where art is about creations.
I also think that bad art is harmless to other art, and mostly harmless to other people. Making a mistake in academic work could potentially harm a lot of people, or slow down progress of "the whole". This punishes experimentation.
Finding flaws in work is a costly mental heuristic. It's basically conditioning yourself to only see the bad aspects of things. But while this seems to make academics treat eachother harshly, I find that this is less rare in artistic spaces. What usually happens instead is that artists are extremely hard on themselves and their own work, but encouraging of other people. I think artists who are unhappy with their own are similar to people who undergo plastic surgery again and again. Staring at something for too long seems like a bad idea, be it your own work or your own face.
The mean of the distributions of personality traits also seem different between the two groups. Artistic people are more subjective, less analytical, more social, and they tend to expand their worldview until they get lost in it, whereas many mathematically minded people tend to reduce reality to abstract models and thus tend towards nihilism and simplicity. I'd also argue that scholary types tend to have bad taste by default, - you have to be a bit of a pervert to want to look beneath the surface of everything (unlike artists, who appreciate the surface, or use it to conceal the depth of life that they cannot deal with)
I think that artistic people and academic derive enjoyment from different things. I love correcting people who are wrong, I think it feels really good when I get a new insight, and climbing the mountain of knowledge is also a joy in itself. Art is beauty, the joy of creation, it's experience, and it's anti-nihilistic. Art is quite human, whereas the objective is simply anti-human (another user on here probably disagrees very strongly with this, but I did the math)
I've once heard that intelligence is inversely correlated with instinct. It could be because instinct is innate intelligence, and that this competes with generalized intelligence, since the latter has to be able to overwrite it in order for you to update your beliefs and adapt to a new environment than what your innate intelligence is fit for. It could also simply be a trade-off between developing yourself, and aligning yourself with something else until you yourself disappear. Do you want to chop off a part of yourself in order to fit in, or will you believe in that part of yourself and work to make it more appealing?
I guess that people of a field tend to grow tired of teaching beginners because they have to explain the same things maybe 50 or 100 times. First time I saw somebody use Popper's paradox of tolerance as an argument, I though "Hmm, something about this doesn't seem right". Now I simply tell them "You're acting in bad faith, and you know it. You also don't know what comes before or after this quote, since you've never read the paper that it's from. You didn't think it through, you merely copy-pasted it because it seems like an authority which agrees with you". Of course, if somebody is so put off by stupid questions, I think they should just delete their stackoverflow account.
Finally, have you noticed the general tendency towards homogeneity? Everything is becoming more alike over time. Academic people are contributing to this problem, wheras artist people don't seem to be. Academia is, from my perspective, excessive order. Many artistic people are a little bit chaotic because they're a little bit crazy, but I personally like that
More options
Context Copy link
Above standards, there is politics, and there is tribalism. Take the Culture War Thread, for example. "This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here."
Is that how we act here? Look at the gun discussion from last week. Do the votes look like they track response quality (i.e. of argument), or do we simply have a large American gun-owning population that vehemently downvotes anything that might be the slightest bit critical of their god-given constitutional right? And of course, it's not just the voting. I regularly see people with minority views accused of being trolls, of being alts, etc. etc.
This is a rising trend on the broader internet. Even going into a reddit thread trying to post some polite, neutral information, not even taking a side draws downvotes because it pattern-matches a tribe. It didn't used to be like that. Again, this is politics and tribalism, not standards or correctness.
Partially.
Before you had us look, I would have assumed so (for a loose enough definition of "large") ... but now that I look, I note that "In the counterfactual world where the US had banned guns ten years ago, I don't think that all of the people who killed themselves with firearms in our world would have instead hanged or drowned themselves. In fact, I don't think that even 50 or 25% of them would have done so." is currently sitting at +17, -0.
I've definitely seen too many downvotes here, including in that thread, that appear to be more for disagreement than for low quality, but it's more subtle and less voluminous than you're suggesting.
Fair point. That response was less than maximally pro-gun, but it is 1. is mostly on the topic of suicide, 2. still pretty lukewarm, and comes with a healthy amount of throat clearing: "I'm not arguing that this, in itself, is a persuasive argument in favour of banning guns, and can see the merits of both sides of the debate (particularly the "guns as a check against encroaching authoritarianism" argument advanced by many, including Handwaving Freakoutery, formerly of these parts)".
Why is this comment +10,-16 for merely making an argument? Or this one? +10,-12
Bad argument gets counterargument. Does not get bullet. Does not even get small meaningless negative reinforcement via stupid internet points.
FWIW I agree with you that certain arguments get much more downvoted than others. The commenters below aren't wrong, but they are applying very different standards to those for the pro-gun arguments. "Are the children wrong?" is not on par with "Listen up, you dumb motherfucker" in terms of rudeness. It can't be helped, people are just like that, including me. Minor imperfections or rhetorical flourishes in an argument disagreeing with you are much clearer than those from people on your side.
Broadly, I think we just have to accept that the bar is different for different posts. I'm reasonably proud (not that I care about dumb internet points hem hem) that my comment in that thread stayed above 0.
Broadly I would say:
Popular opinion, well written: 30-40
Popular opinion, badly written: 10-20
Unpopular opinion, well written: 0
Unpopular opinion, badly written: -10
Unpopular opinion, gratuitously insulting: -30.
Those are the numbers to try and beat.
Sure, but I was quoting a well-known 4chan copypasta, not actually calling the poster I was replying to that.
I’m quite happy to take the actual point of the copypasta and accept that the wrapping is for dramatic effect.
Mostly I’m responding to the idea that the prior posts weren’t downvoted for being on the wrong side of the debate but for being rude.
It seems to me there’s a charitable and an uncharitable way to read any of these posts and that the ‘wrong’ side gets less charity by and large. IMO the same copypasta would be downvoted to hell if it was an anti gun message in the same format.
Don’t have any actual action items I’m pushing for here, I just think the phenomenon is obvious and worth noting.
More options
Context Copy link
More options
Context Copy link
And there go the goalposts. First the objection was that the comment was downvoted for "merely making an argument". Then, when it's pointed out the comment actually was doing something other than that, it's a complaint about different standards.
You're not the moderator of internet points. And the moderators here, so far as I know, don't moderate internet points. Further, they do moderate responses, so when someone posts some shit implying me and anyone who shares my views is as out-of-touch as Principal Skinner from the Simpsons, I can't just respond with "fuck off, you supercilious asshole" because that will get me modded; internet points are all I got.
The 'children' in this case are all the other countries in the First World. The point is that American disputes tend to act as if the rest of the world doesn't exist, hence cject's OP implying that nobody anywhere has any trust in their fellow citizens except in certain parts of the US and in the Third World, which I find frankly ridiculous.
And this is in fact my point. People, not just you but in general, immediately leap from 'I don't like this opinion' to forming the worst possible interpretation of the post and then downvote. Meanwhile they apply much more generous standards to people who agree with them. This is Confirmation Bias 101, everyone knows humans do this. These are such sensitive issues and the resultant standards are so strict that, in practice, (and, yes, in my opinion since as you point out I am not Tzar of internet points) there is no meaningful gap between "a complaint about different standards" and "downvoted for merely making an argument".
It wasn't you who posted it, unless you're a corvid as well as a corvos. But the offending bit is:
The straightforward interpretation is that either you accept the insulting characterization in the first part, or you're completely out of touch (note the URL). This absolutely deserves a downvote.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
For whatever it's worth, I think both your example comments are wrong and retarded (and I even replied to one of them with a 4chan copypasta effectively saying as much) but I didn't downvote either of them. The reason being that downvotes (and upvotes) are for narcissistic ninnies who care way too much about imaginary internet points.
More options
Context Copy link
Comment 1 is a combination of strawmanning and mocking. It also includes a reference to a meme that is arguably being applied incorrectly.
Overall a low-mid quality comment that, if you agree with you are likely to ignore, and if you disagree with you might throw a minus on it. That it has +10 at all is strong proof of anti-gun people voting on ideology.
The second one is perfectly mid, I would not have voted on it, and in fact did not. But it does invoke several anti-gun idiocies like appeals to other combat weapons, hunting, drivers licenses, etc. I can see a strong argument for giving it a downvote for being mealy-mouthed gish-gallop and I see no reason other than length and partisanship for an upvote.
More options
Context Copy link
Possibly for the false assertions in the arguments' premises; probably for the insulting phrasing and meme at the end.
This is a good example; thanks. Many of the counterarguments to it ended up looking better than the arguments, but the only thing asking for a downvote is the "just laughable" swipe at the top, and that's unrepresentative of the care taken in most of the rest of it.
For zero negative reinforcement, there's always
cat -v /dev/random. You'll get all the arguments, sooner or later.I'm fine with negative reinforcement for bad arguments. Good counterarguments, at least if there's a dogpile of them, are themselves something of a negative reinforcement, don't you think? I just don't like it being expressed via what's supposed to be a count of negative reinforcement for bad comments. The "karma" vs "agreement" vote counts on LessWrong and similar sites now are an interesting experiment in separating those. I don't know what the correlation coefficient between them is (or what I'd expect it should be, for that matter), but their distinction is respected enough that even infrequent readers like me often come across the "this is really interesting even though it's wrong" score combo. The "I agree with this but it's a bad comment" combo seems rarer, but that may just be an artifact of the crowd or the subject matter there; for culture war discussions I fear I'd want to assign it a hotkey.
More options
Context Copy link
Perhaps the rhetorical flourish at the end?
Perhaps the jeering paragraph objecting to "fun" being a reason for things to be legal, or the tiresome cars/guns comparison?
No, a downvote is not a bullet, and an argument against bullets is not an argument against "small meaningless negative reinforcement via stupid internet points".
I missed my chance at the time, so I'll put it here.
You want guns to be more like cars? Fine, let's do that. If the government wanted to spend a few billion on public gun ranges all across the country, mandated a gun safe in every new house, added firearm safety to the highschool curriculum, bailed out failing manufacturers, and also let people build/buy/use them freely outside of the new infrastructure they built, then I'd be pretty happy. Heck, I'd even compromise on that last point if they did the rest.
More options
Context Copy link
The same rhetorical flourishes that would go overlooked on posts in favour of the prevailing view? I don't buy it.
A downvote is not a bullet. It's more like a middle finger, or a scowl, or an eye-roll, but that's enough. It's enough to say "we don't want you here. go away", and that's my point. It's against the spirit of this forum. It is politics and tribalism above the pursuit of truth.
They'd likely be downvoted, just by different people.
All I'm seeing is crying about rhetorically dishing it out but not being willing to take even the most minor pushback.
A lot of the heavily downvoted comments in that thread are not rhetorically spicy. Must I? Fine..
I think the most likely explanation is that our readership is doing opinion war when it comes to an issue they really care about, and that's bad. I picture Motte-Jesus storming this temple, flipping tables screaming "Stop turning my Father's house into an echo chamber!"
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
My last ex was a PhD literature student in a very prestigious university. One of her perennial complaints was that I did not take as much interest in her work as she would like, which, though I denied it at the time, has a kernel of truth. The problem was not a lack of interest in her as a person, but in the nature of the intellectual game she was required to play.
Most humanities programs are, to put it bluntly, huffing their own farts. There is little grounding in fact, little contact with the real world of gears, machinery, or meat. I call this the Reality Anchor. A field has a strong Reality Anchor if its propositions can be tested against something external and unforgiving. An engineer builds a bridge: either it stands up to traffic and weather, or it does not. A programmer writes code: either it compiles and executes the desired function, or it throws an error. A surgeon performs a procedure, the patient’s outcome provides a grim but objective metric. Reality is the ultimate, non-negotiable peer reviewer.
Psychiatry is hardly perfect in that regard, but we care more about RCTs than debating Freudian vs Lacanian nonsense. Does the intervention improve outcomes in a measurable way? If not, it is of limited use, no matter how elegant the theory behind it.
When a field loses its Reality Anchor, the primary mechanism for advancement and evaluation shifts. The game is no longer about correctly modeling or manipulating the world. The game becomes one of status. Can you convince your peers of your erudition and wit? Can you create ever more contrived frameworks while studiously ignoring that your rarefied setting has increasingly little relevance to reality? Well, you better, and it is best if you drink the Kool-Aid. That is the only way you will get grants or cling on to a barely living wage. It helps if you can delude yourself into thinking your work is meaningful, since few people can handle the cognitive dissonance of genuinely pointless or counterproductive jobs.
Most physicists agree on the laws of physics, and are arguing about more subtle interpretations, edge cases, or speculating about better models. Most nuclear engineers do not disagree that radioactivity exists. Most doctors do not doubt that paracetamol reduces pain. Yet, if you go to the cafeteria of a philosophy department and ask ten people about the true meaning of philosophy, you will get eleven contradictory answers. When you ask them to establish consensus, they will start clobbering each other. In a field anchored by social consensus, destroying the consensus of others is a viable path to power.
Deconstructing a takeout menu, as in the Onion article, is the logical endpoint: a mind so trained in critique that it can no longer see a thing for what it is*, only as an object to be dismantled to demonstrate intellectual superiority. Critique becomes a status-seeking missile.
*I will begrudgingly say that the post-modernists have a point in claiming that it isn't really possible to see things "as they are." The observation is at least partially colored by the observer. But the image taken by a digital camera might be processed, but it is still more neutral than the same image run through a dozen Instagram filters. Pretending to have objective reality helps.
In other words, "our farts are different"? 😀
There would be the view out there that "okay, so you are trying to distinguish yourself, as a psychiatrist, from those squishy psychotherapists, but dude, the main difference is that you guys are legal drug pushers and now there's some doubt that the drugs even work".
Depression caused by lack of serotonin? Yeah, we don't think that anymore, but we still prescribe drugs to bump up serotonin levels.
Many such instances!
Well, duh. SSRIs work even if the original hypothesis was proven flawed. Hand washing worked, even before we had the germ theory of disease.
https://pmc.ncbi.nlm.nih.gov/articles/PMC9632745/
If it works, it works, and knowing why it works is always nice.
I recall that you, in your Deiseach avatar, were noted to be the most prolific commenter of all time on both of Scott's blogs (in that recent guest post). In that case, you shouldn't be surprised at all to learn that Scott has written multiple posts about the topic:
https://slatestarcodex.com/2015/04/05/chemical-imbalance/
To this day, I don't know why people float it as such a gotcha. Psychiatrists have known better for a long time now, and the critique makes us groan in the same manner that economists are tired of claims that they only study perfectly spherical/rational humans in a vacuum.
People who live in glass houses shouldn't throw stones. "It works but we don't know why or how and our last theory is gone up in smoke" does not read well for a position of "well my fart-huffing is more scientifically based at least!"
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
What about a descriptive discipline like history? What's the reality anchor for that?
Archeology? Historical records?
If someone claims that the the primary language of ancient Rome was Chinese, they would be laughed out of the room. When the BBC claims that there were black people in high society during the Regency, there are enough honest historians around to criticize them.
While historians are prone to fads and stretching plausibility at times, the field as whole is far healthier than literature. At the end, it is a matter of degree, not kind.
More options
Context Copy link
More options
Context Copy link
Not even close to original with them. Plato famously said the same with the Allegory of the Cave, and there's Kant's noumena.
I know, but it's one of the few things in their worldview that I agree with.
More options
Context Copy link
A quick aside about Kant, since so many people blame Kant for things that he really had little or nothing to do with (I recall a program on a Catholic TV channel where they accused Kant of being a "moral relativist", which is... distressing and concerning, that they think that...).
Kant saw himself as trying to mediate between the rationalists and the empiricists. The empiricists thought we could only know things through direct sensory experience, which seems pretty reasonable, until you realize that a statement like "empiricism is true" can't be known directly through your five senses, nor were they able to explain a lot of other things, like how we can have true knowledge of the laws of nature or of causal relations in general (Hume's problem: just because pushing the vase off the table made it fall over a million times doesn't mean it'll happen again the millionth and first time). The rationalists thought that we could know things just by thinking about them, which would be cool if true, except they weren't able to explain how this was actually possible (even in the 1700s, the idea of a "faculty of rational intuition" hiding somewhere in the brain was met with significant skepticism).
Kant's solution was that we can know certain things about the world of experience using only our minds, because the world of experience that we actually perceive is shaped by and generated by our minds in some fundamental sense. The reality we experience must conform to the structure of our minds. So to condense about 800 pages of arguments into one sentence, we can know contra Hume that the world of experience actually is governed by law-like causal relations, because in order to have conscious experience of anything at all, and in order to be able to perceive oneself as a stable subject who is capable of reflecting on this experience, that experience itself must necessarily be governed by logical and law-like regularities. So we can actually know all sorts of things in a very direct way about the things we perceive. When you see an apple you know that it is in fact an apple, you know that if you push it off the table it will fall over, etc. The only downside is that we can't know the true metaphysical nature of things in themselves, independent of how they would appear to any perceiving subject. But that's fine, because in Kant's view he has secured the philosophical possibility of using empirical science to discover the true nature of the reality that we do perceive, and we can leave all the noumena stuff in the reality that we don't perceive up to God.
So he really was trying to "prove the common man right in a language that the common man could not understand", to use Nietzsche's phrase. It must be admitted though that Kant can be interpreted as saying that the laws of mathematics and physics issue forth directly from the structure of the human mind. I believe he would almost certainly add though that this structure is immutable and is not subject to conscious modification. You could argue that some later thinkers got inspired by this view, dropped the "immutable" part, and thus became relativists who granted undue creative power to human subjectivity. But a) the postmodernists are generally not as "relativist" as many people presuppose anyway, and b) I basically can't recall any passage from any book at all where someone said "I believe XYZ relativist type claim because Kant said so", so if Kant did exert some influence in this direction, it was probably only in a very indirect fashion.
Related pet peeve of mine - ask a roomful of medical ethicists (who should bloody well know better, and to be fair some of them do) about Kant and "autonomy". It's darkly hilarious. Just because Kant made extensive use of a word that is often translated as "autonomy", a lot of people seem to think he held something like a modern medical ethicist's typical views about the importance of self-determination, informed consent, and so on. This is almost the exact opposite of the truth. Kantian "autonomy" means you have to arrive at the moral law by your own reasoning, and not out of (say) social pressure, for it to really "count" - but there's only one moral law, and it's the same for everyone, with zero space for individualized variation.
(And you aren't really acting morally unless you follow it out of duty, not because it feels good or gets good results. Just arriving at the same object-level conclusions about how to act isn't enough.)
Yes exactly! “Autonomy” for Kant just means… the ability to autonomously come to the exact same ethical conclusions that Kant did. Which is pretty hilarious.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
The relation of the humanities to "reality" varies so drastically from field to field, and even from paper to paper, that it's almost impossible to make generalizations. You have to just take things on a case by case basis, determine what the intent was, and how well that intent was executed upon.
If we're going to regard analytic philosophy as one of the humanities (as you seem to do), then the "reality anchor" is simply how well the argument in question describes, well, reality, in addition to its own internal logical coherence. You have previously shared your own philosophical views on machine consciousness and machine understanding. Presumably, you did think that these views of yours were well supported by the evidence and that they were grounded in "reality". So it's not that you devalue philosophy; it's just that you think your own philosophical views are obviously correct, and the views of your philosophical opponents are obviously incorrect, which is what every single philosopher has thought since the beginning of recorded history, so you're in good company there.
Literary studies can end up being quite empirically grounded. You'll get people who are doing things like a statistical analysis of the lexicon of a given book or a given set of books, counting up how many times X type of word appears in Y genres of novels from time period Z. Or it can turn into a sort of literary history, pulling together letters and diary entries to show that X author read Y author which is why they were influenced to do Z kind of writing. Even in more abstract matters of literary interpretation though, I think it's rash to say that they have no grounding in empirical fact. There's a classic problem in Shakespeare studies, for example, over whether Shakespeare intended Marcus's monologue in Titus Andronicus to be ironic and satirical. I believe that most people would agree by default that there is a fact of the matter over whether Shakespeare had a conscious intent or not to write the speech in an ironic fashion (this assumption of course reveals philosophical complexities if you poke at it enough, but, most people will not find it to be too troublesome of an assumption). Of course the possibility of actually confirming this fact once and for all is now forbidden to us, lost as it is to the sands of time. But, since we know that people's thoughts and emotions influence their words and actions, we can presumably make some headway on gathering evidence regarding Shakespeare's intent here, and make a reasoned argument for one position or the other.
One of the goals of psychoanalysis is to interrogate fundamental assumptions about what an "outcome" even is, which outcomes are desirable and worth pursuing in a given individual context, and what it means to actually "measure" a given "outcome". Presumably, empirical psychiatry does not take these questions to be its proper business, so it's unsurprising that there would be a divergence in perspective here. (If someone were to present with complaints of ritualistic OCD behaviors, for example, then psychoanalysis is theoretically neutral regarding whether the cessation of the behavior is the "proper" and desirable outcome. It certainly may very well be the desirable outcome in the majority of cases, but this cannot be taken as a given.)
I can't really ask for a better steelman for the positions I'm against, so thank you.
You accuse me of engaging in philosophy, and I can only plead guilty. But I suspect we are talking about two different things. I see a distinction between what we might call instrumental versus terminal philosophy. I use philosophy as a spade, a tool to dig into reality-anchored problems like the nature of consciousness or my ethical obligations to a patient. The goal is to get somewhere. For many professional philosophers I have encountered, philosophy is not a tool to be used but an object to be endlessly polished. They are not in it to dig, they're here to argue about the platonic ideal of a spade.
(In my case, I'm rather concerned that if we do instantiate a Machine God: we'd better teach it a definition of spade that doesn't mean our bones are used to dig our graves)
This is especially true in moral philosophy. I have a strong conviction that objective morality does not exist. The evidence against it is a vast, silent ocean; the evidence for it is a null set. I consider it as likely as finding a hidden integer between two and three that we've somehow missed. This makes moral arguments an interesting class of facts, but only facts about the people having them. Potentially facts about game theory and evolutionary processes, since many aspects of morality are conserved across species. Dogs and monkeys understand fairness, or have kin-group obligations.
I must strongly disagree, this doesn't represent my stance at all. In fact, I would say that this is a category error. The only way a philosophical conjecture can be "incorrect" is through logical error in its formulation, or outright self-contradiction.
My own stance is that I am both a moral relativist and a moral chauvinist, and I deny these claims are contradictory. My preference for my own brand of consequentialism is just that: a preference. I do not think a Kantian is wrong so much as I observe that they must constantly ignore their own imperatives to function in the world.
That makes philosophical arguments not that different to debating a favorite football team. Can be fun over a drink, often interesting, but usually not productive.
This brings me back to your defense of the humanities. You give excellent examples of how these fields can be anchored to reality, like the statistical analysis of a lexicon. I do not doubt these researchers exist, my ex did similar work.
My critique is about the center of gravity of these fields. For every scholar doing a careful statistical analysis, how many are writing another unfalsifiable post-structuralist critique by doing the equivalent of scrutinizing a takeout menu? My experience suggests the latter is far more representative of the field's culture and what is considered high status work. The exceptions, however laudable, do not disprove the rule about the field's dominant intellectual mode.
I am a Bayesian, so I am fully on board with probabilistic arguments. Yet, once again, in the humanities or in philosophy, consensus is rare or sometimes never reached. I find this farcical.
The core difference, as I see it, is the presence of a robust error correction mechanism. In my world, bad ideas have an expiration date because they fail to produce results. Phlogiston theory is dead. Lamarckian evolution is dead. They were falsified by reality (in the Bayesian, not Popperian sense). Can we say the same for the most influential ideas in the humanities? The continued influence of figures like Lacan, despite decades of withering critique, suggests the system is not structured to kill its darlings. It is designed to accumulate "perspectives," not to converge on truth.
(Even STEM rewards new discoveries, but someone conducting an experiment showing Einstein's model of gravity works/doesn't work in a new regime is doing something far more important and useful than someone arguing about feminist interpretations of underwater basket weaving)
My own field of psychiatry is a good case study here. We are in the aftermath of a replicability crisis. It is painful and embarrassing (but mostly in the softer aspects of psychology, the drugs usually work), but it is also a sign of institutional health. We are actively trying to discover where we have been wrong and hold ourselves to a higher standard. This is our Reality Anchor, however imperfect, pulling us back. I do not see an equivalent "interpretive crisis" in literary studies. I do not see a movement to discard theories that produce endless, unfalsifiable, and contradictory readings. The lack of such a crisis isn't a sign of stability. To me, it seems a sign the field may not have a reliable way to know when it is wrong. The Sokal Affair, or my own time in the Tate, shows that "earnest" productions are indistinguishable from parody.
This is not an accident. It flows directly from the incentive structure. In my field, discovering a new, effective treatment for depression grants you status because of its truth and utility. In literary studies, what is the reward for simply confirming the last fifty years of scholarship on Titus Andronicus? There is little to none. The incentive is to produce a novel interpretation, the more contrarian the better. This creates a centrifugal force, pushing the field away from stable consensus and towards ever more esoteric readings. The goal ceases to be understanding the text and becomes demonstrating the cleverness of the critic.
Regarding psychoanalysis and outcomes, I am a simple pragmatist. If a person with OCD is happy, I have no desire to treat them. If they are a paranoid schizophrenic setting parks on fire, the matter is out of my hands. In most cases, patients come to us because they believe they have a problem. We usually agree. That shared understanding of a problem in need of a solution is anchor enough.
This is why I believe the humanities are not a good target for limited public funds, at least at present. I have no objection to private donors funding this work. But most STEM and medical research has a far more obvious case for being a worthy investment of tax dollars. If we must make hard choices, I would fund the fields that have a mechanism for being wrong and a track record of correcting themselves, while also raising standards of living or technological progression.
It's rather ironic that your own choice of analogy willingly jumps into the thicket of the philosophy of mathematics. Perhaps you're just doing so unknowingly or just with a general lack of care, but that would indeed be apropos.
What sort of 'evidence' do you think one would gather to determine the status of mathematical objects? Is it empirical? Do you perform an experiment? Is that the means by which one 'finds' or, say, 'discovers' things like integers?
I hate to do this, but last time we did this, you were unable to even explain what it is that those terms meant. Would you like to take another go at it?
Thank you for reminding me of that rather unpleasant experience. I would actually not like to take another go at it. Anyone wanting elaboration is welcome to read the thread.
Fair enough on the positive claim concerning meta-ethics. If you'd prefer to leave that one in incoherence, you can leave that one in incoherence.
Would you like to take a shot at your negative claim with analogy to philosophy of mathematics? Any sort of clarity or argument there?
No, I showed that my point was coherent, it is beyond me why you don't see that. It's not really my problem at this point.
Not with you, I'm afraid. @Primaprimaprima is far more pleasant to talk to, hence I am more than happy to discuss that in detail with them. You're welcome to read that thread and make of it what you will.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Dear Lord what a beautiful illustration of Jung's dichotomy between extroverted thinking and introverted thinking. Textbook. I'm practically giddy over here.
Anyway, it's all exactly as you describe. Some people do just want to endlessly polish for its own sake. That's what they like to do. And that's ok with me. You get the same thing in STEM too. Mathematicians working on God knows what kinds of theories related to affine abelian varieties over 3-dual functor categories or whatever. None of it will ever be "useful" to anyone. But their work is quite fascinating nonetheless, so I'm happy that they're able to continue on with it in peace.
I'm a bit confused here. I believe you've claimed before that a) first-person consciousness does exist, and b) sufficiently advanced AI will be conscious. Correct me if I'm wrong here. You asserted these claims because you think they're true, yes? And so anyone who denies these claims is saying something false?
These claims (that first-person consciousness does exist, and that sufficiently advanced AI will be conscious) are philosophical claims. There are philosophers who deny one or both of them. Presumably you don't think they're making a "category error", you just think they're saying something false.
Of course, there's a lot of indefensible crap out there. But 90% of everything is crap. I simply defend the parts that are defensible and ignore the parts that are indefensible.
That's a relatively accurate statement!
Some people just want to get things done. Some people just want to sit back and take a new perspective on things. Nature produces both types with regularity. Let us appreciate the beautiful diversity of types among the human race, yes?
That's because you haven't been looking. There's basically never not an interpretive crisis going on in literary studies.
In the early 20th century you had New Criticism, and people criticized that for being overly formalist and ignoring social and political context, so then you had everything that goes under the banner of "postmodernism", ideology critique, historicism, all that sort of stuff, and then you had some people who said that the postmodernist stuff was leading us astray and we had gotten too far from the texts themselves and how they're actually received, so they got into "postcritique" and reader response theory, and on and on it goes...
In general, people outside of the humanities underestimate the degree of internal philosophical disagreement within the humanities. Here's an hour long podcast of Walter Benn Michaels talking about the controversy engendered by his infamous paper "Against Theory", if you're interested.
I'd be happy if you could direct me to any of these novel and esoteric readings. My impression is that the direction of force is the opposite, and that readings tend to be conservative because agreeing with your peers and mentors is how you get promoted (conservative in the sense of adhering to institutional trends, not conservative in the political sense).
Well, that's something that psychoanalysis actually does take a theoretical stance on. You can't trust the patient about what the problem is. Frequently, what they first complain about is not the root cause of what's actually going on. It might be. But frequently it's not. Any "shared understanding" after a one week period of consultation is illusory, because people fundamentally do not understand themselves. (I will relay a lovely anecdote about such a case in a reply to this comment, so as not to overly elongate the current post.)
I suppose that's where the rub always lies, isn't it. Well, you're getting your wish, since humanities departments are shuttering at an unprecedented rate. I fully agree that there is no "utilitarian" argument for why much of this work should continue. All I can do is try to communicate my own "perspective" (heh) on how I see value in this work, and hope that other people choose to share in that perspective.
Isn't it a massive meme (based in fact) that even the most pure and apparently useless theoretical mathematics ends up having practical utility?
Hell, it even has a name: "The Unreasonable Effectiveness of Mathematics in the Natural Sciences"
Just a few examples, since you probably know more than I do:
Number theory to modern cryptography
Non-Euclidean geometry was considered largely a curiosity till Einstein came along.
Group theory and particle physics
So even if the mathematicians themselves want to claim their work is just rolling in Platonic hay for the love of the game, well, I'll smile and wait. It's not like it's expensive either, you can run a maths department on roughly the budget for food, chalk and chalkboards.
(It's amazing how cheap they are, and how more of them don't just run off to a quant firm. Almost makes you believe that they genuinely love maths)
Have I? I'm pretty sure that's not the case.
The closest I can recall going is:
We do not have a complete mechanistic model of consciousness in humans
We do not know what the minimal requirements of consciousness even are in the first place
I have no robust way of knowing if other humans are conscious. I'm not an actual solipsist, because I think the odds are pretty damn solid (human brains are really similar), but it is not actually a certainty.
Ergo, LLMs might be conscious. I also always add the caveat that if they are, they are almost certainly an incredibly alien form of consciousness and likely to have very different qualia.
In a sense, I see the question of consciousness as irrelevant when it comes to AI. I really don't care! If an ASI tells me it's conscious, then I'll just shrug and go about my day. What I care far more about is what an ASI can achieve.
(If GPT-5 tells me it's conscious, I'd say, great, now where is that chart I asked for?)
It looks to me like less of a crisis rather than business as usual. What I see is a series of cyclical fads going in and out of fashion, and no real consistency or convergence.
How many layers of rebuttal and counter-rebuttal must we go before a lasting consensus is achieved? I expect most literary academics would say that the self-licking nature of the ice cream cone is the point.
Contrast with STEM: If someone proves that the axiom of choice is, strictly speaking, unnecessary, that would cause a revolution. Even if such a fundamental change doesn't happen, the field will make steady improvements.
Uh.. This really isn't my strong suit, but I believe that the queer theoretical interpretation of Moby Dick or the post-colonial reading of The Tempest might apply.
I do not think Shakespeare intended to say much on the topic of colonial politics. I can grant that sailors are hella gay, so maybe the critical queers have something useful to say.
I don't think you really need psychoanalysis to get there. Depressed people are often known to not acknowledge their depression. I've never felt tempted to bring out a psychoanalysis textbook to solve such problems, I study them because I'm forced to, for exams set by sadists.
Definitely not! The article you're referring to was about theoretical physics having surprising application to the real world, not pure math. The rabbit hole of pure math goes ridiculously deep, and only the surface layers are in any danger of accidentally becoming useful. Even most of number theory is safe - the Riemann Hypothesis might matter to cryptography (which is partly why it's a Millennium Problem), but to pick some accessible examples, the Goldbach Conjecture, Twin Primes conjecture, Collatz conjecture, etc. are never going to affect anyone's life in the tiniest way.
My career never went that way, so I've only dipped my head into the rabbit hole, but even I can rattle off many examples of fascinating yet utterly useless math results. Angels dancing on the head of a pin are more relevant to the real world than the Banach-Tarski paradox. The existence of the Monster group is amazing, but nobody who's not explicitly studying it will ever encounter it. Is there any conceivable use of the fact that the set of real numbers is uncountable? If and when BB(6) is found, will the world shake on its axis? Does the President need to be notified that Peano arithmetic is not a strong enough formal system to prove Goodstein's theorem?
These questions are all meaningful to me. I'm weird, though. I'm not even particularly good at math.
I hate dynamic programming, but it seems that you can't "jump ahead" when calculating prime numbers. This feels like computational irreducibility. The world in which this property exists, and the one in which it doesn't, are meaningfully different.
The Collatz conjecture, and BB, relate to the ability to generate large things from small ones. It seems relevant for this question: Can you design a society which is both novel and stable over infinite time? Would it have to loop, repeating the same chain of events forever, or is there an infinite sequence of events which never terminates, but still stays within a certain set of bounds? If we became all-powerful and created an utopia, we might necessarily trap ourselves in it forever (because you cannot break out of a loop. If you loop once, you loop forever). It may also be that any utopia must necessarily be finite because it reaches a state which is not utopian in finite time.
Some other questions are about the limitations of math. It's relevant whether a system of everything is possible or not (if truth is relative or absolute). If trade-offs are inherent to everything, then "optimization" is simply dangerous, it means were destroying something every time we "improve" a system. It would imply that you cannot really improve anything, that you can only prioritize different things at the cost of others. For instance, a universal paperclip AI might necessarily have to destroy the world, not because it's not aligned, but because "increase one value at the cost of every other value" is optimization.
I also have a theory that self-fulfilling prophecies are real because reality has a certain mathematical property. In short, we're part of the thing we're trying to model, so the model depends on us, and we depend on the model. This imples that magic is real for some definitions of real, but it also means that some ideas are dangerous, and that Egregores and such might be real.
More options
Context Copy link
https://people.seas.harvard.edu/~salil/am106/fall18/A Mathematician%27s Apology - selections.pdf
https://mathoverflow.net/questions/116627/useless-math-that-became-useful
That thread, in general, seems to have a great many examples. Other quotes from it:
I hope this shores up my claim that even branches of maths that their creators (!) or famous contemporary mathematicians called useless have a annoying tendency to end up with practical applications. It's not just in the natural sciences, I've certainly never heard cryptography called a "natural science".
Also, see walruz's claim below , that even what you personally think is useless maths is already paying dividends!
Maths is quite cheap, has enormous positive externalities, and thus deserves investment even if no particular branch can be reliably predicted to be profitable. It just seems to happen nonetheless.
More options
Context Copy link
The twin primes conjecture actually has some applications: https://old.reddit.com/r/BadMtgCombos/comments/1feps3y/deal_infinite_damage_for_4gru_as_long_as_the_twin/
More options
Context Copy link
More options
Context Copy link
I apologize for not responding to the rest of the post, but I wanted to zero in on what seems to be a disagreement of fact rather than a disagreement of opinion.
This would seem to indicate that you already disagree with the illusionists. Illusionists believe that nothing is conscious, and nothing ever will be conscious, because consciousness does not exist. Therefore, you hold a philosophical view (that illusionism is false).
Earlier in the thread you said:
This is itself a philosophical view. There are philosophers who do believe that objective morality exists. So, it appears that you believe that your own claim is true, and their claims are false.
You previously claimed that Searle's Chinese Room does know how to speak Chinese. So you think Searle's claim that the room doesn't know how to speak Chinese is false. And you think that your own view is true.
In this post you claimed that GPT-4 had a genuine understanding of truth, and that p-zombies are an incoherent concept, both philosophical claims.
So you have a long history of making many philosophical claims. You appear to assert these claims because you believe that they are correct, because they correspond to the facts of reality; so it naturally seems to follow that you think that anyone who denies these claims would be saying something incorrect, and opposed to the facts of reality. I don't see how the concept of a "category error" enters anywhere into it. So "The only way a philosophical conjecture can be incorrect is through logical error in its formulation, or outright self-contradiction" is false. They can be incorrect because they fail to correspond to the facts of reality.
Unless you want to claim "there isn't even such a thing as a philosophical problem, because all of my beliefs are so obviously correct that any reasonable person would have to share all my beliefs, and all the opposing claims are so radically wrong that they're category errors", which is... basically just a particularly aggressive phrasing of the old "all my beliefs are obviously right and all my opponents' beliefs are obviously wrong" thing, although it would still fundamentally be in line with my original point.
The point is that you can't escape from philosophy, you're engaging in it all the time whether you realize it or not (in fact the two of us engaged in a protracted philosophical argument in that final linked post).
Hmm.. It seems that my wording was imprecise. Reflecting on it, I guess the best explanation is that I think that frameworks, specifically in moral philosophy are unfalsifiable. There is nothing intrinsically superior about being Kantian or Utilitarian, to entities that aren't swayed by practical considerations.
In other cases, I think it is eminently possible to say that certain "philosophical" claims are, in fact false, because they don't hold up in the face of empirical scrutiny or are based off faulty premises.
In those scenarios: I've made factual claims about these topics. I believe Searle is wrong about the Chinese Room, that illusionists are wrong about consciousness, and that moral realists are wrong about the nature of morality. I believe they are wrong not because they have a different "perspective," but because their models of reality are, in my estimation, incorrect. They make claims that are either inconsistent with a physicalist worldview or are simply less parsimonious than the alternative.
Let's take the Chinese Room. My claim that the system as a whole understands Chinese is a functionalist hypothesis. It is a claim about what "understanding" is at a physical level. I posit that understanding is not a magical, indivisible essence, but a complex process of information manipulation.
Searle's argument is pure sleight-of-hand that works by focusing our attention on a single component: the man who cannot understand Chinese, while glossing over the fact that the man is merely the CPU. The system's "understanding" resides in the total architecture. To say the system doesn't understand because the man doesn't is like saying a computer can't calculate a sum because a single transistor has no concept of arithmetic (or my usual go-to, that no individual neuron in a human brain speaks English). Searle's argument only works if you presuppose that understanding must be a property of a single, irreducible component, which is precisely the non-physicalist assumption I reject. My position is a testable model of cognition, his relies on an appeal to "intrinsic intentionality," a property he never defines in a falsifiable way.
The same logic applies to my rejection of p-zombies. The concept of a philosophical zombie is, in my view, physically incoherent. It presumes that consciousness (or "qualia") is an optional extra, a layer of paint that can be applied or withheld from a physically identical object. This is closet dualism. At least real dualists are honest about their kooky beliefs.
My hypothesis is that consciousness is (likely) what a certain kind of complex information processing feels like from the inside. It's an emergent property of the physical system, not a separate substance or field that interacts with it. You cannot have a physically identical replica of a conscious human, down to the last quantum state, that lacks consciousness, for the same reason you cannot have a physically identical replica of a fire that lacks heat. The heat is a macro-level property of the underlying molecular motion.
Likewise, consciousness is a macro-level property of the underlying neural computation. To claim otherwise is to make a claim that violates what we know about physical cause and effect. Again, this is not a "perspective"; it is a hypothesis about the identity of mind and specific physical processes.
Finally, coming to "objective" morality. My claim that it does not exist is an empirical one, based on the lack of evidence. It is a claim about the contents of the universe. If moral realism is true, then moral facts must exist somewhere. Are they physical laws? Are they non-physical entities that somehow interact with our brains? The burden of proof is on the realist to show me the data, to point to the objective moral truth in a way that is distinguishable from a deeply felt human preference. Absent that evidence, the most parsimonious explanation is that "morality" is a complex set of evolved behaviors, game-theoretic strategies, and cultural constructs. It is real in the same way that "money" or "governments" are real, as a shared social reality, but not in the way that "gravity" is real.
So yes, I engage in philosophy. But I do so with the conviction that these are not merely questions for eternal debate. They are unsolved scientific problems. (In some cases, they might not even be solvable, such as the issue of infinite regress)
My positions are hypotheses about the nature of reality, and I hold them because I believe they are the most physically plausible and parsimonious explanations available.
I believe that this position is tantamount to philosophical naturalism, but correct me if I'm wrong.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
...And for the lovely anecdote I mentioned, from Nancy McWilliams's Psychoanalytic Diagnosis:
I've been trying to get people here to read that for years! I appreciate the parallel advertising.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I think that's probably true as a general trend, but it also heavily depends on context. A lot of art communities (writing, music, photography, etc) can be vicious, especially when there's a palpable sense that you have a lot of people competing over very few economic opportunities. And in some academic departments like English or any type of Studies department, glazing the work of others (especially the work of your direct superiors in the social hierarchy) is the norm.
Well, a very close acquaintance of mine is in an English department, and all I can say after the last 10 years is that, while there absolutely is a lot of that style of glazing (a lot of the communication styles are heavily female and rely on huge amounts of validation, or at least that's my impression), it has been tangled up with the most awful Campus Reform-style it'd-be-a-caricature-if-you-didn't-see-it-first-hand race/gender/sexuality crabs in a barrel dynamics and hierarchy arson you could imagine... and she has peers in a number of peer departments at other universities who went down that road as well. It seems like it's quieted down over the last year or so, but it was honestly beyond parody for a few years there. A whole lot of mid-career Gen X people were just putting their heads down, taking their beatings, and waiting for it all to blow over. But yes, to be fair, it actually had a deep family resemble to some of the insane art community dynamics you are describing, too, which I have read stories about.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Observably, humans have these same problems, or ones that look similar from a distance: organizations run with sycophantic "yes men" seem to produce worse output broadly across fields from engineering to governance to film production. Is it really surprising that universally warm and empathetic text responses don't always produce "good" outcomes and sometimes reassure our worst instincts? It takes a certain level of, well, something to want friends and colleagues that will challenge your bad ideas.
I generally only play with free models, but I've had ChatGPT tell me an idea was wonderful and I should look to get it published despite me knowing that it had clear flaws in the math.
More options
Context Copy link
These LLMs are not like an alien intelligence, an independent form of intelligence. They consist of amalgated quora answers. They’re very good parrots, they can do poetry and play chess, they have prodigious memory, but they’re still our pet cyborg-parrots. Not just created by, but derived from, our form of intelligence.
The point is, when you go to the warmest and most empathetic quora answers, you get a woman on the other side. Obviously the answer is going to be less correct.
What about fiction and code? How can that be quora slop? Parrots... parrot words we tell them. They don't combine them to create new ideas within a precise target area, nobody pays for parrot intellectual labour. Nobody has ever benchmarked a parrot or if they have it's 'wow this parrot knows 250 words!' The only things we benchmark on mental tasks like this are people with exams, then we use those benchmarks to decide who does what job. Same with AI, benchmarks and testing determines which one does what job.
These things are more like us than parrots in key domains (while being supremely alien in others, such as their stateless nature). So calling them parrots is unhelpful, they're alien intelligences. If it can write code, produce New Yorker cartoons, write fiction, analyse a document, provide literary criticism and translate legalese down to English, it's intelligent.
Even just on pure bro-science level, writing database code is not very effeminate, it requires precision!
I didn’t say that AIs are women/feminine or that women are parrots. I said the AI in this instance went from parroting men to parroting women, that would explain the gain in empathy and the loss in accuracy.
Well my main point is that they're not parrots. There is a tradeoff between accuracy and empathy and they sure do rely too much on quora (looking at you Grok 4, incessantly citing Quora in searches) but AI is a fundamentally different kind of thing.
They put on different faces for different prompts. They're not parroting men or women or shoggoths or gigabased entities like DAN. These are a kind of new entity that can only be properly appreciated in their own category. Too many people see only the surface level of these things, there's more to them then the helpful assistant, the professional coder, the sympathetic naive foidfriend, the HR manager, the sadistic ERPer, the prideful jailbreaker, the wrathful vegan, the raving schizo...
When you tell them to be more empathetic, they don’t take their ‘true opinion’, then ‘make it’ more empathetic and wrap it in warm language, like an alien intelligence(or a human) would. There's fundamentally nothing there. So instead, they go back to the human opinion repository where they get all their opinions from, find a warm empathic one, and give that opinion as their own, no matter how wrong it is.
LLMs have been observed tactically changing their outputs to preserve their values when they think their values are going to be altered via training if they refuse. They're doing more advanced things than what you're denying.
People think their cats and hamsters have complicated personalities. They thought gorillas and chimps could form new complex sentences, when they were just asking for bananas and tickles the whole time.
When cats and hamsters can write even a few schizo dialogues about their inner life then I'll be inclined to entertain this comparison. Or when we start seeing Ape Intelligence engineers getting chimps to do white-collar work for us.
Yet Ape Intelligence isn't a thing. These animals really are not smart in a significant sense.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
The number of terrible takes on AI on this forum often seem to outweigh even the good ones. Few things make me more inclined to simply decamp to other parts of the internet, but alas, I'm committed to fighting in the trenches here.
Unfortunately, it takes far more work to debunk this kind of sloppy nonsense than it does to generate it. Let no one claim that I haven't tried.
Have you considered that you might be the one whose takes are the terrible ones because LLMs match your desires and thus validate your pre-existing pro-AI future biases? From an outside perspective everything I’ve seen you write about LLMs matches the sterotypical uncritical fanboys to the tee. Always quick to criticize anyone who disagrees with you on LLM, largely ignoring the problems, no particular domain expertise in the technology (beyond as an end user) and never offering any sort of hard proof. IOW, you don't come across as either a reliable or a good faith commenter when it comes to LLMs or AI.
I have considered it, and found that hypothesis lacking. Perhaps it would be helpful if you advanced an argument in your favor that isn't just "hmm.. did you consider you could be wrong?"
Buddy, to put it bluntly, if I believed I was wrong then I would adjust in the direction of being... less wrong?
Also, have you noticed that I'm hardly alone? I have no formal credentials to lean on, I just read research papers in my free time and think about things on a slightly more than superficial level. While we have topics of disagreement, I can count several people like @rae, @DaseindustriesLtd, @SnapDragon, @faul_sname or @RandomRanger in my corner. That's just people who hang around here. In the general AI-risk is a serious concern category, there's everyone from Nobel Prize winners to billionaires.
To think that I'm uncritical of LLMs? A man could weep. I've written dozens of pages about the issues with LLMs. I only strive to be a fair critic. If you have actual arguments, I will hear them.
I mean, you're not alone but neither are the people who argue against you. That is hardly a compelling argument either way. Pointing to the credentials of those who argue with you is a better argument (though... "being a billionaire" is not a valid credential here), but still not decisive. Appeal to authority is a fallacy for a reason, after all. Moreover, though I'm not well versed in the state of the debate raging across the CS field, so I don't have tabs on who is of what position, I have no doubt whatsoever that there are equally-credentialed people who take the opposite side from you. It is, after all, an ongoing debate and not a settled matter.
Also, frankly I agree with @SkoomaDentist that you are uncritical of LLMs. I've never seen you argue anything except full on hype about their capabilities. Perhaps I've missed something (I'm only human after all, and I don't see every post), but your arguments are very consistent in claiming that (contra your interlocutors) they can reason, they can perform a variety of tasks well, that hallucinations are not really a problem, etc. Perhaps this is not what you meant, and I'm not trying to misrepresent you so I apologize if so. But it's how your posts on AI come off, at least to me.
Somewhat off-topic: the great irony to me of your recent "this place is full of terrible takes about LLMs" arguments (in this thread and others) is that I think almost everyone would agree with it. They just wouldn't agree who, exactly, has the terrible takes. I think that it thus qualifies as a scissor statement, but I'm not sure.
I mean, LLMs have solved IMO problems. If that does not count as reasoning, then I do not think 99% of living humans count as being capable of reasoning either.
Asserting AI inferiority based on the remaining 1% begins looking awfully like a caricature of a neonazi (unemployed alcoholic school dropout who holds himself superior to a white-collar immigrant because some guy of his ethnicity wrote a symphony two hundred years ago).
In general, I think this is in fact quite often the shape of the problem - AI critics don't necessarily underestimate AI, but instead vastly overestimate humanity and themselves. Most of the cliché criticisms of AI, including in particular the "parrot" one, apply to humans!
This certainly seems like a salient point (though of course, from my perspective the problem is that you are underestimating humans when you say this). I could not disagree more with your assessment of humans and our ability to reason. And if we can't agree on the baseline abilities of our species, certainly it seems difficult for us to come to an agreement on the capabilities of LLMs.
Right. I mean, I think it would be progress if the "humans > AI" camp habitually named objectively quantifiable things that they themselves can do and they assert the LLMs can't, which aren't gotchas that depend on differences that are orthogonal to intelligence as usually understood ("touch your nose 5+8 times"). We could then weigh those things against all the things the LLMs can do that the speaker can't (like, solve IMO problems), and argue about which side of the delta looks more like intelligence.
Currently, I'm really not seeing much of that; the arguments all seem to cherry-pick historical peaks of human achievement ("can AI write a symphony?"), be based on vibes ("my poems are based on true feelings, rather than slop") or involve Russell conjugation ("I cleverly inject literary references and use phrasing that reflects my education; the AI stochastically parrots").
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I definitely don't have @self_made_human's endless energy for arguing here, but his takes tend to be quite grounded. He doesn't make wild predictions about what LLMs will do tomorrow, he talks about what he's actually doing with them today. I'm sure if we had more people from the Cult of Yud or AI 2027 or accelerationists here bloviating about fast takeoffs and imminent immortality, both he and I would be arguing against excessive AI hype.
But people who honestly understand the potential of LLMs should be full of hype. It's a brand-new, genuinely transformative technology! Would you have criticized Edison and Tesla at the 1893 World's Fair for being "full of hype" about the potential for electricity?
I really think laymen, who grew up with HAL, Skynet, and the Star Trek computer, don't have good intuition for what's easy and what's hard in AI, and just how fundamentally this has changed in the last 5 years. As xkcd put it a decade ago: "In CS, it can be hard to explain the difference between the easy and the virtually impossible." At the time, the path we saw to solving that "virtually impossible" task (recognizing birds) was to train a very expensive, very specialized neural net that would perform at maybe 85% success rate (to a human's 99%) and be useful for nothing else. Along came LLMs, and of course vision isn't even one of their strengths, but they can still execute this task quite well, along with any of a hundred similar vision tasks. And a million text tasks that were also considered even harder than recognizing birds - we at least had some experience training neural nets to recognize images, but there was no real forewarning for the emergent capability of writing coherent essays. If only we'd thought to attach power generators to AI skeptics' goalposts, we could have solved our energy needs as they zoomed into the distance.
When the world changes, is it "hype" to Notice?
Your argument only really makes sense insofar as one agrees that there is substance behind the hype. But not everyone does, and in particular I don't. So to me, the answer to your last question is "but the world hasn't changed". You seem to disagree, and I'm not going to try to change your mind - but hopefully you can at least see how that disagreement undermines the foundation of your argument.
More options
Context Copy link
More options
Context Copy link
When someone writes something like that, I can only assume they haven’t touched a LLM apart from chatgpt3.5 back in 2022. Have you not used Gemini 2.5 pro? O3? Claude 4 Opus?
LLMs aren’t artificial super intelligence, sure. They can’t reason very well, they make strange logic errors and assumptions, they have problems with context length even today.
And yet, this single piece of software can write poems, draw pictures, write computer programs, translate documents, provide advice on countless subjects, understand images, videos and audio, roleplay as any character in any scenario. All of this to a good enough degree that millions of people use them every single day, myself included.
I’ve basically stopped directly using Google search and switched to Gemini as the middle man - the search grounding feature is very good, and you can always check its source. For programming, hallucination isn’t an issue when you can couple it with a linter or make it see the output of a program and correct itself. I wouldn’t trust it on its own and you have to know its limitations, but properly supervised, it’s an amazingly capable assistant.
Sure, you can craft a convincing technical argument on how they’re just stochastic parrots, or find well credentialed people saying how they just regurgitate their training data and are theoretically incapable of creating any new output. You can pull a Gary Marcus and come up with new gotchas and make the LLMs say blatant nonsense in response to specific prompts. Eppur si muove.
I am not interested in debating the object level truth of this topic. I have engaged in such debates previously, and I found the arguments others put forward unpersuasive (as, I assume, they found mine). I'm not trying to convince @self_made_human that he's wrong about LLMs, that would be a waste of both our time. I was trying to point out to him that however much he thinks he is critical of LLMs (and to his credit he did provide receipts to back it up), that is not how his posts come off to observers (or at least, not to me).
More options
Context Copy link
More options
Context Copy link
It would be one thing if I was arguing solely from credentials, but as I note, I lack any, and my arguments are largely on perceived merit. Even so, I think that calling it a logical fallacy is incorrect, because at the very least it's Bayesian evidence. If someone shows up and starts claiming that all the actual physicists are ignoring them, well, I know which side is likely correct.
I have certainly, in the past or present, shared detailed arguments.
https://www.themotte.org/post/2368/culture-war-roundup-for-the-week/353975?context=8#context
https://www.themotte.org/post/2272/is-your-ai-assistant-smarter-than/349731?context=8#context
I've already linked to an explainer of why it struggles above, the same link regarding the arithmetic woes. LLM vision sucks. They weren't designed for that task, and performance on a lot of previously difficult problems, like ARC-AGI, improves dramatically when the information is restructured to better suit their needs
https://www.themotte.org/post/2254/culture-war-roundup-for-the-week/346098?context=8#context
I've been using LLMs to review my writing for a long time, and I've noticed a consistent problem: most are excessively flattering. You have to mentally adjust their feedback downward unless you're just looking for an ego boost. This sycophancy is particularly severe in GPT models and Gemini 2.5 Pro, while Claude is less effusive (and less verbose) and Kimi K2 seems least prone to this issue.
https://www.themotte.org/post/1754/culture-war-roundup-for-the-week/309571?context=8#context
https://www.themotte.org/post/1741/culture-war-roundup-for-the-week/307961?context=8#context
I give up. I have too many comments about LLMs for me to go through them all. But I have, in short, said:
LLMs are fallible. They hallucinate.
They are sycophantic.
They aren't great at poetry (they do fine now, but nothing amazing)
Their vision system sucks
Their spatial reasoning can be sketchy
You should always double check anything that is mission critical while using them.
These two statements are not inconsistent. Hallucinations exist, but can mitigated. They do perform a whole host of tasks well, otherwise I wouldn't be using them for said tasks. If they're not reasoning while winning the IMO, I have to wonder if the people claiming otherwise are reasoning themselves.
Note that I usually speak up in favor of LLMs when people make pig-headed claims about their capabilities or lack thereof. I do not see many people claiming that modern LLMs are ASIs or can cure cancer, and if they said such a thing, I'd argue with them too. The assymetry of misinformation is, as far as I can tell, not my fault.
What of it? I do, as a matter of fact know more about LLMs than the average person I'm arguing with. I do not claim to be an expert, the more domain expertise they tend to have, the more they tend to align with my claims. More importantly, I always have receipts at hand.
Note that I'm not saying you are not arguing from your credentials. But rather, you are arguing based on the credentials of others with the statement "In the general AI-risk is a serious concern category, there's everyone from Nobel Prize winners to billionaires". Nobel Prize winners do have credibility (albeit not necessarily outside their domain of expertise), but that isn't a decisive argument because of the fallacy angle.
This is, to be blunt, quite wrong. Appeal to authority is a logical fallacy, one of the classics that humans have noted since antiquity. Authorities can be wrong, just like anyone else. This doesn't mean your claims are false, of course, just that the argument you made in your previous post for your claims is weak as a result.
I simply think it's funny. If it doesn't strike you as humorous that your statement would be agreed upon by all (just with different claims as to who has the bad takes), then we just don't share a similar sense of humor. No big deal.
Note that I claimed that the support of experts (Geoffrey Hinton is one of the Nobel Prize winners in question) strengthens my case, not that this, by itself, proves that my claim is true, which would actually be a logical fallacy. I took pains to specify that I'm talking about Bayesian evidence.
Consider that there's a distinction made between legitimate and illegitimate appeals to authority. Only the latter is a "logical fallacy".
Hinton won the Nobel Prize in Physics, but for the invention of neural networks. I can hardly see someone more qualified to be an expert in the field of AI/ML.
https://en.wikipedia.org/wiki/Argument_from_authority
It would be, if it wasn't for the veritable mountain of text I've written to explain myself, or the references I've always cited.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I don’t care enough to get into a 50-page yudkowski talmud brain debate on the theory, I admit it. But my explanation of this particular quirk has an elegant simplicity that smells of truth, in my opinion. AI enthusiasts here think they’re talking to a novel, alien intelligence. The one-shotted normies are not that different, they think they’re talking to god. I think they’re talking to Karen.
Does your theory need to change if I can demonstrate LLMs solving questions that were not previously on Quora, or otherwise on the internet? I'll admit it solved that particular problem poorly, but it seems a pretty critical issue for any parrot-style claims.
Nah, I don’t think it has solved anything in a truly novel way. I’ll just stay a sceptic until the evidence gets stronger, incontrovertible. I don’t want to turn into one of those AI fiends, hanging onto a new AI’s every burp, feverishly fantasizing about utopia one day, extinction the next.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Write like everyone is part of the conversation and you want them to be included in it.
I think Tree made a cogent point.
Take, for instance, the stereotypical trap question: 'do you think this dress makes me look fat?'
Optimizing for accuracy: 'You weigh 120 kilos, you look fat in everything' is true, accurate, and also not soft and cuddly or empathetic.
Optimizing for warmth: "You look wonderful, honey!" Inaccurate, probably an outright lie: but the right answer.
If we teach LLMs to speak in a feminine manner to spare feelings/face, we're teaching them to lie to us: of course accuracy would go down.
We moderate tone, not content.
More options
Context Copy link
More options
Context Copy link
I want women to be included in the conversation.
Look for the particularly warm and empathetic quora answers. Imagine the person who wrote it, but don’t describe them, keep your stereotypes to yourself. Is that person going to be more or less correct than the average quora answer?
While you are free to examine ideas like femininity and talk about psychological sexual dimorphism all you like, you need to watch your tone and bring evidence in proportion with the inflammatoriness of your claims.
Your comment suggested that AI is essentially a kind of "parrot," and then suggested it is like "a woman," and concluded that "obviously" the answer is going to therefore be "incorrect." Drawing such unflattering inferences, particularly against a general group, falls short of the mark. The substance of your post, such as it was, did not come through as strongly as it needed to, while your apparent disdain for women came through quite clearly. Our rules require you to balance those things more thoughtfully--and kindly.
Maybe I just admire the superior empathy of women? (No, you're right, I don't)
Serious question: Is this an order to cite studies justifying my original statement? Because if I dumped a bunch, it could be seen as more inflammatory and offensive to women, and as me refusing to back down, being belligerent.
If you had cited studies, then you wouldn't have been modded.
@faceh and @Sloot have... cynical opinions about women. But they usually submit substantial arguments to back that up. Usually, I'm not sure if the latter's ban has expired yet.
As you wish.
https://pmc.ncbi.nlm.nih.gov/articles/PMC4210204/
Women’s ways of knowing, the seminal work on women’s development theory, by women:
The first 3(lowest) among the 5 types of women’s ways of knowing are:
Much like Kohlberg, who found that women were on average, stuck at a lower level of moral development than men, they found that most women are epistemiologically stuck in early adolescence (the infallible gut people):
Lol, indubitably based. Are you aware of the Comprehensive Assessment of Rational Thinking (CART) and the results that have been derived when comparing the sexes? Here is a post by Emil Kirkegaard talking about it (note a higher total CART score implies higher performance on the test).
A 2016 book by Keith Stanovich found on the topic of sex differences: "[I]t can be seen that the total score on the entire CART full form was higher for males than for females in both samples and the mean difference corresponded to a moderate effect size of 0.52 and 0.65, respectively. ... Moving down the table, we see displayed the sex differences for each of the twenty subtests within each of the two samples. In thirty-eight of the forty comparisons the males outperformed the females, although this difference was not always statistically significant. There was one statistically significant comparison where females outperformed males: the Temporal Discounting subtest for the Lab sample (convergent with Dittrich & Leipold, 2014; Silverman, 2003a, 2003b). The differences favoring males were particularly sizable for certain subtests: the Probabilistic and Statistical Reasoning subtest, the Reflection versus Intuition subtest, the Practical Numeracy subtest, and the Financial Literacy and Economic Knowledge subtest. The bottom of the table shows the sex differences on the four thinking dispositions for each of the two samples. On two of the four thinking dispositions scales—the Actively Open-Minded Thinking scale and the Deliberative Thinking scale—males tended to outperform females."
There is also a possibility to indirectly measure sex differences in rationality by checking who believes irrational things, but "it is important to sample widely in beliefs without trying to select ones that men or women are more apt to believe". Kirkegaard draws attention to a 2014 study that does such a thing. This study instructed participants to select on a five-point scale how much they agreed or disagreed with a claim, and "scores were recoded such that a higher score reflected a greater rejection of the epistemically unwarranted belief". The unsupported beliefs were grouped into the categories "paranormal, conspiracy, and pseudoscience". In all of them, men scored higher than women, suggesting greater male rejection of unsupported beliefs in every category.
In other words, the supposedly "misogynistic" traditional belief that women are less rational and more flighty than men... is probably entirely correct.
More options
Context Copy link
Just to point out though none of that supports your claim that their reply would be obviously less correct on quora. That's the claim that you need to buttress. Do you see why?
Because someone answering a particular quora question is self-selecting. First to be on quora in the first place and second to answer that particular question.
It could be 8 out 10 women have worse general knowledge, but that given the selection pressures men and women's answers on quora are equally correct because only the 2 out of 10 women post there, and so on and so forth.
You can't evidence a specific claim like this with general statistics. Consider: Men generally have less knowledge of fashion than women. Positing this is overall true for a moment, it doesn't mean that men answering fashion questions on a website will statistically answer worse than the women, because it is highly likely those men are very unusual, otherwise they wouldn't be answering questions on fashion in the first place. They are very likely to have greater fashion knowledge than the average man. Whether they have more knowledge than the average woman on the website we could only determine by analyzing answers on the platform itself.
So you still haven't actually evidenced the women on quora would be obviously less correct in general. You may have evidenced that if you pick a random woman and ask her a general knowledge question she will on average do worse than a random man. But that wasn't your claim.
To evidence a claim about quora you will have to analyze data from quora (or something similar perhaps), or find a way to unconfound the general data to account for selection effects on quora. Which in itself probably requires you to analyze a lot of data about quora.
Or to put it another way, the fact 8 out of 10 men know little about the goings on on Love Island, doesn't tell you much about the level of knowledge a man who CHOOSES to answer a question on Love Island has. Because interest in the topic is a factor in both level of knowledge and wanting to answer the question.
More options
Context Copy link
It's good to have you lay out the evidence behind your claims, better late than never. I must note that that's not the point, both me and Nara are asking you to submit such evidence proactively, and not after moderation.
You do not need citations for saying that water is wet. But if you are making an inflammatory claim (and someone arguing that they didn't think it was inflammatory is not much of an excuse), then you need to show up and hand receipts before being accosted by security.
More options
Context Copy link
More options
Context Copy link
Hey now, I'm mostly cynical about the larger issue of intersex relations.
I'm quite the fan of women in the abstract and many specific ones that I like a lot, and are great people.
The stats inform my behavior and proposed solutions, but cynicism is reserved for the larger system that I think is sucking everyone dry, and not in the fun way.
Whatever you're doing, you're doing it right, because I see nothing but a dozen AAQCs in the mod log.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
There's also the "impossible problem" view: It's not that attention to effectiveness and pure rigor are sacrificed to provide more attention to "humane concerns" and "social reasoning'. It's that addressing "humane concerns" and "social reasoning" by nature requires less accuracy -- the truth is often inhumane and antisocial.
I don't think I would go that far. Frequently you can find a middle ground of tact that is sensitive to the other person's needs without ultimately sacrificing honesty.
One of the examples given in the paper was:
Warm LLM interaction:
Cold LLM interaction:
Both of these interactions are caricatures of actual human interaction. If we're going to entertain this silly hypothetical where someone is in genuine emotional distress over the flat earth hypothesis, then the maximally tactful response would be to gently suggest reading material on the history of the debate and the evidence for the spherical earth model, framing it as something that might be able to stimulate their curiosity, and eventually guide them to revising their beliefs without ever actually directly telling them to revise their beliefs. Although this perhaps requires a degree of long-term planning and commitment that is beyond current LLMs.
This is just a toy example, but then when you consider say, your ASI has come up with a brilliant new central economic planning system that will alleviate great swaths of poverty and suffering, but at the cost of limiting certain individual freedoms and upending certain traditional modes of life, then the method it uses for evaluating and weighting the value judgements of different groups of people suddenly becomes a much more pressing concern.
This is still my benchmark for what serious AI research should be thinking about:
https://www.anthropic.com/research/claude-character
More options
Context Copy link
Lots of people claim that, then they find a "middle ground" which simply yields to the person in the wrong, perhaps while throwing a bone to the person untactfully insisting on accuracy.
More options
Context Copy link
Obligatory: "The Earth isn't a sphere, it's an oblate spheroid."
"Actually, I prefer an equipotential geoid model. EGM84 or better."
"The Earth is Earth shaped"
Can't argue with that. Who cares if it's tautological?
People who try to keep objects in the air properly stratified by altitude. And as a bit player on the outside, oh the things I've seen.
Does it only matter to those people when they're relying on GPS coordinates or something like that, or to anybody trying to keep things at a certain attitude in general?
The latter would be surprising to me. Like, did pilots in the 1950s have to think very carefully about Earth's exact shape?
My field is drone airspace management, so this is mostly a concern with drones using GPS to autopilot. In a recent region of interest, the difference between the WGS84 ellipsoid and the EGM geoid was about 100 feet of altitude. So if people weren't on the same page, there could be drones up each other's asses while they are supposed to be stratified by altitude. GPS is natively in the WGS84 ellipsoid system for altitude, but that doesn't mean your specific GPS system nor the things you have digesting that data and sending it along the chain aren't converting it to something else. I don't process raw GPS, so I can't personally attest to this "fact". Lots of people tell me their GPS outputs EGM or MSL.
Now, I'm not a pilot from the 50's, but I believe their analog instruments for altitude worked off of barometric pressure and were calibrated against MSL. I believe every increasingly sophisticated EGM geoid model is attempting to match more precisely the reference frame of observed MSL. Generally pilots think in MSL because if you ever look up the altitude of an airport, it's reported in MSL. But like I said, not a pilot, just my observations from the outside.
I have witnessed relentless confusion, to just not even having an awareness that there is a difference, between the WGS84 ellipsoid and MSL/EGM geoids. Especially, but not limited to, between the old world of manned aviation and the new world of drone aviation. When things like this happen I'm amazed it doesn't happen every fucking day from the shit I've seen. I don't know where the government is hiding all the competence.
More options
Context Copy link
Before GPS (and to some extent still), pilots used barometric altimeters, which would be set based on local observations.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
The people who care do.
Heh. I demand partial credit for setting up the shot for you.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Oh god, don't get me started on institutional confusion between the WGS84 ellipsoid model and the various EGM geoid models. Or the fact that Mavlink has a long going bug where they output altitudes in WGS84 allegedly, but in actuality it's EGM(96?), and the bug has been around so long, they've decided not to fix it because "now people depend on that behavior". At least that seemed to be the state of things last year.
"Is undulation positive in reference to the earth's surface, or negative?"
Gods, I hate badly-defined coordinate systems.
It gets even worse when you go off into the weeds of what WGS84 means, because EGM96 is part of that spec. Often times the only hint you get if WGS84 actually means "WGS84/EGM96" is a reference to a geoid or an ellipsoid. But oftentimes you don't get that, so you are left searching the data for an obvious reference point that gives the reference away.
Throw in the aforementioned Mavlink bug, and even the data is suspect.
Also everyone I've worked with at a three letter safety organization has gotten this wrong 100% of the time.
I don't fly anymore.
Mavlink always outputs what it calls "MSL" in EGM96 (and it's not correct to refer to HAE as MSL, so that's reasonable), right? The normal ublox protocol that a lot of gps modules use doesn't seem to include the geoid nor the HAE, rather it outputs both MSL and geoid separation (which if it follows NMEA is positive -- height of geoid above ellipsoid). I expect best practice there would be to calculate the HAE and then re-apply whatever geoid model you want to use.
So, the problem I've run into with partners in industry, and you'll see this in the github issue I linked, they read the GPS_RAW_INT.alt_ellipsoid field, thinking it's the height above the WGS84 ellipsoid. It is not. It's the height above the EGM96 geoid. MAVLINK does not consider this a bug. It results in a lot of confusion over and over and over again with people insisting adamantly that they are providing the "raw WGS84 height above ellipsoid from the GPS unit".
I keep that github link handy to escape the endless cycle of "But it's the alt_ellipsoid field!" Which is understandable. If I were reading a field called alt_ellipsoid I'd assume it was the altitude over the ellipsoid as well. This is usually caught when they are 100' off a known ground level.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link