site banner

Culture War Roundup for the week of June 9, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

4
Jump in the discussion.

No email address required.

I have seen the AGI, and it is Gemini 2.5 Pro

If you go and look at Metaculus you’ll see that regardless of the recent breakthroughs like VEO3 and OpenAI’s “Ghiblification” (probably the first AI image system where accusing the outputs of being “slop” makes the accuser look unreasonable rather than denigrates the picture itself) all the “when AGI?” benchmarks have been uncharacteristically stubborn. The question asking about “weak” AGI has gone nowhere for two weeks months while the median prediction on the question about full AGI has receded three years from 2031 to 2034.

It looks like Scott’s AGI 2027 push has failed to convince the markets. For the informed person, AGI is coming “soon” but isn’t imminent. However I think that actually AGI is already here, is freely available to anyone with an internet connection and is called Gemini 2.5 Pro.

For those of us not in the know, at the moment you can access Gemini 2.5 Pro for free with no limits on Google’s AI studio right here: https://aistudio.google.com/prompts/new_chat ; yep, you heard that right, the literal best text model in the world according to the lmarena.ai leaderboard is available for free with no limits and plenty of customisation options too. They’re planning on connecting AI studio access to an API key soon so go and try it out for free right now while you can. No need to overpay for ChatGPT pro when you can use AI studio, and it’s a lot lot better than the Gemini you get via the dedicated app/webpage.

Our story begins a few days ago when I was expecting delivery of a bunch of antique chinese hand scroll paintings I had purchased. Following standard Chinese tradition where collectors would add their own personal seal in red ink to the work and seeing as these scrolls already had a bunch of other seal glyphs I wanted to add my own mark too. The only issue was that I didn’t have one.

This led to a rabbit hole where I spent a good portion of my Saturday learning about the different types of Chinese writing all the way from oracle bone script to modern simplified script and the different types of stones from which seal were made. Eventually after hours of research I decided I wanted a seal made from Shoushan stone written in Zhuànshū script. That was the easy part.

The real difficulty came in translating my name into Chinese. I, with a distinctly non Chinese name, don’t have an easy way to translate the sounds of my name into Chinese characters, which is made all the harder by the fact that pretty much all Chinese syllables end in a vowel (learning this involved even more background reading) even though my name has non-vowel ending syllables. Furthermore, as a mere mortal and not a Son of Heaven with a grand imperial seal, decorum dictated that my personal mark be only 4 characters and around 2cm*2cm, enough to be present but not prominent on the scroll.

All this led to further constraints on the characters to be put on my seal, they couldn’t be so complex that carving them on a small seal would be impossible, and yet I needed to get my name and surname as accurately onto it as possible. Naturally this involved a lot of trial and error and I think I tried over 100 different combinations before coming up with something that sort of (but not completely) worked.

There was one syllable for which I could not find any good Chinese match and after trying and rejecting about a dozen different choices I threw my hands up and decided to consult Gemini. It thought for about 15 seconds and immediately gave me an answer that was superior to literally everything I had tried before phonetically, however unfortunately was too complex for a small seal (it wouldn’t render on the website I was buying the seal from).

I told Gemini about my problem and hey ho, 15 seconds later another character, this time graphically much simpler but sounding (to my non-Chinese ears) exactly the same was present and this actually rendered properly. The trial and error system I was using didn’t even have this particular character as an option so no wonder I hadn’t found it. It also of its own volition asked me whether I wanted to give it my full name so it could give me characters for that. I obliged and, yes, its output mostly matched what I had but was even better for one of the other syllables.

I was honestly very impressed. This was no mean feat because it wasn’t just translating my name into Chinese characters but rather translating it into precisely 4 characters that are typographically simple enough to carve onto a small seal, and with just a few seconds of thought it had managed to do something that had taken me many hours of research with external aids and its answer was better than what I had come up with myself.

All this had involved quite a bit of back and forth with the model so out of curiosity at seeing how good it was at complex multi step tasks given in a single instruction I opened up a fresh chat and gave it 2-3 lines explaining my situation (need seal for marking artworks in my collection). Now I’m an AI believer so I thought it would be good enough to solve the problem, which it absolutely did (as well as giving me lots of good unprompted advice on the type of script and stone to use, which matched my earlier research) but it also pointed out that by tradition only the artist themselves mark the work with their full name, while collectors usually include the letter 藏 meaning “collection”.

It told me that it would be a Faux Pas to mark the artworks with just my name as that might imply I was the creator. Instead it gave me a 4 letter seal ending in 藏 where the first three letters sounded like my name. This was something that I hadn’t clocked at all in my hours of background reading and the absolute last thing I would ever want is to look like an uncultured swine poseur when showing the scrolls to someone who could actually read Chinese.

In the end the simple high level instruction to the AI gave me better final results than either me on my own or even me trying to guide the AI… It also prevented a potential big faux pas that I could have gone my whole life without realizing.

It reminded me of the old maxim that when you’re stuck on a task and contacting a SysAdmin you should tell them what your overall goal is rather than asking for a solution to the exact thing you’re stuck on because often there’s a better way to solve your big problem you’ve overlooked. In much the same way, the AI of 2025 has become good enough that you should just tell it your problem rather than ask for help when you get stuck.

Now yes, impressive performance on a single task doesn’t make AGI, that requires a bit more. However its excellent performance on the multilingual constrained translation task and general versatility across the tasks I’ve been using it for for the last few weeks (It’s now my AI of choice) means I see it as a full peer to the computer in Star Trek etc. It’s also completely multimodal these days, meaning I can (and have) just input random PDFs etc. or give it links to Youtube videos and it’ll process them no different to how a human would (but much faster). Funny how of all the futuristic tech in the Star Trek world, this is what humanity actually develops first…

Just last week I’d been talking to a guy who was preparing to sit the Oxford All Souls fellowship exam. These are a highly gruelling set of exams that All Souls College Oxford uses to elect two fellows each year out of a field of around 150. The candidates are normally humanities students who are nearing the end of their PhD/recently graduated. You can see examples of the questions e.g. the History students get asked here.

However the most unique and storied part of the fellowship exam (now sadly gone) was the single word essay. For this, candidates were given a card with a single word on it and then they had three hours to write “not more than six sides of paper” in response to that prompt. What better way to try out Gemini than give it a single word and see how well it is able to respond to it? Besides, back in 2023 Nathan Robinson (or Current Affairs fame) tried doing something very similar with ChatGPT on the questions from the general paper and it gave basically the worst answers in the world so we have something to compare with and marvel at how much tech has advanced in two short years.

In a reply to this post I’m pasting the exact prompt I used and the exact, unedited answer Gemini gave. Other than cranking up the temperature to 2 no other changes from the default settings were made. This is a one-shot answer so it’s not like I’m getting it to write multiple answers and selecting the best one, it’s literally the first output. I don’t know whether the answer is good enough to get Gemini 2.5 Pro elected All Souls Fellow, but it most certainly is a damn sight better than the essay I would have written, which is not something that could be said about the 2023 ChatGPT answers in the link above. It also passes for human written across all the major “AI detectors”. You should see the words and judge for yourself. Perhaps even compare this post, written by me, with the output of the AI and honestly ask yourself which you prefer?

Overall Gemini 2.5 Pro is an amazing writer and able to handle input and output no different to how a human would. The only thing missing is a corporeal presence but other than that if you showed what we have out there today to someone in the year 2005 they would absolutely agree that it is an Artificial General Intelligence under any reasonable definition of AGI. It’s only because of all the goalpost moving over the last few years that people have slowly become desensitized to chatbots that pass the Turing test.

So what can’t these systems do today? Well, for one they can’t faithfully imitate the BurdensomeCount™ style. I fed Gemini 2.5 Pro a copy of every single comment I’ve ever made here and gave it the title of this post, then asked it to generate the rest of the text. I think I did this over 10 times and not a single one of those times did the result pass the rigorous QC process I apply to all writing published under the BurdensomeCount™ name (the highest standards are maintained and only the best output is deemed worthy for your eyes, dear reader). Once or twice there were some interesting rhetorical flourishes I might integrate into future posts but no paragraph (or larger) sized structures fit to print as is. I guess I am safe from the AI yet.

In a way all this reminds me of the difference between competition coding and real life coding. At the moment the top systems are all able to hit benchmarks like “30th best coder in the world” etc. without too much difficulty but they are still nowhere near autonomous for the sorts of tasks a typical programmer works with on a daily basis managing large codebases etc.. Sure, when it comes to bite sized chunks of writing the AI is hard to beat, but when you start talking about a voice and a style built up over years of experience and refinement, well, that is lacking…

In the end, this last limitation might be the most humanizing thing about it. While Gemini 2.5 Pro can operate as an expert Sinologist, a cultural advisor, and a budding humanities scholar, it cannot yet capture a soul. It can generate text, but not a persona forged from a lifetime of experience. But to hold this against its claim to AGI is to miss the forest for one unique tree. Its failure to be me does not detract from its staggering ability to be almost everything else I need it to be. The 'general' in AGI was never about encompassing every niche human talent, but about a broad, powerful capability to reason, learn, and solve novel problems across domains—a test it passed when it saved me from a cultural faux pas I didn't even know I was about to make. My style, for now, remains my own, but this feels less like a bastion of human exceptionalism and more like a quaint footnote in the story of the powerful, alien mind that is already here, waiting for the rest of the world to catch up.

At this point, I don't even know what an AGI is. The word has just been semantically saturated for me.

What I do know, based on having followed the field since before GPT-2 days, and personally fucked around since GPT-3, is that for at least a year or so, SOTA LLMs have been smarter and more useful than the average person. Perhaps one might consider even the ancient GPT 3.5 to have met this (low) bar.

They can't write? Have you seen the quality of the average /r/WritingPrompts post?

They can't code? Have you seen the average code monkey?

They can't do medicine/math/..? Have you tried?

The average human, when confronted with a problem outside their core domain of expertise, is dumb as rocks compared to an LLM.

I don't even know how I managed before LLMs were a thing. It hasn't been that long, I've spent the overwhelming majority of my life without them. If cheap and easy access to them were to magically vanish, my willingness to pay to get back access would be rather high.

Ah, it's all too easy to forget how goddamn useful it can be to have access to an alien intelligence in one's pocket. Even if it's a spiky, inhuman form of intelligence.

On the topic of them being cheap/free, it's a damn shame that AI Studio is moving to API access only. Google was very flustered by the rise of ChatGPT and the failure of Bard, it was practically begging people to give Gemini a try instead. I was pleasantly surprised and impressed since the 1.5 Pro days, and I'm annoyed that their gambit has paid off, that demand even among normies and casual /r/ChatGPT users increased to the point that even a niche website meant for powerusers got saturated.

Why do you consistently assume that people who don't share your views of LLM capabilities just haven't seen what they can do/what humans can do? For example:

They can't code? Have you seen the average code monkey?

Yes I have (and of course, I've used LLMs as well). That's why I say LLMs suck at code. I'm not some ignorant caricature like you seem to think, who is judging things without having proper frame of reference for them. I actually know what I'm talking about. I don't gainsay you when you say that an LLM is good at medical diagnoses, because that's not my field of expertise. But programming is, and they simply are not good at programming in my opinion. Obviously reasonable people can disagree on that evaluation, but it really irks me that you are writing like anyone who disagrees with your take is too inexperienced to give a proper evaluation.

I join the chorus of people who don't quite understand what your problem is with LLMs. What kind of code do you write? The tools are at the point where I can give them a picture of a screen I want along with some API endpoints and it reliably spits out immediately functioning react code. I can then ask it to write the middleware code for those endpoints and finally ask it to create a sproc for the database component. It's possible you hover high above us react monkeys and barely even consider that programming but surely you understand that's the level like at least half of all programmers operate on? I had copilot do all these things today, I know that it can do these things. So where is the disconnect? It's truly possible there is some higher plane of coding us uninspired 9-5 paycheck Andy's can only obliquely perceive and this is your standard for being able to program but it'd be nice if you could just say that to resolve the confusion.

As somone who's been working in the field of machine learning since 2012 and generally agrees with @SubstantialFrivolity's assesment, I think that what we are looking here is a bifurcation in opinion between people looking for "bouba" solutions and those looking for "kiki" solutions.

If you're a high-school student or literature major with zero background in computer science looking to build a website or develop baby's first mobile app LLM generated code is a complete game changer. Literally the best thing since sliced bread. (The OP, and @self_made_human's comments reflect this)

If you're a decently competent programmer at a big tech firm, LLMs are at best a mild productivity booster. (See @kky's comments below)

If you are decently competent programmer working in an industry where things like accuracy, precision, and security are core concerns, LLMs start to look anti-productive as in the time you spent messing around with prompts, checking the LLM's work, and correcting it's errors, you could've easily done the work yourself.

Finally if you're one of those dark wizards working in FORTRAN or some proprietary machine language because this is Sparta IBM/Nvidia/TMSC and the compute must flow, you're skeptical of the claim that an LLM can write code that would compile at all.

If you're a high-school student or literature major with zero background in computer science looking to build a website or develop baby's first mobile app LLM generated code is a complete game changer. Literally the best thing since sliced bread.

You have to contend with the fact that like 95+% of employed programmers are at this level for this whole thing to click into place. It can write full stack CRUD code easily and consistently. five years ago you could have walked into any bank in any of the top 20 major cities in the united states with the coding ability of o3 and some basic soft skills and be earning six figures within 5 years. I know this to be the case, I've trained and hired these people.

If you are decently competent programmer working in an industry where things like accuracy, precision, and security are core concerns, LLMs start to look anti-productive as in the time you spent messing around with prompts, checking the LLM's work, and correcting it's errors, you could've easily done the work yourself.

I did allude that there might be a level of programming where one needs to see through the matrix to do but in SF's post and in most situations I've heard the critique in it's not really the case. They're just using it for writing config files that are annoying because they pull together a bunch of confusing contexts and interface with proprietary systems that you need to basically learn from institutional knowledge. The thing LLMs are worst at. Infrastructure and configuration are the two things most programmers hate the most because it's not really the more fulfilling code parts. But AI is good at the fulfilling code parts for the same reason people like doing them.

In time LLMs will be baked into the infrastructure parts too because it really is just a matter of context and standardization. It's not a capabilities problem, just a situation where context is splined between different systems.

Finally if you're one of those dark wizards working in FORTRAN or some proprietary machine language because this is Sparta IBM/Nvidia/TMSC and the compute must flow, you're skeptical of the claim that an LLM can write code that would compile at all.

If anything this is reversed, it can write FORTRAN fine, it probably can't do it in the proprietary hacked together nonsense installations put together in the 80s by people working in a time where patterns came on printed paper and might collaborate on standards once a year at a conference if they were all stars. but that's not the bot's fault. This is the kind of thinking that is impressed by calculators because it doesn't properly understand what's hard about some things.

I feel like I'm taking crazy pills here. No one's examples about how it can't write code are about it writing code. It's all config files and vague evals. No one is talking about it's ability to write code. It's all devops stuff.

This is the kind of thinking that is impressed by calculators because it doesn't properly understand what's hard about some things.

Ironically I considered saying almost this exact thing in my above comment, but scratched it out as too antagonistic.

The high-school students and literature majors are impressed by LLMs ability to write code because they do not know enough about coding to know what parts are easy and what parts are hard.

Writing something that looks like netcode and maybe even compiles/runs is easy. (All you need is a socket, a for loop, a few if statements, a return case, and you're done) Writing netcode that is stable, functional, and secure enough to pass muster in the banking industry is hard. This is what i was gesturing towards with "Bouba" vs "Kiki" distinction. Banks are notoriously "prickly" about thier code because banking (unlike most of what Facebook, Amazon, and Google do) is one of those industries where the accuracy and security of information are core concerns.

Finally which LLM are you using to write FORTRAN? because after some brief experimentation niether Gemini nor Claude are anywhere close.

What do you imagine is the ratio just at banks between people writing performant net code and people writing crud apps? If you want to be an elitist about it then be my guest, but it's a completely insane standard. Honestly the people rolling out the internal llm tooling almost certainly outnumber the people doing the work you're describing.

I do not think that expecting basic competency is an "insane standard" or even that elitist. Stop making excuses for sub-par work and answer the question.

Which LLM are you using to write FORTRAN?

What sort of problem did you ask it to solve?

More comments