site banner

Culture War Roundup for the week of August 4, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

3
Jump in the discussion.

No email address required.

With the increased usage of ChatGPT and other aislop in everyday communication such as casual emails and slack messages, AI apologists have increasingly tried to excuse this usage by non-native English speakers(citation needed, but besides the point). The excuse being that for non-native speakers, AI usage can save time, or even increase the quality of the resulting writing. I want to argue this is actually the opposite, and that using AI output particularly and exceptionally corrosive when used by non-English speakers.

I came across this section(plaintext transcription in below comment) of a YT video, where an intermediate level English learner is trying to use ChatGPT improve a piece of writing, and also learn from it. (source video, not important). Here’s the catch ChatGPT’s output is just plain bad

Overall, my issues with ChatGPT for this use case can be broken down into three main problems:

  1. The ChatGPT output is just plain worse in many ways, and English learners won't be able to tell.
  2. By critiquing things that aren’t wrong, learners who follow blindly will lose their voice.
  3. The meaning has changed, and the user will not easily recognize this. The original meaning can be teased out of a sentence in broken English, but it has been erased completely in the AI output As a result, I feel like people using ChatGPT in this way are completely kneecapping their learning.

Let’s go over the main revisions point by point

  • stunning -> absolutely mind-blowing - Stunning is already quite a strong adjective and ChatGPT is overdoing it. OK edit.

  • I commented -> I typed in the comments - Absolutely a bad edit. More wordy for no more meaning, and the original English is more true to the original Japanese.

  • Moreover -> Not only that - Moreover is perfect here. Bad edit.

  • Em dash - not called for here. AI tell.

  • reacted really disgusting me -> actually reacted - This seriously changes the meaning, taking away a major element of the storytelling. Bad edit.

  • I’m in a heaven right now -> I’m in heaven - I’m in heaven right now is emphasis. Bad edit.

  • It was a peaceful and amazing moment in my life -> That one moment was pure peace and bliss. Probably one of the best highlights of my life. - Deemphasized and wordified into two sentences. A better version would easily be “It was the most peaceful and amazing moment in my life”. Bad edit.

  • And also, the most excited thing is -> And the most exciting part is still ahead. - AI slop tell. Bad edit.

  • I could die there -> nothing - ChatGPT just took that out completely!!!! WFT!!!!

  • I really wanna support her live too. -> I really, truly want to support her with everything I’ve got. - “really, truly” came out of nowhere and the double emphasis with “with everything I’ve got” is odd. Bad edit.

  • Imagine that live I feel like drinking her bath water. -> Just thinking about that live … feels like I could drink her bathwater. - This one is totally lost. Basic context clues and cultural knowledge make it clear that the narrator already wants to drink gamer girl bathwater irregardless of any live. The correct edit would be “When I imagine that live, I feel like I’m drinking her bathwater” or “Imagining that live feels like drinking her bathwater.” The original English is closer to correct than ChatGPT and the correct meaning can be inferred.

Of course ChatGPT can probably be made to produce better outputs with better prompting, or used differently, but this is just one of many examples where ChatGPT usage by a casual user has actually made things worse.

Now what's the point of this post? First I would like to urge everyone not to use GenAI outputs in the final work, even for edits. Using AI as a judge is probably fine, but the best way to maintain quality is probably write all of the final text in your own words. Even for people without perfect English. Secondly, with all levels of society using or even abusing AI tools, it may increase productivity by some metrics, it will also be like an enshittification of all written communication.

We've seen an increasing number of complaints enter the discourse about foreign immigrants with weak English skills just being annoying to deal with in everyday life. And I've also had similar experiences, where dealing with a fresh off the boat foreigner has been an annoyance when ordering food or asking a simple question - and also where hiring an American would have only costed a tiny bit more. Well now AI slop is going to provide a double whammy - lazy or misguided native speakers are going to enshittify their own communication with slop, and also foreigners will have their English learning impeded, and the English they do write will be worse.

What a sterling example of making the dream of perfection the sworn enemy of the merely better. As others have pointed out before, the most likely alternative, in the absence of ChatGPT, would have been this poor fellow resorting to Google Translate or other, far simpler ML solutions. There isn't an abundance of fluent English and Japanese speakers willing to proof read random YouTube comments.

I don't speak Japanese, but I see nothing particularly objectionable in the translation. It might not capture all nuance, but it gets the gist of it across. Learning language takes time, probably years, and by the time this gentleman gets good enough that he needs or appreciates the nuance, LLMs will be even better at the job.

This paper: https://arxiv.org/html/2504.18221v1 grades gpt-4 versus other translators with actual human grading (not bs like ROUGE which is useless) and finds that gpt-4 doesn't seriously outperform deepl, and google translate, while worse, isn't even that far off.

This test is actually also unfair in favor of ChatGPT, as since the test text is a story by a famous author, ChatGPT has likely already taken a peek at human translations of the work during training.

I'm reading the paper, but initial issues that caught my eye:

  1. They're not evaluating GPT-4. They're using 4o. The exact implementation details of 4o are still opaque, it might be a distill of 4, but they're not the same model. As far as I can tell, that's a point of confusion on your part, not the authors.

  2. 4o, even at the time of publication, was not the best model available. Very far from it. It is a decent generalist model, but not SOTA. It isn't the best model, the second best model, or even the third... best model, on almost any metric one opts for.

I have, as far as I'm aware, never claimed that LLMs match or outperform professional human translators. My core contention was that even basic bitch LLMs are good enough, and an improvement over previous SOTA, including DeepL, which this paper supports.

This would hold true even if the authors had chosen to use something like o3, Opus 4, Gemini 2.5 Pro etc. It is unclear to me if they were aware that better options were unavailable, there's little reason to use 4o if one wants to know what the best possible output is.

And even if it is true, it doesn't matter. The models are consistently getting better. We have a new SOTA every few months these days.

They're not evaluating GPT-4. They're using 4o.

4o vs gpt4 is my mistake, but gpt4 is generally considered obsolete and nobody uses it. It's true that 4o is a mixed bag and underperforms gpt-4 in some aspects, but we have no reason to believe that it's significantly worse than gpt-4 at translation.

4o is also what powers chatgpt.com so it's the model that most casual users will get the output from.

4o, even at the time of publication, was not the best model available.

4o was released well before gemini 2.0 or claude 3.5, so it likely was the best model at the time, along with the original gpt-4. I agree that right now 4o is not good.

My core contention was that even basic bitch LLMs are good enough

My core contention is that deepl is good enough, as it's within spitting distance of chatgpt. But on the other hand ChatGPT has given people ways to do much much worse when they use it wrong.

The paper seems to have been published on April 2025.

Gemini 2.0 Pro and 3.7 Sonnet came out in February 2025. Claude 3.5 Sonnet came out in June 2024 and was better than the version of 4o out then.

At the very least, the authors should have made a note that they weren't using the SOTA, or that the SOTA would have moved significantly by the time of publication. To do less is mild dishonesty. This isn't 2022, the pace of progress is evident.

4o is also what powers chatgpt.com so it's the model that most casual users will get the output from.

True, but that's OAI being cheap, and not an indictment of the utility of LLMs for translation. It's akin to claiming TVs suck, and then only using a cheap and cheerful $300 model from Walmart as the standard.

My criticisms stand, namely that LLMs only get better, they're "good enough", and that this is a net improvement over the status quo. It remains to be seen how much better the SOTA is over 4o or DeepL.

oh oops, I misread your comment, I thought you said that 4o was not sota when it was released. Yes it was obsolete when the paper came out.

LLMs only get better, they're "good enough", and that this is a net improvement over the status quo.

Won't change the fact that people who use them wrong will still do worse than not using LLM at all.