site banner

Culture War Roundup for the week of February 17, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

4
Jump in the discussion.

No email address required.

Grok 3 just came out, and early reports say it’s shattering rankings.

Now there is always hype around these sorts of releases, but my understanding of the architecture of the compute cluster for Grok 3 makes me think there may be something to these claims. One of the exciting and interesting revelations is that it tends to perform extremely well across a broad range of applications, seemingly showing that if we just throw more compute at an LLM, it will tend to get better in a general way. Not sure what this means for more specifically trained models.

One of the most exciting things to me is that Grok 3 voice allegedly understands tone, pacing, and intention in conversations. I loved OpenAIs voice assistant until it cut me off every time I paused for more than a second. I’d Grok 3 is truly the first conversational AI, it could be a game changer.

I’m also curious how it compares to DeepSeek, if anyone knows more than I?

Grok 3 is a whelming outcome. The only thing notable about it is how consistent it is with most predictions, including scaling laws.

Unfortunately, its time as SOTA is going to be short, or nonexistent, because the full-fat o3 from OpenAI already ekes out a win in benchmarks. Of course, o3 is not technically available to the public, so among released models that are pay to play, Grok reigns.

I've only played around with it a little bit, through lmarena.ai because I'm not in dire need of any paid plan. It seemed fine? There's one particular question that I ask LLMs, courtesy of my maths PhD cousin: "Is the one-point compactification of a Hausdorff space itself Hausdorff?" The correct answer is yes, or so I'm told. Grok 2 fails, Grok 3 succeeds. But so do GPT-4o, Gemini 2.0 Pro and the like.*

(I've asked this so many times that my keyboard automatically suggests the entire question, talk about machine learning)

In short, Grok 3 is a mild bull signal for LLMs, and a slightly stronger one for xAI. It doesn't seem to be astonishingly good, or break ground other models haven't reached. It also hasn't been made retarded by too much Reinforcement Learning from Elon's Feedback. He shared an excerpt showing it lambasting an outlet called The Information, but actual users get a far more measured response. I cynically suspect that a few xAI engineers probably set the sycophancy setting to the maximum when he's using it.

*I'm probably remembering the original explanation I was given wrong, and that my cousin had said it was a no except with additional qualifiers. Mea culpa.

Edit: On a second try with Grok 3:

In all relevant cases, we can find disjoint open sets separating any two distinct points in (X^). Therefore, the one-point compactification (X^) of a Hausdorff space (X) (assuming (X) is locally compact) is itself Hausdorff.

Which I believe is the correct answer.

There's one particular question that I ask LLMs, courtesy of my maths PhD cousin: "Is the one-point compactification of a Hausdorff space itself Hausdorff?" The correct answer is yes, or so I'm told.

Are you just asking it as a yes/no question? This is a standard question that a first-year undergrad could be asked to check that they understood the definitions, and it's unlikely that the answer wouldn't be in the training set somewhere. For example, I quickly fed it to a Q4_K_M quantised Qwen2.5-3B (that's on the level that you could run on a smartphone nowadays), and it completed

Q: Is the one-point compactification of a Hausdorff space itself Hausdorff?

A:

with

Yes, the one-point compactification of a Hausdorff space is itself Hausdorff.<|endoftext|>

edit: See @wlxd's discussion for why the correct answer is actually "No". In fact, Qwen2.5-3B is almost perfectly on the edge: the log-odds of the first token of an alternative answer that goes "No, the one-point compactification of a Hausdorff space is not necessarily Hausdorff.<|endoftext|>" is only about 0.21 lower, so the probability of it being chosen is about e^-0.21 or 0.81 times the probability that it would pick the "Yes...". (Think 45% No to 55% Yes.)

The point is that this answer is just incorrect. There are non Hausdorff one point compactifications of Hausdorff spaces. You need additional assumption of local compactness for it to be true.

You are right, but I don't think that was "the point", given that @self_made_human apparently was led to believe that it is yes (and seemed to treat that answer being given as a success criterion).

(I was actually in the process of writing up another response as I had realised it is not true, after I fed the question to DeepSeek-R1's Qwen7B distill to reason through and found that it choked as it tried to conjure up compact neighbourhoods that I didn't see the grounds for existing, but I hadn't gotten to the point of having a good counterexample yet)

It's likely PEBKAK on my part. My cousin had explained the reasoning almost a year back (eons in LLM time) and I'd likely forgotten the finer qualifications and only remembered the yes or no bit.

I give a counter example in my other comment.

Ah, thanks, that works.