This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.
No email address required.
Notes -
Computationally, maybe all we are is Markov chains. I'm not sold, but Markov chat bots have been around for a few decades now and used to fool people occasionally even at smaller scales.
LLMs can do pretty impressive things, but I haven't seen convincing evidence that any of them have stepped clearly outside the bounds of their training dataset. In part that's hard to evaluate because we've been training them on everything we can find. Can a LLM trained on purely pre-Einstein sources adequately discuss relativity? A human can be well versed in lots of things with substantially less training material.
I still don't think we have a good model for what intelligence is. Some have recently suggested "compression", which is interesting from an information theory perspective. But I won't be surprised to find that whatever it is, it's actually an NP-hard problem in the perfect case, and everything else is just heuristics and approximations trying to be close. In some ways it'd be amusing if it turns out to be a good application of quantum computing.
I don't want to speak on 'intelligence' or genuine reasoning or heuristics and approximations, but when it comes to going outside the bounds of their training data, it's pretty trivially possible to take an LLM and give a problem related to a video game (or a mod for a video game) that was well outside of its knowledge cutoff or training date.
I can't test this right now, it's definitely not an optimal solution (see uploaded file for comparison), and I think it misinterpreted the Evanition operator, but it's a question that I'm pretty sure didn't have an equivalent on the public web anywhere until today. There's something damning in getting a trivial computer science problem either non-optimal or wrong, especially when given the total documentation, but there's also something interesting in getting one like this close at all with such minimum of information.
/images/17544296446888535.webp
That is pretty impressive. Is it allowed to search the web? It looks like it might be. I think the canonical test I'm proposing would disallow that, but it is a useful step in general.
Huh.
Uploading just the Patterns section of the HexBook webpage and disabling search on web looks better even on Grok3, though that's just a quick glance and I won't be able to test it for a bit.EDIT: nope, several hallucinated patterns on Grok 3, including a number that break from the naming convention. And Grok4 can't have web search turned off. Bah.
Have you tried simply asking it not to search the web? The models usually comply when asked. If they don't, it should be evident from the UI.
That's a fair point, and does seem to work with Grok, as does just giving it only one web page and requesting it to not use others. Still struggles, though.
That said, a lot of the logic 'thinking' steps are things like "The summary suggests list operations exist, but they're not fully listed due to cutoff.", getting confused by how Consideration/Introspection works (as start/end escape characters) or trying to recommend Concat Distillation, which doesn't exist but is a reasonable (indeed, the code) name for Speaker's Distillation. So it's possible I'm more running into issues with the way I'm asking the question, such that Grok's research tooling is preventing it from seeing the necessary parts of the puzzle to find the answer.
I tried using o3, but it correctly noted that the file you mentioned isn't available, and its web browsing tool failed when trying to use the website.
I can't do anything about the missing document, but I did manually copy and paste most of the website. This is its answer:
https://chatgpt.com/s/t_6892b68c0c3081919777d514df3ba8c2
That's a bit weird an approach -- you're drawing 20 Hermes Gambits rather than having the code recurse, and the
Gemini Decomposition → Reveal → Novice's Gambitcould be simplified to justReveal-- but it does work and fulfills the requirement. Can run it in this IDE if anyone wants, though you'll have to use the simplified version sinceNovice's Gambit(andBookkeeper's Gambit) isn't supported there, but the exactly-as-chatGPT'd version does work in-game (albeit an absolute pain to draw without a Focus).That's kinda impressive. Both Rotation Gambit II and Retrospection are things I'd expect LLMs to struggle with.
Thanks for verifying the answer! If there's a takeaway here, I'm not sure why you're ?paying for Grok 4. Grok 3 was genuinely impressive, and somewhat noticeably better than the competition at launch. Not the case here I'm afraid.
With that aside, I'm not sure how other people see LLMs tackling problems of this complexity and then claim they're not reasoning. It bemuses me.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link