Contact Us
Sign In
Sign Up
Rules Admins Moderation Log Random Post Random User
What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules
Recommended Posts And Communities
Recommended Realtime Chats
- Astral Codex Ten Discord
- Quokka's Den Telegram

PaperclipPerfector 1d ago (text post) 583 thread views

Friday Fun Thread for July 3, 2026

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

Jump in the discussion.

No email address required.

Shirayuki2 18hr ago · Edited 18hr ago

https://www.hyperstitionai.com/unslop-results

I came across the results of this contest to generate LLM-written stories, with one of our participants being our very own @self_made_human. I'd be interested in any thoughts you had on participating in the contest as well, if you had any.

My main takeaway after skimming the finalists was that while some of the concepts were interesting, even with a lot of prompting and harness effort, there's just not much difference from the sloppy prose and incoherent writing you get when one-shotting text out of an LLM.

It's hard to tell how much of this is just because writing's not a focus at the LLM labs right now, but the fact that every LLM converges onto similar writing attractor states definitely makes me bearish on LLM's ability to continue to generalize.

Context

gattsuru Shirayuki2 11hr ago

Some of the weirdness is downstream of the contest parameters; I'd put some effort toward a short story using a recursive writing approach, but by the time it was remotely good prose, it wasn't clear it complied with the rules. Had similar issues with Phailyoor's challenge.

Some of it is definitely a toolset problem. The models themselves overwhelmingly aim toward <1k word 'chunks', and community efforts to bypass that like WriteLonger are very much band-aids. It's relatively easy to throw together a scaffold of scenes that interlace together, but you still get mini-climaxes at the end of each prompt and that gets grating fast. Claude tries to do something under the hood to hide the problem, as do some agentic setups, but they seem to do so by just gluing segments together. Whether you do that automatically or manually, it still results in unpleasant crescendos because the model thinks the scene or segment ends in places the tension is supposed to be rising. Some prompt scaffolds to give better control over pacing only squeeze that further together.

Like conventional art, there's a big problem where even most writers don't know the technical terms for what they're trying to do, and unlike conventional art with diffusers, writing in the Style of X doesn't work very well, and anything else takes a ton of input tokens. I've had some limited success by giving a handwritten example as input and then transferring the characters and background with them, but it's still not good, and it doesn't scale to longer works.

Context remains an issue. It's amazing that models can get into the 1-million-range, but few do well there. That's more of an issue with long-form writing -- this story isn't very good, sometimes in painful ways because it's close to it, but the prose quality goes from merely purple to outright blah by 30k tokens -- but it does still mean you can't just throw a hundred thousand words of setting and style bible into most models and get anything useful out.

Weirdly, the models are great at editing, both broad strokes and catching narrow typos, including for smaller models and sometimes even for genres that should be really hard for a token predictor to understand. The outcomes aren't always in the form I like, but it's usually nothing awful, or even as bad as the original prose. But you can't just cycle the same text through an LLM with a prompt of "do it better": on top of the token costs, it tends to just get set in cycles around a small subset of problems.

I keep hoping it'd be possible to get something better out of a more complicated script-based approach rather than a solely agentic one, but results have been mixed, and you end up with a thirty-page deep list of checkboxes for the models to review over and over.

Context

Brainwavez Shirayuki2 14hr ago · Edited 13hr ago

My main takeaway: the entries have way too much purple prose, that doesn't add anything positive, just makes them more convoluted. Some makes no sense, e.g. "he stands at her grave like a man patting his pockets on a platform". But even what does, evokes no emotion except irritation, e.g. "Not a bad taste. Worse than that. The underside of a clean plate. Water left out overnight. The dry corner of a stamp before you wet it." (maybe because it's "inauthentic", but also just bland. Why three metaphors? Why not save readers time and effort with "Not even bad, it just tasted like nothing." And these awful metaphors are frequent, almost every sentence has or is part of some figurative expression).

At least most entries had a plot (except "The Bowl" I couldn't figure out, sorry @self_made_human), but none I found interesting: for example, the winner "The June"'s plot is a man has the ability to turn experiences into recipes/dishes, the worse the experience the better the dish tastes, so he breaks up with his wife to cook the tastiest dish, like a drug addict. Shoutout to N/A by elia.discourse, the only entry I felt was unique and not filled with useless metaphors, but maybe only because the LLM went full schizo.

Contrast to LLM's advancements and current ability in coding. I think the coding is even a step up from November 2025, while the creative writing hasn't improved since even GPT-4 in March 2023. Just last week I had an annoying build issue that probably would've taken hours to solve manually; in Claude Code I literally prompted "find and fix the build error" and it did that, using gdb on the command-line (which is a PITA manually), in about 10 minutes. Later I asked Claude to implement a complex algorithm, and initially it wrote a bad implementation which overfit to the tests, but then I gave it a more detailed prompt with a high-level outline, and after a few small manual edits, it seems to work.

Market may be an influence, but I think the biggest reason for the difference is RLVR. Modern LLMs are fed massive amounts of generated code and their output objectively scored, hence they learn how to write code that passes tests. LLMs are fed on random internet text, but there's no scoring mechanism to distinguish between facts and falsehoods or subjective masterpieces and subjective garbage, except human feedback which is relatively slow and inaccurate.

How do humans do it? Maybe because we have better physical training data through real-world interactions (we see, hear, touch, taste things; LLMs have words and comparatively few images and video). And because we directly experience emotions, maybe we learn the concepts associated when they hit, but LLMs can only learn associations between word groups, like if one of the groups has the emotion literally written (e.g. a human may experience XYZ followed by happiness, an LLM may be trained on "XYZ makes him happy" and associate XYZ with happiness; but humans don't know or write everything that makes them happy, so there are more instances of the former than the latter). Or maybe because quality is subjective, the average of many works that are high quality to different people is low quality to everyone.

Context

What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules

Recommended Realtime Chats

Link copied to clipboard

Action successful!

Error, please try again later.