site banner

Friday Fun Thread for July 3, 2026

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

My main takeaway: the entries have way too much purple prose, that doesn't add anything positive, just makes them more convoluted. Some makes no sense, e.g. "he stands at her grave like a man patting his pockets on a platform". But even what does, evokes no emotion except irritation, e.g. "Not a bad taste. Worse than that. The underside of a clean plate. Water left out overnight. The dry corner of a stamp before you wet it." (maybe because it's "inauthentic", but also just bland. Why three metaphors? Why not save readers time and effort with "Not even bad, it just tasted like nothing." And these awful metaphors are frequent, almost every sentence has or is part of some figurative expression).

At least most entries had a plot (except "The Bowl" I couldn't figure out, sorry @self_made_human), but none I found interesting: for example, the winner "The June"'s plot is a man has the ability to turn experiences into recipes/dishes, the worse the experience the better the dish tastes, so he breaks up with his wife to cook the tastiest dish, like a drug addict. Shoutout to N/A by elia.discourse, the only entry I felt was unique and not filled with useless metaphors, but maybe only because the LLM went full schizo.


Contrast to LLM's advancements and current ability in coding. I think the coding is even a step up from November 2025, while the creative writing hasn't improved since even GPT-4 in March 2023. Just last week I had an annoying build issue that probably would've taken hours to solve manually; in Claude Code I literally prompted "find and fix the build error" and it did that, using gdb on the command-line (which is a PITA manually), in about 10 minutes. Later I asked Claude to implement a complex algorithm, and initially it wrote a bad implementation which overfit to the tests, but then I gave it a more detailed prompt with a high-level outline, and after a few small manual edits, it seems to work.

Market may be an influence, but I think the biggest reason for the difference is RLVR. Modern LLMs are fed massive amounts of generated code and their output objectively scored, hence they learn how to write code that passes tests. LLMs are fed on random internet text, but there's no scoring mechanism to distinguish between facts and falsehoods or subjective masterpieces and subjective garbage, except human feedback which is relatively slow and inaccurate.

How do humans do it? Maybe because we have better physical training data through real-world interactions (we see, hear, touch, taste things; LLMs have words and comparatively few images and video). And because we directly experience emotions, maybe we learn the concepts associated when they hit, but LLMs can only learn associations between word groups, like if one of the groups has the emotion literally written (e.g. a human may experience XYZ followed by happiness, an LLM may be trained on "XYZ makes him happy" and associate XYZ with happiness; but humans don't know or write everything that makes them happy, so there are more instances of the former than the latter). Or maybe because quality is subjective, the average of many works that are high quality to different people is low quality to everyone.