Contact Us
Sign In
Sign Up
Rules Admins Moderation Log Random Post Random User
What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules
Recommended Posts And Communities
Recommended Realtime Chats
- Astral Codex Ten Discord
- Quokka's Den Telegram

PaperclipPerfector 21hr ago (text post) 578 thread views

Small-Scale Question Sunday for April 26, 2026

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

Jump in the discussion.

No email address required.

Lizzardspawn 10hr ago

So big week in models - in the matter of hours we got GPT-5.5, Deepseek V4 and Opus 4.7. What are your impressions so far - mine - GPT-5.5 is the least underwhelming of the 3. Deepseek is really capable at insane prices, Opus 4.7 at least for me it totally undistinguishable from 4.6

Also the limits are absurdly tight on the 20$ level.

Context

EverythingIsFine Well, is eventually fine Lizzardspawn 1hr ago · Edited 1hr ago

Haven't really tried the other two much myself, high costs kinda scare me off where I find it hard to believe the value is there for my use case, but I'm looking quite closely at Deepseek V4 Flash specifically, because the cost-to-performance seems to be pretty insane.

I mean, the official API price is $0.0028 per million input (cached), $0.14 (cache miss), $0.28 per million output. Has a 1M context window, and per OpenRouter, because their stats have above a 90% cache hit rate, this means the effective weighted price per million input is only $0.015 per million. That's really crazy. I need to test a bit more to figure out where I'd place the intelligence exactly, but...

NONE of the current frontier 'cost-efficient' models come even remotely close to that. For comparison Gemini 3.1 Flash Lite is 0.25/1.50 input/output per 1M, Gemini 3 Flash is 0.50/3.00, GPT 5.4 Nano is 0.20/1.25, GPT 5.4 Mini is 0.75/1=4.50, Claude Haiku 4.5 is 1.00/5.00. Sure, it's not as good at coding as Gemini Flash, allegedly, but also allegedly it's better at agentic workflows. Those are some pretty significant gaps, approaching an order of magnitude in some cases.

So yeah, Pro is also very cheap and that might make some waves, but contextually Flash is SUPER cheap. Like, obscenely so.

This to me is a big deal because part of what makes AI so compelling is the cost/benefit ratio. With a model like V4 Flash, especially input-heavy workflows, there are plenty of scenarios where it's literally cheaper to throw 5 different approaches at the wall and pick the best than to make a single attempt with a model that's just a hair smarter. We'll see how well it does when encountering actual codebases and such, but I find that it might potentially enable a slightly different type and set of workflows than we're used to.

It's hard to say for sure these days because especially with agentic coding the harnesses are so important (and often what works for certain setups doesn't transfer that well, including across generations of models). I'm curious if someone will figure out a good way to leverage this new cost-benefit balance, because it potentially changes e.g. how you might spin up subagents quite a bit. Although possibly as I mentioned the model is just a bit too stupid to do a large enough range of useful work. We'll see, gotta figure out how much is benchmaxxing vs inherent quality.

At the same time, the Claude shift in tokenizer probably long-term helps efficiency and intelligence, but short term you're looking at a 10-30% flat increase in costs on higher token costs alone, before you get into the token efficiency of the models themselves, at least per the numbers I was looking at initially.

Context

birb_cromble Lizzardspawn 8hr ago

Opus 4.7 seems to handle the stock ticker tests I do better than 4.6. I assume it's the new tokenizer. Otherwise the only difference I notice is that it's more expensive to run on the same thinking level

Context

TowardsPanna birb_cromble 1hr ago

What tests are those?

Context

What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules

Recommended Realtime Chats

Link copied to clipboard

Action successful!

Error, please try again later.