Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.
- 121
- 1
What is this place?
This website is a place for people who want to move past shady thinking and test their ideas in a
court of people who don't all share the same biases. Our goal is to
optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.
The weekly Culture War threads host the most
controversial topics and are the most visible aspect of The Motte. However, many other topics are
appropriate here. We encourage people to post anything related to science, politics, or philosophy;
if in doubt, post!
Check out The Vault for an archive of old quality posts.
You are encouraged to crosspost these elsewhere.
Why are you called The Motte?
A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently,
it's an element in a rhetorical move called a "Motte-and-Bailey",
originally identified by
philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial
but high value claim to a defensible but less exciting one upon any resistance to the former. He likens
this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for
the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired
propositions to which one retreats when hard pressed."
On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.
New post guidelines
If you're posting something that isn't related to the culture war, we encourage you to post a thread for it.
A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts
such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a
submission statement. A submission statement is required for non-text sources (videos, podcasts, images).
Culture war posts go in the culture war thread; all links must either include a submission statement or
significant commentary. Bare links without those will be removed.
If in doubt, please post it!
Rules
- Courtesy
- Content
- Engagement
- When disagreeing with someone, state your objections explicitly.
- Proactively provide evidence in proportion to how partisan and inflammatory your claim might be.
- Accept temporary bans as a time-out, and don't attempt to rejoin the conversation until it's lifted.
- Don't attempt to build consensus or enforce ideological conformity.
- Write like everyone is reading and you want them to be included in the discussion.
- The Wildcard Rule
- The Metarule

Jump in the discussion.
No email address required.
Notes -
Your work is doing things in the most retarded way possible if they're forcing you to use local CPU inference only. I'm not one of the AI boosters around here but I do use models from the big US labs a fair bit at work. I can see the local models becoming more capable and even quite useful for local tasks, but there's no way I'm getting one to vomit up a project from nothing and expecting miracles. I have had asked for audits of old code and had it found bugs that I missed (as well as a lot of noise).
As far as I can tell,
llama.cppis a lot better thanollama. While it's much less of a turn-key experience, many models never seem to make it across toollama, and its inference is slower in my experience. It also seems to be slower at picking up newer developments, like multi-token prediction:For running local models, there are quite a few things to think about:
Q4_K_XLis usually a "sweet spot" most people go for, and here's someone claiming 80 tokens/sec withQwen3.6-35B-A3B-Q4_K_XLon a 12GB card: https://old.reddit.com/r/LocalLLaMA/comments/1t82zxv/80_toksec_and_128k_context_on_12gb_vram_with/ . Ignore the stuff about the MTP PR; that's since been merged tomaster.Then there's actually getting it running and doing something useful. For that I'll defer to the above guide on how to launch
llama-serverfor this model on a 12GB card. You'll then have to point OpenCode at your local server and see if it goes any better for you. No promises, my sense is that local stuff is on the edge of being "quite decent" and it's worth having a finger in the local model pie so when it does get genuinely good. I don't ever want to be locked into paying for subscription compute.Thanks for having an actionable and helpful reply. I'll give these things a try.
Don't get me wrong, the people who say these local models are "almost-Claude-tier" are still too starry-eyed. The model I recommended you -
Qwen3.6-A3B-Q4_K_XL- told me earlier that both the Linux kernel and Busybox used the autotools. It will confabulate as badly as a previous-gen frontier model but if you drop it in an established project where it can read stuff written by people with a clue, it can often be guided into doing useful things like push through refactors and updates where the compiler and tests can keep it on track.More options
Context Copy link
More options
Context Copy link
More options
Context Copy link