Contact Us
Sign In
Sign Up
Rules Admins Moderation Log Random Post Random User
What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules
Recommended Posts And Communities
Recommended Realtime Chats
- Astral Codex Ten Discord
- Quokka's Den Telegram

ArjinFerman Tinfoil Gigachad 1mo ago (text post) 794 thread views

Tinker Tuesday for June 9th, 2026

This thread is for anyone working on personal projects to share their progress, and hold themselves somewhat accountable to a group of peers.

Post your project, your progress from last week, and what you hope to accomplish this week.

If you want to be pinged with a reminder asking about your project, let me know, and I'll harass you each week until you cancel the service.

Jump in the discussion.

No email address required.

KingOfTheBailey 1mo ago · Edited 1mo ago

Do we have many local LLM users here? I'm curious what people are doing: what models are people using? For what jobs? For what reason? On what hardware? With what runner?

As I mentioned to @WhiningCoil a few days ago, I mostly run a Qwen3.6-A3B-Q4_K_XL on llama.cpp's llama-server and connect to it from https://pi.dev/, using a Radeon 780M in my laptop. It's been decent for grinding through smaller coding jobs under close observation, though like any Chinese model it'll just give you the party line if you start asking it about Taiwan or Tiannemen Square. I've also been using a gemma4-26B-A4B for general questions about the world when I'm at session quotas. The other big reason I'm getting into this stuff is that I never want to be locked out by a subscription. Haven't looked at image or video generation at all.

Context

gattsuru KingOfTheBailey 1mo ago

I have a small ML server that I initially set up for some work stuff, and have since retrofitted for LLM and diffuser use. nVidia 3090, i5-14400, running between 128 GB RAM to 192 GB RAM depending on what else I've shut down. Squeaked in just before RAM prices spiked (and am kicking myself for not grabbing three or four more of the 64x2 kits), if you want to know the why on the weird RAM numbers.

I'll caveat that you just shouldn't expect Claude or even Grok-level outputs from local models on their own.

For LLMs runners, I've mostly stuck to llama-server (and forks) as well, after an initial and short-lived love-hate relationship with LMStudio. I have a few custom bits of code for sequencing larger grouped requests, but they're worse-than-vibe-code level stuff and basically just a UI and for loop. Toyed with SillyTavern, just in the hopes of getting better organization, but it's really heavily built for roleplay and I'm not that interested in it. I've looked at and played with some agentic-ish stuff in heavily sandboxed and airgapped environments, but when the best options are nanoclaw, hermes and odysseus, but when the least obnoxious one is powered by pewdiepie, there be dragons here.

Writing:

gemma4-26-A4B is a great editor, beta reader, and brainstorm sounding board. It's the closest to okay prose from a local model in its class, although beating the obvious AI tells out of it takes some effort and it's seldom very interesting. Also seems to have the best MTP assist (though I had the to build the atomic-turboquant variant llama variant to get MTP to work when it first came out; don't know if the situation has changed there).
[Cydonia](https://huggingface.co/TheDrummer/Cydonia-24B-v4) (24B, mistral-based) and ```Strawberry Limeade (70B, llama-based) are older models that were pretty useful and I'll still pull up on occasional to sanity-check stuff against.

Coding:

Qwen 3.6 is hard to beat for simple and fast work, especially things like bringing in an image sketch and converting it into XAML or webdev, or beating some simple file munging into shape. In addition to 35B-A3B, I'd also point to the 27B-MTP dense variant. It's not as fast as A3B, but the gap's smaller than you'd expect, and in some use cases it comes across as much smarter in my experience. I'd also recommend it any time you have a sketch or powerpoint art-level design you want converted into a GUI representation, and can't use a cloud model -- far from perfect, but easily saves hours of work.
GLM (ranging from 4.5-Air at 106B up) can be good for complex work, refactoring, and troubleshooting -- but even at moderate quants, it can be fifteen minutes per turn. Great where I've got a ton that needs to go into the hatch and can work on something different; terrible for anything where a fast OODA loop is important.

Standard vs uncensored/abliterated models is a hard question. Qwen is very refusal-prone, and not just on political topics. While Gemma4 is surprisingly willing to play along for a variety of topics, it still has some hard refusal points, some of which can come up surprisingly rapidly. And not just for weird smut, either. I've had .

For Image Generation, your two power user options are Automatic1111/Forge WebUI and ComfyUI. WebUI is the easier option to get started with, and still has a good level of support for things like img2img, swapping models out, or using various plugins or controlnets. ComfyUI's much more capable and eventually lets you do things like switch between models for different stages of a pipeline, but there's very much a 'who wants to drink from the firehose' moment every time you get started, and managing workflows sucks. On the other hand, if you want to run something like TRELLIS2 or Wan3d, ComfyUI's a lot easier (though not easy!) to set up.

In terms of models:

SDXL based models like the Illustrious and NoobAI family are good for producing general 'vibe'-ish scenes with one or two actors, so long as you don't need precision, and they're pretty fast.
The current new hotness popular options in the furry fandom are [Chroma](https://huggingface.co/lodestones/Chroma) (9B, FLUX.1-schnell-based) and [Anima](https://huggingface.co/circlestone-labs/Anima) (2B), which favor natural language over the SD-style "throw a bunch of words at it" approach. Much slower, though. Qwen Image 2512 variants also fall here, although their workflows can be a lot more annoying.
Qwen Image Edit and Flux2-Klein are the best edit models, especially for keeping consistency in a scene while tweaking it, or moving a character from one setting to another.

2D->3D Models:

TRELLIS(2) gives the nicest-looking outputs for a given input image, and supports(ish) transparency.
Wan3D gives more 'whole' models that require less post-processing to ship to a 3d printer, but tends to be a little fuzzy.

Animation:

WAN2 has the most support and has been out the longest.
LTX-2.3 is much faster and comparable or better quality.
SCAIL2 just came out, but I haven't even tried to set it up yet. Initial reports look good.

I had a hell of a time getting any non-trivial animation model working in Forge WebUI, and the ComfyUI workflows get nutty pretty fast. If you want to experiment with them and not go leaping into the deep end, WAN2GP gives a lot of workflow options, at the cost of sometimes serious performance costs and a bad tendency to automatically download a model without warning.

Context

What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules

Recommended Realtime Chats

Link copied to clipboard

Action successful!

Error, please try again later.