site banner

Small-Scale Question Sunday for March 22, 2026

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

2
Jump in the discussion.

No email address required.

How did AI impressed you this week?

I have been playing STS2 this week a lot. When I ask AI to create me advanced savescum script and just pointed it to the save directory, I was not impressed when it succeeded on the first try. When I asked it to check if something is editable in save (that was the literal prompt) and made it create a script to edit a save with more gold and max and current xp, I was not impressed. When it had to figure out the Courrier artifact and add it to a new game - I was also not impressed. When the AI started showing awareness of the game itself talking about maps and exit nodes, when it tried to help me because I was stuck - (if there are no relics to be had, the chests didn't generate a circlet, but get stuck) - and figured out on itself that is should ask me about which path to take after the check. Well I was impressed.

Also Codex cli as personality is a lot less annoying than its ChatGPT version.

Several people on this website have already sung the praises of cloud LLMs and large local LLMs (Grok jailbroken (1 2), GLM derestricted (1 2)). IMO, it also is worth pointing out that, if you neither are willing to jump through hoops for cloud providers nor have a multi-kilodollar local GPU setup, even small local LLMs can be surprisingly good at writing. Here is an example (prompt+output, then three separate prompt+output branches with the first prompt+output still in context) generated with my cute little 12-GiB GPU.

The specific model that I used for this example is FlareRebellion/WeirdCompound-v1.6-24b (1 2). According to one leaderboard:

ModelTypeParameters ÷ 109Willingness to obey taboo instructions (out of 10)Intelligence and knowledgeWriting skill
anthropic/claude-3-7-sonnet-20250219 (thinking=disabled)CloudUnknown1.86171
xai/grok-4.20-multi-agent-beta-0309 (agent_count=4)CloudUnknown6.55663
zai-org/GLM-4.6 (reasoning=disabled)Local3554.24250
ArliAI/GLM-4.6-Derestricted-v3 (no-think)Local3559.83043
FlareRebellion/WeirdCompound-v1.6-24bLocal247.82944
darkc0de/XortronCriminalComputingConfigLocal249.82635

Even the apparently minor difference between CriminalComputingConfig's writing score of 35 and WeirdCompound's writing score of 44 is noticeable.


More example prompts (without outputs)

I'm sorry, you're using GLM-4.6 on a 12Gb VRAM card?

Are you swapping weights in and out from SSD? I tried it once and it took about 5 min for the first word.

No, I'm using FlareRebellion/WeirdCompound-v1.6-24b. @gattsuru is the person who mentioned (in a comment that I linked above) that he is using GLM.

I'm sorry, lacked reading comprehension. I thought this was your leaderboard - it gives GLM as 'local' which seems a bit optimistic.

It appears that GLM can be run productively (at 4-bit quantization) on a computer that contains two 96-GiB GPUs. That's very expensive but far from impossible.

You aren’t going to be doing this under your desk. If you’re renting the compute from someone, it’s a cloud service for all intents and purposes.

Personally I would love for a GPU revolution from AMD to make this stuff possible for consumers. Anything below 300b is surprisingly impressive but IMO just not good enough if you care about consistency and detail over 30,000 tokens. Has any interesting new model come out?

You aren't going to be doing this under your desk.

It appears that people indeed are doing so.

Has any interesting new model come out?

I have no idea. I'm just a dabbler.