site banner

Small-Scale Question Sunday for April 26, 2026

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

2
Jump in the discussion.

No email address required.

So big week in models - in the matter of hours we got GPT-5.5, Deepseek V4 and Opus 4.7. What are your impressions so far - mine - GPT-5.5 is the least underwhelming of the 3. Deepseek is really capable at insane prices, Opus 4.7 at least for me it totally undistinguishable from 4.6

Also the limits are absurdly tight on the 20$ level.

Opus 4.7 seems to handle the stock ticker tests I do better than 4.6. I assume it's the new tokenizer. Otherwise the only difference I notice is that it's more expensive to run on the same thinking level

What tests are those?

I'm not going to go into detail, but the basic gist of it is that exchange ticker symbols tend to be short strings with a lot of overlap. Despite the textual overlap, each one has a distinct identity and they are not interchangeable. While some LLMs deal with that kind of thing better than others, they all tend to have problems that get worse as the context window fills up, and compaction tends to cause problems as well.

As a toy example that isn't much of a problem anymore, BND and BNDW are not the same thing. They have different holdings, different rates of return, and different tax implications. That little W at the end means a lot, but next token prediction can have a hard time with it.