site banner

Friday Fun Thread for February 20, 2026

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

The path to ubiquitous AI (17k tokens/sec): A company (TALAAS) just announced a chip that runs LLMs very fast: according to their graph, 8.5x as fast as Cerebras, which is 5.6x as fast as Nvidia. Try it for yourself. It's running LLaMa 3.1 8B, so rather dumb, but the answers are nearly instant. Allegedly it's much cheaper (10x) than GPUs, too. A downside is that the model is hard-wired into the chip, allegedly two months from model to production.

Any use cases that aren't possible with today's (relatively) slower and more expensive models? Perhaps you put this on a router to have a very smart firewall. Or have it repeatedly generate code and fix bugs until a test suite passes, which Opus and Codex do but they can take a while. Then again, it's not instant, and frontier models already generate text very fast, much faster than a human can write or even read.

A downside is that the model is hard-wired into the chip, allegedly two months from model to production.

That's (almost) the difference between Claude Opus 4.1 and 4.6 (skipping over 4.5), or GPT 5.1 vs. GPT 5.3 (kind of, since it's a restricted release).

There's probably a niche for it, but it probably won't become a core piece of the landscape even discounting the cost.