BigObjectPermanenceShill
No bio...
User ID: 4286
Many harnesses now support subagent calling. I have built and recommend building your own from pi-agent with a small set of packages, because everything else feels bloated, excessively opinionated and cache-inefficient. Also pi was shown to sometimes perform better with GPT 5.x than Codex itself, so it's not a big compromise. OpenCode is supposedly good now, I don't have the patience to check it out. If you want server-side execution without being locked into a specific lab's ecosystem, I've heard credible praise for Factory Droid.
One note, all this is likely not viable for your scenario, because DeepSeek's cache pricing is limited to their own first party API (Chinese). I am hopeful that in time Western providers, with their superior hardware, will also figure out effective cache compression and serving from disk, or be compelled to grow more generous if they have figured it out already. But this day is not yet here. (Claude Code plans treat cache as free but has obvious usage limits).
I mean aggregated across multiple models. I don't use Claude much anymore (only occasionally API via OpenRouter) and don't think their offer is economically viable. Now, GPT 5.5 is the orchestrator, DeepSeek V4 Pro/Flash is the workhorse. Their 1M context and new prices, especially context caching, make long agentic projects basically free. Cache persists for a whole day too, so speed is of no issue if the top-level plan is reasonable.
While it is true that high AI performance and thus automation making certain busywork obsolete will cause some demand destruction, there are so many ways to use tokens.
It's been a slow day. I've "used" something like 75 million tokens. Of those maybe 72 million were cache reads, true, but also about 2.5M input and 500K output. If you look at modern benchmarks like Artificial Analysis or MathArena, you'll see that even very best models use tens of thousands of tokens to solve problems. We have enough problems. The cheaper intelligence is, the more problems become economical to solve by throwing tokens at it.
Better yet, look here.
- Prev
- Next

You can manually enable any third party model in Codex if the model's endpoint offers a responses API. DeepSeek and a few others don't, but I see there's a converter.
More options
Context Copy link