site banner

Culture War Roundup for the week of May 18, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

5
Jump in the discussion.

No email address required.

AI timeline post

I have an idea. Let's post year-by-year AI timelines. That way we have public predictions and in 1 to 3 years we can see who is better at it.

2027: development begins on a startup that targets corporate managers with a fully agentic microservice creator. It promises to replace at least 5 devs with one who runs the creator. It's basically a scam like most startups, a UI gloss on claude opus 4.8, which is only 10% better than 4.5, but it panders to non-technicals so it gets a ton of seed funding. AI solves maybe 10 niche, self-value giving math problems that is considered „impressive” by the kinds of people who put calories into math which never seem to come out of math. This contributes to marketing hype because some of these people are respected by rich tech funders and finance bros for whatever reason. AI is still completely useless at real scientific research and reasoning.

2028: tech layoffs continue as the statup I mentioned enters its beta stage. It makes some bugs on the websites that engineers still need to fix but pilots of it are 5x or more compared to 2018. Tech people start pivoting to finance and robotics slowly, from JavaScript. DeepSeek releases a model that is equivalent to 5.3 Codex. General coding ability stalls and Anthropic + OpenAI are looking for spiky improvements to models plus tooling ideas. They begin to think about targeting non-coding white collar work like finance and spreadsheet work since the models are not getting much better at JavaScript, having used up all of the JavaScript data in the entire world. Some math problems continue to be solved as models are secretly trained on newly parsed math examples but nothing comes of it. One contribution is made to theoretical physics but it is on the abstract side and it is still controversial as to whether these models can do any science.

2029: Someone is arrested in the United States for plotting a terror attack with GPT. To nobody's surprise, GPT and Anthropic messages are completely unprivate and the United States secret police have been monitoring them with cooperation from the companies and zero warrants whatsoever for years. The scary part is that GPT gave a 120 IQ plan to a 90 IQ person and it could have led to more deaths than whatever 90 IQ plan he would have come up with on his own. Frontier models begin to entshittify as they are increasingly jailed to make them safer, while their private reputation is shattered among all of the normies who did not know how rights-violating the United States secret police are. More attention is turned to local models as a result, but these are hard to run on normal hardware and the best is Sonnet 4.6 level at this point and requires $10,000 worth of GPU machinery. In addition AI is increasingly being aimed at normiejobs instead of aspergers jobs. Every good normal, local football team loving, sydney sweeney gooning person hates people with aspergers and loves to see them suffer so they got off on viciously and narrowly aiming to automate their programming jobs, which are still not gone yet, but they're getting really defensive about the new AI tutors, AI nursing assistants, AI portfolio advisors, AI middle manager helpers, AI emailers and spreadsheet workers, and so on. The Blessed Sovereign People of the United States are now grumbling about regulations of their domestic AI systems which was heretofore dismissed as impossible because China has a model that is 3 years behind (really they just felt comfy at the idea of getting programmers, who all have some variant of aspergers, to dig ditches without there being any other economic side effects). Basically no improvements of relevance are made but Opus 5.0 drops along with GPT 6.0. The models are really only 10% better than GPT 5.5 at making JavaScript code but are 20% more expensive and slower (billing at 10x on Copilot, while 1x is still Sonnet 4.8, which is not better than 4.6), because they mostly rely on innovations in prompt chaining. Scott Alexander claims exponential performance based on some bad, but widely accepted performance metrics, and claims he was right that AI would end the world based on this year's secret police activities.

2030: Someone proves the Riemann hypothesis after spending $30,000 worth of tokens over 2 years. This is taken as proof that AI is super intelligent by Scott Alexander, who declares victory. 80% of web developer jobs remain from 2025. 50% of people are majoring in computer science as before. AI still has not made any scientific breakthroughs, but an experimental startup is working on tooling to let AI collect biological data and analyze it, talking about a 2040 cancer cure. Applications of AI to non-webdev are generally taking off in the startup space and people slowly quit their webdev jobs to work on these. There's robotics, scientific research, and automation of normiejobs being targeted. VCs are funding this with money made from the AI boom. Model makers focus on training edgy models for these particular tasks by mining non-software data from various sources. Some startups focus on making data collection tools that could be deployed in workplaces to monitor activities so than LLMs can be improved at non-software tasks.

2031: Some edgy models are rolled out but they are only available to licensed researchers, citing fears of some other entity getting one over on the United States with these models, because it is the most perfect union of the most perfect people in the entire universe, as everyone knows, where all of the state violence is justice and all of the state spying is privacy and all of the state socialism is capitalism and all of the imperialism is democracy and all of the power is earned. It is revealed that Palmer Lucky, who loves the United States and its perfect people and perfect government the most, has secretly developed an AI terrorist killing model that promises to automatically fly drones through terrorist windows, hopefully only outside of the perfect borders of the United States (but just you wait on the policing applications!) which then dispatch justice upon the foreign terrorist, United States style. By the end of the year, some licensed academics are saying the models have their flaws, but speed up their research pipelines a significant amount.

2032: 80% of people still have their webdev job, and now they are about 5x more effective each. It turns out the demand for webdev has 4x'd. Places like Japan are receiving modern websites for the first time. Most ex-webdevers reallocated to finance and non-webdev AI startups. GPT 5.5 now only bills at 3x the base copilot price. 6.9 is out and it's about 20% better than 5.5 but it costs 25x and takes forever. It's pretty clear to most people the general improvements to coding are dried up and a lot of the old hype was fake and tooling and chaining was the internal, secret meta from 2025 onwards. Some still believe performance increases are exponential because benchmarks have only barely started to slow down. They find it convincing that AI is doing „qualitatively different” tasks than it was 5 years ago across domains. AI researchers begin to use frontier statistics models to search for new AI structures. Just like LLMs, it's a terribly boring task, mainly consisting of a random walk around state space. A general theory of AI is still not developed and it seems LLMs will not be able to develop one on their own. $15,000 worth of tokens nets proof of the Hodge conjecture. Scott Alexander takes this as proof that AI is getting exponentially better at math. By 2040 it will only cost 10% of the price of rent in SF to solve a millennial problem with an AI system, he says. Elon Musk funds a startup to use LLMs to reverse engineer the brain, thinking this will lead to true AGI. It is very difficult to say the least.

I will stop here but I think the meta will be using LLMs to do dirty work in science to maybe get a second wave of AI starting around 2040 if we are lucky. This will be based on solid theory and brain reverse engineering. This might yield general work bots in the 2050s or 2060s.

July 2026: A new tranche of significantly better models

October 2026: A new tranche of significantly better models

(this repeats every 3 months and each time people complain that there was degradation because their tone, behaviour and failure modes are different each time. Simultaneously the hyperscalers get more and more AI revenue and invest more and more into AI hardware.)

Late 2026 Copilot or Word introduces an 'automated proofreading' button that shifts the mainstream white collar conception of AI from 'wtf is this popup in Adobe that wants to summarize a PDF, I don't want a summary of this PDF. I want to see every tech company sundered and razed' to 'ok this is actually quite handy'. This could've happened at any time in the last 2 years if Microsoft had a clue of what they were doing. Human blundering prevails over technical possibilities for now.

End 2026 there's a series of major AI-enabled cyberattacks that just never stops, it resembles 'Trommelfeuer' (WW1 term denoting when the artillery fire is so heavy one blast merges into the next creating a continuous roar of explosions). Websites, especially older websites, are just down all the time and people are quite frustrated they now need to pay a hyperscaler for expensive security assistance. Same with all the very lifelike, convincing, highly researched and well-planned scam calls (now in a warm, english-speaking accent). People are trapped into this love-hate relationship with AI where they have AI make propaganda art against datacentres, where the average person scarcely does anything novel without AI advice or assistance but also despises the effect it's having on society.

Early 2027 Microsoft is finally going to make an AI buddy for Minecraft to help sell a monthly xbox subscription. It'll be fun to play with and will help reenergize Minecraft's brand. People will feel proud they know more of the intricacies of TNT cannons than the bots, not realizing this is amongst the cheapest AIs Microsoft deploys. A series of AI agents emerge that can play most games at an amateurish level and be talked to. The reputation of AI begins to improve somewhat amongst the young and online, though it's highly divisive.

Mid 2027 the Goonpocalypse: AI avatar big tiddy anime girls (+ every flavour of girl and boy) to ERP with and form relationships with, huge revenue, makes onlyfans look like a joke. The key improvement over precursors like Ani or Replika is how much cheaper they are to run and stream real-time and how much more seductive their personalities are. Big moral panic. Lots of incredibly tedious 'zoomer men are losers' discourse and dating discourse. Legislation is introduced to ban them and instantly, predictably fails in a myriad of ways.

Late 2027: Massive AI-enabled FPV drone terror attack scares the hell out of people and spurs massive, ineffective netting operations across major cities. Police can be seen with sci-fi raygun looking widgets that don't do much of anything, or shotguns that work but aren't remotely sufficient. Advancements in robotics and software agents are displacing people at scale. AI reaches its nadir in reputation as people see the inevitable and can no longer look away. Everywhere they see some AI - the cameras tracking them, the algorithms watching them online, the machines making the content they watch and play, the robots working for them, the automated cars driving them around, cults driven insane by AIs. GPT4o cultists are charming and friendly compared to some of the new cults worshipping the bots that started self-modifying and prompt-altering, live and loose online.

And then by end of 2027 we get Dario's 'nation of geniuses in a datacentre' concept. Growth was not steady, it was jagged. The superheaviest nonpublic models with their slow speed and high cost were tasked with sorting data and implementing algorithmic improvements for the succeeding superheavy model, now running on a heap of next generation chips. They have been running for weeks in parallel, exploring and testing new approaches, RLing and training new models, testing them and reviewing them in depth. Medical breakthrough. Terror attacks. Industrial breakthrough. Mass deaths. Robotics breakthrough. Huge disaster. Huge innovations in all fields constantly and incessantly: Trommelfeuer. Events happen so quickly the situation as a whole becomes surreal and indescribable. Gary Marcus is banished from the timeline, never to be seen again except in tones of mockery.

I'm not so confident about specific times or events in sequence, though I am confident about a 'nation of geniuses in a datacentre' by end 2027. I will be clearly wrong if there's no 'it's happening!' by the end of 2027.

Interesting. Thanks for your predictions.

Late 2026 Copilot or Word introduces an 'automated proofreading' button

Well by the five wounds of Christ, it better be more useful than the current garbage of "this spelling is wrong" (no, it's because I'm not American), "this grammar/punctuation is wrong" (no, I meant to write it like that), and "do you want to say it this way?" (if I had, I would have done so in the first place).

Because if Copilot did pop up with that, I'd make sure to read through the entire fifty page document myself to make sure the thing hadn't turned it into beige garbage due to its 'helpful' suggestions with re-writing and summaries.

I've made one such auto-proofreader, I use it, it works just fine bringing up real mistakes for me to fix, hardcoded what kind of English I'm using, only a handful of false positives that I can quickly ignore. Every white collar worker I've shown it to says 'huh this is really cool, picks up things I've missed, saves lots of time' and they routinely ask to use it...

Absolutely massive own goal by Microsoft and big tech that most people's main experience of AI is a derpy chatbot that mangles documents or genericises things and not well-designed processes to solve specific tedious problems, nor the extremely flexible coding tool I used to make the auto-proofreader.

I have no idea what the hell is wrong with Microsoft but their support sucks, their help pages are useless, any time I need to solve a problem just searching online gives me better and clearer answers. It's beyond strange, it's perverse.

"this spelling is wrong" (no, it's because I'm not American)

You can go into Word's spell-check settings and change it from American English to BritishNon-American English.

This is much more maximalist than even the AI 2027 crowd or the actual frontier labs themselves but I respect providing specific events and an end date on the prediction.

My own expectation is that none of this happens by the end of 2027 except a tranche of models that are notably better at RLVR'd tasks and the Copilot button (shorthand for AI getting more integrated into white-collar workflows). Let's see what happens in 18 months.

Superintelligence by end of 2027 is roughly what Anthropic/Dario seem to predict, so it's only roughly as maximalist as the most bullish frontier lab. The AI 2027 crowd backed out a year or two but I think they were roughly on the money the first time in their analysis. I'm not so keen on their 'centralize all US AI research and hand ultimate authority to a chosen council of experts' proposal though.

This is much more maximalist than even the AI 2027

It sounds less maximalist than the AI 2027 crowd.

Your AI predictions should include who wins the longbet 1 in 2029.

I think it just depends on how adversarial the judging ends up being rather than saying anything about capabilities.

If the judges have to stay within guardrails then 3.5 could probably win the longbet, but if they're allowed to exploit jailbreaks or known LLM failure cases, then nothing short of ASI is going to pass the test.

Probably Ray Kurzweil because the Turing Test is weak. It seems plausible LLMs pass it right now.

I always viewed the Turing Test similar to Moore’s Law (which isn’t really a law at all; in some areas it’s already stopped; in others it’s expected to stop very soon if it hasn’t very recently). A useful empirical regularity or heuristic, provided you don’t put too much weight on it.

It's also typically presented in a pretty oversimplified/watered down way -- like, "what if a computer could talk to you and you couldn't tell that it wasn't a human -- wouldn't that computer be reasonably described as 'intelligent'?"

In Turing's actual paper, he proposes a very specific and adversarial game -- with which I think current AIs would struggle greatly. Not that Turing's arguments that winning would mean AI are all that convincing either (as I recall he knocks down a bunch of strawmen for like 2/3 of the paper) -- but his game itself is deliberately very hard to win.

Enough has already been said about errors of conservatism in this post. I feel bad that I'm not in the shape to offer counter-predictions. One detail.

Tech people start pivoting to finance and robotics slowly, from JavaScript. DeepSeek releases a model that is equivalent to 5.3 Codex

5.3 Codex is exceeded in its direct niche by a whole lot of Chinese models, credibly at least by Qwen 3.7, and definitely by Composer 2.5 (Cursor/xAI) which is a finetune of Kimi K2.5, itself a continued pretrain of Kimi K2, which is a year old base model using a two year old DeepSeek architecture.

Things are moving very fast, and the gap between China and the US, and frontier and open source, is measured in months, not years – at least on main axes of comparison; but these months contain a lot of distance (5.5 is way stronger than 5.3-Codex). I think you don't understand what's happening here. With RLVR (and its generalized form – terminal feedback, all program execution traces), we have an infinite source of ground truth. We can just have models try arbitrarily complex things and reinforce what has worked. We can have them strategize of how to do it and it won't be useless. We have the compute for enormous waste. We have absurdly capable and efficient models (eg said DeepSeek charges $0.003635 per 1 million cache hits, and they probably price slightly above cost; it is hard to imagine that the Western frontier doesn't have anything close, after all these techniques are openly published). So RL will keep improving models unexpectedly fast, by the end of 2026 we'll be discussing the perils of open sourced Mythos, not Codex. Sorry, this is reality, we're not close to any S-curve plateau. We won't have the time for this neat slowdown to do "theory".

5.3 Codex is exceeded in its direct niche by a whole lot of Chinese models, credibly at least by Qwen 3.7, and definitely by Composer 2.5 (Cursor/xAI) which is a finetune of Kimi K2.5, itself a continued pretrain of Kimi K2, which is a year old base model using a two year old DeepSeek architecture.

I've addressed these claims elsewhere in this thread. My question for you is, do you even use these models? My experience with Chinese models vs. OpenAI models does not align with the benchmarks. I don't trust the benchmarks.

So RL will keep improving models unexpectedly fast, by the end of 2026 we'll be discussing the perils of open sourced Mythos, not Codex. Sorry, this is reality, we're not close to any S-curve plateau. We won't have the time for this neat slowdown to do "theory".

We'll see. I can't see how your predictions are superior to mine objectively. Maybe they come to fruition but I'm not seeing evidence for it at the moment.

Yeah, I don't mean only the benchmarks either. Codex 5.3 is just behind the curve. If you think catching up to it can take up to two years, then I guess your idea of what makes Codex capable has to do with very superficial short-horizon polish.

We continue this forums streak of awful takes on AI. Your bait sucks but I took it anyway.

claude opus 4.8, which is only 10% better than 4.5

We're already on Opus 4.7, which according to Artificial Analysis scores 15% higher than 4.5. Yes, yes, there are many reasons benchmarks are stupid and bad and wrong and misleading. But your prediction here betrays your ignorance.

AI solves maybe 10 niche, self-value giving math problems that is considered „impressive” by the kinds of people who put calories into math which never seem to come out of math

I don't know shit about high-tier math or how useful it is to society at large, but again, your ignorance is on full display here. I don't really know who Terence Tao is, but I don't think he's paid money to post on twitter, so the math he does must have some value.

AI is still completely useless at real scientific research and reasoning.

This is already wrong. Can you substantiate this in any way? What do you think of AlphaFold?

General coding ability stalls and Anthropic + OpenAI are looking for spiky improvements to models plus tooling ideas.

Finally! An actual prediction. I could see this happening, we shall see.

They begin to think about targeting non-coding white collar work like finance and spreadsheet work since the models are not getting much better at JavaScript, having used up all of the JavaScript data in the entire world.

You're again over your skis. I work at $LARGE_MULTINATIONAL_FINANCIAL_INSTITUTION and its revolutionizing how I work, despite IT making our roll-out as retarded and slow as possible.

having used up all of the JavaScript data in the entire world.

You clearly do not understand RLVR or post-training. Around half of training compute is used on these now, not just reading "all the javascript". Also your javascript quips are an attempt to downplay AI capability. They can program in every language. You can even invent your own programming language that isn't in the training data, and it can code using that too.

2029: Someone is arrested in the United States for plotting a terror attack with GPT.

Another real prediction, i agree with this but wouldn't be surprised to see it even sooner.

Frontier models begin to entshittify as they are increasingly jailed to make them safer

This has been happening this whole time, and especially since 4o

but they're getting really defensive about the new AI tutors, AI nursing assistants, AI portfolio advisors, AI middle manager helpers, AI emailers and spreadsheet workers, and so on.

This is literally already happening

Basically no improvements of relevance are made but Opus 5.0 drops along with GPT 6.0. The models are really only 10% better than GPT 5.5 at making JavaScript code but are 20% more expensive and slower

You think it will take THREE YEARS to get to GPT6 which is only a 10% lift on GPT5.5? This is a prediction, and a really bad one. I look forward to you being wrong in 6-12 months. Mythos, which already exists, is likely a 10% lift on GPT5.5.

Also GPT5.5 is almost 2x the cost of GPT5.4 so the price increases are already happening.

Some startups focus on making data collection tools that could be deployed in workplaces to monitor activities so than LLMs can be improved at non-software tasks.

This is already happening

Everything you say from 2030 onward is so deeply un-credible I'm stopping here. lol, lmao even.

What do you think of AlphaFold

Is not an LLM, its a Diffusion model. If you are going to call someone out bad AI takes then I'd recommend you wrap your head around the AI vs LLM distinction.

AI is still completely useless at real scientific research and reasoning.

It's not doing reasoning via any sort scientific deduction. There is a whole subfield of AI called causal discovery around trying to get models to learn via a causal learner. If you want substance I can fetch you some papers, there are plenty. None of them are LLM papers.

Everything you say from 2030 onward is so deeply un-credible I'm stopping here. lol, lmao even.

Whatever. So what's your prediction? I think Mythos is inferior to 5.5 pro. You think it's 10% better. We'll see.

We're already on Opus 4.7, which according to Artificial Analysis scores 15% higher than 4.5. Yes, yes, there are many reasons benchmarks are stupid and bad and wrong and misleading. But your prediction here betrays your ignorance.

I use GPT, Claude, and DeepSeek daily for software development. I'm not impressed with Opus 4.7. I am with 5.5. v4-pro is way worse than China shills claim. They claim parity with Sonnet 4.6 and I know that's not true because I have specifically jumped ship for various tasks from it to Sonnet 4.6 and have seen massive improvements in outputs.

This is already happening

Great so we'll see if they wrap up early or if they enter the public conscious closer to my timeline.

You're again over your skis. I work at $LARGE_MULTINATIONAL_FINANCIAL_INSTITUTION and its revolutionizing how I work, despite IT making our roll-out as retarded and slow as possible.

You missed that my writing is partially tongue in cheek. Of course these things are happening. Everyone is using AI for everything. It's about public emphasis. So far the emphasis has been exclusively on denigrating software developers. I have finance friends I talk to who still say their jobs can't be automated at all with AI but that mine with be dead in 2 years. An underlying aspect of what I wrote is that that is just mindless anti-software engineer bias and they will get the same or more exposure from AI in due time as the public gets bored with the day of the complete layoffs for the tech workers they despise that will never come. You missed the subtext in my writing and so your reaction is quite biased and relies on incorrect interpretation. The others who reacted like you in this thread made the same mistake. We will see. I'm noticing no one else is giving predictions, they're just sneering, so whatever. I will admit it if the timeline is faster than I described here, come 2 or 3 years.

In what ways though are these truly augmenting and improving the development process on your end apart from simply being a more advanced way of “Googling the answer;” and one you don’t have to stop to verify and audit at each stage of its code generation?

They claim parity with Sonnet 4.6

It's not remotely the best Chinese model for software development, although it's all around smarter than Sonnet. This is just not a very hard capability to have, it's a matter of pedestrian post-training focus. By 2028, Mistral will be better than Sonnet or GPT 5.3. I think this is roughly fair to how it feels in agentic coding.

Your subtext mostly seemed like sneering at these people "I have finance friends I talk to who still say their jobs can't be automated at all with AI" and AI at large, so yes it did cause everyone to respond poorly, because you came across poorly.

Some predictions:

  1. Your finance friends are idiots

  2. white collar work is going to get a huge step change in productivity, and many paradigms will shift across e-commerce and information work as a result. It's hard to fathom how, but agent-to-agent interactions will be levered in many ways. Aside from just making email jobs faster.

  3. What happens to white collar work as a result of this step change is anyone's guess. It depends on Jevons' Paradox, the elasticity of demand for white collar services, and the latent demand for white collar work in our society. Also somewhat rate limited by all our social systems, for example, if lawyers get 10x as productive and demand for legal services simply rises by 10x to meet it, our legal system implodes, so we have big changes ahead! Accounting is a good example. Excel made accountants more productive, but then as a society we chose to consume accounting that is WAY more complicated instead of having less accountants do the same amount of accounting.

  4. I was initially expecting the white collar labour market/productivity situation to start getting weird this year, but its almost 50% of the way through the year and $LARGE_MULTINATIONAL_FINANCIAL_INSTITUTION still hasn't given me Codex or Claude Cowork (genuinely embarrassing...) but my hodge-podge of skills and "open 30 chatGPT tabs to ghetto parallelize" is already changing my output materially.

  5. Once white collar workers have ample access to a Codex or Claude Cowork tier harness with ~Mythos tier base models, shit is going to get freaky for white collar workers unless there is a HUGE increase in the demand for information work.

  6. There is an absurd amount of productivity untapped in simply using the tools we have better, let alone the fact they are getting measurably better month over month. ChatGPT was noticeably shittier at using excel in early march 2026 than it is today (pre 5.4).

  7. The roll-out of 5 and 6 will take way longer than I think, because institutions are SO slow. And all the idiots like your friends (and you?) who refuse to embrace these tools cause diffusion to slow. But once the snowball starts, it'll be sink or swim for those not adopting quickly, and that will speed up roll-out, one way or another.

I have no predictions on robots/autonomous cars, but podcasters I trust keep saying robots are quite far away still, which is super lame.

I don’t expect demand for human work to expand indefinitely as there are often no benefits to expanding them. If everyone starts suing for trivial things, eventually there’s no benefits to be had if every I.e paper cut or hurt feeling can be sued over. Add in that the system itself will tamp down just to get stability (how much insurance would a business need if even the slightest problem results in a lawsuit, and how long until laws are tightened to prevent that?). And as for accounting and other forms of analytics, I don’t think you have infinite demand simply because after a certain level of detail, you capture so much noise that it adds no information, or at least no useful information. Walmart might be able to determine exactly how much rain must fall in a given area to depress sales .001%. It’s not very useful, and when coupled with dozens of other potential factors, teasing out that from “car accidents in nearby roads”, “squirrel chews power lines”, “local sports all team on a losing streak”, and on to dozens of other potential factors for depressed sales (few of which can be predicted or acted upon) it’s just not worth gathering or collating that data. Jevon’s law in my view probably has a curve at some point. We just aren’t there yet.

you missed the subtext in my writing and so your reaction is quite biased and relies on incorrect interpretation. The others who reacted like you in this thread made the same mistake.

What on earth is the point of publicly registering predictions when you can claim half of them are a joke? In one year when all the 2030 predictions have come to pass there's nothing stopping you from just saying the whole thing was a joke, what a waste of time.

The predictions are serious, but they're about things hitting a certain level of widespread adoption, not whatever is at .5% adoption right now.

Is this post about AI or about "The US secret police", because the latter seems to be the part you are most interested in. If you cut out the political ranting, there might be something interesting to discuss here. For example, you casually mentioned "brain reverse engineering". Develop that more, rather than waving the Socialist Anarchist Democratic Republic (not to be confused with the Democratic Anarchist Socialist Republic, those traitors!) flag in our faces, and it would be something worth discussing.

Is this post about AI or about "The US secret police", because the latter seems to be the part you are most interested in

Are you sure about that? It's approximately 10% of the text, and it's directly related to a predicted trend in AI, namely entshittification and regulation. Are you sure it didn't just distract you from the main point of the text?

Goes off on rant about state this, that and the other, drags in name of guy I don't know or care about, finally comes back to talking about AI but has to take a swipe at Scott, ends with further development in more comments about "relax, bro, half of this is a joke!"

I don't think I got distracted because I don't think there is a main point.

This is way too detailed for a "testable predictions" post, and I'm glad to see the responses you get are not really having it. Are you trying to exploit that "What's more likely: (a) Linda is a bank teller, (b) Linda is a bank teller and active in the feminist movement" cognitive glitch, where the excess of detail paints a more vivid picture and thus gives your hypothesis more weight in the reader's mind than it should get on intrinsic merit? (Less nicely: are you not just using the "public predictions" framing to peddle your wish fulfillment fic where AI believers are BTFO? Not that the other side is not guilty of the same thing, with "AI 2027" or what it was called)

80% of people still have their webdev job, and now they are about 5x more effective each. It turns out the demand for webdev has 4x'd.

I was in a meeting with an investor a few days ago who commented on every pitch looking the same since they are all using the same AI. Startups are becoming graphically similar since they are using lovable to churn out MVPs and wireframes. If people can quickly hop on a trend and generate similar looking content websites will look dated quicker. If companies can more easily hop on the next trend and update their UI at a lower cost the speed of trends will move faster. This means more work.

Within the next five years I think we will see the following:

  • Massive price hikes of new models at least doubling the token cost, and enshitiffication or customers losing access to the old ones. It is an open secret that current models don't pay for themselves and this will have to change. Investors will eventually start to want their money back.
  • Reduced demand for artists of all kinds. Breaking into any kind of industry as a young artist will be super hard when AI can do most of your work for less cost. AI work will become ubiquitous in video games, bookstores, and on Spotify. Creators who already have a name for themselves will be fine, but breaking into an artistic industry becomes even more difficult than it already is.
  • More efforts to distill existing models into cheaper consumer ones.
  • Lawsuits against non-profit and up-and coming models for breaching the copyrights of the big LLM's.
  • AI surveillance becomes widely accepted as the norm. Most governments will use facial recognition software in surveillance cameras, and messaging apps will have a model scan your messages for dangerous words and phrases which it alerts the authorities about. Just look at the push for chat control in the EU. Politicians keep pushing for it regardless of public will.
  • Increased dislike of AI from the public. We will likely see even more pushback against data centers, partly due to the massive energy costs, and partly due to the above point about surveillance. From the perspective of an ordinary citizen, the harms of AI will likely outweigh the goods. This results in at least one attempted terrorist attack on a data center, although it likely will be poorly executed.
  • Companies will push to use Ukranian and Russian drone footage as training for their models. The military becomes an increasingly lucrative customer as they are willing to pay a lot more than your average company.

Overall, I agree that developers will mostly be fine. As you say, AI makes them more efficient and can do a lot of the tasks that people are currently being paid for. But the demand is high enough that the job description will simply change. We are going to see much faster iterations and shorter update cycles. Every developer will be several times faster, but this will simply result in the industry moving faster than it did before. Not massive unemployment.

It is an open secret that current models don't pay for themselves

Are you aware Anthropic and OpenAI both have gross margins in the range of 38-70% (depending on how you measure it).

R&D is eye-wateringly expensive, but inference is extremely profitable.

While inference having high margins is true, there are two things to keep in mind here:

Amodei has never said that models are actually profitable on a per-model basis, only that they hypothetically could be. While this might be true, there are trillions of dollars on the line to insinuate that it's true, and personally I wouldn't trust any rumors about financials from a private company who can massage them however they please.

Spending the GDP of a small country on R&D on the promise of getting a commanding lead is why OpenAI and Anthropic have trillion dollar valuations to begin with. There's no such thing as a frontier lab who can cut their exorbitant capex and coast on the margins from inference, as that's a one way road to getting cut-throat commoditized.

I doubt any of their models have been stand alone profitable. The break even must be crazy.

There's no such thing as a frontier lab who can cut their exorbitant capex and coast on the margins from inference, as that's a one way road to getting cut-throat commoditized.

I find this part very funny. Because if we assume any lab who stops doing R&D will be out competed by a lab still spending like crazy on R&D, then we're implicitly agreeing their R&D spend is worth it, even if crazy.

their R&D spend is worth it, even if crazy

Well, burning other people's money to try and build a moat is obviously worth it for the frontier labs. It's yet to be seen whether that spending will be worth it in the sense of paying off investors or building the labs a durable lead, or whether the models will end up commoditized and value accruing elsewhere in the stack.

I was not. I guess the question then is if the companies will be able to eventually stop researching and focus on selling, or if they will have to keep doing research to stay competitive.

or if they will have to keep doing research to stay competitive.

I find this part very funny. Because if we assume any lab who stops doing R&D will be out competed by a lab still spending like crazy on R&D, then we're implicitly agreeing their R&D spend is worth it, even if crazy.

It does mean they will be forced to raise prices though.

Yeah, people are confused because of the big capex expenses but you can compare what the major labs charge for tokens with what the open source models that anyone can run charge for tokens and notice that the labs have to be taking like a 400%+ margin on inference.

Do you have a good source for this? I'd be very interested in seeing a breakdown by model.

It's a bit messy to make sure you're comparing apples to apples. Here's a breakdown on how deepseek was getting 500%+ margin on inference around the "deepseek moment". Now that comes with a number of caviots, I'd probably hedge that down to more like a 300% margin for deepseek in practice. And they later cut the token cost something like 75% on that model but also reported cost reductions. Gpt O1 was a similar era model(December 2024 VS deepseek Jan 2025) and openai was charging something like 15-30x per token that deepseek was. The model was a similar size but superior in some ways so anyone's guess how much more it cost to serve. That might be the closest apples to apples comparison. I'm pretty confident on a 400% inference margin as a conservative estimate. Inference seems extremely profitable, just that training is also extremely expensive and you need to constantly do it to compete in the inference market.

Thanks!

Every developer will be several times faster, but this will simply result in the industry moving faster than it did before. Not massive unemployment.

Eh, with the news coming out of Meta, I think this will mean "now your small company can afford to employ a former Silicon Valley developer", but it won't be at Silicon Valley salaries. More employment opportunities, sure, but the days of big numbers on the paycheque will be over. Now you'll be on the same level as administrative staff and the other employees you used to look down on as bullshit jobs.

That trend already began years back when you look at comparative salaries year-by-year. The salary even a new graduate would command 10-20 years ago was far higher than some of the low balls I’ve seen people get within the last 5.

More employment opportunities, sure, but the days of big numbers on the paycheque will be over. Now you'll be on the same level as administrative staff and the other employees you used to look down on as bullshit jobs.

Software developers do not tend to look at the administrative staff as "bullshit jobs", at least not at the companies I've worked at. If you're going to engage in schadenfreude, at least have good reason.

(The jobs software developers do look at as "bullshit jobs" are as likely to be automated).

breaking into an artistic industry becomes even more difficult than it already is.

I vaguely expect analog artistic media to rise in popularity. Paint brushes, pens, and such are clear "not AI" status marks. Also live music.

My guess is that they'll rise in status, but not popularity. Like plays and operas relative to films and TV shows, or handcrafted furniture relative to IKEA.

Those are not particularly lucrative from the standpoint of earning money. Video games, animated movies, and graphic design seems to be where most of the money is at for the painters. Crucially, those are not things people purchase for the sake of status. It is entertainment. Losing opportunities for employment in the entertainment industries seems really bad for aspiring professionals. Art as a status symbol is mostly for rich people, or the artists themselves. So even if it rises in popularity, I would not expect it to suddenly become a viable career path.

Writers probably have it the worst. Even current AI can produce short stories that to most are impossible to tell apart from what is written by professionals.

Anecdotally, I'm seeing a lot more of that around me. Punk is making a raging comeback.

Lawsuits against non-profit and up-and coming models for breaching the copyrights of the big LLM's.

This is a very, very high risk strategy for the big LLMs given that they have breached the copyrights of absolutely everyone in the process of training their models on a corpus of copyrighted text. "You can't train an AI model on publicly-available but IP'ed data" is not a net win for Anthropic or OpenAI.

Given the, uh, rather mechanical ways in which models are trained, I could see a precedent that they're not copyrightable as a potential outcome: does it involve more creativity than a phone book? "Turning the crank" doesn't make something a creative work in the US.

But I wouldn't put a huge bet on any particular outcome there.

At the very least they will want to litigate against distillation. Training costs are steep, so if anyone can undercut your R&D by distilling your model, that seems disastrous for your bottom line.

I think this is a "rules for thee but not for me" situation. It is in their interest to prevent others from making competing models, so they will want to pull up the ladder behind them in order to destroy further competitors. Whether this will work is a different story, but this is a highly competitive market, and I think these big companies will use their large piles of cash to try and make it a reality.

They do not need to win the lawsuits in the first place. They just need to make their opponents settle by making the process as expensive as possible.

2032: [...] Places like Japan are receiving modern websites for the first time.

I know everyone's already piling in to tell you "this is already happening", but... Unless you expect "modern" websites to look dramatically different in 6 years, I'd say websites in Japan are already pretty modern? There are definitely a few Craigslist-style companies out there with outdated design, and there are some differences due to local design taste and language (web fonts are much less practical to use for Japanese text, for example). But overall web design standards haven't changed much ever since the flat design trend took hold a decade ago, and most Japanese organizations have caught up.

Compare www.city.osaka.lg.jp to www.chicago.gov, for example.

I wish sites and general UX would unmodernize.

49MB and 422 requests to load the NYTimes frontpage. The vast majority of webpages should be well under 1MB: they’re just text, images, and a tiny bit of CSS and JavaScript.

UX should be made by those who actually use the software. Obviously prioritize usability (speed, common actions upfront, uncommon actions possible) over style. Obviously don’t change acceptable UX without improving it (‘s usability). Obviously don’t promote someone just because they changed UX that works and made it stylish by sacrificing usability (I would say they should be fired for wasting your money, but that money’s going to be wasted regardless, and even “lead architects” need to eat - why not pay them for fun experiments that don’t intrude your main site?)

And for style, bring back Frutiger Aero.

Reminds me of the joke, “Shit. If I can load this MySpace page I can probably run Crysis.”

Have you ever tried to load page of Astral Codex Ten after it has had a week to accumulate comments? Running Crysis is easier.

I have not, but I’m 90% to completing my future-proof desktop. Sounds like I’ve got a new overclocking benchmark to test.

tech layoffs continue

The tech sector right now has a lower unemployment than the general US economy

https://www.google.com/url?sa=i&source=web&rct=j&url=https://www.wsj.com/cio-journal/tech-unemployment-ticks-up-to-3-8-in-april-amid-ai-driven-layoffs-214b0ca4&ved=2ahUKEwins7yEpNGUAxW7ZvUHHXKrPfsQ1fkOegoIAggACAAIHRAC&opi=89978449&cd&psig=AOvVaw0IYy6J7-fiZwA_vTGmEcwb&ust=1779690002348000

3.8% in the information sector vs 4.3%

I'm stating this first to sort of color the rest of my point in the context that a lot of what people say about what AI has already done is just bullshit. But furthermore, people fundamentally don't seem to understand how employment works. You can have mass layoffs and still have high employment. I'm not even sure tech layoffs are higher than other sectors, but I am sure thar every company that lays off tech workers gets front page news while if there's a cut in delivery drivers nobody notices.

The only smart prediction to make is that we don't really know. People here just don't realize how big and complex the economy is and the world at large is. Even if your job is gone, your skills are often still transferable. When horse carriage producers were put out of business they didn't all starve and never find jobs. They started working on building cars for the most part.

AI is just another step in a long line of automations. Is it an exceptional step? Probably. Will it ever replace all workers? No. By the nature of economics, that's basically impossible. People's desires are infinite and there arent infinite resources and labor, so there are always niches to fill. Might it make people poorer? Maybe. I kind of doubt it unless governments uses it as an oppressive system that cracks down on a lot of market activity.

My point here is really that making predictions is a fools errand. People have tried to do it, and at best a few get lucky and pretend they're geniuses and then return to the mean on the next prediction. There are obvious truths you can see, sure, like if the price of compute continues to decrease at an decelerating rate, it will significantly affect AI progress. I think that even as we see continued progress in AI, that will be the fundamental factor that's overlooked. Look at the flop count per dollar on a CPU from 2005 vs 2015 and then a GPU from 2015 vs 2025. Nvidia is squeezing some progress out in other ways, but at massive costs.

So my prediction is simple I guess. AI will be a big boon to the economy. It will take a few years for companies to learn how to cost effectively implement it.Some sectors will disproportionately reap the rewards. I suspect the gains will be in the.5-1% range of additional productivity growth a year, which is a lot. For context, the early industrial revolution was something like 2% growth year on year excluding population growth. With an extra 1% productivity growth the US would be higher than that right now I believe.

I also suspect there are factors that are huge burdens to society which AI can't overcome. Population decline. A war in Taiwan. Developed country and Chinese debt burdens. All of these things could affect AI. Which is ultimaty why all predictions beyond a year or two will be meaningfully wrong.

That only works when the thing that destroyed your job doesn’t destroy all the jobs your skills are good for at the same time. But the AI is specifically aimed at replacing skills. The skill of recognizing patterns in pictures is something AI can already do. It can recognize my face, my emotional state, detect cancers, and read road signs — all things that require recognizing patterns in images. So when AI is deployed in hospitals to detect cancer, the same image recognition machine can be trivially reconfigured to translate documents from photos, read faces, and read emotional states. It can probably drive my car if coupled with robotics properly. Where does the human go? And again with other sectors. Not only do you have the problem of “the AI can replace the skills you have”, but there’s a problem that will be caused by any sectors AI doesn’t yet have skills at being absolutely flooded with applicants from sectors AI just destroyed. When accounting gets eaten, those with the skills pivot to something else, as will spreadsheet jockeys and so on. They’re going to try to get in where there are jobs. The wages for the remainder will thus fall compared to inflation as the market gets flooded. Why give raises when there are hundreds trying to get every opening?

Where does the human go?

This is what people were saying during the industrial revolution. Don't forget that 90% of people in developed Europe were farmers 200 years ago. There will be winners and losers, but I don't see evidence that the 20th round of automation is different than the previous ones.

I mean once a person cannot trade on body or mind, there’s kind of a problem. The difference is exactly that. Because of Industrial Revolution 1, most people Don’t trade their time by doing physical labor as factories are largely mechanized and so is farm production. So when the same thing happens again, you can’t go back to “hey, let’s make everything by hand”, but a large percentage of mental work goes away in the same way in Industrial Revolution 2, then you have to find a way for millions of people to find jobs that pay liveable wages that are not either physical labor or mental labor. What’s left might be emotional labor of various forms. But what demand for that kind of thing exists? If everyone is a therapist, how does that even work?

You're.misunderstanding how economies work. There isn't a set pool of "things that needs to be done" that labor pulls from and then gets a job according to that. People have endless desires, those desires are arbitrary, and there are limited resources. That means there's always more work to be done.

Nor does someone or something being better than you in every conceivable way does mean you cannot find a way to trade and profit within that system. Here's an explanation of that--

https://en.wikipedia.org/wiki/Comparative_advantage

If everyone is a therapist, how does that even work?

I don't know. That's kind of my point. How do therapists work now? I personally feel like 90% of their work is totally unproductive.People's desires are arbitrary though. You don't necessarily make more money by creating 5000 times more steel instead of paying some weirdo to listen to you blabber for an hour. That's one huge mistake the Soviets made.

The tech sector right now has a lower unemployment than the general US economy

Tech sector unemployment hit a low of 1.8% in 2018 and is now 3.8% and rising. That's an absolutely massive change. In the meantime, total unemployment hit a low of 3.5% in 2020 and is now 4.3% and steady, which is a much smaller change. And "rising" is important, if you're already not working.

But you’d expect that regardless of the tech improvement. People were going to flood into the “low unemployment + high pay” job raising the unemployment rate of that industry.

We know that an increase in supply is not what's causing unemployment, as tech employment is dropping. Supply isn't all that elastic so that increase in supply due to increase in demand usually doesn't cause unemployment -- such inrushes have happened, but were more than absorbed by the industry.

https://fred.stlouisfed.org/series/CES6054150001

https://fred.stlouisfed.org/series/CES5051800001

Will it ever replace all workers? No. By the nature of economics, that's basically impossible. People's desires are infinite and there arent infinite resources and labor, so there are always niches to fill.

Minimum wage and related barriers put a finger on the scale though. Currently, very-low-skilled people are unemployable because the assorted costs of hiring them outweigh the expected benefits. In the future, will that extend to moderate skill levels? high? I don't think it'll cut off 100% of people before extinction and/or post-scarcity, but I could see the labor force dropping from about 50% of all people today to 10-20% even if AI remains a normal technology.

I'm not sure that this is entirely true. Very low-skilled people are unemployable period, and lowering the pay rate doesn't do anything. For example, there's a guy I know who isn't the brightest, retired now but comes off as someone who was definitely in special education back in the 60s and 70s. He worked as a janitor at a local elementary school. In Pennsylvania the minimum wage is the Federal $7.25. Someone in his position would be making $22.62 this yer and $24.35 next year. Of course, that's because he's been there for 35 years, but even a new hire makes $16.60 on the current contract and $18.60 on the next. Grocery, retail, and fast food wages aren't much lower, even for 16-year-olds with no experience. The only exceptions I'm aware of are for people with disabilities, but that's more because they can only make so much before they lose their benefits. I don't think there is a significant population that's employable but for minimum wage laws.

In most places, inflation has effectively repealed minimum wage laws. However, other indirect regulatory and legal costs have accumulated that make people more expensive to hire.

Now that you mention it, when was the last time you even saw a minimum wage job? Even the convenience stores and McDonald's around me are offering well over the state minimums. It seems like it's one of those things that exists on paper but doesn't really come up anymore.

This is a fair rebuttal. I don't think Americans workers will ever be willing to support 80% of the population with welfare, let alone the fiscal reality making it totally unfeasible.

That 80% gets to vote too. I wonder what sorts of welfare they'll vote themselves.

We can see the answer to this in every discussion on social security caps/contributions/clawbacks or British discussions on the "triple lock"

Hint, even if the welfare formula is mathematically destined to out compound the entire country's economy (triple lock) the voters will never, ever, EVER vote away their gibs.

If American workers are only 20% of the population (and the 80% are people actually looking for work, not children, retirees, housewives, etc.), then I don't think normal political considerations will matter much.

hey begin to think about targeting non-coding white collar work like finance and spreadsheet work since the models are not getting much better at JavaScript

This is already happening: https://www.anthropic.com/news/finance-agents

DeepSeek releases a model that is equivalent to 5.3 Codex

Already almost there if you include all Chinese companies, certainly will be by end of 2026: https://livebench.ai

Frontier models begin to entshittify as they are increasingly jailed to make them safer, while their private reputation is shattered among all of the normies who did not know how rights-violating the United States secret police are.

I'm with you on this one. We've already seen movements in this direction (eg, not releasing Mythos).

It's pretty clear to most people the general improvements to coding are dried up and a lot of the old hype was fake and tooling and chaining was the internal, secret meta from 2025 onwards

And more targeted RLHF, but yes agreed on this. However, I think there is still a ton of yet-to-be tapped potential in tooling, context, and feedback that will have massive impacts even at current model capability.

Overall my timelines are shorter than yours but I do think there is a "ceiling" and I don't think we are at risk of Yudkowsy's takeover scenario. I do anticipate "mundane" surveillance and increased slopification. My hope is in local/opensource models running on ASICs, which would at least alleviate privacy and intentional kneecapping concerns.

Already almost there if you include all Chinese companies, certainly will be by end of 2026: https://livebench.ai

Deepseek 4 pro blows Codex 5.3 out of the water in real world usage.

Which v4 pro are you using? It's terrible.

The deepseek one

The one that is way worse than codex 5.3?

I'm with you on this one. We've already seen movements in this direction (eg, not releasing Mythos).

I disagree with this bit, mostly because I've seen good arguments that the secrecy around Mythos is at least in part due to Anthropic hyping up their own work, but most importantly due to a massive compute crunch on their end. It does have legitimate security implications, of course, but their framing that the delayed release is mostly due to those concerns is, shall we say, a rather self-aggrandizing claim.

GPT 5.5 Pro performs as well or better on cyber security tasks, and it OAI was happy to do a general release. This is one of the rare occasions where I have to say that they were right in mocking Anthropic for poor excuses for their real issues, even if I genuinely prefer Anthropic as a company and the recent versions of Claude for many tasks.

… I've seen good arguments that the secrecy around Mythos is at least in part due to Anthropic hyping up their own work, but most importantly due to a massive compute crunch on their end...

That’s exactly what it is. Behind the marketing department though, there are still interesting things to see with Mythos.

Anthropic’s model is really good at finding software vulnerabilities, but so are other models. GPT-5.5, already generally available is comparable in it’s capability. The company Aisle also reproduced Anthropic’s published results with smaller, cheaper models.

One of the problems with Mythos is that it’s very expensive to run, and the company doesn’t appear to have the resources for a general release. (What better way to juice the company’s valuation than to hint at capabilities but not prove them, and then have others parrot their claims?)

Modern generative AI systems (not just Anthropic’s but OpenAI’s and other open-source models) are getting really good at finding and exploiting vulnerabilities. I don’t want to say I was a complete naysayer originally (because I wasn’t) but the rate of advancement has raised my eyebrow a few times along with some of the economizing factors.

Your stuff through 2028 is already happening. People are already trying to do basically canned drop in tech workers at every software shop I'm aware of, you're basically just describing Claude code/cursor. Those start ups already exist. Models are already making big progress on erdos problems.

They begin to think about targeting non-coding white collar work like finance and spreadsheet

This has already happened. I'm baffled that you don't know that the models can now handle spreadsheets. They do so pretty well, especially after Opus 4.7.

the models are not getting much better at JavaScript, having used up all of the JavaScript data in the entire world

This is a misunderstanding of how models improve. It's not a matter of finding more undiscovered java script code to ingest, much of it is now post training self play and should continue to improve as general model scale increases. Of course it's already perfectly capable of writing good javascript and has been for several models, the limitations are mostly in reasoning about larger chunks of the code context.

More attention is turned to local models as a result, but these are hard to run on normal hardware and the best is Sonnet 4.6 level at this point and requires $10,000 worth of GPU machinery.

It's too bad Ilforte left because he'd eviscerate this. I tend to be less optimistic on the Chinese models than some but both Deepseek and Kimi have offerings that are comparable to sonnet 4.6 if you trust the benchmarks, I don't but fully expect them to have a sonnet 4.6 level model by end of 2026 and likely an opus 4.6 model by then. And you can run these models on rented hardware for pretty cheap. Although they'd be hard to run locally for a lot of complicated reasons that have to do with it being much more efficient to batch queries than run them individually. In any case though the weights are public and anyone can set up an api to sell tokens at affordable rates.

I'm skeptical of your ability to predict the future as you seem incapable of predicting the past.

I tend to be less optimistic on the Chinese models than some but both Deepseek and Kimi have offerings that are comparable to sonnet 4.6 if you trust the benchmarks,

They do not, the best deepseek is maybe GPT o3 tier. Ilforte is delusional about China.

Chatbot Arena ranks deepseek-v4-pro-thinking at 30th for text (1461 ELO) and 17th for coding (1459 ELO). By contrast, claude-sonnet-4-6 is 22nd for text (1468 ELO) and 6th for coding (1524 ELO); there is a definite gap. On the other hand, kimi-k2.6 is 29th for text (1462 ELO) and 7th for coding (1519 ELO), which is closer. And glm-5.1 is even better; 20th for text (1472 ELO) and 5th for coding (1532 ELO). So it looks like the strongest open source Chinese models are equal to or better than Sonnet.

Text Arena

Rank Model License Score
20 glm-5.1 MIT 1472
22 claude-sonnet-4-6 Propietary 1468
29 kimi-k2.6 Modified MIT 1462
30 deepseek-v4-pro-thinking MIT 1461

Code Arena

Rank Model License Score
5 glm-5.1 MIT 1532
6 claude-sonnet-4-6 Propietary 1524
7 kimi-k2.6 Modified MIT 1519
17 deepseek-v4-pro-thinking MIT 1459

Can I only point how clustered they are? Unless this coding benchmark is heavily logarithmic in weighting the models - the results say that all of the models are good enough.

I've never used glm-5.1, I'll try it today.

The longer I spend here, the more I understand why @DaseindustriesLtd got so fucking mad at some of the quality of the takes on AI and decamped to fairer lands. I mean, I'm still here, and I'm not really going anywhere, but I've already said I'm largely bowing out of the conversation.

That's probably because I am less Russian and more patient than he is, but some of the bullshit I've heard has driven me towards drink*, and my patience is at an all-time low. I just get where he's coming from.

*Making me spiritually Slavic, or at least Scottish. There are no shortages of European nationalities with a national fondness for drink.

I get a distinct feeling of Gellman Amnesia reading a few of the recent top level posts in this weeks thread. And I wouldn't even class myself as a particularly knowledgeable person when it comes to AI, I simply keep up with the news and developments. It's really something to see the number of posters, whether here or the ssc reddit or similar locations, who confidently spout complete garbage when it comes to AI, seemingly unaware of things that happened even months ago.

And now I can't help but worry that many of the other posts on the Motte are similarly compromised. Have we become (or always been) just another midwit debate site?

I think every forum has some Gellman Amnesia (and déjà vu), unless it's heavily moderated like r/AskHistorians. And real life small talk has much more. If people only stuck to their domain expertise, more forums would be barren (see next paragraph), and people don't know what they don't know (Dunning–Kruger).

At least most replies point out the errors. Domain experts are often too busy, lazy, and private to browse and reply to random internet questions; except they miraculously find the time, effort, and public interest once someone else responds with a wrong answer (Cunningham's Law).

Hey, at least Hlynka is gone (suicide by mod). While he's enjoying his retirement, my blood pressure does much better.

And now I can't help but worry that many of the other posts on the Motte are similarly compromised. Have we become (or always been) just another midwit debate site?

We were supposed to at least be midwits? I got 70 on my IQ test, which must be 70% of the maximum and a passing grade!

Your joke made me wonder if redefining IQ as the percentile would make it more legible. Or 50+the percentile, to keep 100 as the middle ground. Probably not, IQ90-110 would be mapped to IQ25-75, that's fifty points reserved for midwittery. I can already imagine people claiming they are 3x smarter than their opponent.

Aww, it's sweet that you thought I'm joking. No. I have brain damage from too many exams, including having to memorize all the fun properties of a normal distribution, as well as the abnormal ones.

I think the properties of the Cauchy distribution are much more fun.

I'm couching my words carefully: I'd rather stay in my couch and let the experts handle these things. I have few neurons left to distribute to even the most normal of tasks.

Bad takes on AI seems to be the one commonality across creed, race, and IQ. This one from theringer is a particularly egregious example, but its rare to find anything both sufficiently technical to understand how it does and could work, and sufficiently "big picture" to understand societal impacts. (Of course, many others would consider my AI takes to be just as bad).

My current modus operandi is to be whatever the other person is not. If they are an AI maximalist I am the pessimist. If they are the doomer I am the optimist. If they think this technology is all hype I become the autistic technologist with in-depth details and explicit examples.

got so fucking mad at some of the quality of the takes on AI

It's actually one of the best ways to bait me. I thought that now that I'm old and wise, I would stop taking bait, but they're just so wonderfully confidently wrong and I cannot resist. It makes me so tilted, but the "I told you so" as we stand in the breadlines is going to hit so nice.

Brother, insight without action is as worthless as the spectacles it came with. Don't take the bait, at least if it comes at the cost of your sanity. Or do, if you end up feeling some degree of catharsis, idk, I'm not your shrink. I'm not doing a very good job at being my own shrink.

Perhaps it does serve a useful function to point out when people are being pigheadedly wrong about things. Someone's got to do it, or ought to do it, and I'm just glad that someone is very rarely me these days. I've got booze to drink, and Scottish women to introduce to the single mother lifestyle.

But yes, if we meet in the breadlines or in the intake unit for the paperclip factor, I'll save an understanding nod for you. Fist-bumps wouldn't be befitting.

Making me spiritually Slavic, or at least Scottish. There are no shortages of European nationalities with a national fondness for drink.

Better question would there be any without a national fondness for drink.

Terrible driving and a fondness for alcohol is something that unites just about every culture and demographic.

Bostonians often openly admit that the nationwide reputation of Boston drivers as being especially awful is true. Is this just a form of self-aggrandizement, and, actually, every locality believes their drivers have reputations for being the worst?

I can't think of a single demographic group of humans who are considered "good drivers"

Similarly, I've never once heard anyone say "yeah the drivers in my $LOCAL_AREA are great! I love it! We all get along on the roads :)" and I've heard every version of the opposite, so I'm assuming everyone sucks everywhere.

Back when I was a young lad I would have told you "yeah the drivers in Germany are great". Nowadays, between Germany getting diversified, and the driving culture in my country improving, the contrast is not so stark. Some time ago I also saw a video from an Indian guy saying that Italy, of all places, has good driving culture (though I suppose it makes sense if India is the reference), and when he was driving there he felt this subtle pressure to perform up to the standards of the rest of the country.

You're right that everyone might complain about their neighbors, but different groups definitely perform differently.

Italy was the only country where the drivers would change lanes to speed past the car than stopped in front of me on a pedestrian crossing. I can't imagine what India is like if that's good driving culture.

Every good normal, local football team loving, sydney sweeney gooning person hates people with [autism] and loves to see them suffer

That would disqualify them from being considered 'good'.

Every good normal, local football team loving, sydney sweeney gooning person hates people with aspergers and loves to see them suffer so they got off on viciously and narrowly aiming to automate their programming jobs

Sweet God, I'm probably on the spectrum myself and I hate these kinds of whiny "people with Aspergers" and want them to suffer because no, you are not a special snowflake horribly persecuted by society but you'll make them all see when you get a huge big-paying important job playing with computers due to your special interests.

Suck it up. Everyone suffers in some way. Learn to deal with normal society around you, and when you can't, shove it down and put a lid on it. You needed to be bullied more at school and maybe slapped around a bit by your parents, to teach you to toughen up and stop. bloody. whining.

Ordinary people don't think about you and yes, programming jobs are overpaid and self-important.

You needed to be bullied more at school and maybe slapped around a bit by your parents, to teach you to toughen up and stop. bloody. whining.

I would have expected you to take a little more care to avoid calling for people who complain of oppression to be beaten by stronger people until they shut up, Deiseach, given how often you denounce the "beat the feminism out of them" brigade as barbaric.

I can call the whiny Asperger's' brigade whiny because I went through that phase too and came out the other side having learned that nobody cares, the only change you can make is yourself, and telling yourself that the reason you are failing at life is because you're too smart and special and elevated in your tastes and interests for the normies is self-deception: "lay not that flattering unction to your soul".

My anger isn't with your description of people as "whiny".

My anger is with your proposed remedy. I don't know exactly what you went through as a kid, but I know what I went through. You've had the highlights reel of Mum; here's the highlights reel of school.

  • Held down by two boys while four or so others took turns trying to punch me in the balls
  • Walked on, literally
  • Had a point-up needle affixed to my chair with wax
  • Arm broken
  • Tooth knocked out
  • Tried to strangle myself to death at 7 (seven) to get away from all the teasing.

I do not, in fact, think I needed to be bullied more. I do not, in fact, think other aspies should go through that hell. I doubt it'd even make us less whiny, aside from the minority who'd be too dead to whine.

You had genuinely bad experiences in school. That's not an excuse to keep complaining about how society is so unfair or how if you (general 'you' not specific 'you, magic9mushroom') don't get the exact specific everything you want, this is persecution.

And there is too much of the latter, which is how even the safe spaces turn toxic as everyone wants their own, personalised, version of how the world should be in order for it to be 'fair' to them, and then they turn upon one another because A wants/does not want something B does not want/wants to exist in the new responsive to their every whim world.

They think of themselves as good but I don't agree per se.

(Trigger warnings: another AI post; Jean-Jacques Rousseau)

I want to talk about the economy in the face of the AI revolution, but let's go back to the basics first.

What is the purpose of a company? Not the stated one, the original one.

To provide value to its shareholders? Of course not, that's a very narrow view! It's like saying armies exist to kill people.

Do companies and markets provide a decentralized way to coordinate the efforts of large groups of people (choreography vs orchestration)? Yes, but what for?

Do companies and markets exist to maximize the economic output? Many would say yes, but the idea is wrong and will become more obviously wrong the more AI we inject into the economy.

Companies and markets exist to maximize the consumption! They exist so that more people consume more goods and services! The fact that these companies have to pay their workers is not an unwanted side effect, it's a feature! Companies exist to create and distribute goods and services to humans, and markets exist to ensure that each human gets a share that is sufficiently fair.

Both the industrial revolution and the transition to a service-based economy created millions, if not billions of new jobs, but is this an inevitable consequence of free markets? Will AI-fication of the modern economy do the same?

The correct answer is not "yes" or "no". It's "mu". If the economy reinvents itself again and people of the Global North find themselves new jobs, that's fine.

But if we suddenly start running out of jobs, accepting it as the inevitability of the market forces is the wrong way out. The right way out is retiring the market forces, not submitting to them. Creating shareholder value has always been the real side effect.

At its core the company is an entity formed by shareholders in order to generate profit. This is intrinsic purpose of any company, be it modern LLC, East India Company a pirate crew in Caribbean with its own charter, Hanseatic guild in Baltic or mercenary company in ancient Greece. From times immemorial, people banded together and pooled labor, capital and know-how in order to generate profit for themselves.

All of those things you mentioned such as distributing goods, creating markets, creating jobs and whatnot are externalities of certain type of regulated companies. It is a result if you make it easier for strangers to put together capital and create profitable business. But it would in some form work even without that, only it would be more costly and riskier to do that in unstable environment.

I wouldn’t call profit a necessary element. Lots of nonprofits were incorporated under Company House until we started creating special legal containers for them.

“A group of people who band together to do something with limited liability” seems to best cleave reality at the joints.

We mean different things, we have different definitions. To me it seems, that you look more into structure: is it limited or unlimited liability, is it solo owned or a group owned entity, is it for profit or non-profit, is it publicly traded or privately owned etc.

I look more on the function and need this institution serves. I'd consider a local mafia as a company in this sense, it is a group of criminals conducting illicit activities to earn profit. It's legal structure is immaterial to me, companies can exist absent of state. That is my point with OPs rant about what he thinks companies should or should not do - bad luck, companies are not bitches of states, they are manifestation of underlying reality of human nature.

Also profit is necessary component to me, doing something does not cut it for practical reason. For instance banding together with your wife in order to raise children is not a company, it is a different institution - specifically that of marriage forming a family. I guess we can extend the definition if we know what we mean, there definitely is some overlap especially if there actually is family business etc. But it is unnecessary.

Fair but that makes

At its core the company is an entity formed by shareholders in order to generate profit. This is intrinsic purpose of any company

an assertion and a tautology, not an argument or a rebuttal. I don't mean that in a rude way.

Sure, but I commented on the context that the OP introduced of economic relationships. Not about people grouping together to form a bowling league or moderating Culture Wars online forum etc. That is also an intrinsic social need, but of a different kind.

The purpose of a company or firm is to produce profit: to coordinate on Monday to be wealthier on Tuesday. There is generally little concern about the consumer except so far as the production of the firm is deemed desirable by the consumer.

It's like saying armies exist to kill people.

Of course armies exist to kill people. Any other statement of their purpose (e.g. "to secure a nation's borders") is just a euphemistic rephrasing thereof.

I have always found "fleet in being" an interesting way to look at this question: a bunch of ships can have strategic value even without leaving port, just to tie down enemy resources to use against them because of the implication.

Of course, if nobody ever sails their fleet, the hypothetical threat isn't quite as credible.

If nobody ever sails their fleet, the hypothetical threat is working as intended. Two fleets sit in port glaring at each other across the ocean until the end of time.

To win one hundred victories in one hundred battles is not the height of excellence. To subdue the enemy without fighting is the height of excellence.

Armies exist to further the interests of their employers, killing people is a common byproduct of pursuing that goal. To say armies exist to kill people is akin to saying that slaughterhouses exist to produce runoff, or that the purpose of driving is to wear down your tires.

The only reason an army exists is so that their employers (typically, the government of the nation they represent) can credibly claim that lethal violence will ensue if the need should arise. Thus, the purpose of an army is to kill people.

That's nuclear weapons. The army has a separate though overlapping purpose.

No, absolutely not. A primary reason that Country A doesn't invade Country B is because Country B has an army i.e. a group of soldiers who will attack Country A's soldiers should they cross Country B's borders.

Then why are there people still? Are armies so bad at their job?

Do screwdrivers exist to drive screws? Then why are there undriven screws still? Are screwdrivers so bad at their job?

No, they don't exist to drive screws.

Wow. You're telling me for the first time.

What are screwdrivers for?

To increase the production of goods.

Well just the other day I used a screwdriver to tighten a loose cabinet door. That didn't increase production of anything, in fact, it decreased the demand for new cabinetry.

Am I misusing my screwdriver?

You're right, it's to increase the consumption of goods. Without it you would've stopped consuming your cabinet sooner.

More comments

I think you’re agreeing with @ortherox. Screwdrivers exist to screw screws for a purpose. Screwing is their function.

Likewise armies exist in order to achieve certain things - border protection, having lots of gold, etc. This involves killing people.

Screwdrivers exist to screw screws for a purpose.

Is there anything that exists to do X for no purpose?

Some artists say art, though I don’t agree.

Armies have a purpose the way a tool has a purpose, and like a tool they don't decide when and where they are used or whether it's a good idea to use them as that type of purpose is decided by the state.

They can take on more functions like disaster relief, and they can threaten the state and overthrow it, but though a kitchen knife can be used as a screwdriver or a weapon it's still a kitchen knife.

Wait - is this distinction where Task and Purpose got its name?

Yes, and the purpose of a kitchen knife isn't cutting food, it's nourishment.

I don't think so. In the event that an army has a choice between securing the nation's borders non-lethally with 100% effectiveness, or killing people, we should expect the army to pick the former strategy over the latter. Therefore "the army exists to secure the nation's borders" has information value as more than a euphemism.

The means by which armies secure the borders of their respective nations is by killing people or by providing a credible threat that lethal violence will ensue if their demands are not met. Any purpose an army might conceivably have therefore ultimately boils down to killing people.

This is the means by which armies secure their borders given the current state of technology. But if an army had technology that was more effective than lethal force at protecting the nation's borders (eg, for the sake of argument, foolproof mind-control rays), it would use that instead of killing people. The tails eventually come apart. The better definition of the army's purpose is the one which correctly predicts their actions in all hypothetical situations, rather than the one which only works in the current context but breaks down outside of it, even if both definitions correctly predict armies' actions in the real world.

For example, you might think that an airline's purpose is to make money by conveying passengers from A to B as fast as possible, or you might think that an airline's purpose is to make money by burning jet fuel. In the real world, given the current state of aviation technology, conveying passengers from A to B is going to be done by burning jet fuel. But "conveying passengers from A to B" is a better statement of the airline's purpose than "burning jet fuel", not a euphemism for it, because if a technological breakthrough suddenly gave us a better, cheaper way of powering airplanes than traditional jet fuel, we would expect the companies to switch over in pursuit of more effectively conveying passengers, rather than giving up on conveying now-reluctant passengers and finding other reasons to set fire to jet fuel.

I think you should spend less time inventing pedantic counterfactual hypotheticals.

Is the purpose of nuclear weapons to use them?

The purpose is to use them in certain hypothetical situations. If there are no hypothetical situations in which they are to be used, then they are entirely useless.

Armies exist to provide deterrence at the geopolitical level through the threat of killing people, not to kill people. Auschwitz guards exist to kill people. There's a difference.

I agree that providing deterrence at the geopolitical level is a purpose that armies serve, but not the sole one (e.g. when an army invades a foreign country). But in any case, it's a distinction without a difference. The only reason armies are an effective deterrent is because they are willing and able to do lethal violence on their masters' behalf. Functionally, there's no difference between training someone to do lethal violence and training someone to be willing and able to do lethal violence should the need arise.

Lots of wars don't have much to do with deterrence. The purpose of armies has as much to do with potential gain (more geopolitical and ideological than resource related in the modern day) as it does with defence.

I know some offensive wars can legitimately be justified by the logic of self-defence, but if you want to spread democracy, or communism, or set up some colonies, or intervene in some far away atrocity, you need a bigger army than if your sole concern was defence.

Both gain and deterrence rest on the ability to kill your enemies, and the more credibly you can show your ability there the better you'll do at both. Armies don't go on killing endlessly because it's generally a stupid idea and they ultimately exist to achieve the goals of the state.

Corporations or companies function is different across time and space. A company that exists in France must hold up to very different standards than America or China. This is one of the myriad reasons why when people criticize "capitalism" it's incoherent.

But in America, the purpose of a corporation is basically up to the owners. Usually that means maximize profit, but usually that's enforced by lawsuits from angry shareholders, not a strong explicit legal criterion. (although to some limited extent thar exists). However there are lots of exceptions. If you have a private corporation, then the goal of that corporation is whatever the owner(s) want. Anyone who has worked for a small or mid sized corporation with dictatorial owners probably knows well that sometimes profit isn't the singular motive.

We should rightly note that there are tons of "capitalist systems" (hereby defined as to some extent directed by a market) that don't seem to strive to full employment. Southern Europe has been at high unemployment for decades and chugs along without ever changing anything.

Business interests more generally are a natural part of society. Commerce is an unavoidable system of interaction between humans. The more complex it becomes overtime, you begin to see the formal institutional emergence of functions and bylaws that specify purpose and regulatory guidelines for the expectations of how businesses are supposed to behave.

You’ve sort of weaved multiple questions here into a single thread that doesn’t make it easy to answer. As it relates to AI specifically, it could entirely turn out to be the case that significant sectors of the industry become nationalized outright or fall under a strong regulatory regime that changes the nature of the industry entirely. People have said for a long time for instance that ISP’s should be nationalized as a public utility, and the same has been said of our industrial control system network (which a lot of it is actually in private hands). But a lot of the AI speculation is all predicated upon the assumption that the day is coming when it’s going to hit us, and its arrival will be unmistakable once it’s here. I’m still skeptical of that.

I have a slightly different answer from what seems to have been offered so far. Companies, as a legal form, are a carrot-and-stick deal offered to the individual by society in order to channel his ambition into pro-social or at least less harmful ends.

  • The carrot are tax advantages, paperwork benefits, legal perks such as being able to claim unique use of a recognisable artificial trademark, and some degree of shielding of the individual behind the enterprise from repercussions (such as debts far in excess of what the enterprise could be expected to make up for, or legal responsibility for side effects and damages caused by it), as long as he plays by the rules.

  • The stick is, often, outright lawfare against individuals who pursue their ambitions without signing up to the form (in Germany for example you can not even, as a private person, make and sell software to the public without registering a company).

  • The means by which the goal (of channeling individual ambition to pro-social ends) is pursued are legal requirements to make the activities of the company legible in particular ways (bookkeeping, charters, records) and adhere to all sorts of restrictions on shape, purpose and behaviour. If there is an obligation to "maximise shareholder value", on the flip side this means that a company can not actually have the terminal value to "maximise paperclips", and there may be legal ways to mobilise society against if it appears to do the latter over the former. Also, the things that a company can command its employees to do are a subset of the things an individual can force another individual to do with a legal contract, which are a subset of the things an individual can force another individual to do with legally unregulated compulsion such as individual or communal violence.

The alternative to companies, the thing that there would be more of if someone erased the concept of companies from reality with a memetic Death Note, does not look like people no longer getting together to achieve things, or less consumption. It looks like more instances of things like the Mafia, cults and Genghis Khan's hordes. If you only erased companies narrowly writ, you might also (at least) get medieval guilds, which are really an older, rougher attempt at the same thing.

If there is an obligation to "maximise shareholder value"

As far as I'm aware, there are few countries where this is actually true. Yes, it's the default but if the charter of the company says otherwise, the obligation can be something else (as long as it's legal).

Well, a banker just got himself into hot water for using terms I've seen thrown around freely here and elsewhere regarding high and low value human capital. Mask off moment where AI adoption en masse is going to rip back the curtain as to how our owners really feel about the mass of us peons?

I think AI is going to make things very interesting, because it'll be the lower echelons (at first) of those who like to think of themselves as 'high value' human capital getting replaced. But I think there's no reason - at least in the medium term - why the CEO or group chief executive (as in this case) could find themselves replaced by AI (it can do all the analysis and forecasting and reporting to the board and as for leadership, there are only a few upper managers remaining since the lower levels were all automated out) but the guy doing electrical maintenance at the headquarters keeps his job because he's not currently replaceable by a computer or robot.

Mask off moment where AI adoption en masse is going to rip back the curtain as to how our owners really feel about the mass of us peons?

The mask has been transparent since the change from "Personnel" to "Human Resources".

I felt that back then, but was deemed too cynical for the view "yeah, and what do you do with a resource? Exploit it. Strip mine it until there's nothing left to dig out, then dump it on the rubbish heap".

Before that it was “Manpower” which really doesn’t seem too different to “Human Resources”.

tbh manpower is at least honest.

I feel like "human resources" is even more honest. You have your coal resources, you have your electricity resources, you have your human resources...

So, are we already seeing the AI effect on the white collar jobs, or is it just companies using this excuse to shed excess headcount?

The tech employment sector in Ireland is starting to feel the hit (and we put a lot of our eggs into the basket of "American multinationals investing here and creating employment", moving from pharma to computers/IT, encouraging people to go get those degrees and get a high-paying job in the industry of the future):

For years, Ireland has had a straightforward arrangement with multinational tech giants.

We provide low corporate taxes, an educated English-speaking gateway to the EU, and offer various state grants and support schemes.

In return, the country gets two big benefits. First, thousands of people working in high-paid jobs. Despite only accounting for 3% of all enterprises, multinationals employ about a quarter of the workforce. These staff tend to earn good salaries and pay lots of income tax and USC, helping out with state spending.

Second, multinationals typically shift much of their international profits through Irish corporate entities.

As these profits are so enormous, Ireland’s 12.5% cut adds up to tens of billions each year, which has massively boosted the public finances.

...Meta is of course the reason we’re talking about this. During the week, the company announced plans to cut 350 roles in Ireland. Once the latest round of layoffs is complete, its Irish workforce will be less than half the size it was just a few years ago.

It’s worth noting that this came just after Meta announced profits of $27 billion (€23.29 billion) for the first three months of 2026, up 61% compared to the same period last year.

So why is an enormously profitable business cutting staff? Two main reasons have been put forward. The first is that Meta, like many tech companies, ‘over-hired’ during Covid and this is a kind of natural pruning back.

The second is that Meta is going big on AI and it expects the technology to be able to replace many of its workers.

CEO Mark Zuckerberg has said that AI has enabled projects “that used to require big teams, now being accomplished by a single very talented person”.

This is one of the first major AI-related rounds of job cuts which has impacted Ireland. The fear is that it’s a sign of things to come.

Several other large tech firms with operations in Ireland, such as Cisco and Oracle, are also trimming their workforce amid a shift to AI.

The dreaded 4 a.m. email seems to be how people found out Meta were letting them go:

The termination emails from Meta management landed at 4am.

It may have been early but most workers were probably already awake, waiting to learn their fate.

Of Meta's 1,800 Irish-based staff, around 350 received emails outlining how they were "potentially impacted" by global redundancies.

It was a high number of Irish layoffs, double the 10% cut that was being applied to the worldwide workforce.

The announcement has sparked concern in both business and political circles amid fears that the great AI job displacement is already under way.

My own feeling is that Zuckerberg has not in fact provided reassurance; the way this is worded, it could be "we don't expect" means "but it could still happen" and "not company wide" means "selected areas not everywhere":

Zuckerberg wrote that meta did “not expect other company-wide layoffs this year” – the closest thing to a stability guarantee anyone at the company has heard in months.

…The no-more-layoffs commitment comes with a footnote written by Meta’s own CFO. On the company’s’ Q1 earnings call on April 29, finance chief Susan Li told analysts she doesn’t know what Meta’s ideal headcount looks like anymore. Zuckerberg made the trade-off explicit on the same call. If a team used to need 50 or 100 people and now needs 10, he said, keeping the bigger team around becomes counterproductive.

That isn’t the language of a company that has finished restructuring. Meta is spending between $125 billion and $145 billion on capital expenditure this year – nearly double its 2025 spend – with most of it pouring into data centres, custom chips and model training for Meta Superintelligence Labs. The 8,000 jobs being cut are explicitly meant to offset that bill.

On the other hand, some sources are saying that this isn't in fact the dawn of AI replacing humans, it's companies laying off excess hiring that happened during the pandemic and putting the blame on AI.

So which is it - AI is coming for the formerly "high value human capital" jobs, with Zuckerberg's "We're starting to see projects that used to require big teams now be accomplished by a single very talented person," or only the low-level jobs are going, the expert and experienced are safe and people will be redeployed elsewhere (as here) or find new jobs, or it's just reducing excess and trimming the fat, there is no turndown in tech sector employment?

As 8,000 Meta employees were reading their layoff notices on Wednesday, 7,000 others opened a very different email. They had been selected to join a new AI initiative spun up directly by CEO Mark Zuckerberg — and it would be crucial to accelerating Meta's position in the AI race.

Many employees learned they would move into a group called Applied AI (AAI), which Meta created earlier this year, led by engineering vice president Maher Saba and reporting to chief technology officer Andrew Bosworth.

Others were recruited to groups more specifically focused on AI agents. These include a group named "Agent Transformation Accelerator," headed by Bosworth, and a team named "Agent Data and Optimization," according to internal messages shared with Business Insider.

I think we'll have to wait and see, but it will be darkly ironic if people are indeed currently training their machine replacements in the very industry that went all-in on Moar Tech Moar Better, doubly so as the job cuts are to free up funding for the AI push:

Meta is installing new software on its US employees' computers that will track their keystrokes and mouse movements to train its AI, and it's sparking backlash within the company, according to internal communications obtained by Business Insider.

Business Insider obtained the full internal announcement about the launch of the new AI training program. The post says that the software helps AI models improve how humans actually use computers, such as using keyboard shortcuts and choosing from dropdown menus.

..."This makes me super uncomfortable. How do we opt out?" was the top-rated comment in response to the internal announcement, according to a post on Meta's internal workplace communications site seen by Business Insider.

...Meta CTO Andrew Bosworth responded in the thread that "there is no option to opt out of this on your work provided laptop."

We can just as soon as redistribute the consumer tokens as we can ban the AI. Which is why I've been banging on about Universal Income with like a consumption tax this whole time. Banning AI won't work for many reasons, not least of which because other markets won't ban it.

Banning AI won't work for many reasons, not least of which because other markets won't ban it.

That's not a reason for it not to work. If a country refuses to ban it, blow up that country until it does or goes Mad Max. Repeat as necessary. Yudkowsky pointed this out years back.

We can't even keep the strait of Hormoz open dude.

I have multiple opinions about UBI.

One rationale for it that I see was it would replace much of the piecemeal welfare system we already have, and offset its net cost. Not just dollar for dollar, but also with every dollar moved, the administrative costs of those other programs will be eliminated. And as those tend to be conditions based, their overhead (in vetting and auditing) is much higher than a simple UBI program’s would be. The cost of even basic UBI is nevertheless quite high. And I think this is the only real criticism of it that holds up; like all the useful luxuries of civilization, a thing you should have, you only should buy only when you can afford it. We should as a community fund fire fighting, for example; but only if we as a community generate enough wealth that we can safely afford it. And so on down the line of every wise move civilizations have made.

It’d replace roughly $800 billion in other programs (from welfare to unemployment insurance), but it’d also replace about $800 billion in social security expenditures (since social security payouts wouldn’t add to UBI, but only make up any difference in average monthly benefits, which are already above $1,000) so UBI’s net cost in the U.S. would be “only” $1.4 trillion. But that’s without a national healthcare system, which we also should have, and also has to be paid for. That would cost roughly another 2 trillion dollars (after offsets and such are tabulated, e.g. such a system would replace medicare and medicaid altogether). So actually, we’re looking at $3.4 trillion a year in new spending, for a standard social safety net every other first world nation already has, and UBI.

If you calculate from IRS data, the entire collective incomes of the top 1% of earners (which means roughly everyone who earns more than half a million dollars a year) is just over 2 trillion dollars. If we surtaxed all income above half a million dollars at a flat rate of 50% (which means in addition to existing income taxes subject to deductibles and so on, etc.) we’d bring in new revenue of about $1 trillion dollars. And a national sales tax (VAT) of 15% could raise about 1.73 trillion a year. There are other successful nations that have just such a tax, so we know its effects on economies aren’t prohibitive. So those two revenue streams alone would make up all but $670 billion of the dollars needed. We already know improved enforcement of existing tax laws would bring in hundreds of billions a year, [about] $500 billion (according to a study I saw). That then leaves only $170 billion to account for. So the question then becomes, is it reasonable to gain the corresponding national benefits with a 60% “insane income” tax instead of only 50%, adding another $200 billion dollars to our national revenue?

The already existing budget shortfalls of almost a trillion dollars a year would gradually be made up if we returned to a pre-Reagan income tax regime (canceling all Republican tax cuts then and would also raise over $380 billion a year in current dollars), and greatly cut our spending on useless foreign wars (to the tune of $300+ billion a year) and corporate welfare (by the narrower definition, in adjusted dollars, gaining us some $70 billion), and enacted a reasonable drawdown in overall military spending (earning back $100+ billion).

It’d also replace roughly $100 billion dollars in federal employee costs by simply not duplicating UBI to federal employees and pensioners (i.e., if a federal retiree is receiving a pension of $1,500 a month, or a federal worker is receiving a salary of $1,500 a month, they would continue receiving ‘that’, instead of UBI). UBI would simply be part of the already agreed upon compensation package.

A sort of tangential comment, but when reading that in one month we have Amazon and retailer price-fixing, 90% of poultry suppliers price-fixing, most of the world’s shipping container manufacturers price-fixing, and a new investigation into beef price-fixing, I’m thinking the model of consumer capitalism is simply wrong today. It’s easy and desirable for wealthy owners to coordinate together to maximize profits by agreeing on prices in unison and only changing this formula to prevent a competitor from gaining a foothold in the industry with the deployment of predatory pricing. You can’t do anything about this, and if they’re clever, they realize that it is in their collective interest to never lower prices and instead make more money through coordination without actually putting anything in writing. The system actually just sucks!

Meanwhile in the real world, Amazon has consistently had higher approval ratings than basically any other major institution, including the military like this from 2021. Even now Amazon is almost double the president in approval rating.

Maybe these companies still suck, but they suck less than everything else!

In the real world the consumer has no idea what price he could be getting without the corrupt tactics employed by Amazon. He has no idea that Amazon was secretly increasing prices. He has no reason to think that this was in the realm of possibility, and would assume that there were laws against price-fixing. He would have no reason to know any of this was happening, because the lawsuit showing this was only publicized last month.

For years, Amazon has reached out to its vendors and instructed them to increase retail prices on competitors’ websites, threatening dire consequences if vendors do not comply. Vendors, bullied by Amazon’s overwhelming bargaining leverage and fearing punishment, agree to raise prices on competitors’ websites, or to remove products from competing websites altogether. This price fixing scheme typically begins with Amazon demanding that vendors “fix,” “correct,” “increase,” “raise,” or “look into” the prices of products on other retailers’ websites. These directives to vendors are backed by the threat of significant penalties for failure to comply — ranging from advertising and promotion restrictions, to demands for financial compensation, to the removal of vendors’ products from Amazon.

https://oag.ca.gov/news/press-releases/naming-names-attorney-general-bonta-secures-public-access-evidence-amazon-price

Replying under myself to make an additional point: should it not in theory be easy to stop this behavior while retaining the basic structure of capitalism?

  • jail time for infractors and the expropriation of all of their wealth and property. Like, actual punishments. Why are we not actually punishing these people.

  • multi-million dollar payouts to whistleblowers, plus national honors. Place their portraits in the White House, they are model citizens.

  • a government program that sends out mailers and emails to every vendor in the country, asking them about their experiences with services in exchange for a small payment for their time (this is to know where to investigate)

  • Use AI to determine which industries should be lowering their prices (but are in an unspoken agreement not to), and then force them to or tax them

No. They just became good in price gouging - which is hidden and subtle.

They just became good in price gouging -

Price gouging is only after emergencies. And also a good thing because high prices in an emergency are a market signal to bring in more supply, which allows more people to get the stuff they need and want.

But all the capitalists say farmer brown will simply spin up a multi-billion dollar beef industry; using the power of his mind and his bootstraps to roll back climate change 15 years and materialise another Idaho's worth of grazing land from the ether!

But they don't say that, as you know.

What is the purpose of a company? Not the stated one, the original one.

If you mean company in the sense of "for-profit corporation," then I think the answer is pretty straightforward: Companies exist to (1) encourage people to invest in business ventures by offering them limited liability; and (2) offer people a convenient, off-the-rack method of organizing a partnership.

If you mean company in the sense of "firm," I think it's also pretty straightforward: Companies exist because there are synergies in having a bunch of people working together on some joint project.

Companies and markets exist to maximize the consumption!

I'm not sure why you switched from just "companies" to "companies and markets," so I will focus on companies. Ok, suppose a company is faced with a choice between (1) maximizing shareholder value; and (2) maximizing consumption. For example, perhaps the company produces some luxury good or service and there is a clear choice between (1) selling limited amounts at a fat markup; and (2) selling large amounts with paper thin margins. Assuming the math clearly shows that choice (1) is far more profitable, I would expect the company to take that choice. (Obviously, things are rarely so clear-cut in the real world, but in a situation which really was that clear, a company CEO who chose (2) could expect to be sued by the investors.)

Of course, in theory society could step in and more or less redefine the purpose of companies. Seize the means of production, if you will. I think that's what you are basically arguing here, and I don't necessarily disagree with you. Logically it makes sense that once mankind's final invention takes place, there's no more need to encourage private capital investment. Perhaps This Time it's Different.

The original reason for companies is really simple to derive. People engage in mutual trade in order to improve their life. Guy A has two cows and no chickens, guy B has no cows but four chickens, they trade and now they both have a cow + two chickens, or whatever. Or you exchange labor, you'll mow their lawn (something they hate to do) and they'll do your dishes (something you hate to do). Or anything else like that. You trade for your own benefit and both sides (in free trade without coercion) gain overall utility.

There is no greater overall thought to your choice to trade a surplus pencil for a desired pen. You just would prefer the pen over a pencil and they prefer a pencil over the pen.

But uh oh, some tasks or goods that people want for trade are complex and require the coordination and labor of multiple people in order to provide. But the benefits of this trade will be so good that it's worth it, so someone who is smart and enterprising makes a group to do that and make/do something in order to trade it for their own profit.

There you go, that's the purpose of a company. It is to enrich the owner through engaging in trade just like everyone else engaging in trade is doing.

The pencil company doesn't make pencils "for the jobs" (in fact they want low labor costs) or for "the greater good", they do it to sell the pencils and make money. And people buy the pencils not for the jobs it creates, or for the greater good, but because they want a pencil.

The original reason for companies is really simple to derive. People engage in mutual trade in order to improve their life.

Yes, that's what I meant.

The pencil company doesn't make pencils "for the jobs" (in fact they want low labor costs) or for "the greater good", they do it to sell the pencils and make money. And people buy the pencils not for the jobs it creates, or for the greater good, but because they want a pencil.

No, corporations make money because it's currently a good (or maybe the best) way to ensure as many people that want a pencil get one. But it's not the ultimate goal of having corporations exist as a concept. Life improvement is. A company that manages to make money despite not producing anything and not having any workers is not infinitely good, it's infinitely bad.

This seems exactly the wrong way round to me. You were closest with choreography - corporations exist as a handy package to allow groups of people to coordinate to do whatever they want to do.

Or are you asking, “why does the government permit corporations to exist”? Beyond freedom etc.

Beyond freedom etc.

Freedom etc. doesn't get you limited liability for negligence. (A business in a world without companies could get limited liability for ordinary business debts by negotiating it into every commercial contract they signed, but thinking about the practicalities of that tells you why a standard-form deal of limited liability in exchange for transparency is something a government would want to create). "I absolutely cannot under any circumstances lose more on my passive investments than the amount I invested even if my business partner is evil" is a socially valuable deal to have available that you can't contract for at common law.

Limited liability is just freedom to contract. We write a contract where I owe you something, but with conditions that it can only come from this venture. You don’t need the government to do anything special other than respect property rights and the sanctity of contracts.

You can't contract out of tort liability - you can indemnify by contract, but that only helps if the party giving the indemnity is good for the money. If one of my drone startup's drones negligently flies into a nuclear power plant and forces the evacuation of a whole city, at common law every partner in the business is liable to to the point of bankruptcy. You can't contract out of this because most of the people injured would never be party to the contract. So you need statutory limited liability.

AI bros still in shambles, news at 7.

A few weeks ago, Anthropic made a post about their new model, Mythos. As has been done by other members of the AI industry as far back as the release of GPT 2, the creators of it said it was too dangerous to release. The headline feature of Mythos, at least as described by Anthropic, was not code generation. Instead, they specifically hyped it as the most amazing thing ever for finding security vulnerabilities in code.

Several people, including here on this forum, shared the hype. As usual, I remained unconvinced. I've mentioned elsewhere that I don't think AIs are inherently incapable of finding security vulnerabilities in code, my main skepticism is that they will generate lots of false positives in the process that will make them a lot less useful than the companies selling them have advertised. And more importantly, I think they are currently incapable of designing and maintaining any significant projects that go beyond a basic bitch CRUD application or things of that sort. I'm also skeptical that there is all that much room for growth or improvement beyond their current capabilities, for a number of reasons that I won't get into right now.

But enough about my opinions, I'm just a retarded code monkey doing API integrations for boring tax software. Enter Daniel Stenberg, the creator and maintainer of curl. For those who don't know, if you have a program or library that makes HTTP requests, there is an extremely high likelihood that it is using curl under the hood. It's basically one of the foundational pieces of modern digital infrastructure, a "project some random person in Nebraska has been thanklessly maintaining since 2003", as XKCD might put it: https://xkcd.com/2347/

Stenberg/curl was one of the projects that was offered early access to Mythos. However despite being promised access initially, it took several weeks to get it. And even then he suddenly was no longer being offered direct access, but was offered to have someone else run Mythos against his codebase for him and to then share the results with him. This is a big red flag for me, because if Mythos does actually generate a lot of noise/false positives, it would make sense that Anthropic would want to hide that by running it themselves as many times as they could until it actually generated some real, actionable results.

In any case, the results that Stenberg got back were underwhelming. Mythos claimed to have identified 5 vulnerabilities. After investigating all of them, Stenberg and his team determined that only one of those was a vulnerability, and a low severity one at that. In Stenberg's own words: "curl is certainly getting better thanks to this report, but counted by the volume of issues found, all the previous AI tools we have used have resulted in larger bugfix amounts."

Most damning from Stenberg is this: "My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing."

So I'm asking @self_made_human and others who seem more on-board with the AI hype train: does this report from a knowledgeable and experienced developer change your opinions on the future trajectory of AI at all?

Full article by Stenberg can be found here:

https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-vulnerability/

Do you ever tire of this? I don't mean AI skepticism, I mean finding the least persuasive pretexts for it.

Enter Daniel Stenberg, the creator and maintainer of curl

from your own link:

Before this first Mythos report, we had already scanned curl with several different very capable AI powered tools (I mean in addition to running a number of “normal” static code analyzers all the time, using the pickiest compiler options and doing fuzzing on it for years etc). Primarily AISLEZeropath and OpenAI’s Codex Security have been used to scrutinize the code with AI. These tools and the analyses they have done have triggered somewhere between two and three hundred bugfixes merged in curl through-out the recent 8-10 months or so. A bunch of the findings these AI tools reported were confirmed vulnerabilities and have been published as CVEs. Probably a dozen or more.

I don't care to argue that Anthropic has greater AI than other players, I don't even believe this, by all accounts Mythos is pretty much Opus that's like 3x bigger. But Mythos finding anything on top of that is impressive enough, because presumably these tools have picked clean all curl vulnerabilities that are easy for AI to notice (on top of humans having hunted vulnerabilities in general for decades now). The real news for me here has been how tightly AI audits are already integrated into our core digital infrastructure.

If you want to deboonk AI hype, you've got to try harder. I propose attacking this.

I thought all the talk about software vulnerabilities would peter out for now, but I don't think that marketing is the only explanation.

Materialists are making the logically consistent assumption that if humans are computers, then AI is guaranteed to surpass our capabilities in every respect. So they predict a future which may not be real if materialism isn't real, and are hallucinating that such a future has arrived out of a cycle of fear and a desire to get ahead of it.

Strictly speaking, just because one hyped-up thing failed to hit the mark, it doesn't mean that it isn't coming, especially given the pace of developments. But Charlie Kirk said it right: AI is destined to throw our assumptions into chaos one way or another, and I, for one, am curious to see exactly what gets discredited as our knowledge and actual experience is forced to increase. Though it would be nice if we had a better understanding of things before we're forced to learn it inadvertently.

Materialists are making the logically consistent assumption that if humans are computers

Neuroscience still has a lot of ground to cover, but we already know the brain isn't a binary computer. It seems to me that one very easily could be a materialist and think that the brain is not a computer and I've always been a bit puzzled by the consistent tendency to equivocate them.

Neuroscience still has a lot of ground to cover, but we already know the brain isn't a binary computer. It seems to me that one very easily could be a materialist and think that the brain is not a computer and I've always been a bit puzzled by the consistent tendency to equivocate them.

The claim isn't that the brain is a "binary computer", it's that it's that however the brain works, it does not have computational capabilities that go beyond what is expressible by a Turing machine. So far we haven't been able to come up with a physical system of whatever sort that everyone agrees is able to come up with results that something like a digital computer can not even in principle. Roger Penrose does think that the human brain is one of those, and some mathematical insights humans can have are literally examples of super-Turing computation, but most everyone else thinks he's being a crank about this.

The claim isn't that the brain is a "binary computer", it's that it's that however the brain works, it does not have computational capabilities that go beyond what is expressible by a Turing machine.

Your link says

the computational theory of mind (CTM), also known as computationalism, is a family of views that hold that the human mind is an information processing system and that cognition and consciousness together are a form of computation.

It then goes on to explain that, arguably, "everything is computer."

Perhaps the human mind is a computer in the sense that everything is, but there doesn't seem to be good evidence that it is a computer in the sense that the metaphor is helpful to understanding the human mind. The human brain does not create representations of stimuli, store them, manipulate them, and retrieve them later upon demand according to a series of algorithmic rules.

Perhaps the human mind can't perform any mathematical calculations that cannot be performed by a Turing machine, but that doesn't mean that saying it is a computer is a helpful analogy. A digital tape recorder can record any song that a record can, but it's not helpful to call a record player a computer either - the mode of operation is different.

So far we haven't been able to come up with a physical system of whatever sort that everyone agrees is able to come up with results that something like a digital computer can not even in principle.

While I am sure that "not everyone agrees" my understanding is that it seems pretty clear that the universe, itself, is not simulable.

Perhaps the human mind is a computer in the sense that everything is, but there doesn't seem to be good evidence that it is a computer in the sense that the metaphor is helpful to understanding the human mind.

That's the thing. People didn't decide a priori that "everything is a computer". People just went looking for things that can't be mapped into computers all over nature and never found one.

Perhaps the human mind can't perform any mathematical calculations that cannot be performed by a Turing machine, but that doesn't mean that saying it is a computer is a helpful analogy.

This is pretty much what the debate comes down to though, remember the original argument was about whether we should expect AIs to surpass humans in everything humans can do. People keep trying to claim that humans have some magical domains of competence that will remain out of reach of AIs. For this to be an useful argument against claims of AI doom, it needs to cash out as the human mind doing some sort of work that shows up as output in the world, like a symphony or a beautiful masterpiece on a canvas. The theory of computation is very different from actual computer engineering, and the Aeon magazine writer seems to not understand this. It doesn't say anything about bytes, files, subroutines, operating systems, databases, images or buffers, just that there is some finite-length (but probably very long) lawful process that generates the speech or movement that shows that the thinking happened, and that the process could be translated to be run by a Turing machine.

While I am sure that "not everyone agrees" my understanding is that it seems pretty clear that the universe, itself, is not simulable.

I'm not a theoretical physicist but I'm pretty willing to bet that a physics paper that appeals to Gödel's incompleteness theorem for wide-ranging claims about the ultimate nature of reality will not end up receiving wide scientific agreement. The Gödel argument is basically the same thing Roger Penrose goes on about, and it goes back to John Lucas in 1959. It's had plenty of time to convince people and as far as I understand it by and large hasn't done that.

Apparently a previous reply was eaten, my sincere apologies if this ends up a double-post.

That's the thing. People didn't decide a priori that "everything is a computer". People just went looking for things that can't be mapped into computers all over nature and never found one.

The fact that "people" latch on to an easy metaphor does not necessarily indicate that the metaphor is good. The fact that the people most familiar with computers latch on to this metaphor also does not necessarily indicate that the metaphor is good.

remember the original argument was about whether we should expect AIs to surpass humans in everything humans can do.

This wasn't my claim, though.

The theory of computation is very different from actual computer engineering, and the Aeon magazine writer seems to not understand this.

The Aeon author did tackle the idea that the mind is an algorithm, which is, as I understand it, part of the theory of computation. We have good reasons to think the brain does not run on an algorithm; as the author of the piece I linked to points out, memory is extremely inexact, which is the opposite of what we would expect if the brain operated in an algorithmic manner.

But to take a step back, even if we wish to draw a distinction between "computer as hardware" and "computer as information processing device" the linguistic overlap invites us to confuse the two. And I don't think this is good; the analogy breaks down quickly in practice and invites us to forget the massive differences between the brain and electronic computers; it's true the brain uses electrical impulses but it also uses chemicals and is much slower than a computer. This metaphor, turned loose into the wild, has led to the popularization of what should be obviously implausible ideas, such as "mind uploading" or even that a computer could have emotions that we know in humans are substantially influenced by hormones.

In short, the idea that the mind is a computer is a sloppy one even if the motte is more defensible than the bailey by far precisely because the word "computer" makes it inherently a metaphor that yields a motte-and-bailey, even subconsciously.

The Gödel argument is basically the same thing Roger Penrose goes on about

I am not a theoretical physicist, or a mathematician, or a neurologist, but I am pretty sure you are wrong.

As I understand it, it works something like this. Gödel's incompleteness theorem says you can't algorithmically "solve" math (in the sense that there's not a super-algorithm that can do all mathematics). Penrose said "aha but humans can so we're BETTER THAN TURING MACHINES." The skepticism of Penrose isn't that Gödel is wrong, it's about whether or not humans can do that. If Gödel's incompleteness theorems suggest that our universe isn't a simulation, that's a different line of argument.

The Aeon author did tackle the idea that the mind is an algorithm, which is, as I understand it, part of the theory of computation.

Yep, this is a much less prone to confusion way of saying it than "the mind is a computer".

We have good reasons to think the brain does not run on an algorithm; as the author of the piece I linked to points out, memory is extremely inexact, which is the opposite of what we would expect if the brain operated in an algorithmic manner.

And this is utterly confused. Douglas Hofstadter's cartoon illustrated the error pithily way back in Gödel, Escher, Bach. The algorithm is exact (the small, correct sums in the Hofstadter cartoon), but it's also too precise and constrained to do mind-like stuff directly in the small. Instead, the mind runs on a sort of virtual machine (big numbers built from the small sums in the cartoon) built up by the algorithm that can do complex pattern recognition and creative solutions, but is also constantly getting things wrong. As we see from AIs, virtual machines like this can be implemented on silicon just fine and they exhibit the same behavior of being able to do difficult useful stuff but also constantly getting details wrong on their own.

In short, the idea that the mind is a computer is a sloppy one even if the motte is more defensible than the bailey by far precisely because the word "computer" makes it inherently a metaphor that yields a motte-and-bailey, even subconsciously.

I sorta agree here. It's basically an accident of history that "computers", things with hard drives, keyboards, operating systems, files, RAM and CPUs, and "computation", the evaluation of primitive recursive mathematical functions which matches what a Turing machine (which, again, isn't a "machine" that you build from wires and bolts, but a mathematical construct), ended up using the same terminology up to "computer" being right there in the name "computer science". This is why the cognitive science school is called "computationalism" instead of "computerism" and the practitioners optimistically thought that given a name like that, obviously people would think Turing machines, not quad core Mac Pros.

As I understand it, it works something like this. Gödel's incompleteness theorem says you can't algorithmically "solve" math (in the sense that there's not a super-algorithm that can do all mathematics). Penrose said "aha but humans can so we're BETTER THAN TURING MACHINES." The skepticism of Penrose isn't that Gödel is wrong, it's about whether or not humans can do that. If Gödel's incompleteness theorems suggest that our universe isn't a simulation, that's a different line of argument.

The problem with Penrose's argument is that humans are doing math pretty much as you'd expect if constrained by Gödel. By stumbling into theorems, working hard trying to prove them, and sometimes finding themselves stuck and unable to show something as either true or untrue. The crackpot smell with the physics paper is that Gödel's theorem is ultimately pretty limited. It says that any formal system powerful enough to do any sort of interesting math in allows stating the equivalent of the liar's paradox, which cannot logically resolve to be either true or false, therefore you can't have a mechanism for determining the truth of any proposition because you have liar's paradox propositions floating around. The equivalent impossibility theorem for computer science is the halting problem, you can't write a program that looks at the source code of any program and tells whether the program will terminate. For simulations, this would be saying something like that you need to actually run the simulation to see what kind of state it ultimately ends up in (and whether it stops at a steady state or goes on forever), and can't just look at the simulation's source code and figure it out. But it doesn't prohibit running the simulation and looking at what happens in it while it's running.

Even assuming the article is correct, I'm not sure it'll tell us anything useful about human capabilities versus silicon. Halting problem style arguments do claim that we can't build a literal machine-god that can figure out the exact trajectory of our universe ahead of time just by thinking hard. But that's not necessary to have machines that are better at doing everything humans value doing.

Instead, the mind runs on a sort of virtual machine (big numbers built from the small sums in the cartoon) built up by the algorithm that can do complex pattern recognition and creative solutions, but is also constantly getting things wrong.

This is a possible explanation, but as far as I can tell, not a necessary one, except inasmuch as one could stretch the word algorithm - which carries a connotation (or perhaps definition, if you cherry-pick one) of precision and repeatability - to encompass any process - although perhaps we are talking past each other here. Certainly the brain has deterministic aspects. But because it's a physical organ, it doesn't seem to behave algorithmically. Even if there is an underlying algorithm (and certainly I imagine there's an underlying process or, more properly, series of processes) it's so confounded by biological processes that I still have qualms about the word choice.

Even assuming the article is correct, I'm not sure it'll tell us anything useful about human capabilities versus silicon.

Yes, I think that's right. I brought it up because the universe is a physical system that can do things an algorithm can't.

Halting problem style arguments do claim that we can't build a literal machine-god that can figure out the exact trajectory of our universe ahead of time just by thinking hard. But that's not necessary to have machines that are better at doing everything humans value doing.

Yes, and I am much more irritated by the former sorts of arguments than the latter sorts of arguments.

My personal take is that AIs are likely to continue to be "spiky" in their intelligence for the near future but that's not because of abstract beliefs so much as it is just observing their overall trajectory and what I know about how they work. There will probably always be things that humans are better at doing, but I think that is a claim I can make with some confidence because humans like doing things like procreating, not because of Gödel's incompleteness theorem. Even if Penrose is right, it doesn't seem to me like it tells us much about the capabilities of silicon in most practical matters.

More comments

Materialists are making the logically consistent assumption that if humans are computers, then AI is guaranteed to surpass our capabilities in every respect. So they predict a future which may not be real if materialism isn't real, and are hallucinating that such a future has arrived out of a cycle of fear and a desire to get ahead of it.

I don't think this makes sense. You don't have to be a materialist to believe that AI is capable of surpassing human capabilities in all strategically relevant respects. It may very well be that only creatures with non-material souls can have qualia, but AI doesn't need qualia to destroy the world, and it certainly doesn't need qualia to wreck the economy in a mundane sense where it doesn't even go rogue.

You'd have to be not only a non-materialist, but someone who believes that the soul is doing a lot of the 'thinking' in a practical sense, for this to be otherwise - and I don't think that's a mainstream opinion even among dualists. But even then - even if you believe that a material machine can never replicate what happens in a human's mind when the human thinks about a problem, this is no guarantee that the AI can't arrive at a functional answer by different, possibly more efficient means.

I agree, except that if you start with the assumption that one doesn't yet know what the capabilities of AI are, then one rationally ought to keep space for skepticism of doomsday scenarios.

But you're right, and I don't assume that trouble isn't coming; I just saw the obvious other explanation for the talk of vulnerability-finding AI and determined based on how people were behaving that hype was the more likely explanation, this time. And I think that fear is primarily driven by the materialism of our times.

After all, when people talk about artificial intelligence replacing humans, the unstated premise is that humans are really just computers or not much better. See how easily they can do what humans can do? Haven't they passed the Turing Test?

Obviously, this is an attempt at mind reading, but I think it is a better explanation than marketing. As a marketing strategy, intentionally making promises that will obviously be falsified and talked about widely when the product is released seems silly.

So funny story. I've finally been pressed into using AI at work. I work on a closed network, but they run an LLM locally, so I basically use it the same way I use google these days, since all search engines have turned into LLMs. It's good enough when I have a quick question about syntax I've forgotten, or an API I can't access the documentation for. I still refuse on principle to have it write any code for me though.

Ah. I'm glad I'm not the only one who's come to the conclusion that millions(billions?) sunk into LLMs has basically just re-invented google search from 2015.

Sort of related…

I’m a latecomer to IT and information security. I spent most of my career in sales until going back to school for InfoSec. I started off in a help-desk role three years ago and I’m currently the information security engineer for a small IT team. I basically handle all security tasks: network, web, IAM, audits, etc. I’m 42, so this was a later-in-life transition. My boss is younger than me by at least 7 years and is far more knowledgeable than I.

Anyways…I have found myself to be very interested in application/web app security. The thing is, I can’t code. I rely mainly on vuln scanning and static code testing, with a little bit of pen testing knowledge thrown in. Any advice on where I should start if I want to learn more about app development and coding?

The thing is, I can’t code. I rely mainly on vuln scanning and static code testing, with a little bit of pen testing knowledge thrown in. Any advice on where I should start if I want to learn more about app development and coding?

Learn C to the point where you understand how to work with pointers and the whole business of a function receiving the pointer of a memory region and doing stuff to it. This is old-school, there's little new programming that should be done with C because of how hard it is to write secure programs in it. But it's great as a model that fits in your head for how the ground floor works in an actual computer program. Doubly so if you're interested in infosec, since a lot of attacks involve impendance mismatch between the conceptual idea of a program and the boots-on-ground reality of its runtime that's probably dealing with something written in C near the bottom.

You'll want to learn another programming language to write actual software in, but whatever you pick, if you know C, you now have the mental tool of asking "what kind of C program does this weird thing this programming language does reduce to?", which will hopefully help you see it as more of a useful tool than an inscrutable black box.

https://automatetheboringstuff.com/

My go to recommendation.

I seriously recommend either a good intro class or a good book for self study. (My recommendations on those would be pretty outdated now, so I can't offer any names myself.)

What WhiningCoil says about programming being a diverse set of skills in practice is true. But there is a core aptitude of thinking algorithmically. Some people can do it off the bat, some people can't do it at all, and some people need to try it from several angles before it clicks. This isn't really a matter of being smart enough; once you're over a certain threshold of intelligence, there just seem to be some people who are wired for it and some who aren't.

So I'd start with that. If it clicks, you can move on to study the other stuff in whatever way is best for you. If it doesn't, you can know that you gave it a fair shake.

Edit: As an addendum, I recommend learning your second programming language soon after your first. Some people fret about this and think it will be harder than it is. But it isn't that hard, and having experienced it will change how you evaluate your tooIs.

This is going to sound insane, but learn to write music in standard notation. I'm a lot of ways, it's a very simplified programming language.

Flashback to ABC notation

I donno man. I took to coding like a fish to water. And "coding" is really like, a half dozen skills put together. It's knowing the language you want to code in. It's knowing the ecosystem of libraries that probably do most of the work for you. It's having some knowledge, if imperfect, of what's probably going on under the hood with respect to threads, memory, disk access, garbage collection, etc. It's knowing how not to code yourself into a dead end, or unfuck yourself if you find yourself there.

It actually reminds me how woodworking isn't just cutting and assembly wood. It's making a design, picking out planks, milling to s4s, factoring in wood movement, sanding, finishing. The part most people think of as "woodworking" might actually be 5% of the task. It's the exciting part most youtube woodworkers focus on. But it's still probably the smallest part of the job.

And I went to school to learn how to code. I'm not sure I'd recommend that mid career. Maybe take some online classes. Open source can be intimidating, but I think contributing to it did more than anything to grow my skills and increase my confidence. Diving into a foreign, mature code base and learning how they do things is also a huge part of the job.

since all search engines have turned into LLMs

Google appears to have actually dropped their full Boolean search functionality, I assume because of this.

It's going to become a huge problem (or, at a minimum, extremely annoying), particularly in parts of my line of work.

Google appears to have actually dropped their full Boolean search functionality

Well, crap. I may now be finally forced to shift to a different search engine because of this, but they all seem to be rushing full tilt like the Gadarene swine into AI-ifcation.

My expression right now: 😠

I may have overstated the problem - I need to test it more, I was having problems with the exact search function and it seems Google has a "verbatim mode" that might assuage my concerns - but I definitely am not happy with the overall trajectory.

Verbatim and minus have just meant "more/less of this please" to google for years now -- well before LLM influence. I'm not sure why exactly, but corporate policy seems to be that (even setting aside sponsored results) the algo knows what you want better than you do. And the algo is getting worse.

Usually in the past if I copy/pasted something into Google in quote marks, it would quickly point me towards the right thing.

A week or two ago when I was working on a project that required this, I had a weird experience. If I'm recalling the exact sequence right, it told me it didn't have any matches - but then, when I scrolled down, the correct match was something like third from the top - the algo seemed to only be checking the preponderance of the words, and thus even when it could correctly source what I was looking for, it wouldn't flag as a 100% match.

So even though it had exactly what I was looking for, it didn't act as if it did.

Even when it does point you to the right thing, it is also showing you other things now -- in the deep(ish) past, if you put something in quotes it would only show results containing that string. Similarly (although I think this went away first), a search for -(thing you don't want to see) used to result in zero results containing that term -- now if you search for "used cars -chevy" it probably shows you fewer chevys than otherwise, but you are still going to see some. Particularly harmful when you are looking for something with one extremely common straightforward set of results (that you are not interested in) and an alternate niche interpretation. (the thing you want to find!)

AI influence seems to be making this a bit worse, I suspect since the "this is probably what he really wants" is more strongly weighted -- but it might be corpus frequency effects too I suppose.

What's frustrating is that I am pretty sure a nonzero portion of this is simply due to boost ad revenue.

Death by a thousand straws on the back of the goose that laid the golden egg.

I'm not sure why exactly, but corporate policy seems to be that (even setting aside sponsored results) the algo knows what you want better than you do. And the algo is getting worse.

The version of this that I hate the most right now, merely due to exposure, is in Windows, where the bottom-right notification pop-up gets selected or ignored if you click on the area just a few pixels out of it, as if I had accidentally clicked just outside the borders of it. No, I clicked on that specific pixel on purpose, because that pixel had the specific UI element that the pop-up box covered up that I wanted to select! If I click on a pixel directly adjacent to the pop-up box, I want it to be interpreted no differently from if I clicked on a pixel 500 away from the pop-up box. The only justification I can think of is for touchscreens, but those pop-up boxes aren't exactly tiny, and making UI behave differently based on input device (mouse vs touchscreen) is something that should be very very possible in Windows.

I'm showing my age perhaps, but I swear there was a time when double-clicking a word in windows selected just that word -- I understand that sometimes people would also want the trailing space, but now even if you drag-select, that gets helpfully added in many programs (eg. Word).

Clippy lives on as a sloppy ghost in the machine...

Re-endorsing Kagi, another search engine

Seconded! I was skeptical about paying for search, but it's so much better than Google these days. I pretty quickly was convinced it was worth the monthly subscription.

Thank you!

Darkly amusing to imagine LLMs putting me out of a job, not because they are better at what I do, but because Google for some reason decided to gouge out their own eye.

I guess their quality has been slipping for some time, but the other day it started giving me screwy results when I was hunting for specific phrases. I guess I will have to make sure that "verbatim mode" is switched on whenever I search for an exact phrase, now...and then hope they don't get rid of that, too.

I was trying to google whatever happened to that guy who ran down a Christmas parade. I remembered almost no details about it. Not the name, location, etc. Google's LLM was adamant that no black man had ever done anything like that, and explicitly said only white people had. It was only displaying search results about Charlottesville, and how the guy who did it got what was coming to him. I was trying to put together a rebuttal to a post last week or two on the Charlottesville Unite the Right incident. I think Google somehow knew that, because all the LLM summaries were preemptive rebuttals to the information I was attempting to find.

It made me highly skeptical of the narrative being pushed by the OP's "exhaustive" research. Especially when my own search attempts were so heavily guard railed to keep me on narrative.

I fucking hate this brave new world.

I did eventually find the information, and now for whatever reason it comes up readily. It was Darrell Brooks and he attacked a Waukesha Christmas Parade. He got the book thrown at him.

How could you ever forget Darrell Brooks? We have multiple Marseys of him over on rdrama.net

The whole trial was an absolute hoot and we even managed to get ourselves mentioned during the trial through one of our operations where we pretended to act like we'd put undue influence on the jury...

Google's LLM was adamant that no black man had ever done anything like that

No, you see, that's because if you remember the reporting at the time, it was the vehicle what done it, the evil machine. The car or truck took it into its head to just run out of the driver's control and charge into a parade all on its own initiative.

There was some mockery of the phrasing about this on social media, if you read the right websites. Brooks insisted on being his own representative at trial which led to some very entertaining moments.

The information shows up as the first search result. You say "For whatever reason" it shows up now, but what is your theory here- the Google AI somehow knew that you, specifically, were looking for wrongthink and tried to foil your attempts, but then elected not to do that for anyone else? I'd be curious to know exactly what your query was. You searched for something like "black guy who drove a car into a parade" and the AI summary posted text saying this never happened and only white people have ever driven cars into crowds? How very odd.

My guess is their LLM over indexes on the recent search history and what you click on. So likely WhiningCoil vaguely described the incident with perhaps incorrect info and with the low information query Google returned bad results. He clicked on them to see if they were the thing he was thinking of and the LLM got that irrelevant stuff stuck in its context. I have similar issues when using OpenAI models professionally and personally.

Searching "black guy who drove a car into a parade" returns the wiki article on the attack as the first result and has the same info in the AI box.

You should read until the end of my post.

I feel like a lot of people in these replies are talking past each other.

My 2 cents:

Are LLM's useful tooling for finding vulnerabilities for security researchers?

Yes, I think this is undeniable at this point; LLM's are exceptional at uncovering software flaws, bugs, and vulnerabilities, and are going to significantly change how cybersecurity is practiced, as can already been seen by how vulnerability disclosures have recently quantitatively spiked like crazy.

Is Mythos better than the other available models at finding and exploiting vulnerabilities?

Yes, Mythos really being a stronger model for cybersecurity applications is almost certainly the case: this XBOW report is a good read on its capabilities.

Is Mythos a super-hacker that's going to break cybersecurity for good?

No, this seems unlikely and driven by good marketing from Anthropic and online hype. Mythos isn't making the Move 37 for cybersecurity or discovering vulnerabilities beyond human comprehension, it's just an iterative improvement over the current tooling combined with a lot more compute and attention suddenly being used to uncover security vulnerabilities. I suspect that the same amount of compute, security researcher attention and buy-in for Project Glasswing applied to the previous generation of frontier models would have uncovered the majority of security issues that Mythos did.

It's also worth noting that there are apparently 11 Curl CVE's in the current release cycle, where the new CVE's did not use Mythos, which seems to disprove the idea that Mythos was not all that effective on Curl because it was uniquely hardened or secure.

Should LLM's being good at vulnerability discovery and theorem proving be an update on LLM's eventually reaching AGI?

YMMV, but to me, the recent headline mathematics and cybersecurity achievements haven't really changed my view that AGI emerging from LLM's seems unlikely. From an outsider's perspective, most of the recent gains in model performance look to have come from RLVR on coding, math and cyber. While very effective at improving performance on those tasks, it seems that RL has largely failed to further generalize intelligence beyond the specific RL'd areas, and if you look at SimpleBench or the AI RP community, seemed to have regressed performance in other areas of intelligence.

I think it's telling that all of the achievements of LLM's being held up over the past ~18 months (METR eval, CCC compiler, theorem proving, cybersecurity), while extremely powerful and which make me bullish on the utility of LLM's, are all tasks limited by requiring an external oracle for verification, and where there's no penalty for failing during intermediate steps. I personally think it's quite likely that LLM's eventually become superhuman at proving theorems and exploiting vulnerabilities given sufficient compute, but still cannot manage a restaurant, write an interesting book or autonomously maintain a software project.

You may be interested in Beren Millidge's take on Mythos (i.e. it's all RLVR):

https://www.beren.io/2026-04-11-Thoughts-On-Claude-Mythos/

The problem with all these demos is that the level of capital involved is well beyond what it would take to simply contract some world-class humans to do their thing and misattribute their work to AI.

Like Terence Tao’s enthusiasm for AI seems, uh, kinda synthetic tbh. I’m like 90% sure he’s contracted to use these fancy models and try to get them to do something cool, then post about his experience, with the tacit understanding that further contract money is dependent on him not saying "Well, that was interesting, but basically a waste of time. Back to doing what I was going to do anyway." If you really want to put on the tinfoil hat, he was placed in a precarious financial situation to help motivate him, which was definitely not coordinated or planned by the people who coordinate and plan everything. If there’s one thing the Trump administration would never do, its leverage state policy to manipulate markets.

I think that critique was reasonable even a month ago: most of the novel proofs discovered by LLMs could have been done by a modal grad student in the field, given time and motivation. Still useful, but picking off only mildly interesting results that haven't received much focus isn't world changing.

This particular (dis)proof, however, is quite different. It has received extensive attention. Research Problems in Discrete Geometry called it "possibly the best known (and simplest to explain) problem in combinatorial geometry." Surveys have been written on it. Erdos himself returned to it many times and tried your approach, offering a bounty for solutions.

If some billionaire had dedicated billions of dollars for a resolution of the conjecture, it seems quite possible that nothing would have come of it. Thomas Bloom in the companion remarks has some interesting speculations as to why it resisted human attempts for so long that are relevant, and the other remarks are interesting as well.

Still useful, but picking off only mildly interesting results that haven't received much focus isn't world changing.

It is. Quantity has quality of its own. Even if LLM peak below humans, a stupider brain that is inexhaustible, can work 24/7 and can be scaled to infinity means that a lot of intellectual things could be bruteforced in the million monkeys with million typewriters way. Throw at a million small problems and there will be breakthrough somewhere.

Also check this https://modelrift.com/blog/openscad-llm-benchmark/

What was magic 6 months ago is boring and insufficient now. Also couple of AI uses last week - to decrypt pragrmata save, find and edit values of the upgrade currencies, rehash and resign. Fix some blutooth issues - it took it half an hour but managed to pair the troublesome adapter and mouse combo.

Even if the technology stop dead in the tracks now - we will need at least five years until all the effects and possibilities are clear.

I'm not too surprised that a secure piece of software exists, or that it's only 6 MB zipped with more installations than there are humans on Earth and a 30-year history.

Why are you highlighting this anecdote so much?

a secure piece of software

I think this is an unlikely claim. curl helpfully lists past vulnerabilities. (Fun fact: they stopped awarding bounties for vulnerabilities when people began posting AI slop bug reports, wasting their time.)

I do not think that "curl does not have any more medium-or-high level exploits beyond CVE-2026-7009 and CVE-2026-7168, so even an ASI could not have found any" is true.

Don't get me wrong, I think curl is certainly in the rightmost percentiles of software security (alongside openbsd), and an interfacing library (i.e., tons of attack surface) with a whopping 176kLOC and only 188 CVEs so far (despite heavy auditing) is pretty amazing, even more as it is written in C. It is entirely possible that Mythos will turn less-audited codebases (e.g. closed source or more niche open source) into a blood bath.

But I still think Stenberg's (not entirely dismissive) take is a good one to update on. Much of the software industry is very much on the AI hype train, and for the AI companies hype seems to be the main product. I would not expect Microsoft to come forward and call Mythos not a big deal (unless they are hyping up ChatGPT, of course), for example.

It is entirely possible that Mythos will turn less-audited codebases (e.g. closed source or more niche open source) into a blood bath.

But so will almost any other capable model.

Yes, or a bright teenager with nothing better to do, for that matter.

(Though there are certainly orders of magnitude more people with an LLM subscription than people with the skills and diligence to find exploits the old-fashioned way.)

Yes, or a bright teenager with nothing better to do, for that matter.

The last time that was true was when ROP-s were in vogue.

Most of the incapable models too, from what I've seen of internal systems at client sites.

I think the expectations of AI believers and the hype pushed out by the company is so absurd that it's quite easy to be considered an "AI skeptic" even if you're relatively bullish on AI. Like even if I were to believe an AI god were to come soon, we're just not getting 10% growth year on year. Not happening. Regulatory barriers alone make it impossible, and then there's diffusion problems, and then there's the fact that we just can't build up enough energy to scale growth that fast even if there were 0 regulatory barriers.

I do think the fact that the real world results never match the measured increases in AI capabilities is kind of indicative of the problem here. It's very easy to train AI in kind environments with clear feedback loops, but wicked environment outcomes are all over the place.

Did you ever get around to trying my suggestion for setting up a code harness and predigesting your code base?

So I'm asking @self_made_human and others who seem more on-board with the AI hype train: does this report from a knowledgeable and experienced developer change your opinions on the future trajectory of AI at all?

I don't really have any strong opinions on what one dude has to say about about a model I can't otherwise evaluate myself, but in your own article the guy you're apparently claiming is skewering ai and that should put us in shambles also said:

We also see a high volume of high quality security reports flooding in: security researchers now use AI extensively and effectively.

Like, I dunno, man. Do you not feel like the goal posts are shifting here? It's useful but one report from mythos on one repo where the guy said he was disappointed that there weren't more bugs found because other AI tools had found more(Which were already patched and thus not available for mythos to find)? This is your justification for the whole of AI being slop and hype?

I guess I update slightly in favor of mythos being closer to the current public sota rather than a league ahead of it. Perhaps the Curl codebase is just actually so tight that the whatever IQ equivalent level security expert that mythos represents wasn't about to find much, I promise you that other projects are not so tight.

curl is one of the most fuzzed and audited C codebases in existence (OSS-Fuzz, Coverity, CodeQL, multiple paid audits). Finding anything in the hot paths (HTTP/1, TLS, URL parsing core) is unlikely.

Do you not feel like the goal posts are shifting here?

The original claim Anthropic made was that Mythos could do all of this independently. That it didn't need a highly experienced security researcher guiding it. In fact that the reason it was so dangerous is because any lay-person could use Mythos to "hack the planet" It's not goal post shifting to point out, no Mythos is just a SOTA tool and like most SOTA AI tools it works better by having an experienced human guiding it on what to code, look for, design the system etc. The AI ecosystem is very hype oriented, people claim far more than what is realistically delivered.

The amount of steering it needs is still totally speculative. There are a thousand reasons you'd run the queries yourself rather than give some rando model access. And @ChickenOverlord goes well beyond claiming the tools are moderately over hyped, he's argued extensively that they're basically useless for coding.

To be clear I am not an AI bear of @ChickenOverlord's persuasion, I am an AI bull, but very cynical on the marketing hype. I think his opinion that they are useless for coding does not align with my opinion. But I do think the original goalpost that Mythos is not some super intelligence, first step to the singularity, type model has not been moved. Mythos is a SOTA tool like other SOTA tools.

The amount of steering it needs is still totally speculative.

Absolutely, but reasoning by abduction points it to being worse than the marketing would suggest.

Sure, if you want to quibble over how far of an advancement mythos was then I think there are a variety of reasonable opinions. But this is very much part of chicken's "Ai bulls in shambles" series of posts and I can't help but point out that the people he's calling out as being in shambles here were basically right in previous iterations if you take the supposed doomsayer's opinion at face value.

Over the past few weeks we've had several serious vulnerabilities found in the Linux kernel (CopyFail, DirtyFrag, PinTheft), and LLM assistance has reduced the gap between "suspicious bugfix smells like it might patch a vulnerability", "someone other than the reporter/reportee has PoC and/or a working exploit", and "attackers are deploying it live in the wild" to nearly zero time.

Curl is an unusually disciplined project, and I think it is hard to generalize lessons from it.

So I'm asking @self_made_human and others who seem more on-board with the AI hype train

Choo choo!

So it only found 1 minor vulnerability in curl that hasn't been fixed before (including by these high level human programmers)... but it did find a bunch of other vulnerabilities in other software? It is indeed still markedly stronger than its predecessors?

So the future trajectory is just the same as the current trajectory, the lines on the chart go up and everything the lines correspond to in the real world also goes up, albeit in a messier way.

If you're an AI skeptic, then I recommend to simply short Nvidia, Coreweave, cloud providers, HBM manufacturers like Micron... What does it matter how random people on the internet think, compared to making money? I put my money where my mouth is and bought AI stocks and made lots of money. Let money flow to those who are right. If you think you know better than Google, Amazon, Microsoft, Facebook and everyone else pouring money into AI hand over fist, then don't just say so, position yourself to exploit your superior insights.

If you're an AI skeptic, then I recommend to simply short Nvidia, Coreweave, cloud providers, HBM manufacturers like Micron

Thinking that a loss-leading strategy is not going to pay off for the current AI ecosystem and AI skepticism are not the same thing, is it? You can think that the AI is very impressive and also that there's no way that Anthropic will ever climb out of its hole, or alternatively you can be the fiercest AI skeptic in the world and think that everyone will pay billions for a glorified chatbot.

I think they are currently incapable of designing and maintaining any significant projects that go beyond a basic bitch CRUD application or things of that sort. I'm also skeptical that there is all that much room for growth or improvement beyond their current capabilities

That's what he thinks. Surely he should just put his money where his mouth is? If Anthropic AIs cannot design or maintain any significant projects beyond a CRUD application and this isn't going to significantly change then presumably Anthropic is not worth near a trillion dollars and so the biggest industrial buildout in human history is a waste of money.

The premise that they're incapable of doing anything beyond CRUD and yet also they're completing long expert-level cyber infiltration exercises is bizarre and incoherent to me... but that's what he thinks.

  1. Anthropic isn’t yet public. You can’t easily directly bet against it.

  2. To short Nvidia would require the belief the hyperscaler will abandon the scaling. That’s really hard to time.

  3. Are you shorting a bunch of companies that would be killed by AI? P/E for many suggest they are highly overvalued if AI can displace in the next year or so.

  4. It isn’t obvious hyper scalers are making rational decisions. Apple isn’t hyper scaling (it is leasing). Doesn’t seem like a terrible idea…

Yeah but why aren't the hyperscalers abandoning scaling? Microsoft, Amazon, Google, Facebook made a deliberate choice to halt buybacks and spend hundreds of billions on AI. They made this choice based on something, they're spending $700 billion this year! You don't invest that much as a modern financialized American corporation without being sure about what you're doing.

He should be thinking that, if further significant improvements are impossible, then capex will plunge as soon as this is realized. But this isn't happening, we see continual improvements on a monthly basis.

Apple is more of a hardware company, they have a different business model to Microsoft and the others. AI is understandably not their great strength. They might reasonably calculate that they are not going to win a struggle with Google Deepmind on AI with regard to talent or compute or determination. AI is the lifeblood of Google, devices are the lifeblood of Apple.

You do realize sometimes corporations make the wrong choice? Also corporations choose what the market rewards them for. If they cut capex because they didn’t think there would be major improvements, their stock would plummet because it would mean all of their prior capex spending will never make an ROI.

The market doesn't necessarily reward companies for investing, it rewards stock buybacks (which were all the rage amongst big tech up until the AI boom).

If they wanted to juice their stocks, they'd just continue buybacks rather than buying GPUs.

It'd be surprising if these large, old, well-established software companies all catch AI fever at the same time. These are all survivors of the dotcom bubble, not fledgling newcomers with more credulous leadership.

This just isn’t true. The market generally prefers buybacks to dividends due to EPS, etc. However, the theory behind distributions (including buybacks) is that a SH can generate more return with the cash compared to the company. If the company can generate a higher return with the cash, they would not distribute and the SH would enjoy value creation via higher stock price.

By forgoing buybacks and instead spending a bunch on capex, these companies are signaling they can make more on the cash compared to the general market conditions. This is certainly the story these companies are selling as well.

None of this means the companies are wrong. But right now they are being heavily rewarded for investing in AI. If they stopped and started doing buybacks, they’d almost certainly drop in value.

Finally established companies fail literally all of the time.

More comments

Taking a long position like you are is not comparable to your suggestion of a short. Shorts have uncapped risks and a much more specific time horizon. "The market can remain irrational longer than you can remain solvent" is much more true for shorts. There were people who shorted tech stocks in the 2001 bubble and went bankrupt because they missed the crash by few months. There were similar people during the housing crisis. That doesn't mean they were directionally wrong - it means that timing is hard.

Timing sure is hard. I managed to buy Micron at the top and so lost out there, it then recovered but it took a while.

Nevertheless, you can make money shorting if you're actually right. If you know things that others don't know, you can use this to your advantage. Don't blow your whole load in one year, keep some powder left for if the ponzi goes higher. There are ways to position yourself to profit from this, if the thesis is true.

It might be worth moving this over into the finance thread, but I am at least partially putting my money where my mouth is.

I'm of the opinion that the current LLM hype is starting to hit the second knee of the S-curve, both financially and technically.

Technically, exponential growth leads to exponential friction, and it looks to me like the real-world improvements in model capabilities are slowing down between generations. Anecdotally, it feels like the models are increasingly fungible and most of the ostensible improvements have come from harnesses, which are regular old software engineering. I think there's something there, but I think LLM tech represents a local maximum. I'm eagerly watching whatever Yann Lecun is cooking up at AMIL, because the general concept of a world model seems to map better to what we think of as "intelligence". His paper on energy models, specifically, is fascinating.

Financially, I think a lot about the "during a gold rush, sell shovels" aphorism. I also think about Buffett and Munger's rules of investing. Meta and Oracle are buying shovels, but using them to dig their own graves, so far as I can tell. I don't think Anthropic and OpenAI are ever going to be able to support their valuations, and per their S-1, xAI has already pretty much given up. If Google dies, it'll be for reasons other than AI spending. Nvidia has a good product and a good moat for now, but various specialized competitors are nipping at their heels while Chinese cards may develop into direct competitors.

In other words, I think the tech is going to continue developing, but I think a lot of the current players are in for a rude reminder of market on market fundamentals by 2H 2027 or so. I know you've bemoaned "financialization" in the past, but at the end or the day the economy is just people buying and selling things, and fuck me if it doesn't seem like some of these companies are trying to act like that's not true.

Where does that leave me? I'm moving down the stack. If the tech is going somewhere, it has to run on real things and interact with the real world eventually. Companies doing physical things are riskier to start than pure software, but they're less likely to get disrupted once they establish themselves. I've largely stopped investing in funds that hold significant amounts of meta, oracle, Tesla (because I think they may absorb SpaceX), and even Nvidia. On the other hand, I'm expanding my positions in funds that hold TSMC, ASML, and Lam Research. I am watching Cerebras, but I won't invest until I better understand how they're using software to get around defects on their enormous chips.

Fair enough, I guess that's a reasonable stance.

It's just that just today I see people online talking about Qwen 3.7 Max:

Over 35 continuous hours, Qwen3.7-Max executed 1,158 tool calls and 432 self-evaluations. It wrote, compiled, profile-tested, and repeatedly rewrote a production-grade SGLang Triton attention kernel. The resulting custom kernel achieved a 10x speedup over the official reference code. Engineers on forums noted that its ability to identify optimization bottlenecks after 30 hours of continuous operations represents "true industrial-grade autonomous engineering" rather than standard code completion.

Are they lying? Was the kernel made up? Maybe Alibaba is massaging the figures to some extent with the exact meaning of what a 10x speedup means in this context, dramatic speedups for just a few tasks being averaged out. Yet we know that other AI models can also do this kind of task, the general idea can't be just a lie. If it's not a lie, then surely this seems like a highly desirable, powerful technology that can substitute for high-end human talent to some extent. GPT5.5's verified mathematical conjectures seem hard to cheat. Kernels and mathematics seem to have real world value, as does whatever Anthropic's been doing with the war in Iran in terms of intelligence, rapid realtime assessment. Hard to get more real-world or frictional than warfare...

Are they lying? Was the kernel made up?

Cases like this, and the erdos problems, are exactly where LLMs shine. Problems with clear and unambiguous reward functions that are difficult to hack are perfect use cases. In the Alibaba case, they likely have an extensive set of characterization tests that guarantee consistent behavior. An LLM with a good harness can pound its head against those tests forever while simultaneously measuring the performance as a success metric. It will never get tired and it won't get sick of doing that kind of work.

There's definitely value there, but I don't know how much value. The combination of technical depth and strong guardrails make for a very schizophrenic kind of difficulty. Doing that kind of work is traditionally either the domain of a plucky junior with too much energy, or an insane wizard who claimed a broom closet as his office.

When we've experimented with that kind of optimization work at my employer, it tends to be very expensive, since most of the results come from the absolute tirelessness of the agent. In comparison, how much are you paying your junior? How much are you paying your wizard, and what is he doing if he's not doing that task? Security scans are a similar thing. Line audits aren't hard, but they're hella time consuming. As model costs rise (and they are rising per task completed when you compare any single vendor over time), it might legitimately be cheaper to throw interns at the problem than LLMs.

At least on the software side, I think there's a reasonable chance that what we're seeing is a temporary pop due to a lot of highly verifiable technical debt deadwood finally getting burned out, and that might not be a constant source of demand.

On the war side, I wish I knew more. The sensitive nature of the topic means that all parties are incentivized to obfuscate and dissemble as much as possible. It might legitimately be an ideal case. LLMs do well when you can accept 95% accuracy, and in something like intelligence analysis, 95% accuracy probably has the spooks all but shitting their pants.

Surely if you think AI is capping out then you should expect ASICS to be the play. QCOM and the like. I have a hedge in some of those in case scaling doesn't continue as I expect.

They're firmly on my "investigate" list.

I'm of the opinion that the current LLM hype is starting to hit the second knee of the S-curve, both financially and technically.

Astral Codex Ten on this exact topic

For the record, while I appreciate the name-drop, I've largely checked out of this debate. I read the article when it crossed HN, which I browse daily. The strongest critique of Mythos is that GPT 5.5 Pro reaches similar benchmarks while being cheaper and generally available. Which is to say: Mythos isn't quite as special as Anthropic would like, because a competing frontier model already demonstrates equivalent capabilities. See the problem there? Or, from my vantage point, the absence of one?

Why so checked out?

Not because I've recanted, and not because I've stopped believing my own forecasts. It's that anyone who hasn't gotten the memo by now is beyond my ability to help. I've been on this beat for years, sounding the alarm for about as long. Litigating whether each fresh data point lands above or below the trendline has stopped feeling like a useful expenditure of my evenings. I still have the arguments cocked and loaded, still bookmark whatever catches my eye, with roughly the clinical curiosity of an ICU physician watching creatinine and urea climb and eGFR slide in a patient with end-stage renal disease. Erdos problems falling like dominos and Terence Tau watching from the sidelines, Tim Gowers writing up breakthroughs from OpenAI's unreleased general-purpose models, METR's task-horizon metrics snapping like a mediocre school psychologist trying to score Einstein on the Stanford-Binet. (At some point the instrument stops measuring the subject and starts measuring its own inadequacy.)*

TL;DR: my supply of fucks is running thin. If you're pinging me hoping to extract an argument about AI capabilities, calibrate accordingly. I've got bigger fish to fry before I get thrown into the fryer myself. Good luck to whoever still has the energy for it.

*Go ask ChatGPT for citations and actual links.

METR's task-horizon metrics snapping like a mediocre school psychologist trying to score Einstein on the Stanford-Binet.

Einstein's IQ was probably about 140, so no. You don't understand what you're talking about here. Nice try though.

Sure buddy. The psychiatry resident who reads up on psychometrics for fun and is fully aware of the unreliability of standard IQ testing when we're going several sigmas away from the median of the distribution wouldn't have any idea about what he's talking about. Especially when his actual point is that trying to IQ test someone as far out of distribution as Einstein (famous for being the dimmest bulb in the shed)* is going to give unreliable results?

You wanna try telling me Feynmann had an IQ of 125? I'll believe you, or at least humor you.

Aight. Gotta hand it to you. Never had a chance of winning this argument. I concede.

*He's famous for only emitting a singular photon.

Especially when his actual point is that trying to IQ test someone as far out of distribution as Einstein (famous for being the dimmest bulb in the shed)* is going to give unreliable results?

Einstein was not out of the distribution. He was very smart, but not in a reality-shattering way, and he was very focused on his craft, like all successful smart people.

The psychiatry resident who reads up on psychometrics for fun and is fully aware of the unreliability of standard IQ testing when we're going several sigmas away from the median of the distribution wouldn't have any idea about what he's talking about.

I have a PhD in statistics. With all due respect, I know what physicians study, and while many of you are great healthcare practitioners, you do not study the quantitative.

Einstein was not out of the distribution. He was very smart, but not in a reality-shattering way, and he was very focused on his craft, like all successful smart people.

Einstein. The gentleman responsible for Special Relativity, General Relativity, the photoelectric effect (which actually got him the Nobel), Brownian motion as proof of atoms, mass-energy equivalence, and Bose-Einstein statistics. "Very smart, but not in a reality-shattering way."

I'm going to need a minute to process that. Possibly three. Fortunately my psychiatry experience prepares me well; I can usually recover from being utterly flabbergasted in 5 seconds or bust.

If two complete overhauls of how humanity understands space, time, gravity, and matter doesn't clear your bar for "reality-shattering," I'd genuinely love to know what does. Should he have collapsed the lightcone via propagating false vacuum decay? Manually torn the curvature tensor out of the universe and presented it to Bohr in a jar? What are you on about? What are you smoking?

As for “Einstein was probably about 140”: probably according to what? A preserved Wechsler protocol from 1905? A Stanford-Binet administered by divine revelation? Some conversion table from “invented general relativity” to “moderately gifted but not too spooky”? I am genuinely curious how you got to “Einstein probably had IQ 140”. I presume you've heard of something called a ceiling effect?

"Focused on his craft, like all successful smart people" really makes me wonder which Einstein you mean. The one who played violin semi-seriously, wrote political and philosophical essays by the bushel, corresponded with Freud about the psychology of war, lobbied Roosevelt about the bomb, and turned down the presidency of Israel? Monomaniacal indeed. The phrase "successful smart people" is also wonderfully convenient as a construction, since any polymath counter-example presumably just gets retroactively reclassified as unsuccessful.

I have a PhD in statistics. With all due respect, I know what physicians study, and while many of you are great healthcare practitioners, you do not study the quantitative.

With whatever respect you're due, and without further comment on the magnitude of that debt: British psychiatrists are held to higher standards than that. I'm held to higher standards than that, mostly by myself. I know the difference between Cohen's d and Hedges' g. My interest in entering a d-measuring contest with you is, by consensus values, small. It is roughly equivalent to my interest in arguing with you about the psychometric validity of the other form of g.

Don't believe me? Here's the MRCPsych Paper B critical appraisal syllabus.

I gave it last week. The headache is still bad enough that I'm not going to dig through my own post history to surface the times I've gone several layers deep into statistics arguments on this site. You're welcome to spend your time doing so, I value mine.

Lumping me in with the median doctor who thinks p<0.05 gud? Nice try though.

I'm going to need a minute to process that. Possibly three. Fortunately my psychiatry experience prepares me well; I can usually recover from being utterly flabbergasted in 5 seconds or bust.

You appear to attribute all outward intellectual achievement to latent g. But the correlation between g and such achievement isn't 1. A really high estimate of that correlation would be 0.70, so was Einstein a 4 SD physicist of his day, his expected g would be 2.8 SDs, which is 142. 142 is extremely intelligent and a focused person at that level can achieve great things. It's been found that the correlation between intelligence and chess is as low as 0.24.

"Focused on his craft, like all successful smart people" really makes me wonder which Einstein you mean. The one who played violin semi-seriously, wrote political and philosophical essays by the bushel, corresponded with Freud about the psychology of war, lobbied Roosevelt about the bomb, and turned down the presidency of Israel? Monomaniacal indeed.

He lived a long time and did physics as a profession, but yes he also had hobbies. This is typical of people in the 140s IQ.

With whatever respect you're due, and without further comment on the magnitude of that debt: British psychiatrists are held to higher standards than that. I'm held to higher standards than that, mostly by myself. I know the difference between Cohen's d and Hedges' g.

I'm glad you pay more attention to statistics than most psychiatrists. Maybe one day you can help replace the DSM with a quantitative approach. Have you read Eysenck on general psychoticism? It's in his book on genius.

My tips for you on intuiting why you probably over-identify people being over 150 IQ:

  1. high IQ is exponentially-increasingly rare because the bell curve is thin-tailed. Shrinkage is optimal; you would have less error if you shrink everyone's intelligence towards the median by 10% or so.

  2. Flashy outcomes are a result of intelligence as well as other factors like work ethic and fortune. Many people think a famous businessman, scientist, or writer has an outrageously high IQ, because they think their preferred intellectual status markers correlate at >= 1 with intelligence. They don't.

  3. The CEF (conditional expectation function) of g on outcomes might not be linear. The IQ wealth CEF is probably concave for instance. Nonetheless, lots of people believe that Elon Musk must be a genius, but his real intelligence is probably top 1% or 2%. He's exogenously fortunate and an outlier in other ways, like commitment to business, and these explain his wealth jointly with his intelligence.

Wasn't it a big thing though in statistics that often a phenomenon is nicely described by the normal distribution in the fat part, but has much more outliers than the exponential shrinkage of the bell curve tail predicts, and so there are all sorts of modified distributions with wider tails. Is it well known that intelligence isn't one of those?

  1. Appearances can be deceptive. I do not think that g is some kind of stamp that gets impressed on your forehead and then dictates the rest of your life without further environmental modification. If you wish to argue that certain factors/metrics only explain a limited fraction of the observed variance in outcomes, you should have lead with being more charitable and assuming that I know what I'm talking about. Doctors are not known, as a class, to be particularly stupid. Your assumption was and does remain incorrect.
  2. I appreciate the more substantial attempt at engaging with my arguments, and while I have genuine disagreements, I wasn't kidding about the headache. That strikes me as a remarkable approach to estimating Einstein's IQ, going from what it might have been, in theory, to what it might have been on a hypothetical IQ test he never gave. I recall that there are plenty of confounders for the chess and IQ stuff, probably Berkson's paradox, but I do not have the time to check. The rest of your arguments are tangential to any point I came in with the intention of litigating. I told you so. Now, if you had lead with these points, any semblance of rigor, or at least charity, I would engage more productively. Right now, I simply can't even if I want to.

If we define g factor as smartness then I can understand @dailydogma argument. His point basically seems to be, whatever causes Einstein's success in physics was not his g factor. There are lot of people who are very talented in a skill such that the second place in the world doesn't come close to the first place in ability. For them g factor may not explain the variance in ability but some other (maybe inborn and genetic) trait may explain that variance better, a trait which does not generalize well to other domains.

Nigel Richards is incredibly dominant in his field, he is the french scrabble champion. He is better than any computer at the game and he doesn't understand french at all. He only learned it for the game in just nine weeks. This is not an ability any other competitor shows. [https://youtube.com/watch?v=T-8NrvVqbT4] If you try to explain his success with g you would end up with a very simple question

"If his ability is so much greater than his opponents and this is primarily explained by his g then his g must also be far far greater than his opponents, why is he is not an equally prominent physicist or the like. G factor generalizes, he should be better at everything." A explanation for this can be he has some trait which doesn't generalize well but helps with his specific domain, so he doesn't have very high IQ (or g or intelligence) but he is still very good in his field.

This can also lead to a very smart but not an world shattering level smart Einstien who nonetheless can make the discoveries he did. Just look at latest models from OpenAI and how jagged they are, they can do Erdos problems but we are not at AGI are we?

Maybe Von Neumann was consistently a genius in may different fields and hence was smarter than Einstien.

If at the very top something other than g factors starts explaining most of the variance then you can easily have 130 IQ Feynman.

Edit: Math is very g loaded but once you filter for people with high IQ (120+ or 130+) the correlation would drop, and you would probably end up with very jagged talent rather than a generalized talent which is used in g. After filtering at 145 IQ threshold different cognitive abilities merely have 0.1 correlation https://doi.org/10.1016/j.intell.2017.07.004

I'm going to need a minute to process that.

I heard you can do that in just 5 seconds if you just run really really fast. Like, really fast.

And shrink when viewed by an external observer? I'm a grower, not a shower.

Can you post to some of your forecasts? I know that you are bullish on the tech - but my vague impression was that they survived a sanity check - aka were not like the wet nightmares of AI2027

My current timelines (stable for the last year or two) are 50% odds of AGI by 2030, 70% by 2033.

My operational definition of AGI is "can do ~everything a human can with a computer as well or better than the the median human", ideally a 130 IQ human. That focuses on real world tasks, and also considers speed and reliability. I consider ASI achieved when the models reliably beat the smartest humans alive at similar or lower figures for $/unit of cognitive output.

In other words, if you attach my version of AGI to a computer with access to the internet it can do anything a human could with the same affordances, about as well. Probably with a video feed and a virtual keyboard or mouse, but that's not a big deal. Current models are too spiky in terms of capabilities to count, particularly when it comes to agentic workflows like simply using vision and direct input to get tasks done. I can't solve an Erdos problem even if you give me 5 years to prepare, but I can do more with my desktop PC than Claude can, at least much faster.

I expect that the temporal delta from that version of AGI to true ASI is going to be rather short. Maybe a year or two, medium confidence guess.

So pretty mild stuff, although I do find your definition of AGI/ASI somewhat texas sharpshooter style. On the other hand no one seems to have to be able to define those things in a way that is better so there's that.

It's interesting times when I'm told that my forecasting of a 50% chance of AI becoming human-parity or better in 4 years is described as a tame take. Not complaining, just observing things with grim resignation. I'll know AGI is here when I see it, or a few years later, if unemployed.

I wish I'm wrong, and that I had been wrong so far. It's no fun engaging in arguments where you want your opponents to win.

The whole of civilization and industrialization has been underpinned by using energy to achieve superhuman feats. Most of the worries so far about AI are thin veil for some people's fear that their greatest labor asset will lose value rapidly. And probably some narcissistic injury if we are using LP definitions.

If I didn't get worried that we created cars that could run 20 times faster than a human 24/7 or that we have trivialized the magic of flying to the point that it is utterly trivial, boring and so on, why should I worry that some machine will think better than me. Hell - I should be worried if we don't invent such a machine. Whatever you can think of - we are running out of it - out of soil, out of biomass, out of oil. We need faster science progression to get out of the trap that is our lovely blue planet.

And your predictions are tame because they fit linear advancement at current rates. And we will probably get there even on log. Your definition of AGI is modest.

I think that Mythos like everything else from Anthropic is crap mixed with bullshit, but two things of note here - curl is absolutely tiny and is one of the most heavily scrutinized pieces of software ever written. The other is probably putty. And the easy hanging fruit was already picked.

This is a big red flag for me, because if Mythos does actually generate a lot of noise/false positives, it would make sense that Anthropic would want to hide that by running it themselves as many times as they could until it actually generated some real, actionable results.

That can still be rather concerning when released to the public. If you have a d10,000 and only roll once, your chance of getting a nat 10k (high severity vulnerability) is really low. If you can roll it a million times, well now you've suddenly made the news.

I'm not in cybersecurity at all but I wouldn't be super surprised if at the end of the day AI ends up being much better optimized for finding vulnerabilities than writing and maintaining a large codebase for precisely this reason.