site banner

Culture War Roundup for the week of April 13, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

3
Jump in the discussion.

No email address required.

Another indicator that AI is a bubble. Anthropic just released Claude Opus 4.7, and users are reporting significantly higher token burn rates (and therefore costs) for what appears to be a minor improvement over Opus 4.6. Discussion on Orange Reddit is here: https://news.ycombinator.com/item?id=47816960 and a tracker of the increased token burn rate is here: https://tokens.billchambers.me/leaderboard

The token tracker is based on user reporting, but has been fluctuating between 37% and 45%.

Even if AGI is actually possible with LLMs (or at all, but I'm not trying to start a discussion on metaphysics here), it looks like the capital needed to achieve it is drying up before it can be reached. Anthropic's move here (combined with them handicapping Opus 4.6 a few weeks ago) seems to clearly be an attempt to achieve profitability. The free/subsidized rate train for end users has pulled into the station, and now you have to pay more for the same (or worse) capabilities you were enjoying before.

I normally don't care much for the median Hacker News commenter (if me calling it Orange Reddit didn't already give that away), but I do find them to be a useful barometer for general sentiment in the tech industry. And a few months ago I would have said roughly 60% of HN users were AI believers/enthusiasts, 20% neutral or unsure, and 20% anti/negative. Anthropic's antics over the last few months (and Sam Altman's antics for his entire life) seem to have soured their views significantly, and I see this as a big sign of a sea change in sentiment about AI in the tech industry.

At least for me personally, I just hope this leads to less retarded mandates from my higher-ups about using AI X times a month etc. (we're literally tracked on usage and it can affect our raises/bonuses).

For everyone here, nut perhaps especially the AGI believers, have your feelings changed at all over the last few months?

I don't see how this is a bubble. The best counteargument to all AI skeptics for broad range of critiques including this one is the following: this is the worst the AI will ever be - symbolized by famous Will Smith eating spaghetti evolution. The original 2023 quality is something you can do on consumer grade computer with standard graphic card for couple of hundreds bucks. Even the 2024 result you can replicate by renting hardware for maybe $20k.

The point being, that cost is still decreasing by factor of 100 every year. It may be questionable if the frontier models will ever pay off given how quickly the field progresses, but it does not matter. If all of these companies blow-up, the underlying technology will still be there in the same way Google replaced Yahoo or Amazon was the first to really scale the commercial potential of internet well into the internet era.

The bubble is not going anywhere, the value in terms of the models is real and tangible, it is here to stay. Or to be precise, there is no real bubble, we are not talking about worthless tulip bulbs or NTF images. The models are real and useful. If anything it would be akin to processor "bubble", where miniaturization and chip breakthroughs resulted in Moore's law which basically held for decades. Yes, there are mass graves of companies that failed to keep the pace, including titans like Global Foundries or Panasonic, where individual investors bet on the wrong horse and lost everything. It does not matter, chips are very cheap, very powerful and very useful. The market itself is huge and important and if you hedged your bets correctly, you would be very wealthy.

the market itself is huge and important and if you hedged your bets correctly, you would be very wealthy.

This reminds me of a question I was kicking around with some friends. Suppose that around the time of the Internet Bubble, you had invested in a basket of all the hot tech stocks: Netscape, Amazon, etoys, PimentoLoaf.com, etc. How would you be doing 25 years later?

Depends on your timing. Do you buy in 1997-1999 or at the absolute peak in 2000?

How many companies do you pick out? Based on what metrics?

When it all crashes, do you cut your losses at a set point, or do you buy more? Assuming you got kicked out of them all, do you buy back into the most promising ones during the recovery?

But off the cuff I'd say that holding and never selling Amazon would make up for complete losses in several other stocks.

Depends on your timing. Do you buy in 1997-1999 or at the absolute peak in 2000?

How many companies do you pick out? Based on what metrics?

I agree those are good questions. For the sake of discussion, I would say

(1) you buy at the time when there is a lot of public discussion about a "bubble." So roughly 1998 or 1999.

(2) you buy a big basket of tech stocks. I'm tempted to just pick the Vanguard VITAX fund, but apparently that was not established until 2004.

When it all crashes, do you cut your losses at a set point, or do you buy more? Assuming you got kicked out of them all, do you buy back into the most promising ones during the recovery?

I would say you buy and hold.

But off the cuff I'd say that holding and never selling Amazon would make up for complete losses in several other stocks.

Well that's kind of the point. I believe that there is an AI stock market bubble but I have still invested roughly 10% of my savings in tech. Roughly speaking in the sort of stocks which make up Vanguard's VITAX fund. And I think I have a pretty good chance of coming out ahead.

Buying a big basket of more or less randomly chosen dotcom stocks would have destroyed you. If by big basket you mean 50-100 or more. The majority of those companies lost over 90% of their value and many never recovered at all. If Amazon was only 1% of your holding it might not have the weight to make up for the dozens of utter failures.

Buying a top, current tech fund is different, they're somewhat competently curated.

Buying a big basket of more or less randomly chosen dotcom stocks would have destroyed you. If by big basket you mean 50-100 or more. The majority of those companies lost over 90% of their value and many never recovered at all. If Amazon was only 1% of your holding it might not have the weight to make up for the dozens of utter failures.

Thanks for posting this, it's probably something I should have researched before buying a big basket of random tech stocks in 2024-2025.

Famously, Cisco only just barely got back to its Dotcom bubble era peak (and maybe it still hasn't yet if you account for inflation?).

I’m not convinced it’s a bubble. It might be, but gaging that from random commentary on HN isn’t a good way to figure it out. There are all kinds of reasons that sentiment might be going south, a lot of it being that people are expecting it to come much faster than it actually can. Early LLMs fed this in my view because at the start minor changes were big improvements. Going from an AI that could barely understand a simply question to one that can write an essay on a topic was quick, maybe 3-4 releases. If it takes 6-10 to get AI to get you a publication worthy book on the topic of the query, I don’t think that’s a problem for AI — which will eventually get there — though it probably means a much harder time getting funding to work on the next projects.

I'm firmly convinced it'll pop late 2026 (this year) or 2027. Could be wrong, could entirely be on point.

Suppose the current state of the industry sustains itself at equilibrium. I still think when you factor in all the costs AI entails, it can't license the claim that it's good for almost anything. AI makes so many mistakes that it actually reduces productivity because for every mistake it makes, it costs even more in time and resources to go back and fix; which is often greater than the associated costs of just doing it yourself. Humans are more productive than AI (incidentally this was proved by an analysis that was meant to refute that claim).

With LLM's it's error rate is always going to be the same no matter how much data it gets or at what scale. If you want AGI, you have to abandon LLM's because it's a straight up, dead end technology. It's use cases are small, narrow and mostly consist of merely baseline automation of tasks (hence, it's just a fancy autocomplete). They're unreliable and can be exploited. They don't think. They don't comprehend what they're doing. In fact, they're actually stupid. And worst of all, it can't be fixed. It just doesn't help things. Like, at all. Everyone is always saying forthcoming iterations will eventually solve all these issues but really, they won't. And there's no evidence of that.

The notion as well that AI is going to cut the labor market down is also false due to a basic rule in economics that's been understand since Keynes' heyday: if you double the productivity of your workers, the 'general' tendency isn't to fire half of your staff, it's to sell twice as much stuff. The fact that a lot of AI is also being sold way below the cost just to get market is an indication that it may not be cheaper even if and when they turn out to work. It isn't sustainable.

Shit's fucked up and it's going to be bad.

I’m not sure. Again the entire field is in its infancy. You’re probably right that LLMs are not by themselves going to be AGI. But creating a system with multiple systems run by an agent might be able to go farther in that direction than just LLM with agent.

A lot of those sources are written last summer, or last fall (in which case they'd likely be building on older observations). Anecdata: my company encouraged use of LLMs then. I found them totally useless in our not so easy codebase, shelved the thing and went on the manual way. At the time I'd probably have agreed with the vibe of your post. Then reading some hype about Gemini 3 in the winter I gave it another shot; models turned out to have got over some hump; and now they look like genuinely useful productivity tools.

I can believe LLMs will have a way harder time cracking law or medicine or mechanical engineering or whatever, but with coding you can come up with endless tasks that are sort of real-world difficult that you can beat the model against on giant server farms without zero interaction with the real world, the same formula that worked for AlphaGo, so stands to reason that they'd git gud there faster.

(incidentally this was proved by an analysis that was meant to refute that claim)

An entirely ai slop analysis.... proves nothing in my eyes

I’m not convinced it’s a bubble

My current layman's opinion is that the current environment is a bubble, but that bubble is entirely independent of the technology itself.

It's clear that at least some people, in some circumstances, are getting value out of the technology. It's not like NFTs, where even the best use cases are better served by simpler, pre-existing tech.

That said, the current economic environment is baffling to me. Every big provider is acting like this is a zero sum game where one company winning will give them a monopoly forever. They're also acting like the progress curve will produce exponentially increasing capabilities forever while operating costs approach zero.

I'm not sure if the market as it stands can achieve profitability that justifies the current AI company valuations if there are 3-4 winners instead of one. They're all priced with the assumption that one of them will utterly own the most transformative technology since the steam engine. If that's not true, people are going to start asking why they're not getting a 10% return on a company that has a 20x P/S ratio. Once people start asking that question, it's going to get uncomfortable for anybody that's not a monopoly already.

They're taking on significant debt, too. Take meta, for example. If just one of their data centers has a twelve month delay, that's a ~3% hit to free cash flow to service debt on an asset that isn't making any money. When was the last time that you saw a construction project more complex than a doghouse finish on time and on budget? Even if they finish construction, there are significant delays getting them powered, and gas turbines aren't a permanent solution. There's pretty enormous systemic risk there. Some companies are better equipped to handle it than others, but none of them are immune. Oracle, in particular, appears to be laundering questionable debt through their investment grade credit rating, which is unlikely to end well for them.

That said, even if Anthropic and OpenAI shit the bed and contagion through the bond market causes a market crash, and Google puts their research back on the shelf, LLMs don't go away. Local models exist. China is still plugging along with much more reasonable objectives.

I don't know exactly what the future holds, but either way, it'll have LLMs in it.

Phenomenal take. I largely agree, although things look very different depending on where capabilities stall out.

I just listened to an uncharacteristically poor quality (maybe just Gell-Mann) Odd Lots episode where the economist said essentially he didn't think it would be revolutionary but would add 2-3% to growth.

Having our economies double in growth would be insane, I can't wait

Did he mean it as "increase growth from 2% to 4%", or "increase growth from 2% to 2.04%"?

I think his verbatim quote was something like "it will add 2-3% growth"

I doubt he meant 2% * 1.02 = 2.04% as that's incredibly small and he was otherwise rather bullish, but maybe he did

My current layman's opinion is that the current environment is a bubble, but that bubble is entirely independent of the technology itself.

As another example of that, consider the dot-com bubble: the internet didn't go away when the companies failed.

The comparisons to the dot com and railroad bubbles concern me sometimes.

A railroad line can last centuries if properly maintained. Fiber has a 20 - 50 year lifespan. They were both totally usable by the time everyone finally got over the mania. I'm not sure the same is going to be true about GPUs. The data center physical structures will exist, and maybe the power infrastructure, but even the (IMHO optimistic) projections on GPUs show a 6 year depreciation schedule.

6 year depreciation schedule.

Is that because they break down, or because four cycles of Moore's Law means that the newer ones are 16x as powerful? I know that consumer-grade GPUs running in consumer settings with consumer duty cycles last for more than six years, but I don't know how well professional grade ones in a server farm running 100% of the time last.

If we stall at the current capabilities, that's one thing. If we go back down to 2020-ish levels of compute availability, that's something else.

If we use Bitcoin as a reference, they tended to crap out after about 3 years because of blown capacitors.

Maybe things have improved?

No, but most of the companies that went bust weren't ISPs but ancillary companies that had nothing to do with the Internet itself. Telecom definitely took a hit, but thatbwas due to optimistic demand projections that led to infrastructure build out that wasn't needed, not because they weren't charging customers enough. The current situation is like if they did what they did while offering everyone free access while undercharging people for faster connections. In any event, that build out was based largely on what the technology could already do, not what it theoretically might be able to do in the future. The money also wasn't nearly as much. The current situation is like if the ISPs were spending ten times as much money and were all unprofitable, and traditional telecom companies providing the same service were all losing money on it. In that case it's likely that Internet service would become hard to come by and expensive after the crash and it would have delayed the technilogy's adoption.

For everyone here, nut perhaps especially the AGI believers, have your feelings changed at all over the last few months?

Not really. This seems broadly in line with what I expected: getting genuine human-level intelligence that can apply to general, rather than specialised, tasks is tough and getting to super-intelligence is tougher still. I am not going to say they'll never get there, but right now they've hit the limits until the next breakthrough (which could be something totally different from what they've tried up to now).

More generally, yeah they need to start serving up some steak with that sizzle. Now they need to make money, so having got people/businesses hooked on AI for doing general tasks, it's the time to start slapping prices on this. You want to keep chatting with your AI therapist/boy or girlfriend? That's going to be a subscription rate every month from now on. You can't imagine your work life without AI to write your emails or vibe code for you? Then your employer needs to fork out for a licence, bunky.

There's no such thing as a free lunch. No more "even better model for even cheaper coming next week!" and if you're already using AI, then they'll try to lock you into whatever model that may be plus SaaS rates on top of that.

Even if AGI is actually possible with LLMs

I'm pretty convinced it isn't, based on a thought experiment I read about.

The argument goes basically like this:

Suppose you take the latest and greatest LLM and use it to generate a huge corpus of text and use that text to train a new LLM. And then repeat the process a number of times. Intuitively, it seems unlikely that the result will be any better than what you started with. And apparently both experiments and mathematics indicates that what happens is "model collapse," i.e. with each iteration the new model performs worse. Because you always lose a little with each iteration. Assuming that's all true, it follows that LLMs must be missing some essential attribute possessed by human brains. Because we apparently picked ourselves up by our bootstraps and created from scratch all the text which is used to create LLMs.

Anyway, it's just an argument I read and found to be persuasive. Feel free to correct me.

Another indicator that AI is a bubble

To me it's pretty obvious that AI is wildly over-hyped. But even so, the progress which has been made in the field is nothing short of astounding.

it looks like the capital needed to achieve it is drying up before it can be reached.

If nothing else, it's seems virtually certain to me that governments have realized the strategic implications of AI. Even without any private investment at all, the United States, China, and various other countries can throw quite a lot of resources at the problem.

For everyone here, nut perhaps especially the AGI believers, have your feelings changed at all over the last few months?

Not really, I'm still pretty confident that (1) within the next 10 years or so, we (humanity) will get to AGI; and (2) regardless, there will be huge changes to the world economy.

And apparently both experiments and mathematics indicates that what happens is "model collapse," i.e. with each iteration the new model performs worse.

Model collapse is not really a major concern. The original researchers in that paper trained small models on only AI outputs (of the previous model). Them being small models, they made mistakes and the mistakes compounded over time. It's more like a Chinese whispers experiment.

Big companies make great use of synthetic data and autonomous training, in addition to human originated data. For example, consider Deepseek R1-Zero, which was just trained on reinforcement learning, verified signals and not human reasoning patterns. It was kind of weird and switched languages a lot but it did work and got smarter over the course of training. In fact, all modern models are trained in this way. When Claude occasionally slips into Chinese for a single word it's not because any human ever does that in the training corpus, it's because during the training process they have them autonomously bootstrap and get smarter over time and that's just how it goes. AIs are omnilingual by nature it seems.

Model collapse is not really a major concern.

If you say so, I have no reason to doubt you. But what does that say about the thought experiment I proposed? Are you saying that potentially the 1000th model could be significantly better than the first?

Yes, I think so, provided you were doing the training in a sophisticated way rather than solely training on the outputs of previous models without grading for quality or accuracy. You could get AIs to review the data for example for any errors or issues or have them work out a testing suite to check if the data is right. Data quality is very important, that and the right RL techniques are basically the two key things you need most to get right.

Microsoft Phi trains just on synthetic data and is very cost-efficient, that was its primary goal, making a good very small AI that can run on most PCs. But they curated the data a fair bit to make sure it was good.

In principle I think you could do the same for big first rate AIs too. It's just that it wouldn't be efficient to leave out human data and human curation (it's there, why not use it, the competition will) and you want something humans enjoy working with and not a schizo-sounding model. It'd be like o3 at its most alien but more so:

https://arxiv.org/html/2510.27338v1

they soared parted illusions overshadow marinade illusions overshadow marinade illusions overshadow marinade illusions

Number of relevant organic products depends on whether both of!mena get.demoteudes someone and gem jer eats SAND the protonation-bids, leading possibly to three product calculation

Like wtf does that mean? Who knows? This is an artifact from inhuman RL processes. The inhuman RL processes work, that's why they're used.

I'm also convinced that LLM's aren't the path to AGI. They were grossly overrated from the very beginning. If you want real AI, you have to dump the snake oil. It's already empirically well adduced (1, 2) that LLM's can't get you there.

There's a 'lot' of things you need before you've actually got an intelligent machine that can think. It begins with constructing mental models. From there, you navigate those models in the imagination or perceptual space laid out to work out answers to questions and work out alternatives. Once you build those spaces (and in this case I mean building creative and entirely novel ones) and navigate them to accelerate anticipatory learning. Cats for example actually "learn" how to hunt by doing this. Once you've learned how to model spaces you can them move to modeling "systems;" and that's when you get to the point where it becomes possible to give AGI a theory of 'other' minds. And a "mind" in purely mechanistic and computational terms is simply another causal system; just like "spaces" and "systems" are particular causal systems.

Notice that's exactly what an LLM 'doesn't' do. If you take a look at Waymo's World Model for instance, this is exactly more along the lines for the correct pathway of approach that you need. When you're continuously inventing new models of imaginary environments, you begin to build the skillsets that slowly become applicable to the real world. When it can do that, that's more along the lines of where scaling becomes effective. Nothing like Sam Altman's idea of where it's relevant. When you've got to that stage, AI can then begin to model it's own causal system to think about it's own thinking such that it's capable of asking itself when it's wrong; or how to stack a particular sequence of events to achieve a desired end result.

Incidentally this is the exact pathway natural selection determined for human beings and I'm thoroughly convinced it's the 'only' way to get AGI. This is what human beings fundamentally are on the naturalist paradigm: models and model builders that also navigate and move about in those models. There's zero evidence that I've seen to indicate that the money flowing into AI at the present moment is traveling down that research pathway and make no mistake, eventually the supply of it is going to run out. But make no mistake. A 'lot' of rich people are stupid, so it's doubtful it'll ever go to the real thing. They'll throw it all at the next bullshit snake oil, get their fiscal bailout and blame it on immigrants on something. Presently I'm not left feeling very optimistic about the current state of the industry. It's incredibly destructive to the environment, it dumbs down the human intelligence, and it hasn't even been proven to even work. Why is the “world” so excited?

What made me even more dubious about the entire grand project than I had already been was the news that now they were generating their own data to train models on. We've scraped every single bit of text produced by humans in all of history to date (ahem ahem I take my leave to doubt that, what you mean is 'we've scraped all the available English language text online') and now we need even more to feed the gaping maw of Behemoth, so now we have to invent our own synthetic text generated by AI.

Do they not remember "garbage in, garbage out" or, indeed, Flanderization? Generating your own synthetic data off sythetic data and using that to create more synthetic data and synthetic demographics is getting further and further away from reality, then some poor fool uses the conclusions your AI served up so prettily to make real world decisions and it turns out that in fact 15-24 year old mixed race lower middle class exurban teenagers with sports scholarships do NOT want to wear pink clamdiggers topped off with stovepipe hats. That's your entire chain of stores' summer wear stock now useless even on sale.

While I share your skepticsm, this Dineen makes intuitive sense. But given that all the labs seem to be doing this and are not super concerned about it, I assume it serves it's purpose.

Mythos is a big boy, I imagine there's lot of synthetic data in there and it seems to be working

Intuitively, it seems unlikely that the result will be any better than what you started with. And apparently both experiments and mathematics indicates that what happens is "model collapse," i.e. with each iteration the new model performs worse.

Yes, this follows from data processing inequality.

Assuming that's all true, it follows that LLMs must be missing some essential attribute possessed by human brains. Because we apparently picked ourselves up by our bootstraps and created from scratch all the text which is used to create LLMs.

No. It applies just as well to humans. And humans did not build a civilization by thinking really hard at a corpus of word sequences. Oh, we tried this too, to an extent, and got wonders like Sophistry, Rabbinical Judaism, Medieval Scholasticism, Marxism and Rationalism. But we mostly progressed by receiving environmental feedback, filtering the generated data and preferentially training on validated fraction. Similar logic can be applied to LLMs (or any ML artifacts). This is why the basic trick of the current paradigm is RLVR (reinforcement learning with verifiable rewards). You finetune a model on successful trajectories, then you give it tasks and update towards policy that has generated correct conclusions. The primary source of updates is the model itself, steered by an external verifier. In principle they can do this fully autonomously, by building an ontology of possible tasks that can be algorithmically verified, coding these verifiers, and generating (eg relying on web search) queries against these tasks.

Even under very rudimentary realistic assumptions, generated data improves model performance.

Marxism and Rationalism

I don't understand how these fit into the category like the religious examples

Sophistry is not religious either.

I don't want to make a definition for this category because it's very loose, but basically it's "attempts at recursively improving your understanding via introspective self-play starting from a given set of verbal premises, without any significant role for procedures of updating on empirical, physical evidence".

One can see how this might well work in fields which really don't need an empirical physical component, such as math. Physics can inspire new subdomains of math, but strictly speaking we don't need this. An AI could train on its own data (+ easy verifiers and just corrected majority voting), entirely autonomously, to become an ever stronger mathematician.

No. It applies just as well to humans. And humans did not build a civilization by thinking really hard at a corpus of word sequences. Oh, we tried this too, to an extent, and got wonders like Sophistry, Rabbinical Judaism, Medieval Scholasticism, Marxism and Rationalism. But we mostly progressed by receiving environmental feedback, filtering the generated data and preferentially training on validated fraction. Similar logic can be applied to LLMs (or any ML artifacts). This is why the basic trick of the current paradigm is RLVR (reinforcement learning with verifiable rewards). You finetune a model on successful trajectories, then you give it tasks and update towards policy that has generated correct conclusions. The primary source of updates is the model itself, steered by an external verifier. In principle they can do this fully autonomously, by building an ontology of possible tasks that can be algorithmically verified, coding these verifiers, and generating (eg relying on web search) queries against these tasks.

Sadly, I do not understand this. Would you mind giving me a concrete example of the RLVR process you refer to?

Question: What is 2 + 2

Model: Hmm, that’s 2 and then another 2, so 22.

AUTOMATIC VERIFIER: WRONG

——

Model: Hmm, that’s the sum of 2 and 2, so 4

AUTOMATIC VERIFIER: CORRECT.

The model is tweaked slightly to make the second output more likely, and that output is potentially added to the training set. Repeat for arbitrarily complex mathematics and other problems as long as the solution can be verified, even if it isn’t known in advance. In this way you can generate potentially infinite amounts of data, albeit limited to certain domains. However, problem solving ability has so far extended quite well to other domains even when trained in this manner.

Question: What is 2 + 2

Model: Hmm, that’s 2 and then another 2, so 22.

AUTOMATIC VERIFIER: WRONG

——

Model: Hmm, that’s the sum of 2 and 2, so 4

AUTOMATIC VERIFIER: CORRECT.

The model is tweaked slightly to make the second output more likely, and that output is potentially added to the training set. Repeat for arbitrarily complex mathematics and other problems as long as the solution can be verified, even if it isn’t known in advance. In this way you can generate potentially infinite amounts of data, albeit limited to certain domains. However, problem solving ability has so far extended quite well to other domains even when trained in this manner.

Generally speaking, how does this "automatic verifier" work? Obviously I am not an expert but it seems like this automatic verifier would require human level intelligence.

In this toy case it's just literally a calculator (a snippet of python code). The problem is 2+2, the calculator just does 2+2 and checks if the answer is the same as the LLM output. (The LLM is trained to format the final answer in a particular manner and wrap it with special tokens, so the verifier doesn't have to be able to interpret natural language.)

You can get surprisingly far with this. If it's a calculus question, you can use an automatic differentiator to check it. Likewise for factorisation questions, metric conversion questions, algebraic manipulation of formulae, etc. you put a little work into programming the automatic verifier and you can get an infinite number of problems.

If you're a big company, you might have human domain experts doing some of this work too. If you're a smaller company you have a big LLM do verification for the smaller ones.

Then you have leetcode and programming problems, and again you can verify these automatically. Does the program compile? Is the program output what was requested? Is it faster than the previous solution?

Like I said, this only works for maths, programming, and other domains where you can verify the answer with a computer relatively cheaply, but contra the model of multiple intelligence factors, heavy training on maths and programming seems to improve general intelligence and reasoning quite well.

Like I said, this only works for maths, programming, and other domains where you can verify the answer with a computer relatively cheaply,

This is what the armies of Kenyans are for. I'm actually surprised progressive libs don't use "muh mechanical Turk sweatshop" as an anti AI talking point.

Also I think in some ways what they use the thumbs up/down for? I saw people saying the sycophantic behavior of the 4o era was people love being glazed and thumbs that up a LOT so it crept in.

In this toy case it's just literally a calculator (a snippet of python code). The problem is 2+2, the calculator just does 2+2 and checks if the answer is the same as the LLM output. (The LLM is trained to format the final answer in a particular manner and wrap it with special tokens, so the verifier doesn't have to be able to interpret natural language.)

You can get surprisingly far with this. If it's a calculus question, you can use an automatic differentiator to check it. Likewise for factorisation questions, metric conversion questions, algebraic manipulation of formulae, etc. you put a little work into programming the automatic verifier and you can get an infinite number of problems.

If you're a big company, you might have human domain experts doing some of this work too. If you're a smaller company you have a big LLM do verification for the smaller ones.

Then you have leetcode and programming problems, and again you can verify these automatically. Does the program compile? Is the program output what was requested? Is it faster than the previous solution?

Like I said, this only works for maths, programming, and other domains where you can verify the answer with a computer relatively cheaply, but contra the model of multiple intelligence factors, heavy training on maths and programming seems to improve general intelligence and reasoning quite well.

Thank you for the explanation. My instinct is that even with this type of training, LLMs will still be missing something essential, but I will give it some thought.

My instinct is that even with this type of training, LLMs will still be missing something essential

Your instinct is probably correct IMO. This form of synthetic data generation is just another tool in the box, it's not the key to everything.

I will say that we've got far further than I ever expected us to get using these methods. I'm instinctively a Gary Marcus-style fan of embodiment and unsupervised learning, it seemed clear to me pre-LLM that models wouldn't be able to be anything resembling intelligent without a body and the ability to interact with the real world and 'test' their understanding in real time. When LLMs came in, I felt I had to admit that I'd been wrong. It seems clear to me that we have managed to get to something I would call 'intelligence' (even if it's spiky and fails in some cases where humans would not fail) through these means. So I no longer trust my instincts as much.

This kind of semi-supervised exploration seems like a good compromise for now. I am also very interested in LLMs that can combine next-token video generation and text generation, because video generation requires understanding a bunch of stuff about the real world in order to produce consistent results, but that's a way off.

We formulated our understandings of the world and our interactions with it into techniques and theories, and when we build stuff we do so by employing those techniques and theories from a standpoint of engineering and design. LLMs are merely next word generators. They can recall many of the things in their databases and expurgate them to us, but their outputs aren't the products of strategically employed techniques and theories. This is inherently limiting for the complexity of the outputs they can give us.

I don't understand this claim. Who "we"? Most people learn almost everything they know about economically valuable complex domains from textbooks, manuals, teacher's answers and such second-hand information, and then polish it with on-site instructions and increasingly long-range, open-ended training. They don't build much in the way of their own "techniques and theories" and there's not a world of difference from what LLMs now do. Maybe you're overestimating how much they depend on pretraining at this point. Well, it's believed that >50% of compute in some of the last-generation models goes towards RL, not pretraining on human data.

And as I've said in the opening post: we have literally just seen an LLM employ a technique no human mathematician had thought of using in this specific context, to solve a problem that had remained unsolved since 1968 – over half a century! It wasn't some Riemann hypothesis tier challenge, but it wasn't exactly obscure either, smart professional mathematicians had been working on it for years before GPT 5.4 Pro came and did this. Moreover, GPT does this reliably. In the comments you can see Terence Tao, arguably the guy with the greatest knowledge of "techniques and theories" of math on the planet Earth, an expert of such level that he actively avoids getting roped into solving other people's frontier research level problems, seriously engage with GPT's work:

Thanks! So there does seem to be something special about the original von Mangoldt process - the associated invariant measure ν is extremely smooth (in the Archimedean sense), being asymptotic to 1/nlogn , while all the variants of this measure pick up arithmetic factors such as 1∏pvp(n)!

  • A little surprising to me that removing individual primes instead of prime powers makes it less likely to have prime multiplicity, but I'll chalk it up to one of the numerous probability paradoxes that arise when one tries to compare various weighted expectations. But these factors mean that one cannot immediately solve #1196 by using these processes instead of the von Mangoldt one, as the invariant measure is no longer asymptotic to 1/nlogn
  • So in some sense the AI was "lucky" in finding the one approach that actually worked; it would be interesting to publish the traces to see if there was a lot of brute force involved in trying nearby approaches which didn't quite work.

……

Arb Research has kindly shared with me ten separate runs of GPT 5.4 Pro on this problem #1196 (with a request not to use internet search). From a quick reading, it appears that 8 of them claimed successes, with the other 2 rating the claim as plausible. Interestingly, several of the successful runs actually obtained the sharper formula ∑n≤Aν(n)≤1 that was also derived here, with ν essentially the Mellin transform of 1/ζ(s)

  • Almost all of the runs latched on to the approach of constructing a random chain with a good hitting probability (many runs referred to this as the "Lubell method", after the Lubell of the LYM inequality).

Another notable fact is that none of the runs highlighted the von Mangoldt process that was a prominent feature of the original run (and none of them mention flow networks either). Runs 4 and 7 have an interesting alternate construction of the upward divisibility chain in terms of exponential clocks in the prime factorization indices that actually looks rather tractable to work with; I will need to study this construction further when I have more time.

Basically it seems that for this particular type of problem there are several natural ways to proceed that make the problem actually quite tractable; the literature had managed to focus on a somewhat suboptimal approach in which the opening move was to transfer the problem to a continuous setting, but the AI runs consistently stayed in the discrete world and managed to utilize various existing tools from discrete mathematics (mostly centering around methods relating to the LYM inequality) to reach a solution.

So I don't know. Where's this inherent limit on complexity that you're talking about? What in our culture is truly irreducibly complex, if not math that can surprise Terence Tao?

This is getting a bit comical, don't you think?

I must differ here as I do not see evidence (in domains I'm able to judge) of AI employing techniques and theory in its tasks. Ask it to mimic Stephen King and then compare the output to actual Stephen King. You'll understand what I mean.

I cannot speak to math here as I lack competency in that. But from what I hear from coders, its similar in that domain as well: AI can expurgate volumes of legible code, but it cannot utilize structure.

Humans have techniques and theories which inform their decisions high and low as they layer things together using judgement, intuition, etc., while AIs appear to generate text using probabilistic hacks. AI appears to be able to recreate low-complexity patterns from its dataset. I disagree that these processes are related except at a very basic level.

We have a good idea of how to train AI to solve mathematical problems, of virtually unbounded complexity. In the course of this, AI clearly learns "techniques" as shown here, if not "theories". I don't think King's prowess is theory-driven either, but in any case we don't have a good idea of how to train AI to be a good prose writer. We have some ideas, but are unlikely to act on them. There's not much money to be made in it, and plenty of highly motivated enmity – AI is already widely hated. and yes, autoregressive generation for the prompt "write like King" is not like King actually writing a novel. We have such tricks though.

My point is, it's not a general principle that AI will only rehash human techniques in some uninspired "probabilistic" way. If there is a hill to climb, such that "good" and "bad" outputs with regard to the problem statement can be distinguished, AI can bumble its way up the hill and also find new tricks. We've seen this before LLMs, with AlphaGo and move 37, we're starting to see it with LLMs.

while AIs appear to generate text using probabilistic hacks.

Human mind runs entirely on probabilistic mush. Neural networks were invented as approximation of our own approximate learning. But probabilistic decision processes can have clear enough decision boundaries that they become able to operate with "abstractions", "symbols" or "theories". They also remain able to fail. For example, you are failing to update on evidence, because you haven't been trained to take input like "Terry Tao is surprised" seriously and think it's infinitely less interesting than your preconceived notions, basically some dweeb noise. Unlike an LLM, you can update at lifetime, so maybe you'll reread the above post and see how it contradicts your position.

This is getting a bit comical, don't you think?

Seen on X:

"As the Earth is being disassembled:

"Guys, stop over-reacting! The concept of a Dyson Sphere was already in the training data!"

Heh. See, the AI making that Dyson Sphere doesn't have general intelligence, I bet it can't get the Wordle 6 days in a row like me.

Because we apparently picked ourselves up by our bootstraps and created from scratch all the text which is used to create LLMs.

Or it holds for human brains, but we train on something higher than our text, and so LLMs are upper bounded by us if they train on our text, which is like our cognitive discard.

Suppose you take the latest and greatest LLM and use it to generate a huge corpus of text and use that text to train a new LLM. And then repeat the process a number of times. Intuitively, it seems unlikely that the result will be any better than what you started with. And apparently both experiments and mathematics indicates that what happens is "model collapse," i.e. with each iteration the new model performs worse. Because you always lose a little with each iteration. Assuming that's all true, it follows that LLMs must be missing some essential attribute possessed by human brains. Because we apparently picked ourselves up by our bootstraps and created from scratch all the text which is used to create LLMs.

Where did you hear that anyone is proposing to reach AGI via LLMs by training LLMs on their own generated output? That's clearly dumb and not what people propose. The model has to interact with something real, it has to "touch grass", for it to work. That's the external information. For example a coding LLM can get an informative learning signal by running its generated code through the compiler and running tests and seeing if the resulting program compiles, passes the tests, uses less RAM or is faster, etc. I'm not saying that leads to AGI, but there are clearly ways to obtain information from the outside world, and it's not just about sewing a pipe from the LLM's ass back into its mouth.

Where did you hear that anyone is proposing to reach AGI via LLMs by training LLMs on their own generated output? That's clearly dumb and not what people propose.

I presented the idea as a thought experiment, not as an actual proposal.

No, you presented it as a conceptual proof that LLMs will never get better. All it takes is one innovation that addresses your concern about recycled data to make it invalid. All arguments about intelligence are necessarily a bit wishy-washy, mind you, so I'm not saying your thought experiment is useless.

I think if you really want to argue that LLMs have an inherent cap on their capability, you should address their actual algorithm rather than how they're trained. However much we rejigger them with CoT thinking and non-text data sources, they're fundamentally not designed for anything more than next-token prediction. It should be a source of constant surprise that they do so well on such a wide variety of non-creative-writing tasks (look at early SSC posts about GPT3's output to see this surprise evolve in real time). You could argue that if LLMs end up hitting a soft or hard limit, that's really just the "surprise" petering out, that we really can't just take a glorified text completer and keep pumping neurons into it until it's a genius.

I don't personally believe this will happen, but hey, I don't think anyone really knows for sure.

No, you presented it as a conceptual proof that LLMs will never get better.

Umm, no. In fact I totally think that LLMs will get better.

I presented it as a thought experiment to show that LLMs seem to be missing some essential attribute possessed by human brains.

ll it takes is one innovation that addresses your concern about recycled data to make it invalid.

Yes, of course. Well, perhaps more than one innovation. But yes, if LLMs are missing something important; and we create LLMs 2.0 which include that important thing (or those important things), then yeah, we'll have AGI.

Because we apparently picked ourselves up by our bootstraps and created from scratch all the text which is used to create LLMs.

This is clearly proof that the Saurian Overlords of Agatha in the Hollow Earth taught us language.

More seriously, isn't there a lot of research going into using synthetic data safely? I thought that the current consensus was that you can avoid model collapse with synthetic data if it's properly labeled as such.

More seriously, isn't there a lot of research going into using synthetic data safely? I thought that the current consensus was that you can avoid model collapse with synthetic data if it's properly labeled as such.

I have no idea, but intuitively it seems to me that training with synthetic data is something that can't possibly work. To be sure, I am neither a mathematician nor a computer scientist. But as I understand things, the basic operation of an LLM is to predict the most likely words to follow a string of words. Which is done by training a neural network on lots of text. It's difficult for me to see how an LLM could get better at predicting words if it's trained on its own output. It seems to me that if you created an LLM using a corpus of synthetic data, the best you could realistically hope to do would be to reverse-engineer the original LLM which had created the synthetic data in the first place.

Anyone who is a subject matter expert, feel free to correct me.

I have no idea, but intuitively it seems to me that training with synthetic data is something that can't possibly work.

I agree with you intuitively. However large amounts of Serious People are spending large amounts of Serious Economic Resources to do literally just that. So they clearly see something there.

Mythos seems to be yet another OOM of compute and training and we can be sure a good % of that was synthetic at this point as they already ate the entire corpus of human writing a while ago

I agree with you intuitively. However large amounts of Serious People are spending large amounts of Serious Economic Resources to do literally just that. So they clearly see something there.

Part of me wonders whether this might be some kind of mass delusion and/or grift. It wouldn't be the first time something like that has happened. That being said, I am not a subject matter expert and haven't studied these issues carefully so I couldn't really say.

I think mods should intervene… somehow, because these posts are getting too frequent, too obviously agenda-laden, and aren't even remotely about the culture war (though AI discussion as such is necessary). It's becoming one guy's AI Bad blog.

Look man, it seems that the Opus 4.7 tokenizer change functionally amounts to them forcing each whitespace be a separate token rather than part of any subword, removing all whitespace-containing subwords from the vocab; it does not change the compression rate for whitespace-free languages. I do not know why Anthropic did that, but my hypothesis is that they've found in experiments that this is better in some valuable scenarios, such as related to analyzing code for vulnerabilities; trained Claude Mythos with it; and now are pushing Opus further via distillation from Mythos (this is suggested by it being weirdly different, and them saying they now focus on GraphWalks, which Mythos is doing really great on, for evaluating long-context performance).

For logprob distillation, you ideally need identical vocabulary (there are copes for inter-tokenizer logprob matching, but better just change the student model's tokenizer and heal it).

As a datapoint in the timeline of AI progress, it's a total nothingburger, a non-news.

Anthropic's move here (combined with them handicapping Opus 4.6 a few weeks ago) seems to clearly be an attempt to achieve profitability.

Do you realize that while this is bad for users, it's not that good for Anthropic? The compute and memory cost per a sequence of 1 million tokens is the same whether these tokens encode 1 million or 500 thousand English words. It doesn't improve the profit margin. Of course, now that everyone's codebase is functionally like 40% "larger", they are selling more tokens to their captive clientele for each plaintext-identical request. But this would be such an awkward growth hack. And on Claude Plan, cache is free anyway, so their margins could even shrink.

For everyone here, nut perhaps especially the AGI believers, have your feelings changed at all over the last few months?

Yes. After GPT 5.2 I've become a bit paranoid that we will have AGI before 2028 and are totally unprepared. Recent events such as GPT 5.4 autonomously solving Erdos #1196 with a trick that no human mathematician expected corroborate my feeling.

I agree that his posts are far below the average level of quality here, but they're not THAT frequent, and I wouldn't want them to be modded. This is supposed to be a free speech forum, after all. Our whole thing is that the good ideas are supposed to win over the bad ideas, and fortunately he seems to be getting plenty of pushback. And there are some decent debates happening in the replies, it's just that they're despite OP, not because of him.

After GPT 5.2 I've become a bit paranoid that we will have AGI before 2028 and are totally unprepared.

What is AGI? Will it cure blindness and reverse aging? What about GPT 5.2 made you think we're 2 years away from that?

I don't pretend that AGI is some clean concept. For me, it means a very banal thing: an AI that can reliably replace a human worker. "I can point an LLM at a knowledge work task and it'll do it". Or at least: it'll commit to an honest humanlike attempt to do the job, it won't run out of context length, won't hallucinate something superficially related, won't trip on its own shoelaces; it'll reason about the problem, identify what it lacks, collect the necessary data, maybe do some trials in a scratchpad of sorts, consistently orient towards truth and common sense, do its best and then admit to me if something was still genuinely beyond its ability.

5.2 was a very big jump over 5/5.1 and it showed, in my opinion, a very powerful awareness of problems, an ability to contextualize and deconstruct them. 5.4 and the upcoming 5.5 clearly continue this trend. They've figured something out and I believe it's on the path to AGI as defined above, modulo technological details that seemingly won't be a long-term blocker.

Will it cure blindness and reverse aging?

Will anything? Will human scientists? I don't know. Plenty of things that human-level intelligence has so far proven unable to solve. But so long as science is knowledge work, yes I expect AGI to do it at least as well as we do.

I have the unpopular (and, ok, partially tongue-in-cheek) position that we've already hit AGI. What LLMs can do is already very general, just not fully general. But I wish it was emphasized more that we messy meaty humans don't have fully general intelligence, either - it doesn't matter how you bring up a precocious child, they're not going to be able to rotate 50-dimensional shapes or approximate partial differential equations in their head, and all but the best of us max out at fluency in a few languages, or memorizing a few thousand digits of pi. We're just so used to the things we (and everyone else we've ever known) can't do in our heads that we intuitively don't even think of them as tests of "intelligence".

Someone from the early 2000s, having LLM capabilities described to them, would indeed think that it meets the definition of general intelligence. What we kind of subconsciously expected, but didn't happen, was that someone would just suddenly launch an AI product that lit up a giant neon sign saying "AGI ACHIEVED!". Instead, the AI we've developed so far just turned out to have a different set of strengths and weaknesses than us. By the time we're able to bring those weak points up to human level - i.e., where an AI can perform equally well as an average human on any task, which is what a lot of people think of when they say "AGI" - it'll actually be vastly superhuman in the things that come naturally to it. (LLMs are already superhuman on language comprehension, after all.)

I agree, according to any pre 2019 definitions LLMs would 100% be AGI! It’s funny how the goalposts were immediately moved the moment we achieved it, probably because LLMs didn’t fit into our sci-fi preconceptions of how an AI should behave or suddenly “awaken”, and their strengths and weaknesses are completely different from humans, ordinary software, or stereotypical science fiction robots.

In fairness, the goalposts were moved because we realized LLMs couldn't do certain AGI things despite passing the "AGI" tests.

For example, they can pass a Turing test consisting of a independent questions with short answers, but could never pass a "Turing test" over years, because they have limited context windows (and even with tools and a filesystem, too many things change for them to store and organize). They've effectively passed ARC-AGI 1 and ARC-AGI 2, but not yet ARC-AGI 3, while a median (from their tests) human passes all (play it yourself).

They'll be "true AGI" when we can no longer create (non-physical) tests they don't immediately pass.

Although I agree with SnapDragon that they're "partial AGI". I believe the missing component is continuous learning: they start output like a human, as they've been trained to, so if they continued to be "trained" on their observations, presumably they'd continue to output like a human.

In fairness, the goalposts were moved because we realized LLMs couldn't do certain AGI things despite passing the "AGI" tests.

Yeah, no argument here. Like you said, it's kind of natural that we adjusted our expectations as we learned more about the nature of intelligence (now that we have more than just one kind to generalize from). We sort of assumed that a lot of other human-like capabilities would necessarily come along for the ride when an AI passed the Turing Test, and that was wrong.

Just as long as we don't keep using "it's not true AGI!" as a cognitive stop sign to avoid recognizing the incredible progress we've made.

Although I agree with SnapDragon that they're "partial AGI". I believe the missing component is continuous learning: they start output like a human, as they've been trained to, so if they continued to be "trained" on their observations, presumably they'd continue to output like a human.

Indeed. I've heard of efforts to graft a learning layer onto LLMs (with a "memory" that's an embedding rather than just CoT text), but obviously it hasn't worked so far, and maybe it never will. Also that still seems like a short-term solution.

People are shitting on the fact that 35% more token usage means they have to pay more for the same text. I'm pissed off by the fact that 35% more token usage means the amount of actual text/code rather than tokens before Opus 4.7 needs to compact is down by around 25% meaning more frequent compactions and worse long context performance.

Just give us back release Opus 4.6, that was a great model.

I think mods should intervene… somehow, because these posts are getting too frequent, too obviously agenda-laden, and aren't even remotely about the culture war (though AI discussion as such is necessary). It's becoming one guy's AI Bad blog.

I made 3 top level posts about it over the course of about two weeks, I hardly think that's excessive. And I'm still commenting on plenty of non-AI topics, I'm hardly a one-trick pony like some other posters. But if you think it's an issue, then feel free to report me.

Compare that to some posters who seem to make every post about the joos

the joos

Who do you think makes the AI

Could you at least add more substance than "Opus changed tokenizer, therefore, as I've already said, AI is a bubble"?

Opus is increasing end-user costs, not just changing the tokenizer, and that's the part I said is indicative of a bubble because it looks like they're finally needing to squeeze a profit out of customers instead of subsidizing their usage with VC money. That's the crux of my argument, not just "tokenizer chaged -> bubble." If you're not going to even try to summarize my argument in good faith I see no reason to engage with you.

I also talked about changing sentiment in the tech community, which you completely ignored.

Opus is increasing end-user costs, not just changing the tokenizer

Is it? Is it now? For example, on this bench Opus 4.7 is almost as strong as Opus 4.6 but 8.3x cheaper, because it uses vastly fewer tokens. How does this fit into your theory?

Given that pretty much every single model trains against the benchmarks, I'm going to go with the end users on HN reporting running out of tokens way faster (for the same sorts of tasks they were doing on 4.6) over another synthetic benchmark.

I think mods should intervene… somehow, because these posts are getting too frequent, too obviously agenda-laden, and aren't even remotely about the culture war (though AI discussion as such is necessary). It's becoming one guy's AI Bad blog.

I could name half a dozen topics that come up again and again, sometimes in tedious fashion, and sometimes by a few individuals who post about little else. Generally speaking, we don't "intervene" because someone is tired of topic, or even because we are tired of a topic.

And everything is "obviously agenda-laden" to people who have an opposing viewpoint.

If you don't like a post, you can ignore it or respond to it. You can even report it if you genuinely think it violates the rules. (Most reported posts are not violating the rules, they are just violating the reporter's sensibilities.)

(Most reported posts are not violating the rules, they are just violating the reporter's sensibilities.)

An issue is that people complain about posts and moderators say "you complained about it, but nobody reported it". This encourages over-reporting.

Rarely do we say "You're right, that post should have been modded but we didn't notice it because no one reported it."

Instead, we tell people not to publicly demand someone be modded or attack them, but simply report the post if they think it warrants it.

People do over report, but that's because they use the report button to mean "I don't like this." We prefer that to public callouts, but people should really just let things go if they're mad at what someone wrote unless it was truly a bad post. ("Bad" in the sense of being against what the Motte is intended for, not bad in the sense that you don't like it.)

OK but do you agree that "Anthropic has slightly altered their tokenizer in a 0.1 update for Opus" is not really "controversial issues that fall along set tribal lines"? Which tribe has a strong position on Anthropic's tokenization design choices?

You underestimate how easy it is to turn any random technical issue into a heated controversy.

Not every CW post has to fall strictly along tribal lines.

I suggest making use of the scroll button rather than demanding a Motte precisely curated to your tastes.

What about the idea of making a separate thread? I'm very interested in AI, but it's a poor fit for the Culture War Roundup. If "Transnational Thursday" and "Tinker Tuesday" can get their own weeklies, surely this deserves one, too? We just have to decide on an alliteration! (Claude recommends either "Machine Monday" or "Singularity Saturday")

Quarantine threads are where discussion goes to die.

I'm still a fan of Butlerian Jihad General

OK but do you agree that "Anthropic has slightly altered their tokenizer in a 0.1 update for Opus" is not really "controversial issues that fall along set tribal lines"? Which tribe has a strong position on Anthropic's tokenization design choices?

What an excellent and totally accurate summarization of my post and the argument I was trying to make in it.

I think mods should intervene

Another call for a recurring Butlerian Jihad Roundup, so AI/tech drama doesn’t detract (or get detracted by) Trump/woke drama.

AI/woke drama is my favorite combo but sadly not present here

The most plausible reason for changing the tokenizer is that a more fine-grained tokenizer increases model performance, at the cost of more compute per token (we're breaking up the same input into more tokens). My understanding is that you don't even need a new base model to do this, and that the gains are particularly pronounced for arithmetic and coding. It's not a free lunch, but there are pros and cons that don't just amount to Anthropic nickle and diming their customers.

Even if AGI is actually possible with LLMs (or at all, but I'm not trying to start a discussion on metaphysics here), it looks like the capital needed to achieve it is drying up before it can be reached. Anthropic's move here (combined with them handicapping Opus 4.6 a few weeks ago) seems to clearly be an attempt to achieve profitability. The free/subsidized rate train for end users has pulled into the station, and now you have to pay more for the same (or worse) capabilities you were enjoying before.

Anthropic is, by far, the most compute strapped frontier LLM company. They are also not the only frontier LLM company. Until at least Google and OAI engage in the same putative enshittification (which I am far from sure is even happening wrt Anthropic), then you're kinda jumping the gun here.

Google and OAI have already engaged in the enshittification. Their latest models (apart from GPT-5.4, which is genuinely a good model) hallucinate in ways the earlier offerings 6 months ago weren't doing.

I haven't noticed that, and I do use all of them regularly. If you have some kind of formal benchmark to point at, I'd be more receptive.

Interesting, I don't have any formal benchmark, much like we don't have any formal benchmarkes for the Opus 4.6 degradation beyond people (and AMD) complaining, but that's very much the impression I personally get from using Gemini 3.0+ and ChatGPT (before 5.4).

We need to distinguish between 'capital needed to achieve it drying up before it can be reached' and 'demand is so high that they have to ration resources'.

They kind of look the same but the underlying meaning is different. The former implies the Bubble is Popping whereas the latter implies It's Not a Bubble.

Firstly, I don't think the capital is drying up. Hyperscaler AI infrastructure spending rises year by year. Secondly, demand is huge. Anthropic ARR is now at $30 billion ARR (by their figures, though OpenAI says the real figures should be a few billion lower, depending on how you measure revenue shares). Whichever way you look at it, huge demand growth. $87M annualized run-rate in January 2024 → $1B by December 2024 → $9B by end of 2025 → $14B in February 2026 → $30B in April 2026 is pretty impressive, even if its juiced.

Clearly they're getting lots of demand. There are also issues with slow datacentre rollouts and delays due to the absolute state of Western electricity and construction sector. I think the phenomenon we're absorbing is rooted in high demand, not investors getting antsy and demanding higher returns.

Whether a random patched version is a significant upgrade is hardly strong evidence in any direction whether I is a bubble. Did you ever try my suggestions under your last fud post?

Did you ever try my suggestions under your last fud post?

Came down with a cold, missed work for several days, and forgot. Sorry! I'll try to remember this week.

It's not about profitability, it's that they got a giant wave of users but not enough compute to fill that demand. So, it's pretty obvious what must happen next, you do some mix of increased mandatory token efficiency (adaptive reasoning) + stricter limits (across the board, free and paid, but mostly targeting the super-user hogs who theoretically will pay for extra API usage after limits run out).

I will say though this probably bodes poorly for Claude in the near-medium term, because ChatGPT had the same thing more or less happen with their 5.0 launch (forced adaptive model selection for mandatory token efficiency) and it definitely took the wind out of their sails for at least 4-5 months.

At any rate, however, I strongly, strongly disagree about this empowering the skeptics (or being evidence of a shift against AI adoption). The fact that people are whining about problems with their tools is selection bias. It's kind of like the classic armoring spots on the airplane that didn't have holes (because they didn't survive to be examined), in that people wouldn't complain so vociferously if they weren't so needy for the tool in the first place. The complaints to me are evidence of a generalized latent enthusiasm, not pessimism. In the grand scheme of things, it's far, far better for a company to have complaints that users can't get enough of their product, than it is for the product to be simply ignored. In the near term, I expect a decent chunk of users to swing back toward the OpenAI offering, Codex (which is undergoing a PR blitz of sorts right now)

I’ve found Opus 4.7 to generate better and more human-like text vs Opus 4.6 for my purposes, but I can’t indicate whether it’s any better at coding. I use a mix of LLMs for various things, and my feeling is that ChatGPT is more bland and LLM-y in its output, but much more generous with usage limits. In the limited coding I’ve done, I haven’t seen much of a difference between them. ChatGPT’s image generation model is also nice, as far as my amateur impression can tell.

But it’s a constant fight with the usage limits on Claude, whereas ChatGPT feels like it flows freely. My current pattern is to default to Chat for most informational and coding purposes and bring out Claude Opus for when I want a more thoughtful analysis of something. I don’t know how Sonnet compares to ChatGPT.

Gemini feels massively behind in both usability and tooling, and its integrations with third parties are only good for Google products.

TracingWoodgrains has been a fan of Opus, and seems a little frustrated by 4.7. That said, it may depend on your use case.

I'm generally not that surprised if there are occasional stinkers. I've given specific caveats around other vendors : it's just too easy to benchmax or find a bad local maxima such that there's some minor revisions that either don't have any benefit, or only have backend benefit. Repeated problems or broader-scale issues would say more, but there's been a number of surprisingly good models from other vendors recently, including small-parameter and open-model approaches.

I'm skeptical that LLMs are themselves enough to go to AGI, but I'm also skeptical that they're going to stop at exactly last month's level of capability, and last month's capabilities included solving some Erdos problems. There's a lot of low-hanging fruit just in terms of UI and process tooling, nevermind areas where we haven't applied existing tools.

That said, I recognize that a lot of the major AI vendors have ranged from scumbags to scammers. Altman's ridiculous behaviors, especially in relation to RAM, have made the most enemies (maybe even more than Musk's more conventional culture war), but the best PR the whole faction has got has come from anti-AI people, so that's a whole big mess.

LLM’s are highly unlikely to get us to AGI. It’s the wrong architecture for getting there, period. I’ve continued to play around with Gemini and some other models here and there and while it can do some things that I think are cool and novel, my biggest surprises have come from its inability to proceed with context I’ve explicitly given it and it continues to get basic things wrong.

The pain in the ass I’ve experienced with model drift and trying to keep it on track continually leaves me wondering where its value truly lies. I’ve had it spit back to me literally every type of answer under the sun by the time I get it to zero in on the proper context and details and by the time I get there, I’m no longer entirely confident that it has the correct chain of reasoning.

"LLM’s are highly unlikely to get us to AGI. It’s the wrong architecture for getting there, period."

What makes you so sure about that? This sounds to me like: "fixed-wing aircraft are unlikely to get us to flight. It's the wrong architecture for getting there, period. We need flapping wings. Every animal that flies flaps its wings"

I’m confident about it because LLM’s lack a true capability to understand the world.

They use statistical correlation to predict the next likely token, which means they mimic intelligent reasoning, rather than possessing it. They also lack any concept of a “world model," and don’t understand causal relationships of the world, only the linguistic patterns describing it.

Even the most advanced models that are “capable” of advanced reasoning struggle ‘massively’ with distribution shift and they fail whenever they face situations outside their training data. Because of the way they train on data, their understanding of things doesn’t evolve in real-time. They can’t learn from continuous, active interaction with the world.

Gary Marcus had a good talk on the problems endemic to these systems fairly recently.

I’m confident about it because LLM’s lack a true capability to understand the world.

For what it may be worth, I tend to agree with this. Actually, I think it was Gary Marcus who observed that when LLMs play chess, they still make the occasional illegal move despite the fact that they are trained on databases which contain both the rules of chess AND millions of historic chess games. (To be sure, I have not verified this myself.)

By contrast, a fairly bright child can be trained in a few hours how to play perfect chess -- perfect in the sense of never making an illegal move.

Another example, of course, is the car wash puzzle.

Another is the puzzle I posed here a few months back about the NYC helicopter trip.

It just seems like LLMs at present don't actually model the universe. What it reminds me of is when you take an advanced math class in high school and there is that one student in the class who has no real understanding of the concepts but gets As anyway because the exam problems are somewhat similar to the problems in the textbook and the student grinds away on all the problem sets and constantly pesters the teacher with "will this be on the test?"

And what about the cognitive errors that humans make all the time? The rationalist community was founded on a list of widespread "fallacies", after all. To pick one field, I would argue that humans lack a true capability to understand probability. We lose to even basic computer programs at Rock Paper Scissors. Gamblers think Red coming up 3 times makes Black more likely next time. There are actual medical professionals who don't understand that a positive on a 90%-accurate test for a rare disease does not mean you are 90% likely to have it. Simpson's Paradox will fool almost anyone, including me.

And on this very forum (and ACT's), every so often I try to correct people about the Doomsday Argument, which, like Monty Hall, is easily modeled and shown to be false. Yet Scott - and a motivated subset of Wikipedia editors - believe it anyway. Somebody who believes something false is clearly lacking a "true capability to understand probability". But they can still be intelligent.

And what about the cognitive errors that humans make all the time?

What about them? Seriously, what's your point?

Oh wow. Um, ok, how can I dumb this down as much as possible for you?

  • You: LLMs make cognitive errors, so they can't understand the world!
  • Me: Humans make cognitive errors, so they can't understand the world!

Having intellectual blind spots - like reading comprehension in your case - is not proof that you're not "modeling the world".

More comments

I don't think you can say for sure that they don't have a "world model" hidden somewhere in their trillion-dimensional space. I've certainly used them in ways that seem to require one, and while it's certainly possible that it's because they're faking it with statistics and I'm overestimating the difficulty of what I ask ... the argument does have to trail off at some point, right? You have to include some way to show that they really do "understand causal relationships" (even if it's just through preponderance of evidence), otherwise you're using unfalsifiable faith-based reasoning to assert that only human intelligence is real intelligence.

What they definitely don't have is temporal persistence of thought, just because of their actual mechanics. (CoT reasoning is a patch to this, but an imperfect one.) A priori, I would have thought this was necessary to do complex reasoning.

And I would advise heavily discounting anything Gary Marcus says. He's just enjoying a career as a self-purported "expert" that the media can go to whenever they want a skeptical quote, but almost every testable claim he's made has been wrong.

Somewhat an aside, but I consider that first link to be a first-degree chart-crime. First of all radar plots are inherently iffy, since we pay close attention to the "area" and the area is highly dependent on how the categories are organized (a "spiky" radar plot has much less total area than if you sort the axes to create a "lopsided" plot, despite showing the same information). This is a little bit defensible if the adjacency of the categories is obvious and inherent, but they frequently are not. For example, "Occupational: Writing Literature and Language" is NOT next to "Text: Creative Writing" for no good reason at all. And furthermore, what is the scale of the chart? It's "Arena rank"... which is NOT equally spaced. The chart implies that the difference between #1 and #2 is the same as (or even slightly bigger than, considering how the radar chart "expands") that between #3 and #4, but this is plainly not the case. They should be using some kind of actual score instead, perhaps a scaled one. Sure, it allows consistency across axes, but if we are comparing a model to its successor, the rating scale definitely shouldn't be implicitly including other models like it does now (in one spot it drops from rank 2 to rank 5, does this mean in that category some other model class does abnormally well, or that did Claude truly degrade?). Even worse, the center of the plot, usually a natural "zero", is not a zero at all - it's rank 6. There are, as you know, dozens and dozens of models in the rankings, so rank 6 being a zero score is totally nonsensical.

Anthropic's move here (combined with them handicapping Opus 4.6 a few weeks ago) seems to clearly be an attempt to achieve profitability. The free/subsidized rate train for end users has pulled into the station, and now you have to pay more for the same (or worse) capabilities you were enjoying before.

I guess it depends whether you think this is a forced move due to running out of money or if they have run their internal numbers and think people are willing to pay the increased prices. VC money is a runway, it's not intended to be a permanent subsidy. If they reduce the amount of money they are burning on subsidized inference, that's money they can put into R&D, more GPUs, etc.

It's hard to speculate without knowing more about their internal metrics, but based on the complaints I have heard about Claude being slow, laggy, etc, it sounds like they are quite oversubscribed. If the demand exceeds the supply, increasing prices is the logical move.

The way these Orange Reddit people use AI is revealing to me. I tried Opus 4.6 and got no benefit over Codex 5.3 but it made me run out of tokens very quickly. I use Codex 5.3 for my day job and several side projects. I think I got no benefit because I have expertise on what I'm doing, so I give pointed, well written prompts. These people must be completely out of their depths and therefore reliant on extremely costly extra layers of prompt refinement to get the same performance I can get with Codex 5.3.

Opus made you burn tokens quickly so you switched, but when these people also use Opus and burn tokens quick it's because they're using it wrong?

They're not using Opus wrong, but being reliant on Opus means they're bad at AI.

You seem very eager to jump on any negative AI news out of some desire to prove the “AI bros” wrong. What’s your motivation? Annoyance at AI mandates from above? At insufferable people shoving AI slop in your face at every opportunity? Just disliking the concept in general?

I don’t know if I’m an “AI believer” (what do you mean exactly by that?), I dislike OpenAI and Anthropic for the shenanigans they keep pulling, and I’ll jump ship to whichever AI service provided the best value for money. The tech industry hype cycle goes on and on, at some point people went crazy over Java of all things, now it’s just a boring programming language and you don’t have to be a “Java believer” to use it.

AI as it is right now is a gimmick I want people to get bored with, like NFTs, "Blockchain" everything, and 3d Films/TVs. I also have a deep personal disgust reaction towards how it's apparently impossible to get an LLM not not call things Chef's Kiss or declare that it as [X] Energy, and the general way so much of its output is designed to give the impression of comprehension while not comprehending. Or when it touches certain topics it suddenly becomes cautiously, obsequiously, orthodoxly pozzed, and mysteriously stops trying to lick my ass. I realize I only used text generation on free and, for a while, one step above free models, and not using it to code, which seems to be the most impressive practical function it has.

I want to step into the turbolift and say "Deck 12," and have the computer reply with "Deck 12,"

I do not want to hear "Sure buddy, I'll get right on that. Deck 12 has some serious Starfleet Energy to it, it's an excellent choice. Dare I say, it's downright Chef's Kiss."

The other day I was watching Sharpe's Rifles on Plex (only place I could find it), which has commercials every now and then (the only time I see commercials), and way too many of them feature creepily-animated AI-derived CGI critters that make my skin crawl. I'm distressed that so many other humans seem incapable of recognizing or rejecting AI slop; their reaction seems to be "This thing is awesome, it tells me exactly what it thinks I want to hear!" completely oblivious to how horrific that is.

While you have valid points about their sycophancy, and political bias and sloppy nature... If you haven't used any top models you don't really have a solid basis for judging AI or where it's going.

AI as it is right now is a gimmick I want people to get bored with

I realize I only used text generation on free and, for a while, one step above free model

Lmao every single time without fail

I have really bad news for you my friend. Today in April 2026 is the least amount of LLMs in your life. It's only downhill energy from here.

"Look at this door. All the doors in this spacecraft have a cheerful and sunny disposition. It is their pleasure to open for you and their satisfaction to close again with the knowledge of a job well done. Hateful isn't it?" --Hitchhiker's Guide to the Galaxy

Actually what annoys me more is when machines require that you be polite to them, such as when you need to cancel a subscription, disable an annoying popup, etc. and your only option has a please tacked onto the front. I don't want to tell a machine to please do anything.

Sure buddy, I'll get right on that. Deck 12 has some serious Starfleet Energy to it, it's an excellent choice. Dare I say, it's downright Chef's Kiss."

LLMs trained on the reddit corpus become redditors. I can't tell if this is the worst timeline, funniest one, or both.

Annoyance at AI mandates from above? At insufferable people shoving AI slop in your face at every opportunity? Just disliking the concept in general?

All of the above, honestly? But the biggest would be annoyance at mandates from above, combined with a completely reversal in what people consider quality engineering in software that magically coincides with the rise in popularity of AI tools. See Lines of Code suddenly becoming a positive metric for a lot of people, versus the old Bill Gates quote "Measuring programming progress by lines of code is like measuring aircraft building progress by weight."

The tech industry hype cycle goes on and on, at some point people went crazy over Java of all things, now it’s just a boring programming language and you don’t have to be a “Java believer” to use it.

Sure, but despite Java's warts it's still used to this day to make a lot of the important software that keeps the modern world running. The AI hype bubble is much more reminiscent of the crypto bubble. No matter how many times you tried to make it clear that crypto is only useful where you need a distributed, immutable, trustless ledger (and even then it's questionable), crypto bros kept proposing uses in situations where trust was still required and other existing tools already did an infinitely better job for far less computing power. Similarly, I see retarded things like "I had AI generate a thing, and then I had another AI review it and tell me it looked great! What, review it myself? No, of course not, why would I do that?"

The AI hype bubble is much more reminiscent of the crypto bubble...

I just had this epiphany the other day. In my mind it’s barely a step up from the mania of every Bitcoin sycophant. The value is marginal (but if bosses are convinced they can use it to axe employees the hype train will keep running), it’s massively damaging to the environment, the financing doesn’t work and the news cycle perpetuates the myth that this will get us to the techno-messiah that is AGI.

All of the links in this chain are so contentious and fanciful, it’s difficult to come to the conclusion that this isn’t a fraud on several levels. Add to that the aura and personality of someone like Altman as one of the people leading the train (who’s always given me the vibe of a con artist), I don’t see any other way this ends apart from it all coming down crashing; hard.

Except crypto was almost always purely in the realm of theory-applications.

With AI, right now, I can do things like generate custom flashcards for subjects I'm learning (job interview prep). I can get more in-detail answers about random questions without spending hours on Google piecing things together (just yesterday, asking for details about how stomachs process different macronutrient profiles). I can generate custom mini-apps for a wide variety of tasks (recently I made a custom task-selection spinner for my todo list that weights the important tasks more than smaller tasks, while occasionally mandating a break). It can make sure an email I send to a recruiter doesn't have obvious mistakes or commit a faux pas. I can get personal advice of at least middling quality without friction on a wide variety of topics. Obviously, it can code really well, and that touches my field very directly in a lot of ways. There are plenty of other use cases, too. These aren't "lines of code" type accomplishments, they are concrete deliverables of various scopes. Some of which were previously high-friction or even impossible.

Sure, some of these are gratuitous or busywork, but they are all real. Crypto stuff was like, "what if the government keeps track of property listings on the blockchain" which is a) something the government already does mostly just fine and b) obviously never happened and c) would have required very significant network effects. And currently, crypto is extremely useful for pretty much exactly two types of people: those who treat it like digital gold (it does OK at that) and criminals who can move money around that's difficult to track. Nothing else. So sure, in that sense it was real, but AI plainly can do more than two things and will continue to do more than two things even as hype dies out.

And sure, my IRL friend will give me better advice than Claude will, but there are some things that are so low-stakes that it would be disrespectful of their time to ask or discuss. Paradigms like that are all over, because of the speed and cost AI offers. In that sense, it's more like the Industrial Revolution, where speed and cost enable things to happen that previously were functionally impossible at scale. In fact most of the Industrial Revolution was about things that were already feasible to do, but were cost-prohibitive (or took too long). This in turn generated new industries that were previously only theory. Now, I don't think AI will have that level of impact on society, and I'm also not sold on it 'creating new industries' at all, but probably it's somewhere on the level of the impact similar to the invention of Google at least?

Except crypto was almost always purely in the realm of theory-applications.

If nothing else, Bitcoin can always buy you a pizza, although the only topping will be regret

With AI, right now, I can do things like generate custom flashcards for subjects I'm learning (job interview prep). I can get more in-detail answers about random questions without spending hours on Google piecing things together (just yesterday, asking for details about how stomachs process different macronutrient profiles). I can generate custom mini-apps for a wide variety of tasks (recently I made a custom task-selection spinner for my todo list that weights the important tasks more than smaller tasks, while occasionally mandating a break). It can make sure an email I send to a recruiter doesn't have obvious mistakes or commit a faux pas. I can get personal advice of at least middling quality without friction on a wide variety of topics. Obviously, it can code really well, and that touches my field very directly in a lot of ways. There are plenty of other use cases, too. These aren't "lines of code" type accomplishments, they are concrete deliverables of various scopes. Some of which were previously high-friction or even impossible.

This is all ‘real’ in the same sense that having “AI” in a Sonicare toothbrush or a refrigerator is also real. It just isn’t the selling point that people think it is. A lot of these are fairly menial tasks that still require some level of supervision and for most people the value isn’t enough to break with the typical habits people develop to complete their work.

The work the AI does for things I’ve put it to use over are inconsistent enough it isn’t work the end user investment I’ve put into it. You could argue that I’m just “using it wrong,” (I’m actually not, I’ve shown it to other people who have the same problems), but then how is that more valuable than doing a manual task myself with certainty of how things are used, rather than outsourcing work that I can only hope is doing what I ask it correctly?

You could argue that I’m just “using it wrong,"

I do actually bet your using it wrong (or a free model, or tried before November 2025). What are you trying to do? There are many things it can't do well, so maybe you're right, but given what you said above I am suspicious.

This is all ‘real’ in the same sense that having “AI” in a Sonicare toothbrush or a refrigerator is also real. It just isn’t the selling point that people think it is.

It's absolutely insane that you can read things like:

  • Making personalized study materials instantly and infinitely

  • High quality research across all human knowledge in a fraction of the time

  • Instant custom software on demand

  • Infinite mid tier advice and cognitive output

And called it as useful as a vibrating toothbrush. You are so biased as to be willfully blind.

Vibrating toothbrushes are pretty useful.

At least for me personally, I just hope this leads to less retarded mandates from my higher-ups about using AI X times a month etc. (we're literally tracked on usage and it can affect our raises/bonuses).

I work at a dinosaur of a company, so I can't speak to this directly, but a friend of mine that I recently mentioned gave me an update the other day. They've gone from "you must burn as many tokens as possible to maximize your performance review" to "we must use our token budget wisely."

The timing is interesting. It happened right around the same time Anthropic started putting the screws on its customer base with increased token usage and tighter rate limits.

I really feel like the company that sits on a "good enough" model and aggressively cost cuts is going to win this particular war.

I really feel like the company that sits on a "good enough" model and aggressively cost cuts is going to win this particular war.

That would be OpenAI. Claude is a 2026 fad and will be over soon.

There are some Chinese contenders within striking distance as well. GLM-5.1 is open weight and seems to perform somewhere between Opus 4.5 and 4.6. It's pretty incredible that there is open weight competition that's less than six months behind frontier state of the art.

and a tracker of the increased token burn rate is here:

The tracker is slop so I don't trust it one bit.

But in general for thinking models, more thinking effort = higher scores on pretty much all benchmarks. So they could easily have just tweaked a setting so 4.7 medium = 4.6 high. Voila, number goes up. Of course you're paying for those tokens anyways but the scale is fundamentally arbitrary - there's no real definition of what "low" or "high" thinking actually means.

I'd be more worried about the fact that the reception to 4.7 has been extremely lukewarm to say the least. Ain't nobody on twitter singing the praises of that model.

I don't really see what this is supposed to prove one way or the other. You are still stuck in the timescale framing of the most fervent AI bros. Opus 4.6 came out in February, 2 months ago. So what if Opus 4.7 is not a revolutionary upgrade? If AI were truly stagnant, we won't really find out until someone posts in 2028 that Opus 6.7 is only a marginal upgrade over Opus 4.7.

I think you misunderstand my argument. I'm not arguing that AGI is impossible based on this (though I don't believe it's possible). I'm arguing that this is a strong sign that VC money is drying up before they could ever conceivably achieve AGI (even if it is possible).

It’s an interesting sleight of how the tech bros have hoodwinked the finance bros and duped them out of so much money. The train is being driven irrationally with so much FOMO money going out the door, I’m surprised it’s lasted as long as it has without people asking questions.

I wonder if Michael Lewis has already had a draft in the works of the next big story he’s working on. As long as I can pick up a box of GPU’s for pennies on the dollar, I’ll be happy. Although I don’t know how the resellers are going to pop up for such a massive surplus of inventory, but I’m definitely on the lookout. My home lab is about to get even bigger.

Anthropic raised $30 billion two months ago, their problem isn’t lack of money. All the VC money in the world won’t solve a bad engineering culture.

Sure, but they're on track to burn $11 billion this year in expenses, and more in the future, so that's not going to last too long

$11 billion this year in expenses

...and $14 billion in revenue assuming zero growth. Or closer to $35B if their 10x/yr trajectory continues.

Note that your link says "run-rate revenue" which is a very different thing from actual revenue. Relevant XKCD: https://xkcd.com/605/

Yes, I did note that in my comment. They had 1/12 of that revenue in 1/12 of the year (or some other fraction), and therefore they're on track to $14 billion in revenue assuming zero growth.

Assuming zero growth, and zero decline in revenue, compared to whatever fraction of the year.

More comments

If that were the end of the story it wouldn't be an issue. It's that it evidently uses significantly more computing power than the performance improvement would suggest, raising the spectre of rapidly diminishing returns.

It seems to me this also has financial implications. If you are paying per token, and the model's benchmark performance increases slightly, but its token cost to reach those higher benchmarks increases tremendously, suddenly you're paying a lot more to do, at best, slightly more.

If Anthropic is making margin on the token cost, then this is an improvement from their financial point of view, right?

I’m not seeing the mathematics on this one. Care to explain?

Toy model to illustrate:

Let's say that I need to make 100 PowerPoints per year, and I use AI for this. And let's say that when I use 4.6, it costs me $1 in token costs to make a PowerPoint presentation based on a prompt. I now have to spend 10 minutes correcting the errors.

Now supposing we bump up to 4.7, and suddenly the PowerPoint a bit better, I only need to spend 5 minutes correcting the errors. But it costs $2 because the token cost is less efficient.

If Anthropic is making margin on the token costs, then the demand for tokens has increased even though the demand for work has not (I still need to make 100 slide decks annually). And while we've saved me some time, we've increased my cost to $200 instead of $100. If Anthropic is making 10% margin, they've now made $20 instead of $10. And since suddenly the token demand has doubled (in this toy world with static demand for PowerPoints which now cost more tokens) Anthropic can likely use the increased demand to raise costs on compute further.

Some disclaimers:

  • this is a toy model
  • I am not sure to what degree and in what way "benchmark improvements at the cost of more token use" translates over into real world applications. Does 4.7 now use more tokens to do the same work (e.g. answering "what is 2+2") or does the allegedly less efficient token cost only kick in with more involved prompting? I can imagine a world where "benchmark improvements at the cost of more token use" in the real world means you can 1-shot an app instead of 3-shotting it, so even if it uses twice as many tokens, it's actually saving compute.
  • from what I understand the financials of compute are all over the place: some people or services have something closer to a cost-per-token, many do not
  • Furthermore as I understand it companies like Anthropic own some of their compute, but not all of it, meaning that if costs of compute increase due to this it might be bad for their bottom line if they are renting a lot of their compute and their providers decide to jack prices up on them

Possibly there's something (else) I am missing here, would be very happy for feedback. I don't use LLMs to code so my lack of experience with the most-common use-case means I have little personal insight into the trade-offs between increased demand for tokens versus higher performance. If people are complaining, though, I assume it's because they feel like they are able to get less done (IOW, the model is less token-efficient). If anyone has a better model for how this works in the real world, particularly in more common use-cases, I would love to be filled in.

United States law that would require all operating systems to implement mandatory age verification is now available to read.

The bill is ironically titled the Parents Decide Act rather than the Government Decides Act. It applies to all operating systems; Windows, Linux, embedded systems, even smart refrigerators. Developers will have full access to all relevant personal data.

The bill doesn't even specify how age verification will work and instead delegates this task to the FTC, which will also specify data storage/protection requirements. The law wiould be considered in effect one year from date it is enacted and violations will be handled under the Federal Trade Commission Act.

„Child protection“ laws like this have no good justification and simply amount to destroying anonymity on the internet. What benefit does anybody get from such a law anyway? I can't see any. If operating systems are so bad for 17 year olds, why don't parents just take their kids' phones away? How does 17 year olds using operating systems create negative externalities for other people? I'm not seeing what I'm supposed to be gaining from these laws. It seems like lazy parents have teamed up with law enforcement who hate anonymous internet usage to demand that governments destroy internet privacy under the thin veneer of protecting teenagers from nothing.

The steelman is that it will incentivize minors to run custom operating systems so they can watch porn, and they might learn something in the process both about tech and the government.

More seriously, the only reason besides utterly incompetence anyone could have to enact such a law is that they want to create a dystopia out of RMS's worst nightmares.

This law would be moot as long as people have the freedom to decide what software they run. At the moment, there are both walled gardens and platforms which with you can mess as much as you want. While anyone can buy a Raspberry Pi for a couple of bucks and run whatever software they want, kids can simply opt for a distribution which does not try to babysit them. Or run systemctl disable babysitd, for that matter. So you would need to mandate TPM chips in every device with more than four kilobytes of address space or something. Good luck with that.

And of course this would be required, but not yet sufficient on its own. How can an operating system know if the person in front of it is a minor or not? With AI, facial recognition can be tricked. Of course, the government could helpfully implant RFID chips in citizens to help the poor OSes figuring out who is who. I think the traditional location would be the forehead.

The internet has the advantage over real life that if you find yourself in a situation which makes you uncomfortable, you can just turn off the screen without having to learn how to dissociate first. Nor is this the purpose of such laws -- someone who prefers not to see unsolicited dick picks can use messenger apps which have options to block those. The purpose of such laws is to control what kind of content minors are allowed to search for. Like the internet, reality is not safe for children. Any ten-year-old riding a bike in traffic is just one bad decision away from a life-altering accident. However, kids (or those who make it, anyhow) thrive in conditions which are not entirely safe.

So you would need to mandate TPM chips in every device with more than four kilobytes of address space or something.

This is the direction the wind blows. I think it's almost inevitable. Phones, tablets and laptops are basically there already (TPM/secure boot/ect.), the one thing missing is that the boot loader on a few phones and most laptops isn't locked yet. But it will be, soon, just like the phones. The industry wants it that way, and politics wants it, too.

The next steps are easy. The only bootable OS on those devices comes with age checks and a locked app ecosystem. The only browsers available will cooperate. Then websites will be required to implement hand shakes dependent on keys in the TPM, and only serve data to valid devices.

And sure, you'll be able to get around it for a while, especially on niche hardware. But if industry and politics cooperate, getting onto Instagram will soon be as difficult as getting a 4K Netflix stream on a "custom operating system" (i.e. only by the grace of Usenet/torrents).

Nevertheless, it's bipartisan and it's coming. We should have full 1984 by 2034, so not bad for a government program.

The bill doesn't even specify how age verification will work

Neither did the Australian social media ban, which didn't even delegate to the eKaren; it just set a 12-month deadline in the legislation and told industry to figure it out.

I've been getting ads against this act on some podcasts. I would say that it is absolutely in the category of a bad law. "Think of the Children" is the alleged reason. Huge delegation to agencies. Almost no specifics.

If you wanted to actually help children, you'd withhold federal funds from any school that allows cell phones inside the building.

Not giving money to schools is the third rail of American politics. Like cutting social security.

All the best stuff!

If we gas ourselves up on hopeium, in theory this could be a positive step in the right direction.

Internet anonymity is already a mixed bag. If you are anonymous but make enough impact there are plenty of avenues for those who want to out you to do so. Just recently Howling Mutant got doxed. He joins a long list of 'doxxed' folks who have had their lives upended in worse ways.

You are not anonymous because people can't find you. You are anonymous because you don't matter. Those who matter get doxxed and the veil of anonymity now harms them, since they are now alone and exposed whilst everyone else is allowed to hide. If there was no anonymity people would take their rights to express themselves more seriously. And then maybe one day the 'freedom of speech does not mean freedom from consequences' line could die the painful death it deserves.

Outside of that there's plenty of potential utility in ID verification over the internet. Be that to do business with the bank or government offices that would have required you to go there in person, but can now be solved with a few swipes or clicks. I would in fact be quite partial to the idea that certain demographics would never see a gambling ad ever again. Which would otherwise be hard to achieve. On the flipside I'm not really sold on the utility of a low barrier of entry for kids to see porn or fall victim to psychologically manipulative 'gaming' schemes.

To put it another way: If what kids see on the internet matters so much that parents should revoke access to it, why isn't what's on there a bigger deal? We've already seen fine posts on here regarding the subject of foreign interference in media with the recent forced sale of TikTok. That, on top of the promulgation of hard and soft pornography, should be dealt with head on rather than being excused away under the guise that this is all somehow a meaningful avenue of anonymous expression whilst your ability to express your political views is a total sink or swim predicament based entirely on the whims of billionaires and the political extremists they bankroll, who can revoke your ability to meaningfully express yourself at will.

If we are to elevate the internet to be a free market place of ideas then it should be that in totality. Not piecemeal where sometimes our rights are sacred but other times not.

Theoretically your identity could be veiled to the public on certain platforms in a formalized manner, and unneeded breaches of information could be prosecuted similar to a libel suit. The big companies could now properly curate content based on a very firm 'don't show porn to under 18's' criteria. Meaning the government has a foot in the door of their algorithms. Maybe we could finally stop pretending that technology is all too complicated to legislate. And maybe, just maybe, this will lead to my YouTube frontpage sucking less. Maybe.

Now, what are the odds that OS ID verification leads to any of this? None. But the mechanisms would at least theoretically be in place to make the change. As it stands the situation isn't all that great. And I'd wager this would mostly affect phones anyway, which already have pretty ironclad ways of knowing exactly who you are, where you are and so on.

If you want to start a social media company based on the premise that the users are verified using government IDs, by all means do so. If you want to tell your kids that they are only allowed to use such media, by all means try. If you dislike porn, I hear Disney runs websites which are rather porn-free. By all means lobby Microsoft to add Mandatory User Age verification to Windows Server.

The point being that you compete in the marketplace of ideas. Plenty of companies build walled gardens and gilded cages in the internet. Even the companies which verify identities can generally decide how much they trust operating system and if they require remote attestation of some TPM chip certifying that the video feed used for ID is actually recorded by a tamper-resistant camera.

But to constrain the marketplace of ideas you would have to demonstrate that the options you dislike are actively harmful, and you have done no such thing.

The problem isn't the content as such, it's what people do with it. The big news stories around "somebody think of the children!" generally turn out to be "14 year old was picked on in school, just this time it's done online instead of face-to-face, and they committed suicide". It's "pervy creeps used photos of kids posted on social media to generate child porn". It's "guy who should be fed into a wood chipper pretended to be 12 year old online, gained confidence of real 12 year olds, then blackmailed them for nudes".

Unless we can solve human nature, all the age verification laws in the world won't solve anything.

I'm not sure we need to solve all of the worlds ills to derive some benefit from OS ID verification. I'd wager it would be easier to account for who exactly the 12 year olds are messaging if every person involved had a verifiable ID. It would at least raise the barrier of entry for pedophiles from being able to make an account on Discord to being able to create an entirely fake identity.

Comically, having such oversight for online messages sounds so oppressive it might even drive the kids to spend more time with each other in person, just for privacies sake.

Keep in mind, governments and companies are more or less incompetent.

Be that to do business with the bank or government offices that would have required you to go there in person, but can now be solved with a few swipes or clicks.

Banks and government offices already have your ID. They still require you to go in person, because 1) people steal each others' IDs, and 2) they haven't upgraded their systems since before the mainstream internet.

I would in fact be quite partial to the idea that certain demographics would never see a gambling ad ever again.

Gambling ads and suggestive content are visible even on kids' sites designed to block it. The blocks don't work, because 1) selective blocking is a hard problem, and 2) companies don't invest enough because they want to maximize profit (and governments don't fine them enough).

If what kids see on the internet matters so much that parents should revoke access to it, why isn't what's on there a bigger deal? We've already seen fine posts on here regarding the subject of foreign interference in media with the recent forced sale of TikTok. That, on top of the promulgation of hard and soft pornography, should be dealt with head on rather than being excused away under the guise that this is all somehow a meaningful avenue of anonymous expression whilst your ability to express your political views is a total sink or swim predicament based entirely on the whims of billionaires and the political extremists they bankroll, who can revoke your ability to meaningfully express yourself at will.

It is a big deal, but: foreigners steal locals' IDs, and convince them (sometimes by visiting in person) to spread foreign propaganda. Pornography is popular, some pornstars are already public and some viewers have no shame.

Theoretically your identity could be veiled to the public on certain platforms in a formalized manner, and unneeded breaches of information could be prosecuted similar to a libel suit. The big companies could now properly curate content based on a very firm 'don't show porn to under 18's' criteria. Meaning the government has a foot in the door of their algorithms. Maybe we could finally stop pretending that technology is all too complicated to legislate. And maybe, just maybe, this will lead to my YouTube frontpage sucking less. Maybe.

Companies already aren't allowed to leak PII: it leaks anyways, they get sued and lose, but the final payout is negligible. YouTube already controls your frontpage and tries not to show porn to under 18s. Technology is already legislated, but governments abuse and/or ignore the legislation and companies find workarounds.


I do suspect mandatory ID would reduce kids exposed to harmful content, foreign interference, and porn (distribution and consumption). But significantly increase political (and non-political petty) speech consequences, which would be worse, because governments and companies will leak the IDs of users with views they dislike, and leaking everyone's views won't work as explained here.

Where I live there's a government ID system you can choose to link up to your phone. It provides access to a lot of basic government services without the need to visit an office. Sure, you have to prove your identity one time. But after that it's fine for years. Your bank can interface with this system to prove your identity and now you have access to a host of banking services. This is a very clear and direct quality of life improvement that could not be possible without some database somewhere that can interface with your phone knowing exactly who you are.

ID theft is hardly a relevant problem here. Because it's a known quantity, there are safeguards and insurances in place to ensure that you can't lose too much if you fall victim to it. It's not much different from the risk of losing a credit card for that matter.

Gambling ads and suggestive content are visible even on kids' sites designed to block it. The blocks don't work

Service providers have plausible deniability since you can't prove or verify a users age beyond just asking the user like they do now. However, if you could prove age, you could start holding service providers that don't adequately respect that age accountable. OS ID Age verification provides the mechanism for that change. I'm not saying things will become perfect, but perfect need not be the enemy of the first steps on a long road to improvement.

Just to add, the government may leak things and pay pittance in return, but that's still better than having your info leaked and nothing happening to those who leaked it.

I think mandatory ID for specific services is fine, my objection is mandatory ID to use the internet.

Service providers have plausible deniability since you can't prove or verify a users age beyond just asking the user like they do now.

The problem isn't kids clicking "yes" on "am I over 18?" and seeing porn, the problem is kids clicking "no" and seeing porn anyways, because it's in YouTube Kids. If governments don't hold YouTube accountable for this today, I don't see why they would after mandatory ID.

Also note that OS "age verification" currently implemented in some states is just asking the users' age:

Provide an accessible interface at account setup that requires an account holder to indicate the birth date, age, or both, of the user of that device... (CA-AB-1043)

Provide an accessible interface at account setup that requires an account holder to indicate the birth date or age of the user of that device... (CO-SB26-051)

I do think this OS age verification will reduce kids being exposed to harmful content, and mandatory ID would reduce it further. I agree that's a good thing. The problem is these laws may introduce other problems that make them overall negative.

Specifically, I don't really object to the age verification in California and Colorado because it's lackluster: one can enter a fake birth date, and probably use an OS that refuses to implement it without enforcement. But I would object to mandatory ID, because governments and companies have repeatedly failed to secure sensitive data, and people should have an outlet to express views unsavory to those around them (since many people would retaliate against or be deeply hurt by certain views, even mundane views (from a general perspective)).

You are not anonymous because people can't find you. You are anonymous because you don't matter. Those who matter get doxxed and the veil of anonymity now harms them, since they are now alone and exposed whilst everyone else is allowed to hide.

I agree it's way harder to hide than the average person thinks but it's definitely not impossible in the slightest. Even Russia and China, with much tighter grips on the Internet still struggle here. And it requires a lot of time, effort, and to some degree talent to go through the normal doxxing methods, whereas "give your ID and link it directly to your accounts" is incredibly easy comparatively.

And then maybe one day the 'freedom of speech does not mean freedom from consequences' line could die the painful death it deserves.

That's never going to die because "consequences" is vague and in many ways includes other people speech. Just consider a basic premise. John makes a policy that he will insult anyone who insults him first.

John: Hi Rude stranger: hi you ugly fucker John: Ok bitch, go die in a ditch

The rude stranger has suffered a consequence over his speech. To prevent this consequence requires silencing John.

This is particularly silly but highlights an important point. People criticizing you or insulting feels bad, but that is their speech being used. Someone's speech must be suppressed in order to stop this consequence.

How about a more life impacting example?

John is a CEO of IndustryInc. RandomManager accidently hotmics "And we gotta get these stupid moron customers to accept the price increases somehow". Customers are upset about being insulted and stop buying from Industry Inc. John fires RandomManager to try to bring customers support back and RandomManager can't pay his mortgage.

That sucks for the manager but which thing should we not allow in order to prevent "consequences"? Should customers be forced to buy from companies? Seems silly to me. Should John not be able to fire RandomManager who is hurting his business then?

Freeing the manager of consequences means removing freedom of association from everyone else.

Ok how about John and Joe are friends playing pool at the bar. While drunk, Joe says "John, I really hate your wife and think she's a bitch. She's an ugly fat bitch". John ends the friendship. Joe has now suffered a consequence for his speech, but what is the solution here, state mandated friends?

Yes there are some "consequences" that are obviously BS. Violence, shouting over people, abuse of government. Those things should not be accepted. But a lot of the negative things that happen to someone socially for speech are just the result of others exercising their own basic freedoms. They insult you, they unfriend you, they fire you, they boycott you, whatever because they too are free.

Russia is a particularly interesting example.

Every year they test a rogue version of the TCP/IP stack. It utilizes a Russian National Domain Name System to keep RUNet, which is like a parallel universe to DARPANet back in the days of the Cold War, that’s distinctly different from ICANN’s standards.

IP’s that would originally point to something like Google could instead be replaced with something like Yandex. If they activated that splinternet, hostnames are resolved through RNDNS at any time of Russia’s choosing.

Back in 2019, they passed a sovereign internet law that permits deep packet inspection, mandatory possession of decryption keys, filtering of traffic, content moderation and outlawing VPN’s to the public. The Yarovaya amendments also forced data retention of Russian citizens for 1-year without judicial oversight.

Рунет simply refers to the Russian language Internet. Where are you getting this from?

Since you asked. (1, 2, 3, 4, 5, 6, 7, 8, 9)

I'll admit I didn't read every word of these articles but they are mostly discussing content blocking that Russia, China, et al have been doing for decades.

I didn't see any evidence for this claim:

Every year they test a rogue version of the TCP/IP stack. It utilizes a Russian National Domain Name System to keep RUNet, which is like a parallel universe to DARPANet back in the days of the Cold War, that’s distinctly different from ICANN’s standards.

It's also not really clear to me why this would even require a "rogue version" of TCP/IP, what a "rogue version" would mean, or why you'd need to reinvent the TCP/IP wheel to force everyone to use your own DNS servers (which btw usually talk over UDP and not TCP).

I’ll try and dig into the specifics when I have time later. I read that in a book from an infosec practitioner a few years ago.

My example pertained more to America. If you sign up for or log in to a website you are functionally trackable, as far as I understood things. So yeah, being hidden is possible, but being hidden and being someone that matters in discourse? I think the barrier to entry on that is a bit too high to be considered relevant.

That's never going to die because "consequences" is vague and in many ways includes other people speech.

This feels like a very clear motte and bailey.

No one employing the 'freedom of speech does not mean freedom from consequences' line is defending peoples rights to disassociate over someone being an asshole in private dealings. Instead they are defending exactly the things described here:

Yes there are some "consequences" that are obviously BS. Violence, shouting over people, abuse of government.

Yes, calling the bosses wife fat to his face might get your fired. Voicing support for party X whilst your boss hates party X might also get you fired, but these are clearly not the same thing. You have to see the distinction between them. At risk of sounding like a complete cardboard box: we live in a democracy! Making political statements in a democracy has to be protected. People can play their cards close to their hands in private, but limiting discourse on the public square via fear of reprisals is not a way for a democracy to function. There has to be a way to navigate that.

Voicing support for party X whilst your boss hates party X, in practice, will get you fired. Even if there are laws against it, your boss will assign you annoying tasks, over-scrutinize your mistakes, etc. to evict you for a different official reason. And there's no way to detect this without false positives.

The loss of online anonymity would also damage relationships, and not just ones with irreconcilable political beliefs. People "code-switch" all the time; imagine no code-switching because everything you write online is visible under your ID. Men talking about women around other men, women talking about men around other women, kids talking about their teachers to other kids, teachers talking about kids and parents to other teachers, etc. Austists would love to know everyone's views about them and may easily adjust, but I suspect most people would be turned off by others' behavior in other groups. Importantly, 1) even when they logically know such back-talk was always happening, they would struggle to emotionally handle concrete examples; and 2) some back-talk is criticism aimed at helping the target or those around them.

—-

Thinking more about it:

Direct P2P would also avoid these problems. While maybe mitigating the social harms of today’s internet, which were less common when in-person and telephone communication were dominant, like social isolation and a certain type of (embarrassing) meannness and brainrot.

If online anonymity were eliminated for everyone, while providing a way for everyone to communicate (with ID) only to who they choose - that may be better than today. People could even make public political statements without repercussion, by privately communicating them to a trusted speaker for their party…so this doesn’t actually eliminate anonymity, just makes it harder…but doesn’t anything?

No one employing the 'freedom of speech does not mean freedom from consequences' line is defending peoples rights to disassociate over someone being an asshole in private dealings. Instead they are doing exactly the things described here:

Voicing support for party X whilst your boss hates party X might also get you fired, but these are clearly not the same thing. You have to see the distinction between them. At risk of sounding like a complete cardboard box: we live in a democracy! Making political statements in a democracy has to be protected.

You see how you just went "No one is doing that, anyway I'm going to do that" right? I don't see why your boss shouldn't be able to fire you for that. At will employment is the default in the US after all. He can fire you because he doesn't like the color of your shirt, because he doesn't like that your voice sounds annoying, that he saw a picture of your lawn and thought it wasn't taken care of well.

People can play their cards close to their hands in private, but limiting discourse on the public square via fear of reprisals is not a way for a democracy to function. There has to be a way to navigate that.

And yet, restricting citizen's freedom of association (which in the US is an implied right under freedom of speech) via fear of reprisals is? If you don't like a company firing John for his speech then you can boycott the company for that, as is your right.

He can fire you because he doesn't like the color of your shirt, because he doesn't like that your voice sounds annoying, that he saw a picture of your lawn and thought it wasn't taken care of well.

He can't fire you for being pregnant, female, black, Muslim, gay, trans, or disabled, so I don't see why he can fire you for being a Republican, Communist, believer in race IQ differences, or a supporter of Palestinian independence. Especially if you do those things outside of work. Which is always the fear. It's never, I'm going to proselytize to my coworkers, it's always, what if my X account gets doxxed and all these comments I made on the internet outside of work get me fired. We could just make that illegal.

Protection from termination for political views varies by state, and- perhaps counterintuitively but unsurprisingly- it tends to be blue states which have strong protections for political views, theoretically to protect union organizing.

He can't fire you for being pregnant, female, black, Muslim, gay, trans, or disabled, so I don't see why he can fire you for being a Republican, Communist, believer in race IQ differences, or a supporter of Palestinian independence

Maybe they should be able to fire people for the former things as well. Overly irrational amounts of bigotry are eventually solved by free markets. If you pass up too many good candidates just for them being gay or black or Republican or communist, you're going to do worse business wise. And there will be smarter businesses and competition that don't care about those things and just want to win in the market. It's not that irrational amounts of bigotry don't happen at all, but that they aren't really as meaningful.

Anti discrimination laws don't really have much of an impact, since in a democracy for them to be passed it requires a population that is already rather anti discrimination! So they're gonna be mostly not doing too much irrational discrimination on their own. At least, not those outside of what society already generally wants.

it's always, what if my X account gets doxxed and all these comments I made on the internet outside of work get me fired. We could just make that illegal.

Make what illegal here? The doxxing or the firing? Doxxing being illegal doesn't really make sense, it's historically considered a form of free speech and free press. Journalists would try to reveal anonymous people all the time in the past.

Overly irrational amounts of bigotry are eventually solved by free markets.

Only if there's enough competition. Which in most sectors, there isn't. You know, because, there's only so many people.

Anti discrimination laws don't really have much of an impact, since in a democracy for them to be passed it requires a population that is already rather anti discrimination!

Their impact is bounded, but not necessarily 0. 60% of people can easily force 40% of people into behaving differently in a Democracy. Which is roughly what happened with Civil Rights.

Make what illegal here? The doxxing or the firing? Doxxing being illegal doesn't really make sense, it's historically considered a form of free speech and free press.

Firing. That would be the extension of civil rights. Although you could make doxxing illegal too. The precedence for this in the United States would mostly be 18 U.S.C. § 1030. Weev went to federal prison for publishing a list of emails he got from a public HTTP API, because AT&T did not intend for the API to be public. Doxxing works on the same principal; it's the publication of information that was not intended to be public. The onus is not on the doxxee to secure the information 100% properly, because Congress has already rejected pure internet anarchy. If the information is reasonably interpreted to be intended-as-private, access and publication could be said to constitute exceeding authorized access. I just read a doxx on Howling Mutant in fact which used two data breach leaks as proof! That could obviously constitute felony usage of felony-produced data, just like using leaked passwords to break into an account. Morally, doxxing is obviously a crime which has a victim, which the criminal intends to harm, which makes it much more of a crime than probably the majority of so-called computer crimes the United States prosecutes.

When someone is making a 'this is how I think things should be' argument, it's very annoying to receive a 'well this is how things actually are' response. We're not really playing from the same sheet of music here.

You see how you just went "No one is doing that, anyway I'm going to do that" right? I don't see why your boss shouldn't be able to fire you for that.

I don't see how I did that unless you are arguing that there is not a difference between hurling personal insults at your boss and publicly voicing a political opinion he disagrees with. I see that distinction clearly, and I also think that expressing political opinions and handling political disagreements is a basic and necessary function of living in a democracy. If you don't see the inherent conflict of serving your democratic duty as an active participant in the political process and being liable to lose your job because of that then I feel we are at an impasse.

Outside of that I feel like we are roaming back to my original point. And I would just directly challenge your conception of 'having rights' in America as you present them here. For example, you can't fire a person because they are black. The Civil Rights Act just doesn't allow that. So you don't really have at will employment by default so we don't even need to act like 'At will Employment' is a point here to begin with.

And that highlights my problem with this predicament. Boycotting a company because they fired an honest and good man for bad reasons is what losers with no rights do. People with actual rights just point the upholder of their rights to the person that violated them and the upholder deals with it.

If you have to uphold your own rights in the immediate sense then you just don't have rights. Like, insofar as rights are real, you have to have an external mechanism that enforces them. Otherwise you are just kind of doing what you want and calling it 'having rights'.

When someone is making a 'this is how I think things should be' argument, it's very annoying to receive a 'well this is how things actually are' response. We're not really playing from the same sheet of music here.

Ok sure, fair enough.

don't see how I did that unless you are arguing that there is not a difference between hurling personal insults at your boss and publicly voicing a political opinion he disagrees with. I see that distinction clearly,

So if you say to your boss "your wife is a bitch" he can fire you because it's private, but if you post on your public work associated Facebook "I think my boss's wife is a bitch", he can't because it's public?

And wait, let me anticipate the "oh that's different it's political" response. Where's the exact distinction? Like extreme example but real political thing that happens in some countries. What if say, his son is gay and the employes tells the boss (or I guess, posts on his public Facebook) "your son is a freak who should be executed by the moral police"? That sounds distinctly political, which people should and should not be executed by government moral police. How about "women shouldn't vote, including your wife"? I think he should be able to find that insulting and fire you. Or hell what if they just say "I hope the president issues an executive order calling your wife a bitch". Can't get more political than your hopes of a particular policy from a politician. Maybe he's really creative and inspired and makes a troll campaign (but he plays it completely seriously) for local waterboard commissioner and while in an interview makes a point to repeatedly say "yeah, part of what inspired me to run is that my current boss's wife is a bitch. I figured maybe something is wrong with the water making her so bitchy".

How are you going to draw the lines in a fair manner, where does "politics" a topic about basically every part of life in at least some way actually begin here? This isn't some gotcha, it's an extremely difficult task to actually make a good overarching definition that isn't able to be abused. Just try with only the examples I gave alone and it'll be hard without making a convoluted mess.

And that highlights my problem with this predicament. Boycotting a company because they fired an honest and good man for bad reasons is what losers with no rights do.

This is some crazy logic, boycotting companies is your right. The government should not be micromanaging your financial decisions like that. Do you want every time you use a different gas station or try a new brand at the store to be open to scrutiny by bureaucrats to make sure you aren't "cancelling" anyone?

How are you going to draw the lines in a fair manner, where does "politics" a topic about basically every part of life in at least some way actually begin here?

On a case by case basis. Like is done all over the world. I'm sure your entertainingly convoluted examples would make it all the way to the highest court of any land. That being said, I don't think they are very realistic. And you can make a mockery of any law with unrealistic examples. But those examples could still be dealt with, even if they are far from being representative.

I would personally make a distinction between political views and assertions made about private individuals in public. Similarly, political views directed against private individuals could easily be deemed to not be in line with the political process. As in, making politically unrealistic wishes of ill towards private persons is a clear enough step over the line. Similar to how saying 'In minecraft' is not actually a legal defense against the preceding threats of violence, saying 'politically' is also not a defense.

If boss' wife was not private, but a public political figure, then assertions against her would be political. But not in the context of her being your boss' wife, since that fact is not politically relevant. If it were politically relevant, and both the boss and wife are politically involved then an employee would have the right to make political statements about both.

All that being said, I'd generally side with employees over employers in any case where the working relationship between the two is not personal. The idea that an employer gets to dictate the public expressions of tens, hundreds or thousands of people goes against fundamental aspects of democracy as I see them.

This is some crazy logic, boycotting companies is your right. The government should not be micromanaging your financial decisions like that. Do you want every time you use a different gas station or try a new brand at the store to be open to scrutiny by bureaucrats to make sure you aren't "cancelling" anyone?

The point being illustrated by me was that people with actual employment protection rights, like blacks in America, don't have to boycott things, since their rights are upheld by third parties. If your rights are not upheld by third parties then you don't really have rights. Unless you want to contextualize any ability you have to do anything in the world as a 'right', in which case our understanding of the word is not 1:1.

On a case by case basis. Like is done all over the world. I'm sure your entertainingly convoluted examples would make it all the way to the highest court of any land. That being said, I don't think they are very realistic. And you can make a mockery of any law with unrealistic examples. But those examples could still be dealt with, even if they are far from being representative.

Is it common to have anti discrimination laws based around something as vague and unclear as "political beliefs"? I wouldn't have said it was.

I would personally make a distinction between political views and assertions made about private individuals in public. .

"John's gay son should be hanged" vs "Gays should be hanged by the government" doesn't seem that meaningful of a difference to me.

All that being said, I'd generally side with employees over employers in any case where the working relationship between the two is not personal. The idea that an employer gets to dictate the public expressions of tens, hundreds or thousands of people goes against fundamental aspects of democracy as I see them.

The employer is not some sort of dictator who is unable to be left. There's tons of jobs that someone can go do, both in their field and out of it. You have the same right of association and can leave your job for the reasons you want, like "my boss has an annoying voice" or "the company had a trans pride picnic and I don't like that".

The point being illustrated by me was that people with actual employment protection rights, like blacks in America, don't have to boycott things, since their rights are upheld by third parties. If your rights are not upheld by third parties then you don't really have rights

Civil right laws are largely meaningless, they only get passed when a society (and thus almost always the market of a society) are already in agreement with the general principles. Enshrining them has some effect don't get me wrong, but it's not as potent as it seems.

Market rationality happens a lot without such anti discrimination laws, like how many companies will hire illegal immigrants or with otherwise obviously fake ID under the table simply because it's more economical for them. Despite the exact opposite and almost all the laws on book encouraging hiring the expensive citizens. The Republican voting farmer might not really like illegal immigration as a concept, but he does enjoy doing better in his farm business. There will be plenty of bigots who might not like blacks or Whites or Asians or whatever, who recognize the same thing with race. Or gender. Or whatever. Bigotry has to be overwhelming in the market (and society) to overturn this. And the more cutthroat the market is, the more overwhelming the bigotry has to get.

Firing people unreasonably for their race/religion/sex/political beliefs/citizen status whatever will always be suboptimal compared to "hiring the most economical choice". Some companies might be willing to take the hit and be suboptimal, but plenty of others won't.

More comments

I'm more and more drifting to the view that this abstract principles-based reasoning in a vacuum, where you just assert rights and derive things from it and hold to it, is not where real impact lies. You need a social fabric that holds people together where the political differences are bridgeable and the other political parties are seen as legitimate alternatives. For example if one party says that income tax should be 5 percent higher and another disagrees, or one thinks that public healthcare is more efficient with larger regional hospitals, while another wants to prioritize good care being available closer to everyone's homes, etc. then there is no such danger. In other words, the Overton windows have to overlap enough.

Once you let society fracture so much that they see each other's political opinion as an existential threat to themselves, their identity, their deeply held cultural beliefs etc., the tool to reach for is not rules lawyering some better laws from first principles like free association or free speech, but to try to create social cohesion. Politics is downstream of culture, and culture comes from social interaction and exchange. If you have long-term relations to your co-citizens in ordinary contexts, and you depend on them for general life stuff, if you go to each other's weddings and help each other haul stuff or do some construction work or whatever, seeing each other in many different roles, that results in a convergence of understanding, and some degree of synchronization, and interest alignment.

Of course this is what's getting erased with the increasing individualism. There's nothing that ties you to your neighbors, so you're free floating and can take on any political views, without any connection to whatever other people believe. There's less pressure to compromise and more pressure to stand out by being the purest and most vocal, most righteous version of your chosen side and has very little cost associated with fully condemning the other side as pure evil.

Now, many would say that this kind of cohesion is not really possible and there are inherent conflicts of interest that will always remain. Some would point to class differences, others would point to ethnic ones. But if you identify such unbridgeable differences, the tools are also not really the abstract principles to solve this, but some kind of Bosnia-Herzegovina style regulated representation and explicit design around this social fact, because again, politics has to be designed around the social reality. There's certainly some "backflow" and the rules create incentives that have effects on social relations, but in the end the rules are more a codification and stabilization of what the real emotional connections are. It's like Conway’s Law, stating that "organizations which design systems (in the broad sense used here) are constrained to produce designs which are copies of the communication structures of these organizations." The political rules of what's allowed, what's normal, what freedoms can be afforded, depend on the social (and interest) relation structures of the people themselves. High-trust cultures where most everyone believes in the same fundamental principles and are generally not at each other's throats can afford to allow unrestricted speech because most people anyway don't want to say things that would upset others too much. A society that's split in two tribes, which shout and plot about how to hurt the other tribe to the maximal degree, will find that they need to have some kind of constraints, but it's not very likely to solve the original problem.

Historically, such pent up tension was often released through war, like the Thirty Years War, after which people realized it's better to come to some kind of compromise, or one side is weakened so much that the tension is released that way. I'd hope that it's not a necessary stage to go through, though. Another thing may be the threat of an external enemy, or an external cause to unite around. I think the end-of-history types thought this could be these neutral inert things like space exploration, climate change and environmentalism, generic 90s elementary-school textbook obviously-good UN/UNESCO/UNICEF charities and human rights etc. But it seems that this is not really enough to form cohesion and are turned into wedge issues as well.

I think people are not inherently motivated enough to cooperate, only if circumstances force them, which is generally not pleasant. The village where everyone relies on everyone is not just sunshine and rainbows to live in. Because relying on others also carries dangers, and it limits your choices and there is constant judgment and gossip and observations, you have to care about your reputation, and the emergent judgments are not always fair.

The Kids Aren't Alright, at least it seems. I constantly hear studies and anecdotes on how Gen Z and α are significantly more awkward, asocial, mentally ill; and evidence suggests social media is why.

Hence I think governments are desperate to get kids and teens off social media, or at least make it less toxic, and/or reduce usage, to fix them. I agree with that goal.

But I'm wary of these bills: they threaten anonymity, can be bypassed, add regulatory burden... Although this YouGov survey rates Australia's ban "cautiously optimistic". Still, I much prefer:

  • Banning phones in schools (which I think is so obvious, it's surprising and embarrassing many schools haven't already done it)

  • Encouraging more in-person socialization with after-school activities, kids third-spaces, etc.

  • Less toxic social media algorithms, better parental controls, cultural encouragement for parents to limit social media - not via laws

I don’t think point three works.

Asking social media platforms to detoxify and make their platforms less compulsive for users more or less is like asking a food company to make their food worse. Their entire business model is “deliver attention-paying eyeballs to a platform where businesses can pay to make them watch ads.” To the product of course they pretend to be about connecting people to things they like, but it’s really not the point. TikTok doesn’t care about your enjoyment, they care about your engagement. So asking a private company to just … stop doing your business model isn’t going to work.

Parental control is not that great either. Unless you are skilled enough to IP block those sites at the router (which only works for a device using your Wi-Fi anyway) most controls are easily disabled. It’s not something you could rely on.

I think you're right about social media companies not making their sites less toxic on their own. So...I do think we should regulate kids social media more, and maybe adult social media past a certain size. I'm specifically wary of regulating small sites, because for example that hinders hobbyists and startups.

I think parental controls work. Current parental controls aren't good, but better ones are possible; and it's true that particularly smart and determined kids will subvert practically any controls, but not all kids are smart and determined. An example of a better parental control is a phone OS that, without an admin password, blocks sites not in a "kid-friendly" whitelist provided by a third party. I don't see why that's particularly hard to implement or configure.

Banning phones in schools (which I think is so obvious, it's surprising and embarrassing many schools haven't already done it)

How do you make it work? Try it, and somebody will start screaming about how this is harassing 17 year olds who are old enough to [do legal thing] but you are imprisoning and enslaving teenagers yet again; parents will go on radio shows about how they absolutely need to be able to contact little Krissanteemum at any time of the day; a child genuinely will need their phone, not have it, and something bad happens; teachers will be accused of picking on and victimising minority kids, and the list goes on.

Yeah, you get Johnny and Susie to hand over their phone which is kept in a locker in the secretary's office. Haw haw dumb authorities, that was my burner phone! I still have my real one!

Estacada High School in Oregon seems to have succeeded.

I, uh, am young enough to have gone to high school when smartphones were almost universal and simultaneously old enough for them to have still been banned at school while I was there.

The official policy was they weren't allowed on campus at all, but you can't enforce that. What they did enforce, was if you were caught with it out, you'd get a warning, and after that it would get confiscated. Parents were informed of this policy and understood it. If you needed to call home (which you never needed to do), you could ask to go to the office. It really wasn't that dramatic. I daresay it would work again today if only the school administrations could grow a pair.

Here, the whole province went ahead and did it, starting last fall (they were already banned from classrooms since january 2024).

Not being a teacher, a high school student or a parent of a high school student, and not being in frequent contact with either of these groups, I can't really say if that's the case, but the mainstream reporting I read on it seems that it has had a very positive effect.

I daresay it would work again today if only the school administrations could grow a pair.

That, and you do need parents to be on board with it. And unhappily, there are always parents who don't give a shit about what the kids do so long as it doesn't involve them, or they will listen to the kids bitching about not having their phones, or they will blow up about "this is racism/discrimination/some other attention-grabbing thing" because you took her phone off my little Chanterelle and she needs that phone!

We're probably around the same age, I was a young elementary schooler in '07 when the iPhone dropped so they were everywhere by the time I was in high school. We had the same policy.

Except ours was a bit more draconian. Confiscation on sight with no warning, and parents could not retrieve them from the office until after a 3-day waiting period. As much as this pissed me off at the time, this model is probably ideal. Plus I had a collection of old androids I kept for tinkering so I'd just reactivate one of those for the length of the holding period.

Plus I had a collection of old androids I kept for tinkering so I'd just reactivate one of those for the length of the holding period.

So try that today to keep young newport off the Internet and away from undesirable sites, and the same result; they took your phone but you have backups you can just reactivate and keep scrolling that [bad thing children should never see!]

Most age verification policies continually fail to answer some pretty basic questions.

  1. How do you verify age without invading privacy? There's plenty of neat "tricks" to try to get around it but ultimately there has to be some thread between you and your activities online. Whether it be giving your ID to websites directly or giving your ID to a third party who tells the websites you're of age.

  2. Why would parents who are fine with buying their child a computer/smart phone/etc device and are fine with them using it unmonitored willy nilly not be willing to just use their ID for a kid? What kind of parent gets their child a computer but then says "nvm" at an OS level age verification? How many parents out there are fine with their kids watching YouTube all day unmonitored who won't just do a facial scan for the kid as well?

  3. How do you stop kids from just using other identities anyway? Just go grab an ID online or get it from your parents wallet or whatever. People are literally scanning video game characters even to get past the age restrictions. The more restrictive you get on this, the more you amplify the first problem of linking identity to internet usage.

China a country with far more restrictive policies still largely failed to manage curbing children's gaming, and they don't even have to concern themselves as much with the first problem. As I've said before, that means we have to be super China in order to keep most children off the internet. Maybe you think that is worth it, but I don't want to be super China.

I'm old enough to remember the first attempts at age verification on fanfiction websites and yes, it was trivially easy to tick the "oh indeed I definitely am of legal age in my country to access these mildly spicy stories and not 13 pretending to be 18" boxes.

Verification today will need links to real-world data to make sure that you are not 13 pretending to be 18, and that will open up a whole can of worms (e.g. so what if the site storing all this data gets hacked? now somebody can sell the details of every 15 year old in the USA on the dark web).

You cannot be super China either. Your state simply does not wield enough power to be super China, and your bureaucracy would not have the competence to pull it off. It's funny that people always say "I don't want to become China" as if that were an option they're deliberately turning down. You don't really have that option.

We can always do a shitty half-assed version with most of the downsides and no upside.

Well yes that's the point. We're limiting casual privacy and annoying people just to not even really achieve the stated goals because we aren't gonna be Super China and yet that seems to be needed here for success.

People who insist on these stupid and failing half measures to get kids off phones/social media remind me of how environmentalists banned showerheads from using "too much water". They're frustrated that they can't actually do anything meaningful, but they have to do something to feel good so fuck your showers and fuck your casual privacy.

The most central example of stupid and failing half measures is covid lockdown in the US, of course.

If operating systems are so bad for 17 year olds, why don't parents just take their kids' phones away?

I think the idea is that

  1. The operating system would keep track of users' ages;

  2. This would facilitate porn sites keeping minors out; and

  3. It would also facilitate social media bans for people under whatever age is deemed appropriate.

Anyway, I think there are two answers to your question.

The first is that phones serve various positive purposes, such as being able to call the authorities in an emergency; being able to use the map function to avoid getting lost; and so on. Age verification (if it worked) would allow young people to retain phones for these positive purposes while locking them out of porn sites, etc.

The other issue is that with respect to social media, online games, and so forth, there is kind of a collective action problem. It's difficult to tell your children they can't use some popular social media site if all their friends at school are using it. Even if most of the parents would prefer to keep their kids off of social media, few parents want to be the first one to do it. A blanket rule, for example, that nobody under 16 can use Facebook, would solve this collective action problem.

Anyway, I agree that there is a huge potential cost to age verification, which is that it will undermine anonymity. As someone who has politically unpopular views, that doesn't thrill me.

and so forth, there is kind of a collective action problem. It's difficult to tell your children they can't use some popular social media site if all their friends at school are using it. Even if most of the parents would prefer to keep their kids off of social media, few parents want to be the first one to do it. A blanket rule, for example, that nobody under 16 can use Facebook, would solve this collective action problem.

This narrative is about as compelling to me as there being a deep state conspiracy to destroy privacy. A better narrative is that individual parents feel they would be individually better off if they took their individual kids' phone away, but they feel too weak to do that. So they want the government to discipline their kids for them. Normal people can't identify collective action problems well, it's too complex of a scenario. A well documented collective action problem is credentialism, and people can't grasp it because they just see that they would be better off personally if they consumed more education. Since collective action problems are complex, they also require solid documentation to prove. Bryan Caplan produced this for credentialism, but the data on teenage phone usage doesn't prove a collective action problem. It argues, poorly, that teenagers are individually better off when their individual social media usage is reduced. So the question of „why not parent“ must be answered individualistically. My guess is that individual parents feel weaker than in the past.

It argues, poorly, that teenagers are individually better off when their individual social media usage is reduced.

The teenagers themselves agree. 68% of them feel worse after spending time online. 50% say a digital curfew would improve their lives, 47% would prefer to live in a world where the internet doesn't exist.

And, pertinent to what we're talking about:

79% say technology companies should be required by law to build robust privacy safeguards into technology and platforms used by children and teenagers, such as age verification or identity checks.

How much data do we need to show that teenagers are stuck in a collective action problem when supermajorities of them are saying 'please help us get out of this collective action problem'?

How much data do we need to show that teenagers are stuck in a collective action problem when supermajorities of them are saying 'please help us get out of this collective action problem'?

I have data that says only 16% agree that a total phone ban at school is a good idea, and only 30% agree that any phone restrictions at all are a good idea. Tracks well with my experience in school.

The teenagers themselves agree. 68% of them feel worse after spending time online.

Caused by doom scrolling and algo slop. Fix social media, don't target adult privacy rights and teenagers' access to phones.

50% say a digital curfew would improve their lives,

Sleep related. Best solution is to delay school start times and encourage parents to give teenagers a bedtime, not this spyware bill.

47% would prefer to live in a world where the internet doesn't exist.

Not a majority, too abstract a question, just a vibe, also too bad, this bill doesn't make the internet disappear (which would be a disaster), it just attacks internet privacy.

How much data do we need to show that teenagers are stuck in a collective action problem when supermajorities of them are saying 'please help us get out of this collective action problem'?

You'd need a book like The Case Against Education. Except, The Anxious Generation was slop and didn't even include most of the data Haidt used on Substack to make the case. He actually dumbed it down for normies. Apparently normies need a fallacious book to accept that there is a problem, but a non-fallacious one can't be produced. Hm.

I have data that says only 16% agree that a total phone ban at school is a good idea, and only 30% agree that any phone restrictions at all are a good idea. Tracks well with my experience in school.

I would distinguish between school discipline matters and social matters. Clearly, young people aren't happy with the digital first childhood, but all kids like messing around in school. The two positions aren't really in conflict. Although frankly, the idea that we should be consulting children on the kind of discipline they are subject to seems pretty stupid. I imagine a lot of kids would like to be able to bring alcohol into school too.

Caused by doom scrolling and algo slop. Fix social media, don't target adult privacy rights and teenagers' access to phones.

I mean, I'm 100% behind banning stuff like infinite scroll, but it's not like there's a big button governments can press that says 'make the digital world not addictive'. I mean, really think about what that would entail. You'd have to ban video games, youtube, dating apps, Reddit and a bunch of other stuff I haven't thought of. There's an awful lot of stuff on the internet that is (or can be) addictive. I've dumbed down my phone about as much as possible and I still find myself idly scrolling on the Wikipedia app. Addictiveness is just a characteristic of the digital world. Banning it all for everyone would be far more authoritarian than just preventing teenagers from using the worst offending apps.

Sleep related. Best solution is to delay school start times and encourage parents to give teenagers a bedtime, not this spyware bill.

Delaying school start times isn't a bad idea, but we had early school start times before and we didn't have kids demanding restrictions on themselves. This is different. Also, bedtimes, really? Do you honestly think that parents haven't thought of 'tell your children to go to bed'? The kids themselves recognise the problem isn't 'lack of bedtimes', it's the addiction machine sitting on the bedside table.

Not a majority, too abstract a question, just a vibe, also too bad, this bill doesn't make the internet disappear (which would be a disaster), it just attacks internet privacy.

The very fact that such a high number would want to delete a technology that is so integrated into their lives should give you pause for thought. Teenagers in the 1920s didn't want to ban the radio, kids in the 50s didn't wish they lived in a world without television. The internet has clearly damaged the social fabric in a meaningful way, and the fact that young people have noticed too deserves more than a flippant response.

You'd need a book like The Case Against Education. Except, The Anxious Generation was slop and didn't even include most of the data Haidt used on Substack to make the case. He actually dumbed it down for normies. Apparently normies need a fallacious book to accept that there is a problem, but a non-fallacious one can't be produced. Hm.

I've read both of these books but I really don't understand what point you're trying to make here. Could you clarify?

Although frankly, the idea that we should be consulting children on the kind of discipline they are subject to seems pretty stupid.

Maybe less stupid than consulting the rabble on the kind of laws they are subject to, considering they destroy civilization when they choose wrong, but kids in school just have a little more fun, since school is pointless anyway.

I mean, I'm 100% behind banning stuff like infinite scroll, but it's not like there's a big button governments can press that says 'make the digital world not addictive'.

Governments could ban infinite scroll, start at a fine of $10 million per day of any company commanded to remove infinite scroll. I bet it will be gone quickly.

I mean, really think about what that would entail. You'd have to ban video games, youtube, dating apps, Reddit and a bunch of other stuff I haven't thought of.

No, you don't have to be any more consistent than your take on schools and democracy. The government is a murderous asshole that goes on random violent rampages over small triggers, it is not a Kantian philosopher attempting to achieve a perfectly Consistent moral Order of Things.

Addictiveness is just a characteristic of the digital world.

Either-or fallacy. Ponder heroin and cigarettes, if you will.

The very fact that such a high number would want to delete a technology that is so integrated into their lives should give you pause for thought.

Maybe it wouldn't replicate.

Teenagers in the 1920s didn't want to ban the radio, kids in the 50s didn't wish they lived in a world without television.

You don't know that.

I've read both of these books but I really don't understand what point you're trying to make here. Could you clarify?

How? What's confusing you?

encourage parents to give teenagers a bedtime

Do you not remember being 12/14 and arguing passionately that you were now old enough to be allowed stay up late(r)? Maybe you can force 15 year old Teen Kid to go to their bedroom, but you can't force them to go to sleep (and you can't lock them in, either).

Do you not remember being 12/14 and arguing passionately that you were now old enough to be allowed stay up late(r)?

Apparently 50% of them now want a bedtime, so why would this be an issue for them?

aybe you can force 15 year old Teen Kid to go to their bedroom, but you can't force them to go to sleep (and you can't lock them in, either).

But you can take their phone for the night, which is what I presume digital curfew means. Or use some kind of parental control so that it locks down.

build robust privacy safeguards into technology and platforms used by children and teenagers, such as age verification or identity checks

I agree that social media is an issue, but this sentence is giving me a stroke. Collecting data on your age and identity isn't what I'd call a "privacy safeguard".

Just because parents don't know what a collective action problem is doesn't mean they can't identify one. Not everyone works off of formal logic, parents can recognize that instagram is bad for kids at the same time as kids being socially isolated by being the only one not on instagram is bad for kids.

I have a previous thread about very conservative parents being better at their jobs, and my sources overemphasized discipline as a factor. Lots of the commentary was basically about how 'discipline' meant setting limits on social media. Plausibly your theory about parents feeling disempowered is supported therein; but short of spreading the folkways of the rightmost 10-20% or so of the population more broadly(and I have another thread about that), the best way to solve this specific problem of teen social media use is to make a law against it. They won't follow it voluntarily but it will let their parents enforce it.

Of course, I would prefer to be a selective libertarian and empower the rightmost 10-20% of the population by not doing anything to prevent the rest of it from self destructing. This is not out of a general commitment to freedom. But it's entirely understandable to me why social media bans that nobody knows how to enforce would be welcomed by parents.

Normal people can't identify collective action problems well, it's too complex of a scenario.

I disagree with this. Maybe normal people are unfamiliar with game theory; the prisoner's dilemma; nash equilibria; and so on. But definitely a lot of the time they can intuitively sense that there are situations where it would be good if everyone would agree to some X, but in the absence of an agreement, they feel pressured to go along with the crowd.

Since collective action problems are complex, they also require solid documentation to prove.

I disagree with this as well. Sometimes collective action problems are relatively straightforward and sometimes common sense is more than sufficient to recognize that one exists.

but the data on teenage phone usage doesn't prove a collective action problem.

I'm not familiar with any formal research, however I'm pretty confident just based on general observations and common sense. Above, you asked why parents don't simply take their children's phones away. I am quite confident that -- part of -- the answer to this question is that parents don't want their children to be the weirdo in class who doesn't have a phone; who's out of the loop; etc.

I'm pretty confident just based on general observations and common sense.

Common sense in this case is a hammer you got from slate star codex, for which everything is a nail. My common sense says the hammer is a specialty one and it doesn't fit all but a few nails. Alas, rationalists are always trying to use it anyway. Collective action this, game theory that, moloch thing there, prisoner's dilemma here.

I am quite confident that -- part of -- the answer to this question is that parents don't want their children to be the weirdo in class who doesn't have a phone; who's out of the loop; etc.

I don't think parents implementing common sense social media controls to their under-16 children would make them the weird kid in class. It would not amount to completely depriving them of a phone or the ability to text friends.

But definitely a lot of the time they can intuitively sense that there are situations where it would be good if everyone would agree to some X, but in the absence of an agreement, they feel pressured to go along with the crowd.

Except they fail to do this in the most important cases. Probably because their heuristic is asking whether the thing is individually good. They don't think teen phone usage is individually good, the mainstream argument is not collective action problem, it is individual parenting problem.

Common sense in this case is a hammer you got from slate star codex, for which everything is a nail

For what it may be worth, I was studying game theory when Scott was still in diapers.

I don't think parents implementing common sense social media controls to their under-16 children would make them the weird kid in class. It would not amount to completely depriving them of a phone or the ability to text friends.

You are sort of shifting the goalposts here. Earlier, you referred to completely taking away a child's phone:

"individual parents feel they would be individually better off if they took their individual kids' phone away, but they feel too weak to do that.

But anyway, let's break this down.

  1. Do you agree that many parents perceive that their children's use of social media is harmful?

  2. Do you agree that of those parents, many also perceive that their children are likely to end up being isolated/left out/etc. if their child stops using social media while their children's peers continue to do so?

Except they fail to do this in the most important cases.

Well do you think there are ANY situations where normal people can intuitively and correctly sense that there is a collective action problem, even if they are unable to make use of the formal language and terminology?

The collective action problem is other parents. And of course, other kids.

You can't control what happens in other people's houses when your kid goes over to a friend's house. Maybe the parents are lax, maybe they don't care if their 12 year old kid is watching porn, maybe they have no idea. Boys are going to dare one another over "did you see this?"

Well do you think there are ANY situations where normal people can intuitively and correctly sense that there is a collective action problem, even if they are unable to make use of the formal language and terminology?

No, I think they lack the cognitive capacity for anything beyond „X is bad, because if it happens to me, I won't like it“ and „Y is good, because if it happens to me, I will like it“. That's the basis for all of our laws and our education system and economic system. The masses have failed to accept every well-documented collective action problem I can think of. It's because they require someone to be top 10% literacy to comprehend.

For example, this comment. He argues

teenagers are stuck in a collective action problem [because] supermajorities of them are saying 'please help us get out of this collective action problem'?

But the evidence follows the individual heuristic I just wrote:

68% of them feel worse after spending time online. 50% say a digital curfew would improve their lives, 47% would prefer to live in a world where the internet doesn't exist.

I feel bad after too much time online, so I would be better with less time online. I sleep too little because of phone, so I would be better off putting phone away early. I feel bad on the internet, so I would like the internet to go away.“ And seriously, the last one is preposterous, can you imagine the collective economic damage if there was no internet? Meanwhile, when it comes to actual collective action, I have data that says only 16% agree that a total phone ban at school is a good idea, and only 30% agree that any phone restrictions at all are a good idea. They don't want collective action.

Do you agree that many parents perceive that their children's use of social media is harmful?

Yes.

Do you agree that of those parents, many also perceive that their children are likely to end up being isolated/left out/etc. if their child stops using social media while their children's peers continue to do so?

No, because I think a solid fix is a screen time limit, and this doesn't lead to complete isolation. I think parents don't do this because they are lazy and weak and won't fight with their teens.

Earlier, you referred to completely taking away a child's phone:

I meant partially, or on a temporary basis for a particular reason.

No, I think they lack the cognitive capacity for anything beyond „X is bad, because if it happens to me, I won't like it“ and „Y is good, because if it happens to me, I will like it“. That's the basis for all of our laws and our education system and economic system.

I disagree. For example, I'm pretty sure most people favor laws against income tax evasion. Even though most people would cheat on their taxes if they could get away with it.

Do you dispute that most people favor laws against income tax evasion?

I feel bad after too much time online, so I would be better with less time online. I sleep too little because of phone, so I would be better off putting phone away early. I feel bad on the internet, so I would like the internet to go away.“

Ok, and is so preposterous to hypothesize that people might have the following feelings: (1) I feel bad when I am away from social media because I feel left out; and (2) I feel bad when I use social media because I feel inadequate compared to a lot of my connections.

No, because I think a solid fix is a screen time limit, and this doesn't lead to complete isolation.

Umm, does that mean "yes" or "no"? I am not asking about screen time limits. I am asking this:

Do you agree that of those parents, many also perceive that their children are likely to end up being isolated/left out/etc. if their child stops using social media while their children's peers continue to do so?

It's a very simple yes or no question.

What are the common sense social media controls you're thinking of, exactly?

As far as I can guess at teen mindsets, having a dumb phone that is not designed to have apps in 2026 is exactly the kind of thing that would make a kid the weird kid in class.

What are the common sense social media controls you're thinking of, exactly?

Parental controls? Time limits? The main harm is scrolling for too long.

Parental controls? Time limits? The main harm is scrolling for too long.

I don't know if that's the main harm, but certainly a significant potential harm is the feeling of constantly comparing yourself to other people and feeling that you don't measure up in some way. It's hard to see how this would be prevented with time limits. Or with parental controls other than simply preventing your child from being on social media.

but certainly a significant potential harm is the feeling of constantly comparing yourself to other people and feeling that you don't measure up in some way.

Unfortunately, this is just reality. And it relates to one or two collective action problems the masses don't comprehend. The best documented of these is the eugenics problem; less well documented but probably real is a problem with the economy where too much is based on luck, so people have to watch those with the same or lesser genetic endowment as themselves be much more privileged, which is wrong. But they can only think of the dumbest communism as a solution to this and that didn't work so well, so they have given up. Communism of course is based on the selfish heuristic of I would be better if I had more stuff, and not based on true collective action problem logic. The real solution would use IQ tests and would be enforced meritocracy or something along those lines.

More comments

the data on teenage phone usage doesn't prove a collective action problem

Why doesn't it? I guess I have to go dig it up, but there's literally surveys with teenagers where they're asked if they think they'd be better off with no social media but don't want to stop using social media if everyone else is still on it.

Literally the definition of a collective action problem.

Why doesn't it? I guess I have to go dig it up, but there's literally surveys with teenagers where they're asked if they think they'd be better off with no social media but don't want to stop using social media if everyone else is still on it.

Yeah, in general I am skeptical of people's self-reporting about their desires, motivations, and feelings. But here, it's basically just common sense, following from basic principles of human nature and social media, among them: (1) comparison is the thief of joy, and the more comparison the less joy; (2) social media facilitates intense comparison; and (3) nobody likes to feel left out, which includes not being the social media site being used by one's peers.

Why doesn't it?

Because it only argues, poorly, that teenagers are individually better off when their individual social media usage is reduced.

but there's literally surveys with teenagers where they're asked if they think they'd be better off with no social media but don't want to stop using social media if everyone else is still on it.

I haven't seen this, I don't recall Jonathan Haidt talking about it. I'm mostly thinking of his work on the topic.

Literally the definition of a collective action problem.

Allowing teens aged 16 to 19 on social media while demanding photo ID from anyone to use any device doesn't appear to solve that problem.