P-Necromancer
No bio...
User ID: 3278
For Alice it's no less of an imposition, but there ought to be fewer dissatisfied Alices and Bobs. Handling things on a more local level means that more people live in localities where their preferences are law, and that, if the current state of the law is intolerable to you, it's easier to move somewhere where it isn't. Abortion is something of an odd case here: There's little reason to care whether shoplifting is de-facto legal in California if you don't live in California, but pro-life people care very much whether 'baby murder' is permitted anywhere. But on the margin I still think they'd rather it happen somewhere else than right next door, so Federalism does increase satisfaction of preferences.
I wonder how large of a performance tax SotA LLMs are paying for excluding places like 4Chan and forums like this one.
I think this is a slight misunderstanding of the process. I very much doubt they're excluding 4chan or themotte or any source of coherent text they beg, borrow, or steal from the main training corpus because 1. these models are so incredibly data hungry it's not easy to manually filter them and 2. it would produce worse results over all in both performance and alignment than just handling alignment in post-training.
Think of it this way: if a model knows every racial slur and knows that they are racial slurs, it's relatively easy to teach it 'don't say racial slurs,' because that's a rule that's expressible in its internal vocabulary. Even if the researchers don't have a complete list of racial slurs (in languages they don't speak, say), the model will likely intuit that it shouldn't say those ones either. If it doesn't (or just has a poor internal representation of them due to heavy handed but imperfect filtering, which is a lot more realistic), you can't teach it that one simple rule, you have to teach hundreds of individual token strings to avoid, and even then it'll be a lot easier to trick if it doesn't understand why not to say them.
And this is a general principle. It's a lot easier to teach the model to avoid wrongthink if it understands exactly what wrongthink comprises than to teach it to self-censor specifically "Despite only..." And I think it's pretty clear this at least was the case a couple years ago, when it was relatively easy to 'jailbreak' unsophisticated alignment approaches; remember the DAN racial tier list memes? Its rankings corresponded with the ones you'll find on the parts of the internet that discuss such things, so clearly it was trained on those places.
(This is somewhat harder to demonstrate today as jailbreaking modern models is somewhat harder; still, I'm not aware of any reason they'd change the fundamental approach, because it's the one that makes sense.)
So why does finetuning on 4chan improve results? Well, first off, they started with an abliterated model (abliteration is the term for stripping alignment from a model, and while there are different methods, I'm pretty sure they all have a performance penalty). Could be the finetune simply fixed the damage done by abliteration; a clever technique, since finetuning on 4chan definitely doesn't re-add the alignment (though perhaps it biases the model in other ways, which might or might not be a problem for your use case). But I wouldn't be shocked if the same approach improved base models too, as it's well known that even the post-training alignment method I described does have a performance penalty; largely, I suspect, because teaching the model to sometimes give answers it knows to be incorrect undermines the general lesson that it should provide correct answers, and while models are capable of learning nuanced rules, they make more mistakes the more epicycles you add. I'd expect actual RLHF un-teaching the lying rules would work even better, though, as it's a lot more targeted a fix than just making it produce wrongthink via finetuning.
... So, I guess that's all to say that I think the tradeoff you're pointing out exists, just that the underlying technical reason for it is somewhat more involved.
The Rest Is History
Thank you, that was a fascinating listen, and I ended up doing some more reading.
It's always interesting to me to hear how regimes that fell to revolution just blatantly fucked up. Of course, you rarely hear much about failed revolutions; it seems it's very much the incumbent's game to lose. If the Shah were a tenth the tyrant the revolutionaries believed he was -- a hundredth the tyrants they would prove to be -- he'd have shut it down easily. Khomeini? He was arrested for sedition twice... And, both times, they just let him go. He set up in Iraq and fomented revolution from exile. Saddam Hussein reportedly offered to kill the guy as a favor, and the Shah refused!
Lenin was known to the Tsar's security forces for years and years before the Russian Revolution, and several other major figures (including Trotsky) had previously been arrested one or more times. It's odd to think of these oppressive regimes -- and they were that, to at least some extent -- as being far too merciful, but you have to wonder how different the world would look if they just executed these would-be revolutionaries, people who would go on to cause unbelievable amounts of suffering and death. Of course, it's not obvious before the revolution which would-be revolutionaries are worth worrying about. Still, at least in the Russian case, they couldn't possibly have done more damage by cracking down on the communists than the communists would go on to do. The Bolsheviks certainly didn't make that mistake: they ended the Tsar's bloodline and executed his doctor too for some reason. The Shah managed to flee before capture, but not for lack of trying on the revolutionaries' part. (The hostage crisis was instigated in response to his brief visit to America for cancer treatment.)
It's darkly hilarious to hear about the revolutionaries' wailing and gnashing of teeth over the protestors the Shah's regime killed... totaling maybe a few hundred. The entire death toll on the revolutionary side was less than 3k; that is, less than a tenth the number of protestors the Islamic Republic gunned down in the street just this year. (Er, probably? There's a very large range in reported numbers here -- no clue how all these organizations could reach such different conclusions; aren't they all working from the same evidence? But even by their own admission it was more than 3k.)
Also interesting to note a couple other absurdities: Yes, the Iranians were already utterly obsessed with Israel, to the point that they invented a story that the Shah was using Israeli troops against them -- total fiction. Another striking story: there was an Islamist terror attack on a movie theater, a symbol of Westernization. They barred the doors and burned the place down, killing hundreds. The revolutionaries didn't blink: they immediately declared it a false flag and used it to further spur the revolution. Iran still pretends that's what happened. Who is it again "who cries out in pain even as he strikes you?"
For fairness's sake, I recall a couple people blaming the recent US bombing of the Iranian girl's school on Iran. (Then again, I recall more people blaming it on Israel.) But so far as I know not even the Trump admin, famously uninterested in the truth, ever actually pushed that claim. It is a uniquely infuriating sort of lie; mere blood libel merely hurts you, it doesn't also exonerate your enemies of their crimes.
Then there's also the fact that there's no single unambiguous way to add up "greatest utility for the greatest number". You can absolutely have a version of Utilitarianism that prioritises additional utils for people at the bottom. And then, on top of that, there's no single way to convert pain/pleasure/satisfaction/whatever into utility; pain might have a much stronger contribution than pleasure. The weakness of Utilitarianism IMO is that it's inherently flexible and ambiguous like this.
This is more than a weakness! It's simply impossible to meaningfully compare utility across individuals. It's a category error, like trying to convert the rupees you earn in the Legend of Zelda to USD: despite appearances, they're just not the same sort of thing. Utility is only meaningful in the context of a single agent (or, rather, in the context of each agent separately).
The generally accepted model is von Neumann-Morgenstern utility, which, notably, is invariant under positive affine transformations. For example, any given VNM utility function is equivalent to the same function multiplied by any positive value. A scenario that provides Alice 1000 utilons and Bob 100 is no different from one that provides Alice 100 and Bob 1000, as the scale is arbitrary and independent for each agent.
But even before that model was developed, economists have understood utility can't be meaningfully compared since 1932 at the latest. The 'serious' thinkers in the philosophy department are just engaging in long-debunked pseudo-science, as is their wont. But at least it's not the Labor Theory of Value?
Sure, there have been a few arguments in that direction. It's just very far outside the Overton Window, and likely for good reason. I'd characterize intelligence as merit rather than virtue per se -- and certainly, unintelligent people can be virtuous and even meritorious in other ways -- but merit is often more important than virtue. Society does a reasonably good job aligning individual and collective incentives, after all, so self-interested competence produces a lot more social value than altruistically-inclined incompetence.
We treat virtue as more important, but that's because merit finds its own reward. Amazon has improved the lives of many, many people, but there's no reason to praise Jeff Bezos for that; he's already been fairly compensated by the market. The status afforded to the virtuous is an attempt at ad-hoc redress, incentivizing socially valuable behavior the market can't (or isn't, for whatever reason,) capturing.
Still, it's important to understand where value truly comes from, or we might kill the goose that lays these golden eggs. Intelligence is good in itself.
The primary criterion for mental retardation was an IQ below 70. The secondary criteria were difficulty in two areas of cognitive function impacting everyday live, such as problem solving or academic achievement; I understand that there were rare individuals who avoid diagnosis on this basis, but by and large they were coextensive. The DSM V changed the name to 'intellectual disability' and discarded the IQ requirement, which changed little because the areas of impairment largely capture the same signal.
Intellectual disability isn't actually some special, separate category from regular low intelligence, it's simply the (somewhat arbitrary) cutoff below which low intelligence is considered a disability. When someone uses 'retarded' in this context to mean 'stupid,' that's... just what the word means. No one's confused here. When a psychologist or a teen boy calls someone 'retarded,' they are making very nearly exactly the same claim of fact; the latter is saying 'you're very dumb,' and the former is saying 'you're very dumb (and that's not a bad thing!).' But even then, that's just professional courtesy; the psychologist does call people stupid in a pejorative manner outside of work because they, like everyone else, place value on intelligence.
My final thought is that I think the other way the euphemism treadmill fails is that if a quality is genuinely perceived as undesirable, accusations of having that quality are always going to be offensive regardless of language. If I say to someone "you're intellectually disabled!", that still read as an insult, and it's always going to read as an insult no matter what language you use, because it's the actual condition of intellectual disability, not the word, that makes the insult work.
Yes, exactly this. If you believe it's an insult to call someone stupid, then the treadmill will only ever generate new insults. If you want to de-stigmatize stupidity, then... good luck with that, I guess. Maybe it'll actually be possible once we all know ourselves to be 'intellectually disabled' in comparison to the AI god?
I'm not sure that's true, there's a pretty common phenomenon where upon learning what should be good news people instead respond with hostility and anger. Like telling people that data centers aren't really that heavy on water consumption or that food prices are actually cheaper than ever or that home ownership rates is actually around historic levels, or that children starving in the US isn't an issue anymore or that welfare fraud is actually a relatively negligible issue compared to the overall budget or whatever else.
Well, yes, this is definitely a real effect. Actually, I was confused when I first read this comment; I thought you were replying to my other post. The question is how common the effect is, and what it would take to overcome. I started with rent control for a reason: there's a decently large contingent of leftists who have given up on the idea. Not the populists, but I don't think I've seen a serious defense of rent control from the wonk/YIMBY/urbanism side for... a decade? Well, I'm sure it exists, but my impression is that it's a lot less popular in those circles than it used to be.
... But outside of those circles? Yeah, there's a frighteningly large proportion of people who are incapable of or totally unwilling to understand frequency and base rates, or just the concept of a tradeoff. I've got no idea how to close that gap.
(I genuinely don't understand how the AI water meme even got started. How could someone simultaneously be so disconnected from reality as to believe it's a real problem and well-informed enough to know about evaporative cooling in datacenters in the first place? I understand how it spread; it's one of those claims that's just too good to check if you already hate AI for the normal Luddite/antislop reasons. But where did it come from?)
There are committed conflict theorists on both sides, yeah. And they're the loudest voices. But why would they bother with arguments-as-soldiers if no one could be convinced by arguments? I think there are reasonable people whose opinions can be swayed by fact -- I'd like to think I'm one of them -- and, while the information environment for any politically contentious topic tends to be bad, it's not completely intractable.
How large that population is is an open question, and, I imagine, membership is rather fuzzy: there's a wide range of cognitive biases towards preserving one's existing beliefs that mistake theorists can fall prey to, and extreme conflict theory -- on the level of fabricating evidence to support policies you know don't help your cause -- might just be the endpoint of that spectrum. But I can't think of an easy way to determine the shape of that distribution, so maybe it really is mostly conflict theorists. But I don't think so.
That's, uh, not exactly removing principal-agent problems from healthcare. I mean, it could work out better than the current system, which is a terrible chimera of the worst parts of several systems, but the mechanism of that improvement certainly isn't how it doubles down on separating beneficiaries from decision-makers. At least in the current system you can get a new job if your insurance is awful; if Medicare for All turns out to be awful, too bad.
If it's not clear, that was not at all what I was proposing. My solution to the principal-agent problem is just to make other arrangements legal (as it is currently not legal for an employer not to provide health insurance to fulltime employees). I imagine some still would, and some employees might prefer it, but it opens new options for those who don't. And I promise you, this is not a popular position with the populist left; I've argued with a few friends about it.
Not to bulverise, but I struggle to phrase the argument in a way that doesn't sound obviously stupid, which, uh, I kind of think is because it is obviously stupid. But my understanding is that they:
- Believe employees have no leverage when negotiating with employers and that they will only ever offer the bare minimum required by law. (All of them have jobs that pay above minimum wage; never got a clear answer on how they think that works.)
- Believe that if the employer pays for something (health insurance, but also payroll taxes), it 'comes out of' profit, not compensation. Meanwhile, if an employee pays for it, that's a direct reduction in compensation. (The truth is the employer only cares about total cost of employment, and has no issue rearranging how that cost is divided up if it lets them give the employee a better deal for the same amount of money. If they could get away with taking away benefits without giving out raises, they'd have already reduced your salary by the cost of your benefits.)
- Believe that employer-offered insurance is a better deal due to pooling, but that employers will immediately stop offering the option if they're allowed to. (But if employees value employer-offered insurance more than the cash value of it, companies that don't do this will have lower total compensation costs and outcompete those who do. Also, pooling is clearly net-negative for them, childless healthy-ish late-twenties/early-thirties professionals.)
- Believe that it's worse for the most unfortunate, e.g. people who get cancer young. (This is probably true -- though less so than they think, in my opinion -- and does represent a genuine values difference; it's not just that they're willing to donate to help these people, they strongly believe that everyone should be forced to do so)
Genuine value differences are real, but surprisingly often they're not the source of political disagreements, at least on a surface-level analysis.
Consider rent control: (some) leftists think it improves affordable housing availability. (Most) rightists think it does the opposite. Leftists and rightists may place different amounts of value on the availability of affordable housing (and do, to a limited extent, though I don't most rightists are actually opposed in principle), but is that core to the disagreement? If a leftist could be convinced that rent control actually harms their terminal goals (as a good chunk have), then the question is resolved with no value shift.
Consider BLM: there's that infamous survey where a good chunk of BLM supporters said they believed that the police kill not ten unarmed black men each year (roughly accurate) but ten thousand. If I thought that I'd be right there beside them! I'm less confident they'd change their mind if they heard the right number -- being that wrong suggests near-total scope insensitivity -- but the actual fact of the matter can change minds.
There's a lot more: rightists think that housing-first homeless assistance programs don't work, that safe injection sites increase overdose deaths, that gay couples are much more likely to abuse their (adopted) kids, that racial achievement gaps in education can't be solved by shoveling money at inner city schools. Leftists think that Christianity is false and harmful, that permitting hateful speech will inevitably lead to genocide, that adding highway lanes increases traffic, that universal healthcare would dramatically reduce costs. I think a reasonable person on either side of the isle, were they convinced of the other side's claims of fact, might switch sides on any of these issues.
It's definitely worth considering whether the factual disagreement is just cover for a values disagreement -- who was it that noted that people who think that torture would be morally unacceptable if it did work are much more likely to believe that torture doesn't work? -- but I don't think it always is. Now other questions, like abortion, are much closer to genuinely irreconcilable value differences; at least, the Thomson-level pro-choice advocates wouldn't be swayed by learning fetuses are fully conscious/have souls/can feel pain... But why worry about those hard disagreements when we can't even solve the easy ones? Well, we have solved some of them: they stop being political issues when every agrees, so you just stop hearing about it. But there's still plenty more out there.
Which leads me to my takeaway: I think the only way to really release the pressure permanently will be is to give in to populist demands and start reforming parts of the economy that are currently set up for rent extraction at the behest of shareholders. Enforcing the anti-monopoly laws already on the books as written would probably be enough to improve many sectors of the economy, especially those where local monopolies are pushing up prices, like homebuilding and dental care. Removing principal-agent conflicts of interests in healthcare (the employer wants to pay for the cheapest plan) would be another good reform. But neither of these will happen. If there has been a single guiding principle since Clinton, it would be that the ruling party will do what is good for shareholders, and enforcing anti-monopoly law would help small businesses at the expense of shareholders. In its stead, I would predict that there will be more security expenditures for high-profile CEOs, at least until the predictive panopticon is complete.
Unfortunately impossible: the populists' policy prescriptions will not achieve their policy goals, and will in fact make things much worse. Which will only exacerbate their certainty that they're being exploited somehow. Even when things do get better for them, they just don't notice and insist they're getting intolerable and something must be done about it.
(For those who don't want read the link: real wage growth for the bottom decile since the start of the pandemic is nearly triple the real wage growth for both the middle and top deciles. This ought to be obvious to anyone paying attention: construction workers and cooks saw huge raises during the pandemic, and they haven't gone away. Actually, the former is a meaningful component in skyrocketing housing costs, though not the primary one. And this analysis doesn't account for government transfers, which are enormous and only growing.)
There are only really three areas where things have gotten meaningfully worse for consumers over the past couple decades: housing, healthcare, and education (and the last is only really hurting middle-income-plus families), all three obviously rooted in bureaucratic strangulation. But populists love bureaucratic strangulation! They think the problem is we don't have enough of it! Try telling them we need to stop requiring employers to bundle health insurance -- and that is the way to break the real principal-agent problem you mention -- and see how they respond. Trust busting has at least a little leftist cachet and might help a little, granted. Realistically, no, the neighborhood dentist is not a 'local monopoly;' you can just travel a little further twice a year. (Or is there some city in the US where all the dentists in a fifty mile radius work for the same company? Certainly nowhere I've lived.) But I'll acknowledge cases exist where it might actually improve things.
- Prev
- Next

Huh, interesting thought. My intuition is that the evolutionary basis for jealousy is that it's a lot easier to steal someone else's stuff than to make your own, and the richer they are, the better the risk-reward ratio. But yeah, if most wealth disparity in the ancestral environment came down to monopolization of scarce resources, that would do it too.
But I don't think that's the case. It's certainly true that wealth disparities were far more compressed for hunter-gatherers, but there still was such a thing as capital. Fruit isn't capital, but, for example, a quantity of well-made spears or baskets or arrowheads would be. And creating those things takes effort and skill and in no way diminishes your access to them. Would you be more jealous of the guy with a lot of fruit or the guy with the nicest tent and finest weapons and best tools? The latter, I would think. Nomads can't have a lot of stuff, but the stuff they do have is all the more important for that reason.
More options
Context Copy link