I mean, for Scott I empathize that he’s dealing with toddlers right now. They’re the perfect mix of capacities and incapacities for demanding hands-on intervention. They climb on things, get into things, scream for attention… obviously they spend some time playing quietly by themselves or napping, and you can get some things done with them around, but it’s a sharp curve when they graduate from immobility to crawling to walking to climbing!
Overall it’s not bad, but it would be worse without other people to help around. I don’t really know what his full circumstances are like, but caring for a toddler more or less solo for a full day, no friends or family around to hang out with, is pretty rough. It’s always best when you can be communal, and for deracinated Bay Area sorts, that’s what you get all the time.
So I have a little more pity. There are things money can’t buy.
Extreme emotional lability, or rather, all over the place. Consider the extreme positive affect and the sudden cold feet as expressions of the same flaw. Best to stay away if she can’t even keep herself together.
You did the right thing by staying firm and stable. Don’t let that kind of woman suck you into her aberrant emotionality. And, like many things, dating sucks until it doesn’t.
Why ask the internet about generalities instead of going to an orientation for the actual and specific schools in your area? Even if the information you get here is absolutely statistically true, it means jack all if your schools buck the trend in some fashion.
Yes, this is racist - about schools, not people. You’re in a position to judge them on the contents of their characters, and instead are asking strangers to judge them by the colors you believe will be on their students. Morality, who cares; but why are you hamstringing yourself?
Presumably because of demand. As I figure, the HOA is basically a way of forcibly excluding people who can’t “fit in” with the community or follow the rules. Reading in between the lines, that was what was going on in this case, as in, classic bullying. Probably the people trying to force defendant to fit in were a real mess of busybodies, obviously, as they brought a dumb case to court, but this is the function here.
In a more sympathetic case, imagine a family moved in who left rusting cars on the lawn and other obnoxious but not quite illegal things that nobody of your class or background would do. How do you make them stop? I know a lot of the people here are libertarians, principled or otherwise, but the average Joe ain’t and would rather keep those families out, or else coloring within the lines. Personally I don’t empathize and enjoy my freedom more, but I get that’s a rarity overall.
And apparently HOAs are overall popular. People like em. Or at least, they aren’t the kind of radioactive that would stop people buying these properties, even with the very obvious downsides, and encourage developers to not enforce them. I know revealed preferences is a meme, but it seems to apply here.
Relating this out. I’ve seen a lot of people on this forum arguing pretty directly for a shared US culture. Well, the HOA feels exactly like what’s being asked for here - an association that punishes deviance with process, and upholds normalcy. Japan is a pretty culturally centralized place, and from what I hear from my friends there, pretty much every little village and neighborhood has its own little HOA (micro-local government). They organize things like who goes to sweep out the graveyard, sure, but also make certain nobody gets too far out of line, in that distinctive passive-aggressive but unmistakably Japanese way. And I think of that, and of the fuck-you American spirit, and it makes me laugh a little. Conformists are allowed their little liberties here, but why think they’re remotely popular? An American will only subject himself to banding together once he’s exhausted the alternatives for keeping the undesirables out.
(This is ignoring the little associations that are just about funding shared resources, like an HOA that pays for the community pool. Those have a straightforward reason to be.)
I guess your point on romantic signaling is probably true, though I hate signaling and reflexively oppose the position on principle.
But I actively disagree that the most important thing is to push moneymaking degrees, on a couple of points.
First, the whole degree-to-job pipeline is overrated. The degree is a proxy for, roughly, intelligence, and as long as you have the real meat you will be able to leverage the actual work. (This is my life story. Started in humanities and trivially switched to work in STEM. I’ll admit software makes this easy.)
Second, while cash obviously matters, I think the most important thing is to learn wisdom and be a good and broad-based parent to your children. This is what my parents were to me. And while I decry the sorry shape of the liberal arts in universities, the actual subject I consider paramount. So rather than just add work training for women, I think bringing refinement and rigor back to the degrees would be better. (And helping people who have no business being in college get out. That’s another topic.)
You’re right about divorce as a path for extremely cynical women. If I were writing about the man’s perspective, this comes front and center. He’s devoting so much of his life to her! What if she just takes it from him, with the blessing of the courts? It’s genuinely unsettling. But, in that other hypothetical post, I wouldn’t be talking about cads. I don’t think (or hope) my audience is cads, or people interested in cads, and the same goes for the female equivalent.
Divorce is honestly another point of risk for an honest woman, just like it is for an honest man. Risk hitting your mid-thirties with no loyal man, and either no children or worse - children? It’s kind of awful to think about. But the post was already meandering a little for my tastes.
Yes, of course I agree a man needs standards. I have standards, and I insisted my wife meet them (kindly and firmly in the dating stage - and no, not about petty things like how I wanted my breakfast cooked).
But that doesn’t undercut the fact that what underwrites those standards is a man’s reliability and character. I’ve been performing a little personal ethnography on this forum, and in my own life, and the men who are happily married tend to be extraordinarily solid and secure in their opinions, thoughtful and caring about women’s perspectives (NOT a dogwhistle for mainstream feminism), and with a great focus on their own ability to be trusted. And this is something that good women, women who clearly enjoy the high opinions of their husbands and of me (should I meet them), deeply desire.
Anyway. I don’t think women have greater risks in dating, or that men do, for that matter. I tend to agree that the risks are mostly around discerning good from bad, and that’s hairy both ways. But learn good from bad one must do, or at least learn the methods of getting wiser friends to help, if one wishes to make anything of oneself. But I’m sympathetic to your worries, and hope you find a woman who allows you to lay them aside.
What kind of woman does that? Would you consider her in your league? In the college league?
Besides, this was advice for reproducing, not dating. Dating advice is a different kettle of fish.
If you have a woman who you’re dating who is a good candidate, learning to trust her goes a long way, and trusting yourself the rest. “Learning to trust” is not an abstract journey of the soul. Select things to trust her, and yourself, on, and see how they go when things get hairy. Stressful situations are effective here!
Buyer beware: I’m not recommending a good time, here.
Baby boom was in a sense a last gasp. Huge wealth changes the equation. But it was the specific experiences of the baby boom that sparked feminism; when second-wave feminists deride the life of the housewife, they are and can only be specifically talking about the baby boom housewife. Daughters saw what life was like for their mothers, and they wanted out. You can’t declare feminism as a premise; feminism was, like any social movement, a reaction to prevailing conditions. Those conditions were, first, the Victorian era and second, the baby boom.
The advice is distilled from my own life and my successful friends and coworkers, who are by and large married and with or currently having children. It’s not advice on how to get laid, or how to attract women initially (I have opinions but consider it besides the point), but how to convert a relationship into a companionable and loving marriage with children, which is what I consider valuable. Take it or leave it, I guess.
I’m interested in your view on how the quality of men and women has gone down, and as a treat, why. If I were to give a description, I’d say that the lowered quality was literally that they weren’t interested in making things work, rather than separate elements. That sort of intentional, serious attitude towards life is basically what you want out of a partner as table stakes, right? That they’ll have the hard fights with you and want to get through them instead of taking them out on you, that they’ll commit materially sooner rather than later, that they’ll stick with you if things aren’t breezy. Obviously material concerns matter too, but people (in my circle, maybe unrepresentative) make plenty if they’re even slightly dedicated. What’s your take?
Downthread, in the discussion on cheating in college and the decay of institution, @hydroacetylene brought up a frequent topic: is the college-to-work pipeline good for society and for women? Rather than the high-level moral or strategic view, I wanted to look more at the countervailing forces here. Even assuming that early family formation is good, desirable, and pleasant for women compared to schooling, why would they choose college? Not to bury the lede: I think it’s risk mitigation.
A woman’s life is, not to an infinite extent but nevertheless to a great extent based around vulnerability. She is especially vulnerable to men, who are stronger than her and yet want something from her. A man who wants something from her more than he cares about her is not a curiosity but an active threat. Even if no such threat manifests, her very nature makes her vulnerable. A pregnant woman, or a new mother, is incredibly dependent on those around her. If any part of that support should go away, she could be in serious trouble. Women’s life strategies, unsurprisingly, center around mitigating these risks.
These strategies fall into two major camps: finding a center for her protection and support, and making damn certain that she has excellent control over that center. (For men this is simple: he is his own center of protection and support, always. Everything else is just a fallback for extenuating circumstances, or part of his larger ambitions.)
For her center, a woman can choose, in essence, a man, an institution, or herself. For herself, she will obviously be unable to reproduce. This is a fallback, the spinster’s last resort. No more needs be said. An institution is impersonal and uncharitable, but (say) a widow will find it tolerable, and she has some modicum of control. If she follows the rules, support will not be retracted. So what is preventing her choosing a man? Her lack of control over him.
Men are famously fickle. A man will sing a woman’s praises to the moon, and maybe even believe himself, and vanish as soon as he gets some. He will spend the family’s money on dice or drinks. He will say that whatever he earns is his by right, and ignore the duty he has towards the flower he plucked in the prime of her life in an explicit contract to care for her forever (till death do we part). Even if he is one of the rare, dutiful ones, his simple preferences become domineering imperatives, and you have to think on every one: is this worth fighting over, if he might just leave? To say all men are cads is to go too far. But there are cads out there, and their attentions are disastrous.
(I know women who have had their men: get fired and refuse to work, get addicted to painkillers and refuse to work, allow their mother to browbeat their wife, and support an entire separate family in another country, off the top of my head. I also know women who have had loving husbands with no problems who are in old age. But would you want to simply gamble on the outcome here?)
So what women need is leverage. Historically this was twofold: the highly salient and important labor they performed, and their tight bonds with their (and their man’s) immediate community. For reference, before modern textile production, a woman would quite literally make the clothes on her husband’s back and the food he ate. Were he to get them elsewhere, they would be much more expensive and less tailored to him. This makes any argument inherently easier for the wife to win. He depends on her, too. Meanwhile, if he were to stray, her connections to the local wives, perhaps including her own parents and his, or moral leaders like a priest, would allow her to bring wide-ranging pressures down upon him. Or, say, if he were to romance her but fall short of his duty to propose to her, a brief word between their fathers would end in a joyous wedding officiated by shotgun. I’m not trying to imply the distant past was a glorious feminist utopia, but these were to the best of my knowledge the mechanisms of women’s power back then.
Woman’s work was eviscerated by the Industrial Revolution, and her community was shattered by the car. Bluntly, there is nothing coarse and material that a housewife can offer a man in this day and age which he cannot get for an acceptable amount of his own money. Food and cleaning are trivial, and the only real limitation on sex is whether porn is sufficient (it generally is). The only things she can offer are on a more sophisticated or higher plane, like the abstract of a continued legacy through childcare or loving intimacy and affection. These are important, but have a lower valence than the material, meaning that the man’s opinion is dramatically privileged. And in a postwar suburb of friendly acquaintances, in and out of the house on errands and excursions, there’s nobody to drop in on and talk to and organize with - and even if there were, why would the man not simply get in his own car and leave to find those who “understand“ him better? As the last nail in the coffin, the pill and the Sexual Revolution deny women even their power over sex. If it’s pleasurable and has no risk, what right does she have to demand that her man do something in exchange - except pay as her john? With pregnancy on the table, it’s obvious: he risks what she does, together with her. But without, it’s harder to argue the obvious truth that she is risking time, because he does not have the same pressure to make the most of the flower of youth.
This is the foundation of our current moment, and given the premises women choose independence. They do not perceive a reasonable alternative by which they can have a marriage where they are respected and equal. The life plan changes accordingly, and becomes: go to college (to protect you in your most vulnerable and desirable period and increase your status and the treatment you can demand), take a job with a good healthcare plan (including maternity leave), find a man who sticks with you for several years (while you are on the pill, and proving he is not a cad), and finally, around 30, get married to a man you TRUST to support you and your children. Of course, this costs a huge amount of time and money, but it’s more palatable than taking a dive for the first schmuck on the street with no good way out. (And even if he is a good man, get stuck in a suburban home near HIS job with an infant or two and an absolute dearth of friends to see during working hours and little sense of what you’re really bringing to the table. At that point, why not just get a job working alongside other ladies and stick the kids in daycare?)
So that’s my analysis. College is just a means here; if it were not available, women would go for anything else that could protect them, probably an employer. The problem for women is that they feel like the whole deal is raw, that they’re going to struggle to get a man who works for them and supports them and who they can influence. Unless they feel their own power in their own relationships, they will scrabble for every edge they can get. If you want to fix this on a personal level, as a man, be trustworthy and the whole reproduction thing will come pretty easily. As a woman - can’t comment with quite so much authority, but valuing men for their private (i.e. directed at you) virtue over their public (i.e. abstract and status-seeking) virtue might help. On the societal level, focus less on pushing women into childrearing and more on pulling. What are the advantages? How do they mitigate risk? And what’s in it for them, on a practical and day to day sense?
Long-term I feel this will shake out. Men and women who figure out how to bond and partner quickly and effectively will be aspirational and fruitful, and they will be the new model. But for those of us alive now, I think it helps to be intentional about our own lives.
Interested in the opinions of married mothers on this (I think we have a few). I’m a happily married father, so I have some insight, but it’s all third person to me.
Interesting - I do think there's a pretty major intermediate step of using static analysis-type processing to control the excesses of AI, and worldinfo is a plausible first step. But that lack of real memory just keeps coming back and kicking any in-depth efforts right in the teeth.
Probably a good place to stop this particular conversation, but not before Claude updates. It's still at Mt Moon, but there have been some interesting developments (timeline:
- It almost got out of Mt Moon! Unfortunately, then it died.
- After dying, it managed to convince the oversight layer ("Critique Claude") that there's a secret passage to Cerulean City that skips going through Mt Moon, and all it needs to do is find it. Now both AIs are convinced that this is the one way forward.
- But the only way to get to that secret passage is by dying again. So it's trying to black out on wild Zubats in Mt Moon to reach Terabithia or something.
- Now that it's converted to Gnosticism, it doesn't need any knowledge of the world of the Demiurge. So it's been scrubbing data out of its memory banks in favor of the holy book "black_out_strategy".
- At some point it managed to crash its internal tooling. This hasn't helped matters.
This is absolute peak comedy. Somehow the AI has managed to go completely nuts, seduce its parole officer, and start a death cult. We're not even past the linear part of the game yet!
I've seen things... cough you people wouldn't believe...
LOL
fun exercises in tard wrangling
Is that a full TTRPG campaign set up for an LLM to execute on? How well does that work, and how extensive can it get? Is there some kind of external scaffolding for selecting things like random events, or does it have the capacity to toss all the events together in memory and then select? How long does it go before it totally loses the plot? (Maybe not an appropriate Culture War Roundup topic, but w/e.)
I considered playing around with some of that stuff a while back but I just couldn't justify the costs to myself. It's interesting, but so are a lot of other things that are WAY cheaper (and I'm at this point morally opposed to interfacing with large companies if I can at all help it). If cost-to-performance comes massively down over the next decade, maybe I'll try a local model off a reasonably priced GPU. Otherwise, idk, it's cool hearing stories.
The primary question was whether conflicts in Japan can be classified as ethnic. If you want a definition, here you are: coethnics recognize themselves as the same "kind" of people. An ethnic conflict is a struggle between mutually recognized "kinds," where the direct competition between the "kinds" is driving everything involved. The groups in conflict will directly reference the underlying cultural or genetic differences (especially material) in identifying the group they oppose. Think slurs here.
The modal ethnic conflict is Israel/Palestine: two self-identified groups competing over specific territory and resources. When one wins, they move the other off the territory entirely. When they win they enforce their cultural habits and obliterate the practices of the losers in any ways they care about.
I'd go so far as to say that NO internal Japanese conflict maps to that, except the conflicts with the barbarians, which the Japanese very explicitly labeled as a conflict between their "kind" and the barbarian "kinds." (Maybe the stuff with the Christians could be labeled as an abortive ethnogenesis.) Japanese conflicts are typically one of the following: jockeying for position under an accepted sovereign power; attempting to overthrow the sovereign power; attempting to create an independent hierarchy parallel to the sovereign power (this never worked outside of the Sengoku period; they all got cleaned up and subdued by the start of the Edo period). One group of elite warriors fights another, vassalage agreements are reordered, anyone who doesn't fit in gets killed, and the village headman starts paying taxes to someone new.
You know what doesn't happen? The people of Satsuma expelling farmers from the outskirts of Kumamoto and settling the territory, destroying the local art and buildings and replacing it with their own. The Japanese do that to the barbarians, sure, but not to each other. Therefore, not an ethnic conflict.
What I would argue, though, is that regardless of whether we think the word 'ethnicity' is appropriate or not, historically Japan has been often divided, and people from different parts of Japan understood themselves to be meaningfully different to one another - certainly to the point of fiercely conflicting with one another.
Only somewhat true. Let's start from prehistory and round dates aggressively:
- 300 AD - 500 AD: probably interfamilial conflicts; largest one is plausibly between followers of Amaterasu and Susanoo (roughly corresponding to the people who followed the coast of Honshu to the south and north respectively out of Kyuushuu). Result of that conflict was that both sides apparently agreed to live with one another, and the winners badmouthed Susanoo in their myths.
- 500 - 650: no notable internal wars.
- 650 - 675: coups, major government reform.
- 700 - 1150: no notable internal wars. Samurai emerge in this period; alternately fight barbarians and one another (for stewardship of outlying farmland, e.g. Tokyo area, in the name of Kyoto nobles). You may not believe it, but Japan is not especially martial up to this point. Their manpower generation is feeble; their political elite doesn't know how to fight; they have a huge problem with half-trained thugs working for Buddhist monasteries extorting the capital (until someone figures out that samurai have been invented and bring a couple dozen home to clean house).
- 1150 - 1200: major civil war between samurai over who gets to take the government from the nobles.
- 1200 - 1300: no notable internal wars. Government gets its legitimacy from fairly judging disagreements between samurai and precluding violence.
- 1300 - 1400: comedy of errors. Starts with an imperial succession crisis; in the middle of that, a notable general decides he wants to become shogun. He succeeds, but totally loses control of the country. Succession crisis continues for some fifty years in the meantime. Finally the grandson of the shogun gets the country mostly together, but now Japan is more like the Holy Roman Empire than it was before: lots of petty princes.
- 1400 - 1450: intermission.
- 1450 - 1600: the show continues. Warlords get mad at one another and decide to cage match in Kyoto, burning it down in the process. (Shogun lives there.) Rest of the country falls to pieces. 21st-century crews descend to film the bulk of the country's historical dramas. Finally a warlord manages to reunite Japan, then gets assassinated when he really would rather not have. His lieutenants have a cold war, one of them dies of natural causes first, the other wins the following hot war, and installs himself as shogun. Most lords are his direct vassals, and get reorganized into being more like corporate salarymen (with mandatory relocations!), and the rest are kept on a tight leash. Christians are exterminated.
- 1600 - 1850: no notable internal wars. Country mostly closed for renovations.
- 1850 - 1875: foreign influence forces country to open. Ambitious retainers of the independent lords decide that this is their chance. They swiftly take the country over and industrialize.
- 1900 - present: no notable internal wars.
So, adding that up, when was it divided? Maybe in prehistory, but if we start from the appearance of writing, we have around 600 years of general unity with a single period of civil war oriented around who gets to lead the government. Following the appearance of samurai, things get a lot more spotty, but there's a couple of unified governments, and even in the rough times nobody is arguing that one cultural subcategory of Japan should exterminate another. Still, from 1150-1600, you have about 150 years of unity and 300 years of disunity. Following that, you have one (1) more internal war (which I will overestimate as 25 years of serious internal instability) in the 400 years leading to the present and otherwise total unity.
Across this time period, although I have no idea what is sufficient in your eyes to be "meaningfully different" - perhaps it's the Edo-period complaint that the Kantou or Kansai eat their noodles like fucking animals, perhaps not - no people in Japan felt their "meaningful differences" were good reason to start a war. Directly competing ambitious elites certainly had a reason to start wars with one another, and did so frequently, but just as frequently took vassals and intermarried and felt no particular need to enforce one way of producing miso over another. That was the concern of peasants, after all.
The thing that irks me about your initial comment isn't that it implies Japan was ever violent. Certainly it was violent! Certainly there was great discord and strife! Coethnicity is no panacea against human conflict. The second story in Genesis is about someone killing his very brother. What irks me is that it seems to be based on a definition of "ethnic" that has no meaningful subject, or else is based on a representation of Japanese history which is not reflected in reality. The reality of Japanese history, and Japanese conflict, is something I've found deeply interesting, and it has its roots in petty court intrigues and the powerful and chaotic dynamics of feudal vassalage. But there is no ethnic side to these conflicts, and they do not need an ethnic side to be interesting. Trying to color them as ethnic loses the real hue of that history, which is what changes as conflicts cease to be feudal and begin to be ethnic - which, incidentally, is a good description of what happened over the course of the Napoleonic Wars.
Outfits: (For each character their current clothing and underwear.)
and underwear
This site needs emojis for shit like this. Text doesn't do it justice.
Hobbyists have no shame.
Yeah, I hope nobody tells them about worldinfo or something. I'm still convinced the median /g/oon still has the median researcher's ass handily beat wrt "prompt engineering". Arguably this is a testament to how powerful a tool SillyTavern is, but afaik every feature has been initially conceived and pitched by the community anyway.
It looks like they actually implemented something similar to what I was talking about earlier - I watched Claude sit and churn for a while after it left Pewter, moving all information about that city into long-term memory (with explicit tags!) and clearing up local information. It's now back in Mt Moon, so we'll see whether this has made it more effective at navigation. What it's definitely doing is taking meaningful and extended "clock cycles" to manage - so this kind of improvement is definitely not free or cheap at present implementation/with present models.
Very cool tool from WorldInfo. I like the idea of bringing word definitions into context transparently based on the prompt.
I expect that wouldn't change much, arguably it'd make it get lost even more, at least now it seems to have a fairly clear objective in mind (beat children defeat gyms), which it can even translate into lower-level "tasks" like navigating routes.
Besides, the minimal prompting seems to be the point; from my understanding the dev is unwilling to hold Claude's hand any more than necessary and he wishes to see how it holds up on its own, even if it takes it days to get out of every stupid loop he gets stuck in. I wish I had unlimited credit think it's dumb, even with crutches to streamline progression and break loops this would still be pretty interesting to watch, but oh well.
Yeah, watching the money burn is a little eye-watering, but I appreciate how seriously the guy seems to take it. He seems to have known from the start that it wasn't going to be a magical success, but wants to see what it takes to get it working. I'm here for that. My only complaints are: there's no summary of where it's been/what it's done (so I can't track progress easily) and there's no export of the knowledge base over time to show what it's learned. Getting to read the knowledge base would be incredibly interesting.
In contrast to the previous comment, I DO disagree. Japan's only ethnic groups are the Yamato and the pre-Yamato "barbarians" (and the Ryuukyuuans, although those were annexed much later and are not in the main archipelago).
The Yamato did historically understand themselves to be one people organized under the priesthood of the Imperial family, which performed a yearly ritual to ensure good rice harvests for all. They used one language, with various dialects - similar to the way most languages work, like English. They shared an overwhelming proportion of their material culture and religion (local cults and the abortive Christian movement notwithstanding). For multiple extended periods of Japanese history they were united under central rulership, although in earlier centuries this was pretty distant rulership.
Modeling Japanese conflict as regional is nonsensical - the better model would be family (or clan) conflict, with only a few interesting exceptions like the militant Buddhists around Osaka during the Sengoku period (or the rising of the farmer-samurai, same period). The closest thing I can think of to a strictly regional conflict was the east-versus-west conflict of the Genpei war - which is, once again, even named after the two families in conflict. The regions in question are mostly important as the places where the warring parties have their farms.
If you want the clearest evidence, consider that every group that succeeded in WINNING one of these conflicts sought out the SAME goal: entitlement to lead the Japanese people, typically as Shogun but in one memorable case as Emperor. (On the small scale, it was the right to rule over a local group of Japanese in a pretty typical Japanese fashion, which is to say with high taxes.)
Your requirements for a given people being "one ethnicity" appear utterly unattainable anywhere. What standard could possibly be met? If there's ever a conflict between two groups, isn't that - from the argument as you have stated it - sufficient proof that these were not coethnics in the first place?
the different states of the Holy Roman Empire were all German
But didn't the people in those states agree that they were German? Or else what was the pan-Germanism movement that arose in response to Napoleon's invasions?
but when spatial navigation is not prompted directly because it is presumed to be implicit in the task
Is this an artifact of the LLM having no side-effects while processing outside of the explicit textual output? e.g. if you tell them to process it explicitly but include that in a sidebar like the <thinking> block, would they have an easier time keeping the anime chicks where they oughta be? Human communication assumes that there's subtext in every conversation, and the deepest part of the subtext is that the other party is thinking and remembering certain things. But there's no equivalent for an LLM.
Actually yeah I believe this is exactly the problem, my experience with purely chat-based MUD-adjacent scenarios has shown that it can barely keep track of even that. Some kind of consistent external state of the world, or at least of the self, seems sorely missing, and the 'knowledge base' doesn't seem to successfully emulate that.
Memory, in other words. And all the hairiness that entails. I wonder why the knowledge base approach seems to have fallen flat. It's a very plausible idea on the surface! If there's too much for me to keep track of, or I'm worried I'll forget the details, the correct solution is to write it down and refer to the notes.
Actually, re-reading the design, it looks like the knowledge base isn't so much like a binder of notes as it is a single post-it note stuck to the screen - Claude doesn't query it deliberately, it apparently gets the entire contents of it shoved into the prompt. Wild! That would explain part of why it's so useless. It's hard to fit anything very detailed in there and means that Claude can't get a new set of "notes" for whatever area/task it's currently attempting to handle.
I'd guess it was given an explicit task - beat the game, which requires completing the objectives, which constrains its focus to the general idea of the game's progression it has from training (see its obsession with Route 5 during the tard yard arc). Exploration is basically you the player exercising agency in ways permitted by the game structure, agency of which Claude has none. Actually I wonder if explicitly prompting something like "beneficial items found in out of the way areas can help in beating trainers by making your mons stronger" would make it get lost even more actually explore.
Yeah, on a strict level Claude can't possibly be agentic, but it could definitely be given a richer set of goals. What if you gave it something open-ended like "Pokemon is a game that children play to explore, befriend Pokemon, and win tough battles. Play this game the way it was meant to be played"? Or, if it needs more hand-holding, "explore the world of Pokemon and defeat the Elite Four"? Although this would only be helpful if it learned from exploring. Otherwise it would find every corner of MOMS_HOUSE as magical as the first time it explored it.
OTOH it's interesting how it doesn't seem to take a step back here and define a meta-strategy, an approach that makes pursuing future goals easier. That comes naturally to humans as a function of learning. Whenever you try doing something new, you play around with it a little first rather than directly attempt to achieve a goal, right? I suspect one reason that this AI doesn't do it is that it's not trained to learn, as it is incapable of learning.
Compute can be spent in many different ways. We're moving from a paradigm of scaling up the size of a model, in terms of parameter count, in favor of scaling run-time compute (time spent thinking) and reinforcement learning.
Very interesting aside! However, it doesn't address the question of diminishing returns.
There isn't much of a market for AI playing Pokémon. There is immense demand for them to be good at coding and maths. We've seen stunning progress in that regard, as you acknowledge. You attempt to back-chain your argument, saying that they're said to be good at maths but look, they're shit at Pokémon, which apparently invalidates the former. It really doesn't.
I've used AI for coding, which you mention further down as a crowning triumph. It is... not particularly good. It struggles at anything past a very general form of the problem. It was very useful for copy-pasting similar pieces of code! Not very useful for building new features. It had a distinct habit of waiting until the interesting or important part of the problem and leaving a comment saying "Implement a function to do X!" Hmm, very interesting, if I tried that I'd get fired. So no, I think this is a valid argument. AI can be taught to the test, and indeed appears to have been, but the actual world involves far more de novo work than the test includes. That's why school-trained pre-professionals tend to need a pretty hefty ramp-up to start being really useful - they've only been working on tests so far. Pokemon is interesting precisely because it has not been trained for. You should expect more, not less, untrained situations for AI to do anything meaningful in the job market - and you should weight untrained situations in your analysis several orders of magnitude higher than trained situations.
Do you use AI to augment your work? Is it going to take your job? On what kind of a timescale? Do you think you'd be able to substitute yourself for an unmonitored AI without issue on any tasks? If not, what errors do you think it would make, and why? Honestly interested in your answers here, if nothing else. I would greatly respect you for putting your money where your mouth is on this one and bringing receipts.
I had never given it any thought before the demonstration. But plenty of people have speculated that LLMs would never be any good at video games, and now that they're not good but not terrible, it's only a matter of time before they're great. And that time can be very short.
Hmm... you think getting stuck in what appears to be a permanent loop is not terrible? Is this the behavior you'd accept from anyone working for you?
The thing that keeps puzzling me about your comments is that you seem to simultaneously view ANY capacity in a task as an impressive accomplishment at the same time as you assert that AI has overwhelming general ability. Those two don't go hand in hand, except maybe by this little quote. Any capacity seems to be, for you, an indisputable sign of unlimited future capacity - as though the only question to be answered is total disability versus infinite ability. There's no clear reason that this has to be the limit of the answer space. Line go up... forever? Like with bitcoin? There's also the rather bizarre fixation on LLMs - even though something like, say, an octopus is very obviously not an LLM and still has meaningful if primitive intelligence. The sheer gnostic power of your position is hard to argue against, and unfortunately I don't find it very convincing based on my own experience. It takes rather a lot on faith.
Glad you like it!
By "think spacially/temporally," do you mean "produce valid outputs for spacial/temporal problems" or "model space and time as first-order constructs"? I definitely believe the former, but I'm skeptical of the latter. Claude's adventures in the "tard yard" showed a real difficulty in grasping that, if the back of the house is a closed-off yard, maybe you should exit through the front. Looping is a problem, but I don't think any of us would consider this to be a particularly information-dense problem. The only way it could be is if the AI's ability to recognize the problem is hamstringed by its need to encode the state as a totally different sort of resource (linguistic tokens) - which brings us around to the top.
Battles are, of course, way easier, because they can be cast as a narrative (and I'm pretty sure every AI is trained on Smogon's ample fora).
Another interesting thing, not sure what to think of it. When I play a game like this, my default behavior on entering a new area is to explore it thoroughly and learn what there is to learn about it before seeking out objectives. Claude seems to prioritize specific objectives over general exploration, to its detriment. Wonder why that is?
You are literally erasing my existence, mods???
My culture is NOT your costume, fake-normie.
Well, if that's what you want to call an Anthropic researcher who decided to make their experiment public.
"Claude Plays Pokémon continues on as a researcher's personal project."
Ha, wow! Was not aware of that. I guess that makes sense w/r/t the funding.
You've written a lot. I think it's best to focus. (As much as I'm tempted to talk about concepts.)
What I understand to be your main point is (my words because you did not state it in concrete terms):
AI has rapidly improved in the recent past. We should expect it to continue improving at a similar rate. So if you see any success in a given metric now, you should expect to see much more success in the near future.
Which is a fair point! The only counterargument to that is on the specifics: why is it improving and what do we expect future improvements to look like? Almost all of the improvement thus far is based on throwing more compute at the problem - so if we're going to see improvements of the same kind, we should see them based on more compute. However, improvements in models are logarithmic - steps up in capacity tend to require 10x compute (by appearances you're pretty educated about AI, so I suspect that is not news to you). So although improvements in efficiency can effectively allow for somewhat more compute, like with Deepseek, we should expect that throwing more compute at the problem will get prohibitively expensive. I believe this has already happened. So while under hypothetical conditions of infinite compute we could have an LLM that infinitely approximates an AGI, similar to the implausible premise of Searle's Chinese Room (a book that allows one to construct a correct response to any input), we are unlikely to see that in practice.
So, how are we to get to AGI, in my opinion? By improving AI on completely different parameters from what currently exists - a revolution in thought about how AI should function. And tests like Claude Plays Pokemon are a fun way of showing us where the gaps in our thinking are.
For my own point of view:
This AI can strategize in battle, understand complex instructions, and process information, BUT it struggles with spatial reasoning in a poorly-rendered 2D GameBoy game, therefore it's not intelligent.
That's not the argument. The argument is: this AI is struggling in a VERY non-human way with what we would consider a pretty trivial task. This reveals that its operational parameters are not like those of a human, and that we should figure out where else it is going to perform at sub-human levels. The fact that we're seeing this at the same time as it performs at SUPERhuman levels in other tasks shows that this is not AGI, or even in the direction of AGI, but rather is tool AI. (I assume you think humans are, at the very least, general intelligence - right?)
I don't think you've addressed this point, except here:
It wasn't designed to play Pokémon. It still does a stunningly good job when you understand what an incredibly difficult task that is for a multimodal LLM.
Why should I care? AGI is supposed to be GENERAL. This is the stuff that's supposed to be taking people's jobs in a few years! And yet it gets lost in Cerulean City? As a tech demo, this is very cool - it's remarkable that someone was able to pipe these pieces together, and the knowledge base idea is very cool and is a plausible direction to take new LLMs into. A hypothetical Claude 3.8 that is explicitly trained to make knowledge base manipulation a central feature of the model could potentially perform miles better on some of these tasks. But all you've told me is that I should expect AI to struggle with these tasks. In which case: doesn't it sound like we agree? We both agree that there was no reason to expect Claude to succeed with Pokemon at the level of an eight-year-old. So, from the perspective of an uncommitted third party, given that an AI skeptic and an AI optimist have both agreed that an LLM can't play Pokemon like an eight-year-old... well, it feels pretty clear to me.
Obviously, if this becomes a big selling point for the next generation of LLMs, then we'll see them all benchmarked on Pokemon Red speedruns and you can I-told-you-so about AI being able to beat Pokemon. I don't doubt the ability of motivated corporations to "teach to the test" - it's what we've been seeing with "reasoning" AIs. It's just one of the problems with setting up real tests of ability for some of these AIs, because they get so much data that it's all but impossible to ensure you have a pure test like what the IQ test aspires to.
In other news: a streamer with deep pockets and a love of AI has decided to have Claude play Pokemon.
To get this working, ClaudeFan (as I'll be calling the anonymous streamer) set up some fairly sophisticated architecture: in addition to the basic I/O shims required to allow an LLM to interface with a GameBoy emulator and a trivial pathfinder tool, Claude gets access to memory in the form of a "knowledge base" which it can update as it desires and (presumably) keep track of what's happening throughout the game. All this gets wrapped up into prompts and sent to Claude 3.7 for analysis and decision. Claude then analyzes this data using a <thinking>reasoning model</thinking>, decides on its next move, and then starts the process over again. Finally, while ClaudeFan claims that "Claude has no special training for Pokemon," it's obvious by the goal-setting that the AI has some external knowledge of where it's supposed to go - it mentions places that it has not yet reached by name and attempts to navigate towards them. Presumably part of Claude's training data came from GameFaqs. (Check out the description on the Twitch page for more detail on the model.)
So, how has this experiment gone?
In a word: poorly. In the first week of playing, it managed to spend about two days wandering in circles around Mt Moon, an early-game area not intended to be especially challenging to navigate. It managed to leave after making a new decision for unexplained reasons. Since then, it has been struggling to navigate Cerulean City, the next town over. One of its greatest challenges has been a house with a yard behind it. It spent some number of hours entering the house, talking to the NPC inside, exhausting all dialogue options, going out the back door into the yard, exploring the yard thoroughly (there are no outlets), re-entering the house, and starting from the top. It is plausible, though obviously not possible to confirm, that ClaudeFan has updated the model some to attempt to handle these failures. It's unclear whether these updates are general bugfixes
How should we interpret this? On the simplest level, Claude is struggling with spacial modeling and memory. It deeply struggles to interpret anything it's seeing as existing in 2D space, and has a very hard time remembering where it has been and what it has tried. The result is that navigation is much, much harder than we would anticipate. Goal-setting, reading and understanding dialogue, and navigating battles have proven trivial, but moving around the world is a major challenge.
The current moment is heady for AI, specifically LLMs, buoyed up by claims by Sam Altman types of imminent AGI. Claude Plays Pokemon should sober us a little to that. Claude is a top performer on things like "math problem-solving" and "graduate-level reasoning", and yet it is performing at what appears to me below the first percentile at completing a video game designed for elementary schoolchildren. This is a sign that what Claude, and similar tools, are doing is not in fact very analogous to what humans do. LLM vendors want the average consumer to believe that their models are reasoning. Perhaps they are not doing that after all?
It's a bit of a tired point, but LLMs are known to be "next likely text" generators. Given textual input, they predict the most likely desired output and return it. Their power at doing this is quite frankly superhuman. They can generate text astonishingly quickly and with unparalleled flexibility in style and capacity for word use. It appears that they are so good at handling this that they are able to pass tests as if they were actually reasoning. The easiest way to trip them up, on the other hand, is to give them a question that is very much like a very common question in their training data but with an obvious difference that makes the default answer inappropriate. The AI will struggle to get past its training and see the question de novo, as a human would be able to. (In case anyone remembers - this is the standard complaint that AI does not have a referent for any of the words it uses. There is no model outside of the language.)
So, as you might guess, I'm pretty firmly on the AI-skeptic side as far as LLMs are concerned. This is usually where these conversations end, as the AI-skeptics believe they've proven their case and (as I understand it) the AI-optimists don't believe that the skeptics have any kind of provable, or even meaningful, model for what intelligence is. But I do actually believe that AGI (meaning: AI that can reason generally, like a human - not godlike Singularity intelligence) is possible, and I want to give an account of what that would entail.
First, and most obviously, an actual AGI must be able to learn. All our existing AI models have totally separate learning and output phases. This is not how any living creature works. An actual intelligence must be able to learn as it attempts to apply its knowledge. This is, I believe, the most natural answer for what memory is. Our LLMs certainly appear to "remember" things that they encountered during their training phase - the fault is in our design that prevents them from ever learning again. However, this creates new problems in how to "sanitize" memory to ensure that you don't learn the wrong things. While the obvious argument around Tay was whether it was racist or dangerously based, a more serious concern is: should an intelligence allow itself to get swayed so easily by obviously biased input? The users trying to "corrupt" Tay were not representative and were not trying to be representative - they were screwing with a chatbot as a joke. Shouldn't an intelligence be able to recognize that kind of bad input and discard it? Goodness knows we all do that from time to time. But I'm not sure we have any model for how to do that with AI yet.
Second, AI needs more than one capacity. LLMs are very cool, but they only do one thing - manipulate language. This is a core behavior for humans, but there are many other things we do - we think spacially and temporally, we model the minds of other people, we have artistic and other sensibilities, we reason... and so on. We've seen early success in integrating separate AI components, like visual recognition technology with LLMs (Claude Play Pokemon uses this! I can't in good faith say "to good effect," but it does open meaningful doors for the AI). This is the direction that AGI must go in.
Last, and most controversial: AI needs abstract "concepts." When humans reason, we often use words - but I think everyone's had the experience of reasoning without words. There are people without internal monologues, and there are the overwhelming numbers of nonverbal animals in the world. All of these think, albeit the animals think much less ably than do humans. Why, on first principles, would it make sense for an LLM to think when it is built on a human capability absent in other animals? Surely the foundation comes first? This is, to my knowledge, completely unexplored outside of philosophy (Plato's Forms, Kant's Concepts, to name a couple), and it's not obvious how we could even begin training an AI in this dimension. But I believe that this is necessary to create AGI.
Anyway, highly recommend the stream. There's powerful memery in the chat, and it is VERY funny to see the AI go in and out of the Pokemon center saying "Hm, I intended to go north, but now I'm in the Pokemon center. Maybe I should leave and try again?" And maybe it can help unveil what LLMs are, and aren't - no matter how much Sam Altman might wish otherwise!
- Prev
- Next
If you have time for an addendum, I’d be interested in hearing what your wife has to say, since it sounds like she was doing the lion’s (lioness’s?) share of the in-person raising. Or not! Might be wrong on my read.
For instance, mine gets a lot of mileage out of the library, playgrounds, stroller walks with friends, pretty much anything where she can chat with other women and let children be children.
More options
Context Copy link