domain:cafeamericainmag.com
My immediate social circle and the benefit of social media allowing me to keep some distant tabs on people from high school and college. Seeing a good number of women I thought had good heads on their shoulders go off some deep end and regress to behaviors I recognize from when they were younger.
I'm not trying to dismantle your argument, as I think you made it well. But I do want to point out that, at least in my circles, there's a strong correlation between "actively using social media" and "not having your shit together". In other words, if your sample is just social media, then you're missing out on all the well-adjusted individuals who are keeping to themselves.
And partially through my job where I interact with people of many ages, and one of the more common and frustrating genres of people I encounter is "neurotic woman in her 40s or 50s who still has the demeanor of a teenager."
Do you work in inside sales? Mostly making a lighthearted joke, here. Maybe even healthcare or aviation?
makes a good case for where Israel has gone wrong
Could you quickly summarize that part? There's no way I am going to read this book, but I am curious enough to hear the summary.
Huh. I was confident that I had a better writeup about why "stochastic parrots" are a laughable idea, at least as a description for LLMs. But no, after getting a minor headache figuring out the search operators here, it turns out that's all I've written on the topic.
I guess I never bothered because it's a Gary Marcus-tier critique, and anyone using it loses about 20 IQ points in my estimation.
But I guess now is as good a time as any? In short, it is a pithy, evocative critique that makes no sense.
LLMs are not inherently stochastic. They have a (not usually exposed to end-user except via API) setting called temperature. Without going into how that works, it suffices it to say that by setting the value to zero, their output becomes deterministic. The exact same prompt gives the exact same output.
The reason why temperature isn't just set to zero all the time is because the ability to choose something other than the next most likely token has benefits when it comes to creativity. At the very least it saves you from getting stuck with the same subpar result.
Alas, this means that LLMs aren't stochastic parrots. Minus the stochasticity, are they just "parrots"? Anyone thinking this is on crack, since Polly won't debug your Python no matter how many crackers you feed her.
If LLMs were merely interpolating between memorized n-grams or "stitching together" text, their performance would be bounded by the literal contents of their training data. They would excel at retrieving facts and mimicking styles present in the corpus, but would fail catastrophically at any task requiring genuine abstraction or generalization to novel domains. This is not what we observe.
Let’s get specific. The “parrot” model implies the following:
-
LLMs can only repeat (paraphrase, interpolate, or permute) what they have seen.
-
They lack generalization, abstraction, or true reasoning.
-
They are, in essence, Markov chains with steroids.
To disprove any of those claims, just gestures angrily look at the things they can do. If winning gold in the latest IMO is something a "stochastic parrot" can pull off, then well, the only valid takeaway is that the damn parrot is smarter than we thought. Definitely smarter than the people who use the phrase unironically.
The inventors of the phrase, Bender & Koller gave two toy “gotchas” that they claimed no pure language model could ever solve: (1) a short vignette about a bear chasing a hiker, and (2) the spelled-out arithmetic prompt “Three plus five equals”. GPT-3 solved both within a year. The response? Crickets, followed by goal-post shifting: “Well, it must have memorized those exact patterns.” But the bear prompt isn’t in any training set at scale, and GPT-3 could generalize the schema to new animals, new hazards, and new resolutions. Memorization is a finite resource but generalization is not.
(I hope everyone here recalls that GPT-3 is ancient now)
On point 2: Consider the IMO example. Or better yet, come up with a rigorous definition of reasoning by which we can differentiate a human from an LLM. It's all word games, or word salad.
On 3: Just a few weeks back, I was trying to better understand the actual difference between a Markov Chain and an LLM, and I had asked o3 if it wasn't possible to approximate the latter with the former. After all, I wondered, if MCs only consider the previous unit (usually words, or a few words/n-gram), then couldn't we just train the MC to output the next word conditioned on every word that came before? The answer was yes, but that this was completely computationally intractable. The fact that we can run LLMs on something smaller than a Matrioshka brain is because of their autoregressive nature, and the brilliance of the transformer architecture/attention mechanism.
Overall, even the steelman interpretation of the parrot analogy is only as helpful as this meme, which I have helpfully appended below. It is a bankrupt notion, a thought-terminating cliché at best, and I wouldn't cry if anyone using it meets a tiger outside the confines of a cage.
I was kinda put off by the villainy in Player of Games. It would be nicer if their "extreme meritocracy through McGuffin" concept have been addressed on merits, instead Banks just goes for "but akshually they are all liars and don't do what they profess at all, and instead just do evil things and hypocritically hide it". This is easy - of course people that use plausible sounding concepts to hide being bad are actually bad, especially if the author demonstrates to us that they are bad and then asks "aren't the people I just showed you being bad actually bad?!" Of course they are, you wrote them this way, what do you expect! This just feels lazy to me. I like my villains to be a bit more chewy, to require at least some work to figure out why their position - in which they see themselves as righteous - is untenable, or at least unacceptable to me. Even Ferengi have been given more fair treatment than that (remember, they reached pretty high level of developed society without any wars or atrocities like slavery. For an obvious caricature, it's pretty decent achievement).
Usually I unhelpfully reply "do a lit review!!!" to these sorts of questions but after a quick look myself I don't think it would be that easy - "become an expert in therapy" is probably more accurate but is as about as unhelpful as it is predictable.
The challenging bit is that therapy (especially CBT) is "indicated" for about everything but that doesn't tell you which types of patients will benefit most from which types.
I'm not an expert in this by any means.
It is worth noting that "real" therapy (or many types of popular therapy) is often less ooey-gooey emotional exploration and more resembles socratic questioning or an outright class (in the case of CBT which is driven by "homework").
I do have a family member who is not in psychology or psychiatry (or medicine) who listens to psychiatric podcasts and a few of them dig into this explicitly, you could probably do that if you really wanted to develop a knowledge base.
Some modalities are more specific however, DBT is for Borderline Personality Disorder and people struggling with cluster-b coping mechanisms as part of their pathology. It can work quite well for this.
Classically (especially for any U.S. medical students reading!) the answer to any board question at the Shelf or Step 1-3 level that includes CBT is going to be CBT unless it's DBT for Borderline.
Thought #1: Incredible machine translation from Claude. 4o interpolates a little that's not in the actual text ("sexy kind of heaven") and does an iffy literal translation for "peaceful moment"; "blissful moment" is a better fit.
Thought #2: Ban LLMs. They will allow comments like this to be translated to English.
Late to the party I started, but spending money to incentivize a change in outcomes in my opinion is categorically different then legally enforcing those outcomes, and the latter is what I interpret to be the modern form of "DEI" that most people (especially on this forum) rail against.
i.e. if you want more women in leadership roles (regardless of motivation):
- Spend $1,000,000 as e.g. scholarships to women to help them acquire credentials seen as barriers to leadership roles
- Pass a law that every board needs to have at least one woman
the former is not strictly DEI imo, whereas the latter is.
If your position is that we do not as a society need to incentivize any change in outcomes (e.g. because you already believe we're perfectly egalitarian), then fine. But to paint it is as DEI is imo aggressively retroactive because the west has a century of history of programs that attempt to bring about positive social change through funding, but the phrase DEI only recently came into the lexicon.
What's your sample?
My immediate social circle and the benefit of social media allowing me to keep some distant tabs on people from high school and college. Seeing a good number of women I thought had good heads on their shoulders go off some deep end and regress to behaviors I recognize from when they were younger. Also including my Ex.
And partially through my job where I interact with people of many ages, and one of the more common and frustrating genres of people I encounter is "neurotic woman in her 40s or 50s who still has the demeanor of a teenager."
Dating has not done a lot to change the perception. I get the sense that women either mature quite fast (usually when they have good parental examples) and are generally self-sufficient by age 22-24... or they hit 25 and if they haven't gotten their mental house in order around then, it just isn't likely to improve from there. There's not likely to be a 'flash of realization' where they renounce their behavior before and suddenly they start 'acting their age.'
I keep making this point, But so many of the people that end up on Caleb Hammer's show are women who are absolutely, GOBSMACKINGLY bad with finances. Which is a decent metric for maturity if you ask me. Oh I'm sure tons of men are in dire straits too, but ain't nobody validating their choices.
This sounds like a character problem, not an estrogen problem. I've met plenty of bitter men who never learn from their bad experiences.
Yes, I cede the point that many men never reach actual maturity. But 'character problem' can indeed be an estrogen problem.
I would suggest that a combination of hormones (keeping in mind that both estrogen levels too high and too low can have huge impact on mood) and a general lack of restraint/correction of maladaptive behaviors on women results in 'stunted' maturity in women even as they approach thirty. And there's nontrivial number of young women taking hormonal birth control in their teens and twenties which can exacerbate the hormone thing.
Then add in that mental disorders, especially anxiety/depression has spiked particularly badly among young women. And as a result young women are increasingly prescribed antidepressants.
This probably exacerbates the hormone issue above. I am highly suspect of what happens to brain development due to said brain being awash with a combination of exogenous hormones (birth control) and SSRI's and similar drugs for the entirety of one's young adulthood.
I dunno man, I get the sense that women are having an increasing amount of trouble coping with the world-as-it-is. That is, they have bad experiences, and rather than process and learn from them... they use pharma drugs to cope. And they become bitter.
I think men will have issues like this too, but they don't tend to go to social media and scream it from the rooftops, so it is harder to see. If it gets bad enough, they tend to kill themselves. Less serious, they may withdraw from society (or society discards them as useless), or go to prison if they lash out, or they become an Andrew Tate acolyte or something.
I am prepared to believe that this will be less prevalent among higher SES demographics.
I have some friends in this category. They’re miserable.
Same, all the moreso because if and when they do manage to find someone who ostensibly wants a strait-laced monogamous relationship, the dominant gay culture is constantly shoving extra-relationship sex into their faces, leading to rampant cheating and drama and relationship blow-ups/divorce.
With high intelligence being one of the key ingredients to make for better leadership of groups and societies
I would like to challenge this. While obviously we don't want the leaders to be idiots, I am not sure I would prefer a 150 IQ psychopath to 120 IQ kind and moral person as a leader. To me, the main role of the leader is to set goals, make choices and keep the group from descending into chaos, and I am not sure sky-high personal IQ is the best way for that. Maybe some other qualities - which I am not ready to define, but could tentatively call as "not being an evil asshole"? - are at least as important? I do not doubt we need to require the leader be smarter than average - but I think there's a point of diminishing returns where pure IQ power stops being the thing that matters. I don't know where this point is, but I think the premise "the more IQ, the better leader, no exceptions" needs at least to be seriously questioned.
This of course is complicated by the fact that a lot of our current leaders are, to be blunt, psychopaths or borderline psychopaths. Some of them aren't that high IQ either, to be honest. So we're not doing great in this area, and we only have ourselves, as a society, to blame for that. I'm not going to name names here because everybody would have their own examples depending on their political and personal proclivities, but there are enough examples for everybody. But if we want to do better, sometime in the future, I am not sure "higher IQ score" is the metric we need to concentrate all our efforts on. I have seen enough high-IQ assholes to make this questionable for me.
They're not evaluating GPT-4. They're using 4o.
4o vs gpt4 is my mistake, but gpt4 is generally considered obsolete and nobody uses it. It's true that 4o is a mixed bag and underperforms gpt-4 in some aspects, but we have no reason to believe that it's significantly worse than gpt-4 at translation.
4o is also what powers chatgpt.com so it's the model that most casual users will get the output from.
4o, even at the time of publication, was not the best model available.
4o was released well before gemini 2.0 or claude 3.5, so it likely was the best model at the time, along with the original gpt-4. I agree that right now 4o is not good.
My core contention was that even basic bitch LLMs are good enough
My core contention is that deepl is good enough, as it's within spitting distance of chatgpt. But on the other hand ChatGPT has given people ways to do much much worse when they use it wrong.
100% onboard with this
The word you're looking for is "war".
If you're stuck in a permanent war against an enemy you profoundly outclass militarily, economically, culturally, and politically, at a certain point you are responsible for the ongoing outcomes.
Israel has done other things, up to and including
I should be clear, I am no defender of the Palestinians, they are absolutely awful insane irrational neighbors. I am deeply thankful I live nowhere near a Muslim theocracy. While I rate the Israeli reconciliation attempts as "mediocre at best", they have tried, both sides are profoundly irrational at this point.
the best the Israelis can do
I think the answer to this is some flavor of Marshall plan + perhaps a rather invasive CCP-style police state to give young Palestinians a taste/goal of a better life while ensuring that the smallest possible % of GDP is turned into ballistic rockets. Also things like "not constantly encroaching on the West Bank with settler communities" would probably help as that rather calls into question the good faith nature of one of the sides.
Thanks, I hate it.
I finally took the plunge and joined an art discord a couple months back, and VR chat is a big part of their social activity. I actually have an old VR rig I've never bothered setting up, and briefly considered joining in, but increasingly think it's better to leave it on the shelf.
Black hat SEO would have a mandatory death penalty.
-
Depends on what you mean by "tech companies", technically unless you do fulltime VPN at least your ISP has the full list of all websites you visit. Given that we have confirmed report of dragnet surveillance installed at least at some major ISPs, you can assume NSA (and whatever TLAs they share with) has the full log of these (storage paid for with our tax money, thank you very much!) though they probably don't check it unless you become a focus of their attention somehow.
-
Google/Apple most definitely has these data, and likely they sell some of it, and give some of it on a search warrant. The government can request it, the legality of it is kinda debated but it's legal at least in some cases, so you can assume if the government wants it, it will have it. I don't think we have any info about Feds keeping independent logs, but they wouldn't need to.
-
Not likely, as it would be a direct violation of wiretapping laws AFAIK. Unless, of course, you got into trouble enough for The Law to be able to get a wiretapping warrant on you. Though really with all the rest of NSA shenanigans I wouldn't be totally surprised if they start doing it, but I haven't heard any indications of that happening yet.
-
Not likely, since the traffic to record it all would be large enough for people to notice and start talking about it. It is plausible that there could be "keyword triggers" that record specific conversations and clandestinely ship them back to the phone/OS company (where the previous items apply), but for full transcripts of every word it'd be hard to do without people noticing, and since we don't have AFAIK any good evidence of this right now, I'll tend to say no, at least in the form presented. They definitely could listen and update e.g. your advertisement profile - that'd be very hard to catch without having enough access, though the longer we go without somebody Snowden-ing it out, the lesser is the probability that it is actually happens. If NSA couldn't keep their secrets secret, why Google or Apple would be able to?
-
In general, it all depends on a) what is your threat model and b) how interested the government is in you. For most normal people, the government is not interested in them unless they become a target of the investigation - which means they did something to trigger it, or somebody else pointed at them as somebody to watch. If that happened, pretty much any contact with modern society means you're screwed. Bank account? Forget about it. Driving? You better wear a mask and steal a car from somebody that doesn't mind they car being stolen. Communication? Burner phones probably would get you somewhere but don't stay in the same place too long or use the same burner for too long. It's possible to live under the radar, but it's not convenient and usually people that do that have their own infrastructure (like drug traffickers) and if you're going into it alone, it will be tough for you. OTOH, if you're just a normie feeling icky about your data being stored at the vast data silos, you can use some tools - like VPNs, privacy OS phones, etc. - with relatively minor inconvenience, and avoid being data-harvested. But it wouldn't protect you if The Law becomes seriously interested in you.
The word you're looking for is"war".
The definition of insanity is doing the same thing over and over again expecting different results.
Israel has done other things, up to and including withdrawing both military and civilian populations from the area. Result: Gaza Palestinians elect Hamas to represent them and step up the attacks. The Palestinians are going to hate them and attack them no matter what; the best the Israelis can do (short of actual genocide) is degrade their ability to do so.
If the "ping" in "ping-ponging" is ethnic cleansing, would the "pong" be "ethnic dirtying"?
I just meant ping-ponging as having them shuffle back and forth across the strip, not a metaphor.
getting someone out temporarily because it's an active war zone and then bringing them back when it's safe is just good manners, not ethnic cleansing.
I agree, except with 2 caveats:
-
I think shuffling a bunch of humans around an area as you bomb it into gravel in an effort to wipe out an organization who's primary recruiting tool is the anger generated in humans who are being shuffled around an area as its bombed into gravel is equal parts evil and stupid. The definition of insanity is doing the same thing over and over again expecting different results.
-
given this area will never stop being an active war zone (only more or less intense of one), and frankly given neither side has any interest in making it not that, I feel like "being wildly unwilling to create a lasting peace" somewhat offsets the good manners part of moving people around to keep them safe in the micro level (i.e. day to day or month to month), when on the macro level you have absolutely 0 interest in them ever actually being safe.
Were the latter just suckers, to take such risks only to have critics ignore their existence?
Yes
Amusingly, the Chen Sheng story strikes me as the perfect summary of how the Gazans feel. It's not quite as extreme, but this is literally how Hamas has a never ending string of young men signing up to get blown up
"The moral of the story is that if you are maximally mean to innocent people, then eventually bad things will happen to you. First, because you have no room to punish people any more for actually hurting you. Second, because people will figure if they’re doomed anyway, they can at least get the consolation of feeling like they’re doing you some damage on their way down."
I'd say it's entirely applicable. "without getting in too much trouble" is one of the two main throttling mechanisms on the tribal hatred engine, and is tightly linked to the other one, the fact that the search is "distributed". The search being distributed reduces how much trouble individuals get in, and reduces the efficiency of the search because it's conducted in a less-conscious fashion. It's the part people miss when they niavely predict the outbreak of civil war over the atrocity du jour.
Here's a gradient:
"X are bad" > "X shouldn't be tolerated" > "It's pretty cool when an X gets set on fire" > "you should set an X on fire" > "I'm going to set an X on fire" > actually going out and setting an X on fire.
You can graph the gradient in terms of actual harm inflicted on the outgroup, by the danger of getting in trouble, or by the amount of trouble you'll get in. There's a sweet spot on the graph where you find the greatest harm inflicted for the least cost incurred. The Culture War consists of people, with various degrees of consciousness, searching both for that sweet spot and for changes to social conditions that make the sweet spot larger and sweeter. Increasing consciousness of the nature of the search increases search efficiency greatly. Being unaware of the mechanics of getting in trouble likewise increases the efficiency of the search, since even if you get in trouble, you still provide valuable data to the rest of the search nodes. Various coordinated actions, changes in social norms or in formal policies likewise increase the efficiency of the search by asymmetrically reducing the threat of trouble being gotten into. Affirmative consent policies, DOGE, "who will kill Elon" and "are those level-4 plates?" are all variations on a theme.
Blue hostility toward the Church and Red hostility toward Academia are the same thing: coordinated meanness against an enemy tribal stronghold, moderated by the need to not individually get in too much trouble. The tribes successfully purge each other from their institutions, and then are shocked when the other side no longer values the institutions they've been purged from and begins reducing them with metaphorical bombardment.
...And for those who've read this far, this is your reminder that this process is not your friend. Our capacity to maintain flowing electricity and running water rely on the sweet spot staying quite small and the search being quite limited and stable over time.
Does he? Wouldn't surprise me, but I think we need weebs subject matter experts to disambiguate on our behalf.
I'm reading the paper, but initial issues that caught my eye:
-
They're not evaluating GPT-4. They're using 4o. The exact implementation details of 4o are still opaque, it might be a distill of 4, but they're not the same model. As far as I can tell, that's a point of confusion on your part, not the authors.
-
4o, even at the time of publication, was not the best model available. Very far from it. It is a decent generalist model, but not SOTA. It isn't the best model, the second best model, or even the third... best model, on almost any metric one opts for.
I have, as far as I'm aware, never claimed that LLMs match or outperform professional human translators. My core contention was that even basic bitch LLMs are good enough, and an improvement over previous SOTA, including DeepL, which this paper supports.
This would hold true even if the authors had chosen to use something like o3, Opus 4, Gemini 2.5 Pro etc. It is unclear to me if they were aware that better options were unavailable, there's little reason to use 4o if one wants to know what the best possible output is.
And even if it is true, it doesn't matter. The models are consistently getting better. We have a new SOTA every few months these days.
Amusingly, we got sucked into bickering about definitions, which I had actually hoped to avoid by using "ethnic cleansing" over the much more volatile "genocide". Admittedly, I opened up pretty flippantly, so maybe that was the wrong word, although it felt amusing and volatile in the moment.
My core thesis is that I think shuffling a bunch of humans around an area as you bomb it into gravel in an effort to wipe out an organization who's primary recruiting tool is the anger generated in humans who are being shuffled around an area as its bombed into gravel is equal parts evil and stupid. I don't really care what we call it at the end of the day, there is a metric shitload of human suffering happening, much of which is being deliberately and callously applied.
Gaza is one of the most efficient generators of human suffering I've ever been made aware of. The definition of insanity is doing the same thing over and over again expecting different results.
CBT and DBT have excellent evidence bases for instance and are meant to be highly structured with clear end points. We also have a pretty good understanding of what patients and situations should use each of those therapy modalities.
What is a good way to learn more about our understanding of best practices of when to apply which flavor of therapy?
VRChat (and most other social virtual reality worlds) allow people to choose an avatar. At the novice user level, these avatars just track the camera position and orientation, provide a walking animation, and have a limited number of preset emotes, but there's a small but growing industry for extending that connection. Multiple IMUs and/or camera tricks can track limbs, and there are tools used by more dedicated users for face and eye and hand tracking. These can allow avatar's general pose (and sometimes down to finger motions) to match that of the real-world person driving it, sometimes with complex modeling going on where an avatar might need to represent body parts that the person driving it doesn't have.
While you can go into third-person mode to evaluate how well these pose estimates are working in some circumstances, that's impractical for a lot of in-game use, both for motion sickness reasons and because it's often disruptive. So most VRChat social worlds will have at least one virtual mirror, usually equivalent to at least a eight-foot-tall-by-sixteen-foot-wide space, very prominently placed to check things like imu drift.
Some people like these mirrors. Really like them. Like spend hours in front of them and then go to sleep while in VR-level like them. This can sometimes be a social thing where groups will sit in front of a mirror and even do some social discussions together, or sometimes they'll be the one constantly watching the mirror while everyone is else doing their own goofy stuff. But they're the mirror dwellers.
I'm not absolutely sure whatever's going on with them is bad, but it's definitely a break in behavior that was not really available ten years ago.
I had considered adding the caveat of "I am happy to do my own research, but if you have any pointers of where to start that would be much appreciated" but then got distracted and just clicked "Comment".
I appreciate your thoughtful answer
More options
Context Copy link