site banner
Advanced search parameters (with examples): "author:quadnarca", "domain:reddit.com", "over18:true"

Showing 25 of 111318 results for

domain:jessesingal.substack.com

Computationally, maybe all we are is Markov chains. I'm not sold, but Markov chat bots have been around for a few decades now and used to fool people occasionally even at smaller scales.

LLMs can do pretty impressive things, but I haven't seen convincing evidence that any of them have stepped clearly outside the bounds of their training dataset. In part that's hard to evaluate because we've been training them on everything we can find. Can a LLM trained on purely pre-Einstein sources adequately discuss relativity? A human can be well versed in lots of things with substantially less training material.

I still don't think we have a good model for what intelligence is. Some have recently suggested "compression", which is interesting from an information theory perspective. But I won't be surprised to find that whatever it is, it's actually an NP-hard problem in the perfect case, and everything else is just heuristics and approximations trying to be close. In some ways it'd be amusing if it turns out to be a good application of quantum computing.

I suspect if you know a PsyD or other actual psychotherapist they might have more helpful advice but my quick lit review didn't turn up anything useful.

I do generally suggest that everyone in medicine read Nancy McWilliams Psychoanalytic Diagnosis for an understanding of personality structure since it has broad application to life and general medicine still needs to know how to deal with personality dysfunction.

Some of the chapters are still fun to read with zero background (ex: Anti-Social).

It won't answer your specific question directly but will provide a lot of context and peck at it a bit.

oh oops, I misread your comment, I thought you said that 4o was not sota when it was released. Yes it was obsolete when the paper came out.

LLMs only get better, they're "good enough", and that this is a net improvement over the status quo.

Won't change the fact that people who use them wrong will still do worse than not using LLM at all.

ルイズ!ルイズ!ルイズ!ルイズぅぅうううわぁああああああああああああああああああああああん!!! あぁああああ…ああ…あっあっー!あぁああああああ!!!ルイズルイズルイズぅううぁわぁああああ!!! あぁクンカクンカ!クンカクンカ!スーハースーハー!スーハースーハー!いい匂いだなぁ…くんくん んはぁっ!ルイズ・フランソワーズたんの桃色ブロンドの髪をクンカクンカしたいお!クンカクンカ!あぁあ!! 間違えた!モフモフしたいお!モフモフ!モフモフ!髪髪モフモフ!カリカリモフモフ…きゅんきゅんきゅい!! 小説10巻のルイズたんかわいかったよぅ!!あぁぁああ…あああ…あっあぁああああ!!ふぁぁあああんんっ!! アニメ2期決まって良かったねルイズたん!あぁあああああ!かわいい!ルイズたん!かわいい!あっああぁああ! コミック1巻も発売されて嬉し…いやぁああああああ!!!にゃああああああああん!!ぎゃああああああああ!! ぐあああああああああああ!!!コミックなんて現実じゃない!!!!あ…小説もアニメもよく考えたら… ル イ ズ ち ゃ ん は 現実 じ ゃ な い?にゃあああああああああああああん!!うぁああああああああああ!! そんなぁああああああ!!いやぁぁぁあああああああああ!!はぁああああああん!!ハルケギニアぁああああ!! この!ちきしょー!やめてやる!!現実なんかやめ…て…え!?見…てる?表紙絵のルイズちゃんが僕を見てる? 表紙絵のルイズちゃんが僕を見てるぞ!ルイズちゃんが僕を見てるぞ!挿絵のルイズちゃんが僕を見てるぞ!! アニメのルイズちゃんが僕に話しかけてるぞ!!!よかった…世の中まだまだ捨てたモンじゃないんだねっ! いやっほぉおおおおおおお!!!僕にはルイズちゃんがいる!!やったよケティ!!ひとりでできるもん!!! あ、コミックのルイズちゃああああああああああああああん!!いやぁあああああああああああああああ!!!! あっあんああっああんあアン様ぁあ!!セ、セイバー!!シャナぁああああああ!!!ヴィルヘルミナぁあああ!! ううっうぅうう!!俺の想いよルイズへ届け!!ハルケギニアのルイズへ届け!俺は実はサイト萌えなんだっ!!

It is trivially an army: it was originally the Army Air Force and was only separated from the Army for bureaucratic convenience.

The paper seems to have been published on April 2025.

Gemini 2.0 Pro and 3.7 Sonnet came out in February 2025. Claude 3.5 Sonnet came out in June 2024 and was better than the version of 4o out then.

At the very least, the authors should have made a note that they weren't using the SOTA, or that the SOTA would have moved significantly by the time of publication. To do less is mild dishonesty. This isn't 2022, the pace of progress is evident.

4o is also what powers chatgpt.com so it's the model that most casual users will get the output from.

True, but that's OAI being cheap, and not an indictment of the utility of LLMs for translation. It's akin to claiming TVs suck, and then only using a cheap and cheerful $300 model from Walmart as the standard.

My criticisms stand, namely that LLMs only get better, they're "good enough", and that this is a net improvement over the status quo. It remains to be seen how much better the SOTA is over 4o or DeepL.

it's always possible OpenPhil is actually bad at their stated mission for whatever reason, including design flaws.

OpenPhil might be the 800 pound gorilla funding EA, but it is useful to remember that OpenPhil is not particularly EA.

Scott has addressed this kind of thing--how much altruism is mandated or what is sufficiently pure--multiple times.

While in the past Scott has written about the burden being easy and the yoke light, he went on to donate a kidney and wrote that one should keep climbing the tower. I am skeptical that his past writings on addressing the questions of purity are, uh, pure.

I actually opt into a service with Google where they track where I am at pretty much all times through my phone. I can go to a dashboard and follow myself through the past going back to when I first opted in. I assume they do this for everyone and I'm only opting into the tools to see the data myself. My wife can also see where I am at any given time, which is also intentional. I have issues with my health and get holes in my memory; I've needed others to be able to locate me before when I'm not well.

Usually I unhelpfully reply "do a lit review!!!" to these sorts of questions

I had considered adding the caveat of "I am happy to do my own research, but if you have any pointers of where to start that would be much appreciated" but then got distracted and just clicked "Comment".

I appreciate your thoughtful answer

My immediate social circle and the benefit of social media allowing me to keep some distant tabs on people from high school and college. Seeing a good number of women I thought had good heads on their shoulders go off some deep end and regress to behaviors I recognize from when they were younger.

I'm not trying to dismantle your argument, as I think you made it well. But I do want to point out that, at least in my circles, there's a strong correlation between "actively using social media" and "not having your shit together". In other words, if your sample is just social media, then you're missing out on all the well-adjusted individuals who are keeping to themselves.

And partially through my job where I interact with people of many ages, and one of the more common and frustrating genres of people I encounter is "neurotic woman in her 40s or 50s who still has the demeanor of a teenager."

Do you work in inside sales? Mostly making a lighthearted joke, here. Maybe even healthcare or aviation?

makes a good case for where Israel has gone wrong

Could you quickly summarize that part? There's no way I am going to read this book, but I am curious enough to hear the summary.

Huh. I was confident that I had a better writeup about why "stochastic parrots" are a laughable idea, at least as a description for LLMs. But no, after getting a minor headache figuring out the search operators here, it turns out that's all I've written on the topic.

I guess I never bothered because it's a Gary Marcus-tier critique, and anyone using it loses about 20 IQ points in my estimation.

But I guess now is as good a time as any? In short, it is a pithy, evocative critique that makes no sense.

LLMs are not inherently stochastic. They have a (not usually exposed to end-user except via API) setting called temperature. Without going into how that works, it suffices it to say that by setting the value to zero, their output becomes deterministic. The exact same prompt gives the exact same output.

The reason why temperature isn't just set to zero all the time is because the ability to choose something other than the next most likely token has benefits when it comes to creativity. At the very least it saves you from getting stuck with the same subpar result.

Alas, this means that LLMs aren't stochastic parrots. Minus the stochasticity, are they just "parrots"? Anyone thinking this is on crack, since Polly won't debug your Python no matter how many crackers you feed her.

If LLMs were merely interpolating between memorized n-grams or "stitching together" text, their performance would be bounded by the literal contents of their training data. They would excel at retrieving facts and mimicking styles present in the corpus, but would fail catastrophically at any task requiring genuine abstraction or generalization to novel domains. This is not what we observe.

Let’s get specific. The “parrot” model implies the following:

  1. LLMs can only repeat (paraphrase, interpolate, or permute) what they have seen.

  2. They lack generalization, abstraction, or true reasoning.

  3. They are, in essence, Markov chains with steroids.

To disprove any of those claims, just gestures angrily look at the things they can do. If winning gold in the latest IMO is something a "stochastic parrot" can pull off, then well, the only valid takeaway is that the damn parrot is smarter than we thought. Definitely smarter than the people who use the phrase unironically.

The inventors of the phrase, Bender & Koller gave two toy “gotchas” that they claimed no pure language model could ever solve: (1) a short vignette about a bear chasing a hiker, and (2) the spelled-out arithmetic prompt “Three plus five equals”. GPT-3 solved both within a year. The response? Crickets, followed by goal-post shifting: “Well, it must have memorized those exact patterns.” But the bear prompt isn’t in any training set at scale, and GPT-3 could generalize the schema to new animals, new hazards, and new resolutions. Memorization is a finite resource but generalization is not.

(I hope everyone here recalls that GPT-3 is ancient now)

On point 2: Consider the IMO example. Or better yet, come up with a rigorous definition of reasoning by which we can differentiate a human from an LLM. It's all word games, or word salad.

On 3: Just a few weeks back, I was trying to better understand the actual difference between a Markov Chain and an LLM, and I had asked o3 if it wasn't possible to approximate the latter with the former. After all, I wondered, if MCs only consider the previous unit (usually words, or a few words/n-gram), then couldn't we just train the MC to output the next word conditioned on every word that came before? The answer was yes, but that this was completely computationally intractable. The fact that we can run LLMs on something smaller than a Matrioshka brain is because of their autoregressive nature, and the brilliance of the transformer architecture/attention mechanism.

Overall, even the steelman interpretation of the parrot analogy is only as helpful as this meme, which I have helpfully appended below. It is a bankrupt notion, a thought-terminating cliché at best, and I wouldn't cry if anyone using it meets a tiger outside the confines of a cage.

/images/17544215520465958.webp

I was kinda put off by the villainy in Player of Games. It would be nicer if their "extreme meritocracy through McGuffin" concept have been addressed on merits, instead Banks just goes for "but akshually they are all liars and don't do what they profess at all, and instead just do evil things and hypocritically hide it". This is easy - of course people that use plausible sounding concepts to hide being bad are actually bad, especially if the author demonstrates to us that they are bad and then asks "aren't the people I just showed you being bad actually bad?!" Of course they are, you wrote them this way, what do you expect! This just feels lazy to me. I like my villains to be a bit more chewy, to require at least some work to figure out why their position - in which they see themselves as righteous - is untenable, or at least unacceptable to me. Even Ferengi have been given more fair treatment than that (remember, they reached pretty high level of developed society without any wars or atrocities like slavery. For an obvious caricature, it's pretty decent achievement).

Usually I unhelpfully reply "do a lit review!!!" to these sorts of questions but after a quick look myself I don't think it would be that easy - "become an expert in therapy" is probably more accurate but is as about as unhelpful as it is predictable.

The challenging bit is that therapy (especially CBT) is "indicated" for about everything but that doesn't tell you which types of patients will benefit most from which types.

I'm not an expert in this by any means.

It is worth noting that "real" therapy (or many types of popular therapy) is often less ooey-gooey emotional exploration and more resembles socratic questioning or an outright class (in the case of CBT which is driven by "homework").

I do have a family member who is not in psychology or psychiatry (or medicine) who listens to psychiatric podcasts and a few of them dig into this explicitly, you could probably do that if you really wanted to develop a knowledge base.

Some modalities are more specific however, DBT is for Borderline Personality Disorder and people struggling with cluster-b coping mechanisms as part of their pathology. It can work quite well for this.

Classically (especially for any U.S. medical students reading!) the answer to any board question at the Shelf or Step 1-3 level that includes CBT is going to be CBT unless it's DBT for Borderline.

Thought #1: Incredible machine translation from Claude. 4o interpolates a little that's not in the actual text ("sexy kind of heaven") and does an iffy literal translation for "peaceful moment"; "blissful moment" is a better fit.

Thought #2: Ban LLMs. They will allow comments like this to be translated to English.

Late to the party I started, but spending money to incentivize a change in outcomes in my opinion is categorically different then legally enforcing those outcomes, and the latter is what I interpret to be the modern form of "DEI" that most people (especially on this forum) rail against.

i.e. if you want more women in leadership roles (regardless of motivation):

  • Spend $1,000,000 as e.g. scholarships to women to help them acquire credentials seen as barriers to leadership roles
  • Pass a law that every board needs to have at least one woman

the former is not strictly DEI imo, whereas the latter is.

If your position is that we do not as a society need to incentivize any change in outcomes (e.g. because you already believe we're perfectly egalitarian), then fine. But to paint it is as DEI is imo aggressively retroactive because the west has a century of history of programs that attempt to bring about positive social change through funding, but the phrase DEI only recently came into the lexicon.

What's your sample?

My immediate social circle and the benefit of social media allowing me to keep some distant tabs on people from high school and college. Seeing a good number of women I thought had good heads on their shoulders go off some deep end and regress to behaviors I recognize from when they were younger. Also including my Ex.

And partially through my job where I interact with people of many ages, and one of the more common and frustrating genres of people I encounter is "neurotic woman in her 40s or 50s who still has the demeanor of a teenager."

Dating has not done a lot to change the perception. I get the sense that women either mature quite fast (usually when they have good parental examples) and are generally self-sufficient by age 22-24... or they hit 25 and if they haven't gotten their mental house in order around then, it just isn't likely to improve from there. There's not likely to be a 'flash of realization' where they renounce their behavior before and suddenly they start 'acting their age.'

I keep making this point, But so many of the people that end up on Caleb Hammer's show are women who are absolutely, GOBSMACKINGLY bad with finances. Which is a decent metric for maturity if you ask me. Oh I'm sure tons of men are in dire straits too, but ain't nobody validating their choices.

This sounds like a character problem, not an estrogen problem. I've met plenty of bitter men who never learn from their bad experiences.

Yes, I cede the point that many men never reach actual maturity. But 'character problem' can indeed be an estrogen problem.

I would suggest that a combination of hormones (keeping in mind that both estrogen levels too high and too low can have huge impact on mood) and a general lack of restraint/correction of maladaptive behaviors on women results in 'stunted' maturity in women even as they approach thirty. And there's nontrivial number of young women taking hormonal birth control in their teens and twenties which can exacerbate the hormone thing.

Then add in that mental disorders, especially anxiety/depression has spiked particularly badly among young women. And as a result young women are increasingly prescribed antidepressants.

This probably exacerbates the hormone issue above. I am highly suspect of what happens to brain development due to said brain being awash with a combination of exogenous hormones (birth control) and SSRI's and similar drugs for the entirety of one's young adulthood.

I dunno man, I get the sense that women are having an increasing amount of trouble coping with the world-as-it-is. That is, they have bad experiences, and rather than process and learn from them... they use pharma drugs to cope. And they become bitter.

I think men will have issues like this too, but they don't tend to go to social media and scream it from the rooftops, so it is harder to see. If it gets bad enough, they tend to kill themselves. Less serious, they may withdraw from society (or society discards them as useless), or go to prison if they lash out, or they become an Andrew Tate acolyte or something.

I am prepared to believe that this will be less prevalent among higher SES demographics.

I have some friends in this category. They’re miserable.

Same, all the moreso because if and when they do manage to find someone who ostensibly wants a strait-laced monogamous relationship, the dominant gay culture is constantly shoving extra-relationship sex into their faces, leading to rampant cheating and drama and relationship blow-ups/divorce.

With high intelligence being one of the key ingredients to make for better leadership of groups and societies

I would like to challenge this. While obviously we don't want the leaders to be idiots, I am not sure I would prefer a 150 IQ psychopath to 120 IQ kind and moral person as a leader. To me, the main role of the leader is to set goals, make choices and keep the group from descending into chaos, and I am not sure sky-high personal IQ is the best way for that. Maybe some other qualities - which I am not ready to define, but could tentatively call as "not being an evil asshole"? - are at least as important? I do not doubt we need to require the leader be smarter than average - but I think there's a point of diminishing returns where pure IQ power stops being the thing that matters. I don't know where this point is, but I think the premise "the more IQ, the better leader, no exceptions" needs at least to be seriously questioned.

This of course is complicated by the fact that a lot of our current leaders are, to be blunt, psychopaths or borderline psychopaths. Some of them aren't that high IQ either, to be honest. So we're not doing great in this area, and we only have ourselves, as a society, to blame for that. I'm not going to name names here because everybody would have their own examples depending on their political and personal proclivities, but there are enough examples for everybody. But if we want to do better, sometime in the future, I am not sure "higher IQ score" is the metric we need to concentrate all our efforts on. I have seen enough high-IQ assholes to make this questionable for me.

They're not evaluating GPT-4. They're using 4o.

4o vs gpt4 is my mistake, but gpt4 is generally considered obsolete and nobody uses it. It's true that 4o is a mixed bag and underperforms gpt-4 in some aspects, but we have no reason to believe that it's significantly worse than gpt-4 at translation.

4o is also what powers chatgpt.com so it's the model that most casual users will get the output from.

4o, even at the time of publication, was not the best model available.

4o was released well before gemini 2.0 or claude 3.5, so it likely was the best model at the time, along with the original gpt-4. I agree that right now 4o is not good.

My core contention was that even basic bitch LLMs are good enough

My core contention is that deepl is good enough, as it's within spitting distance of chatgpt. But on the other hand ChatGPT has given people ways to do much much worse when they use it wrong.

100% onboard with this

The word you're looking for is "war".

If you're stuck in a permanent war against an enemy you profoundly outclass militarily, economically, culturally, and politically, at a certain point you are responsible for the ongoing outcomes.

Israel has done other things, up to and including

I should be clear, I am no defender of the Palestinians, they are absolutely awful insane irrational neighbors. I am deeply thankful I live nowhere near a Muslim theocracy. While I rate the Israeli reconciliation attempts as "mediocre at best", they have tried, both sides are profoundly irrational at this point.

the best the Israelis can do

I think the answer to this is some flavor of Marshall plan + perhaps a rather invasive CCP-style police state to give young Palestinians a taste/goal of a better life while ensuring that the smallest possible % of GDP is turned into ballistic rockets. Also things like "not constantly encroaching on the West Bank with settler communities" would probably help as that rather calls into question the good faith nature of one of the sides.

Thanks, I hate it.

I finally took the plunge and joined an art discord a couple months back, and VR chat is a big part of their social activity. I actually have an old VR rig I've never bothered setting up, and briefly considered joining in, but increasingly think it's better to leave it on the shelf.

Black hat SEO would have a mandatory death penalty.

  1. Depends on what you mean by "tech companies", technically unless you do fulltime VPN at least your ISP has the full list of all websites you visit. Given that we have confirmed report of dragnet surveillance installed at least at some major ISPs, you can assume NSA (and whatever TLAs they share with) has the full log of these (storage paid for with our tax money, thank you very much!) though they probably don't check it unless you become a focus of their attention somehow.

  2. Google/Apple most definitely has these data, and likely they sell some of it, and give some of it on a search warrant. The government can request it, the legality of it is kinda debated but it's legal at least in some cases, so you can assume if the government wants it, it will have it. I don't think we have any info about Feds keeping independent logs, but they wouldn't need to.

  3. Not likely, as it would be a direct violation of wiretapping laws AFAIK. Unless, of course, you got into trouble enough for The Law to be able to get a wiretapping warrant on you. Though really with all the rest of NSA shenanigans I wouldn't be totally surprised if they start doing it, but I haven't heard any indications of that happening yet.

  4. Not likely, since the traffic to record it all would be large enough for people to notice and start talking about it. It is plausible that there could be "keyword triggers" that record specific conversations and clandestinely ship them back to the phone/OS company (where the previous items apply), but for full transcripts of every word it'd be hard to do without people noticing, and since we don't have AFAIK any good evidence of this right now, I'll tend to say no, at least in the form presented. They definitely could listen and update e.g. your advertisement profile - that'd be very hard to catch without having enough access, though the longer we go without somebody Snowden-ing it out, the lesser is the probability that it is actually happens. If NSA couldn't keep their secrets secret, why Google or Apple would be able to?

  5. In general, it all depends on a) what is your threat model and b) how interested the government is in you. For most normal people, the government is not interested in them unless they become a target of the investigation - which means they did something to trigger it, or somebody else pointed at them as somebody to watch. If that happened, pretty much any contact with modern society means you're screwed. Bank account? Forget about it. Driving? You better wear a mask and steal a car from somebody that doesn't mind they car being stolen. Communication? Burner phones probably would get you somewhere but don't stay in the same place too long or use the same burner for too long. It's possible to live under the radar, but it's not convenient and usually people that do that have their own infrastructure (like drug traffickers) and if you're going into it alone, it will be tough for you. OTOH, if you're just a normie feeling icky about your data being stored at the vast data silos, you can use some tools - like VPNs, privacy OS phones, etc. - with relatively minor inconvenience, and avoid being data-harvested. But it wouldn't protect you if The Law becomes seriously interested in you.