site banner

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

The Wednesday Wellness threads are meant to encourage users to ask for and provide advice and motivation to improve their lives. It isn't intended as a 'containment thread' and any content which could go here could instead be posted in its own thread. You could post:

  • Requests for advice and / or encouragement. On basically any topic and for any scale of problem.

  • Updates to let us know how you are doing. This provides valuable feedback on past advice / encouragement and will hopefully make people feel a little more motivated to follow through. If you want to be reminded to post your update, see the post titled 'update reminders', below.

  • Advice. This can be in response to a request for advice or just something that you think could be generally useful for many people here.

  • Encouragement. Probably best directed at specific users, but if you feel like just encouraging people in general I don't think anyone is going to object. I don't think I really need to say this, but just to be clear; encouragement should have a generally positive tone and not shame people (if people feel that shame might be an effective tool for motivating people, please discuss this so we can form a group consensus on how to use it rather than just trying it).

3

This thread is for anyone working on personal projects to share their progress, and hold themselves somewhat accountable to a group of peers.

Post your project, your progress from last week, and what you hope to accomplish this week.

If you want to be pinged with a reminder asking about your project, let me know, and I'll harass you each week until you cancel the service

21

Hey folks, there's a space on X where people are doing live reactions for the Starship launch this morning. Come join if you're curious.

-3

Let's chat about the National Football League: This week's schedule (all times Eastern):

Thu 2024-10-24 8:15PM Minnesota Vikings @ Los Angeles Rams
Sun 2024-10-27 1:00PM Atlanta Falcons @ Tampa Bay Buccaneers
Sun 2024-10-27 1:00PM Chicago Bears @ Washington Commanders
Sun 2024-10-27 1:00PM Indianapolis Colts @ Houston Texans
Sun 2024-10-27 1:00PM Arizona Cardinals @ Miami Dolphins
Sun 2024-10-27 1:00PM Green Bay Packers @ Jacksonville Jaguars
Sun 2024-10-27 1:00PM New York Jets @ New England Patriots
Sun 2024-10-27 1:00PM Tennessee Titans @ Detroit Lions
Sun 2024-10-27 1:00PM Baltimore Ravens @ Cleveland Browns
Sun 2024-10-27 4:05PM Buffalo Bills @ Seattle Seahawks
Sun 2024-10-27 4:05PM New Orleans Saints @ Los Angeles Chargers
Sun 2024-10-27 4:25PM Carolina Panthers @ Denver Broncos
Sun 2024-10-27 4:25PM Kansas City Chiefs @ Las Vegas Raiders
Sun 2024-10-27 4:25PM Philadelphia Eagles @ Cincinnati Bengals
Sun 2024-10-27 8:20PM Dallas Cowboys @ San Francisco 49ers
Mon 2024-10-28 8:15PM New York Giants @ Pittsburgh Steelers

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

31

I've been thinking about conflict vs mistake theory lately, especially since the events of October in Israel last year.

I've been particularly trying to understand where support for Palestine (and Hamas, implicitly or not) comes from. Much has already been written about this of course, whether it's the bigotry of small differences or the trap of the "oppressor/oppressed thinking," the hierarchy of oppression, and so on.

What I found striking and want to discuss here though is the strain of thought responding to "how can LGBT+ support Palestine" by declaring, e.g., from Reddit:

It's easier to focus on getting gay rights when you're not being genocided.

Or from a longer piece:

The interviewer asks him, “What’s your response to people who say that you’re not safe in Palestine as a queer person?” Dabbagh responded, “First and foremost, I would go to Palestine in a heartbeat. I have no fear. I love my people and my people love me. And I want to be there and be part of the movement that ends up leading to queer liberation for liberated Palestinian people. If you feel that such violence exists for queer people in the Middle East, what are you doing to change that for that community? The first step is the liberation of Palestine.

I don't claim it's the most common strain of thinking, but to me this largely cashes out as "they are homophobic because of oppression/imperialism/Jews." As an aside, contrast with the way "economic anxiety" plays out in the US.

The part I want to focus on is this kind of blend of mistake and conflict theory -- there's conflict, yes, but it has a cause which can be addressed and then we'll all be on the same side. I'm skeptical of this blend, which seems to essentially just be false consciousness: if not for an external force you would see our interests align.

I think this mode of thinking is becoming increasingly popular however and want to point to the two most recent video games I put serious time into (but didn't finish) as examples: Baldur's Gate 3 and Unicorn Overlord (minorish spoilers ahead)


[Again, minorish spoilers for Unicorn Overlord and Baldur's Gate 3 ahead]

Baldur's Gate 3 was part of a larger "vibe shift" in DnD which I won't get into here except to say I think a lot of it is misguided. Nevertheless, there are two major examples of the above:

The Gith'Yanki are a martial, fascist seeming society who are generally aggressive powerful assholes. A major character arc for one of your team Gith'Yanki team members however, is learning she had been brainwashed and fed lies not just about the leader of the society and her goals, but also the basic functioning of the society. For instance, a much-discussed cure for a serious medical condition turns out to be glorious euthanasia.

The Gith have been impressed with a false consciousness, you see, and your conflict with them is largely based on a misunderstanding of the facts.

More egregious is the character Omeluum, who you meet early in the adventure. Omeluum is a "mind flayer" or "illithid":

Mind flayers are psionic aberrations with a humanoid-like figure and a tentacled head that communicate using telepathy. They feast on the brains of intelligent beings and can enthrall other creatures to their will.

But you see, even these creatures turn out to be the victim of false consciousness--Omeluum is a mind flayer who has escaped the mind control of the "Elder Brain." After fleeing, he happily "joined the good guys." You might think it's an issue that his biology requires he consume conscious brains, but fortunately he only feeds

on the brains of creatures of the Underdark 'that oppose the Society's goals', and wishes to help others of his kind by discovering a brain-free diet.

In the world of DnD (which has consciously been made to increasingly mimic our own world with mixed results), it seems that but for a few bad actors we could all get along in harmony.

Anecdotally, the last time I ran a DnD campaign it eventually devolved into the party trying to "get to the root" of every conflict, whether it was insisting on finding a way to get goblins to stop killing travelers by negotiation a protection deal with the nearby village which served both, or trying to talk every single cultist out of being a cult member. I'm all for creative solutions, but I found it got pretty tedious after a while.


The other game, Unicorn Overlord, is even more striking, albeit a little simpler. Unicorn Overlord is a (very enjoyable) strategy game where you slowly build up an army to overthrow the evil overlord. What you quickly discover, however, is that almost without exception every follower of the evil overlord is literally mind-controlled. The main gameplay cycle involves fighting a lieutenant's army, then using your magical ring to undo the mind control. After, the lieutenant is invariably horrified and joins your righteous cause.

I should note this is far from unusual in this genre, which requires fights but also wants team-ups. It's a lot like Marvel movies which come up with reasons for heroes to fight each other then team up, like a misunderstanding or even mind control. Wargroove was especially bad at this, where you would encounter a new friendly and say something like "Hello, a fine field for cattle, no?" but the wind is strong or something so they hear "Hello, a fine field for battle, no?" and then you fight. Nevertheless, the mind control dynamic in Unicorn Overlord is almost exclusively the only explanation used.


Funnily enough, I think in these an other examples this is seen as "adding nuance," but I find it ultimately as childish as a cartoon-twirling villain. The villain is still needed in fact (Imperialists, the Evil Overlord, The Elder Brain, The Queen of the Gith), but it's easier to explain away one Evil person who controls everything than try to account for it at scale.

Taken altogether, I can't help but think these are all symptoms of the same thing: struggling to explain conflict. The "false consciousness" explanation is powerful, but seems able to explain anything about people's behavior.

My suspicion is that mistakes and genuine conflict can both occur, but this blended approach leaves something to be desired I think. I had an idea a while ago about a potential plot twist for Unicorn Overlord where it's revealed you aren't freeing anyone -- you're simply bringing them under your own control but you don't notice. That feels a bit like the fantasy all of this is getting at I think: I have my views because of Reasons or Ethics or Whatever, and you would agree with me if not for Factor I'm Immune To.

Transnational Thursday is a thread for people to discuss international news, foreign policy or international relations history. Feel free as well to drop in with coverage of countries you’re interested in, talk about ongoing dynamics like the wars in Israel or Ukraine, or even just whatever you’re reading.

25

For much of my life, people who hear bits and pieces of my biography would say “You should write a book!”. So perhaps finally, I begin to.

Here's the elevator pitch:

I'm an American who came of age outside America, a soldier from a pacifist family, an atheist from a faith-healing cult in Indiana. An intellectually pretentious infantry sergeant. A middle-class dilettante among rough soldiers, a semi-retired middle-aged house-husband with a phone full of cat pictures. A pot-smoking gamer and master-class pistol shot. Hunter, fisherman, amateur home cook. Good with kids and animals, bad with women.

As a short and non-inclusive list: I've been a missionary, translator, manual laborer, martial artist, drug mule, camp counselor, soldier, punk guitarist, research assistant, firearms trainer.

Debated theologians, imams and feminists, drank and sauna'd with Russians, smoked weed and chicken with Kurds, hunted deer and trouble with native Americans. Built orphanages in Ukraine and blew them up in Iraq. I speak bits and parts of ten or so languages, been on every continent but Australia and Antarctica (Africa and South America are technicalities, but those count), and all forty-eight contiguous states.

At the same time, I'm a skinny nerd who grew up on the internet, cut his teeth in the chans and treats online politics like bros treat fantasy football. Had an erratic but broad education, presented professional research at APA conferences, published history monographs and main-tanked a guild through BWL. Can calculate bullet drop, p-value and THACO.

I've performed musically in front of thousands of people, academically to hundreds and athletically for dozens. Conducted military funerals, psychological research and church worship teams. Attended the foundings of PAX, the first non-orthodox church in Novocheboksarsk, MOPH 180, Sniper Platoon 2/11, and the Michigan branch of the Proud Boys. I've sat behind a sniper rifle in the ruins of what was once Babylon, behind a Telecaster on the stage of a megachurch, and behind a conference table in the main hall of Palmer House.

For food, eaten everything from live dragonfly larvae to scrambled pig's brains. I've had pizza with mayo for sauce, kittie kabobs and roasted horse, twenty-year-old MREs and raw deer heart, straight out the ribcage. Drunk everything from prison wine to Romanian ration vodka, HofBrau Oktoberfest to Busch Lite, McCallan 25 to Dr. McGillicutty's Cherry Schnapps. Kefir, Kvass, Tiger.

For work I've trained green-broke mustangs and worse-broke cops, power-washed semi-trucks, sold legal guns and illegal hooch, shingled roofs, tied steel, smuggled dope into an embassy, fabricated windows and pallets with the Amish, driven diabetics to dialysis, and located underground utilities. Planted crops with illegal aliens, detasseled corn with midwest hicks, worked on climbing walls with hippies, washed shit off dairy cows. I don't put any of that on my CV.

Along the way, conflict was inevitable. Fought trailer park kids in Indiana, Gopkini in Moscow, Marines in Vegas, reform school kids on a soccer field, Mortar platoon in the quad, a cafeteria full of home-schoolers at Bob Jones University, drunks behind a bar in Flint Mi., the Al-Janabis in central Iraq.

Stranger perhaps were the ladies involved. Fighter not a lover, but they have their charms! Italo-hispanic painters, semi-pro russian hookers, a mohawk on long walks with amish girls, scrawny white boy at an all-black dance with a borderline little person, suicidal lesbians, a leather jacket with a married chick at an Ani DiFranco concert, and a guild-destroying hookup with main heals at a gaming convention. Just a selection of the awkwardness that has been romance.

My name is Sgt. Scott. I remember some of this shit and I'm writing it down. That's the pitch.

Ever since Covid, I've been writing through some of my past experiences. Much of this is half-baked digressions mostly to get memories down, but even so. Over the coming year I will be writing steadily on biographical stuff, and doing interviews with family members and old friends. I don't know if this will ever be a book, but it's a start. Be posting some of those projects here. Feedback is appreciated.

If you read this far and want to help, LMK which of the above sound the most/least intriguing.

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

-3

Let's chat about the National Football League. This week's schedule (all times Eastern):

Thu 2024-10-17 8:15PM Denver Broncos @ New Orleans Saints
Sun 2024-10-20 9:30AM New England Patriots @ Jacksonville Jaguars
Sun 2024-10-20 1:00PM Cincinnati Bengals @ Cleveland Browns
Sun 2024-10-20 1:00PM Detroit Lions @ Minnesota Vikings
Sun 2024-10-20 1:00PM Houston Texans @ Green Bay Packers
Sun 2024-10-20 1:00PM Miami Dolphins @ Indianapolis Colts
Sun 2024-10-20 1:00PM Tennessee Titans @ Buffalo Bills
Sun 2024-10-20 1:00PM Philadelphia Eagles @ New York Giants
Sun 2024-10-20 1:00PM Seattle Seahawks @ Atlanta Falcons
Sun 2024-10-20 4:05PM Carolina Panthers @ Washington Commanders
Sun 2024-10-20 4:05PM Las Vegas Raiders @ Los Angeles Rams
Sun 2024-10-20 4:25PM Kansas City Chiefs @ San Francisco 49ers
Sun 2024-10-20 8:20PM New York Jets @ Pittsburgh Steelers
Mon 2024-10-21 8:15PM Baltimore Ravens @ Tampa Bay Buccaneers
Mon 2024-10-21 9:00PM Los Angeles Chargers @ Arizona Cardinals

Week 8 thread: https://www.themotte.org/post/1216/weekly-nfl-thread-week-8

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

9

This is the Quality Contributions Roundup. It showcases interesting and well-written comments and posts from the period covered. If you want to get an idea of what this community is about or how we want you to participate, look no further (except the rules maybe--those might be important too).

As a reminder, you can nominate Quality Contributions by hitting the report button and selecting the "Actually A Quality Contribution!" option. Additionally, links to all of the roundups can be found in the wiki of /r/theThread which can be found here. For a list of other great community content, see here.

These are mostly chronologically ordered, but I have in some cases tried to cluster comments by topic so if there is something you are looking for (or trying to avoid), this might be helpful.


Quality Contributions in the Main Subreddit

@naraburns:

@Highpopalorum:

@2D3D:

Contributions for the week of September 2, 2024

@Dean:

@faceh:

@KolmogorovComplicity:

@ControlsFreak:

@RenOS:

Special Issue: Babies Everywhere!

@Hoffmeister25:

@ProfQuirrell:

@Tractatus:

@doglatine:

@urquan:

@satirizedoor:

Contributions for the week of September 9, 2024

@CrispyFriedBarnacles:

@FiveHourMarathon:

@ControlsFreak:

@gorge:

@Rov_Scam:

Contributions for the week of September 16, 2024

@Dean:

@naraburns:

@100ProofTollBooth:

@Walterodim:

@CrispyFriedBarnacles:

@MaiqTheTrue:

On An Ideology With No Name

@MadMonzer:

@Hoffmeister25:

@FCfromSSC:

@Supah_Schmendrick:

Contributions for the week of September 23, 2024

@teleoplexy:

@wemptronics:

@FiveHourMarathon:

@Hoffmeister25:

@LotsRegret:

You're a Villain All Right

@Baila:

@DirtyWaterHotDog:

@faceh:

Contributions for the week of September 30, 2024

@self_made_human:

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Happy birthday to us! Hard to believe we've been the Internet's leading (possibly only?) independent user-funded (ad-free!) open political speech forum for two whole years. @ZorbaTHut, what a remarkable framework you've constructed; everyone else, what remarkable things you've constructed on that framework. Thanks for hanging out with us and getting into very kind, respectful, politely-worded fistfights about anything and everything under the sun.

As iron sharpens iron, so one person sharpens another.

(Proverbs 27:17)


This is the Quality Contributions Roundup. It showcases interesting and well-written comments and posts from the period covered. If you want to get an idea of what this community is about or how we want you to participate, look no further (except the rules maybe--those might be important too).

As a reminder, you can nominate Quality Contributions by hitting the report button and selecting the "Actually A Quality Contribution!" option. Additionally, links to all of the roundups can be found in the wiki of /r/theThread which can be found here. For a list of other great community content, see here.

These are mostly chronologically ordered, but I have in some cases tried to cluster comments by topic so if there is something you are looking for (or trying to avoid), this might be helpful.


Quality Contributions to the Main Motte

@FCfromSSC:

@Glassnoser:

@cjet79:

@WhiningCoil:

@Dean:

@100ProofTollBooth:

Contributions for the week of July 29, 2024

@wqnm:

@Astranagant:

@Tophattingson:

@Felagund:

Contributions for the week of August 5, 2024

@RandomRanger:

@Felagund:

@naraburns:

@SSCReader:

@DradisPing:

@gattsuru:

@100ProofTollBooth:

@urquan:

@faceh:

@felis-parenthesis:

Contributions for the week of August 12, 2024

@CrispyFriedBarnacles:

@Dean:

@MonkeyWithAMachinegun:

@DirtyWaterHotDog:

@blooblyblobl:

Contributions for the week of August 19, 2024

@YE_GUILTY:

@Grant_us_eyes:

@Dean:

@2D3D:

@wemptronics:

@Throwaway05:

Contributions for the week of August 26, 2024

@RandomRanger:

@Goodguy:

@MartianNight:

@Tanista:

@gattsuru:

@Shiro:

@Amadan:

58

Here some people have expressed interest in my take on AI broadly, and then there's Deepseek-Coder release, but I've been very busy and the field is moving so very fast again, it felt like a thankless job to do what Zvi does and without his doomer agenda too (seeing the frenetic feed on Twitter, one can be forgiven for just losing the will; and, well, I suppose Twitter explains a lot about our condition in general). At times I envy Iconochasm who tapped out. Also, this is a very niche technical discussion and folks here prefer policy.

But, in short: open source AI, in its most significant aspects, which I deem to be code generation and general verifiable reasoning (you can bootstrap most everything else from it), is now propped up by a single Chinese hedge fund (created in the spirit of Renaissance Capital) which supports a small, ignored (except by scientists and a few crackpots on Twitter) research division staffed with some nonames, who are quietly churning out extraordinarily good models with the explicit aim of creating AGI in the open. These models happen to be (relatively) innocent of benchmark-gaming, but somewhat aligned to Chinese values. The modus operandi of DeepSeek is starkly different from that of either other Chinese or Western competitors. In effect this is the only known group both meaningfully pursuing frontier capabilities and actively teaching others how to do so. I think this is interesting and a modest cause for optimism. I am also somewhat reluctant to write about this publicly because there exist lovers of Freedom here, and it would be quite a shame if my writing contributed to targeted sanctions and even more disempowerment of the small man by the state machinery in the final accounting.

But the cat's probably out of the bag. The first progress prize of AI Mathematical Olympiad had just been taken by a team using their DeepSeekMath-7B model, solving 29 out of 50 private test questions «less challenging than those in the IMO but at the level of IMO preselection»; Terence Tao finds it «somewhat higher than expected» (he is on the AIMO Advisory Committee, along with his fellow Fields medalist Timothy Gowers).

The next three teams entered with this model as well.

I. The shape of the game board

To provide some context, here's an opinionated recap of AI trends since last year. I will be focusing exclusively on LLMs, as that's what matters (image gen, music gen, TTS etc largely are trivial conveniences, and other serious paradigms seem to be in their embryonic stage or in deep stealth).

  • We have barely advanced in true out-of-distribution reasoning/understanding relative to the original «Sparks of AGI» GPT-4 (TheDag, me); GPT-4-04-29 and Sonnet 3.5 were the only substantial – both minor – steps forward, Gemini was a catch-up effort, and nobody else has yet credibly reached the same tier. We have also made scant progress towards consensus on whether that-which-LLMs-do is «truly» reasoning or understanding; sensible people have recoursed to something like «it's its own kind of mind, and hella useful».
  • Meanwhile there's been a great deal of progress in scaffolding (no more babyAGI/AutoGPT gimmicry, now agents are climbing up the genuinely hard SWE-bench), code and math skills, inherent robustness in multi-turn interactions and responsiveness to nuanced feedback (to the point that LLMs can iteratively improve sizable codebases – as pair programmers, not just fancy-autocomplete «copilots»), factuality, respect of prioritized system instructions, patching badly covered parts of the world-knowledge/common sense manifold, unironic «alignment» and ironing out Sydney-like kinks in deployment, integrating non-textual modalities, managing long contexts (merely usable 32K "memory" was almost sci-fi back then, now 1M+ with strong recall is table stakes at the frontier; with 128K mastered on a deeper level by many groups) and a fairly insane jump in cost-effectiveness – marginally driven by better hardware, and mostly by distilling from raw pretrained models, better dataset curation, low-level inference optimizations, eliminating architectural redundancies and discovering many "good enough" if weaker techniques (for example, DPO instead of PPO). 15 months ago,"$0.002/1000 tokens" for gpt-3.5-turbo seemed incredible; now we always count tokens by the million, and Gemini-Flash blows 3.5-turbo out of the water for half that, so hard it's not funny; and we have reason to believe it's still raking in >50% margins whereas OpenAI probably subsidized their first offerings (though in light of distilling and possibly other methods of compute reuse, it's hard to rigorously account for a model's capital costs now).
  • AI doom discourse has continued to develop roughly as I've predicted, but with MIRI pivoting to evidence-free advocacy, orthodox doomerism getting routed as a scientific paradigm, more extreme holdovers from it («emergent mesaoptimizers! tendrils of agency in inscrutable matrices!») being wearily dropped by players who matter, and misuse (SB 1047 etc) + geopolitical angle (you've probably seen young Leopold) gaining prominence.
  • The gap in scientific and engineering understanding of AI between the broader community and "the frontier" has shrunk since the debut of GPT-4 or 3.5, because there's too much money to be made in AI and only so much lead you can get out of having assembled the most driven AGI company. Back then, only a small pool of external researchers could claim to understand what the hell they did above the level of shrugging "well, scale is all you need" (wrong answer) or speculating about some simple methods like "train on copyrighted textbooks" (spiritually true); people chased rumors, leaks… Now it takes weeks at most to trace a yet another jaw-dropping magical demo to papers, to cook up a proof of concept, or even to deem the direction suboptimal; the other two leading labs no longer seem desperate, and we're in the second episode of Anthropic's comfortable lead.
  • Actual, downloadable open AI sucks way less than I've lamented last July. But it still sucks. And that's really bad, since it sucks most in the dimension that matters: delivering value, in the basest sense of helping do work that gets paid. And the one company built on the promise of «decentralizing intelligence», which I had hope for, had proven unstable.

To be more specific, open source (or as some say now, given the secretiveness of full recipes and opacity of datasets, «open weights») AI has mostly caught up in «creativity» and «personality», «knowledge» and some measure of «common sense», and can be used for petty consumer pleasures or simple labor automation, but it's far behind corporate products in «STEM» type skills, that are in short supply among human employees too: «hard» causal reasoning, information integration, coding, math. (Ironically, I agree here with whining artists that we're solving domains of competence in the wrong order. Also it's funny how by default coding seems to be what LLMs are most suited for, as the sequence of code is more constrained by preceding context than natural language is).

To wit, Western and Eastern corporations alike generously feed us – while smothering startups – fancy baubles to tinker with, charismatic talking toys; as they rev up self-improvement engines for full cycle R&D, the way imagined by science fiction authors all these decades ago, monopolizing this bright new world. Toys are getting prohibitively expensive to replicate, with reported pretraining costs up to ≈$12 million and counting now. Mistral's Mixtral/Codestral, Musk's Grok-0, 01.Ai's Yi-1.5, Databricks' DBRX-132B, Alibaba's Qwens, Meta's fantastic Llama 3 (barring the not-yet-released 405B version), Google's even better Gemma 2, Nvidia's massive Nemotron-340B – they're all neat. But they don't even pass for prototypes of engines you can hop on and hope to ride up the exponential curve. They're too… soft. And not economical for their merits.

Going through our archive, I find this year-old analysis strikingly relevant:

I think successful development of a trusted open model rivaling chatgpt in capability is likely in the span of a year, if people like you, who care about long-term consequences of lacking access to it, play their cards reasonably well. […] Companies whose existence depends on the defensibility of the moat around their LM-derived product will tend to structure the discourse around their product and technology to avoid even the fleeting perception of being a feasibly reproducible commodity.

That's about how it went. While the original ChatGPT, that fascinating demo, is commodified now, competitive product-grade AI systems are not, and companies big and small still work hard to maintain the impression that it takes

  • some secret sauce (OpenAI, Anthropic)
  • work of hundreds of Ph.Ds (Deepmind)
  • vast capital and compute (Meta)
  • "frontier experience" (Reka)

– and even then, none of them have felt secure enough yet to release a serious threat to the other's proprietary offers.

I don't think it's a big exaggerion to say that the only genuine pattern breaker – presciently mentioned by me here – is DeepSeek, the company that has single-handedly changed – a bit – my maximally skeptical spring'2023 position on the fate of China in the AGI race.

II. Deep seek what?

AGI, I guess. Their Twitter bio states only: «Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism». It is claimed by the Financial Times that they have a recruitment pitch «We believe AGI is the violent beauty of model x data x computing power. Embark on a ‘deep quest’ with us on the journey towards AGI!» but other than that nobody I know of has seen any advertisement or self-promotion from them (except for like 70 tweets in total, all announcing some new capability or responding to basic user questions about license), so it's implausible that they're looking for attention or subsidies. Their researchers maintain near-perfect silence online. Their – now stronger and cheaper – models tend to be ignored in comparisons by Chinese AI businesses and users. As mentioned before, one well-informed Western ML researcher has joked that they're the bellwether for «the number of foreign spies embedded in the top labs».

FT also says the following of their parent company:

Its funds have returned 151 per cent, or 13 per cent annualised, since 2017, and were achieved in China’s battered domestic stock market. The country’s benchmark CSI 300 index, which tracks China’s top 300 stocks, has risen 8 per cent over the same time period, according to research provider Simu Paipai.
In February, Beijing cracked down on quant funds, blaming a stock market sell-off at the start of the year on their high-speed algorithmic trading. Since then, High-Flyer’s funds have trailed the CSI 300 by four percentage points.
[…] By 2021, all of High-Flyer’s strategies were using AI, according to manager Cai Liyu, employing strategies similar to those pioneered by hugely profitable hedge fund Renaissance Technologies. “AI helps to extract valuable data from massive data sets which can be useful for predicting stock prices and making investment decisions,” …
Cai said the company’s first computing cluster had cost nearly Rmb200mn and that High Flyer was investing about Rmb1bn to build a second supercomputing cluster, which would stretch across a roughly football pitch-sized area. Most of their profits went back into their AI infrastructure, he added. […] The group acquired the Nvidia A100 chips before Washington restricted their delivery to China in mid-2022.
“We always wanted to carry out larger-scale experiments, so we’ve always aimed to deploy as much computational power as possible,” founder Liang told Chinese tech site 36Kr last year. “We wanted to find a paradigm that can fully describe the entire financial market.”

In a less eclectic Socialist nation this would've been sold as Project Cybersyn or OGAS. Anyway, my guess is they're not getting subsidies from the Party any time soon.

They've made a minor splash in the ML community eight months ago, in late October, releasing an unreasonably strong Deepseek-Coder. Yes, in practice an awkward replacement for GPT-3.5, yes, contaminated with test set, which prompted most observers to discard it as a yet another Chinese fraud. But it proved to strictly dominate hyped-up things like Meta's CodeLLaMA and Mistral's Mixtral 8x7b in real-world performance, and time and again proved to be the strongest open baseline in research papers. On privately designed, new benchmarks like this fresh one from Cohere it's clear that they did get to parity with OpenAI's workhorse model, right on the first public attempt – as far as coding is concerned.

On top of that, they shared a great deal of information about how: constructing the dataset from Github, pretraining, finetuning. The paper was an absolute joy to read, sharing even details on unsuccessful experiments. It didn't offer much in the way of novelty; I evaluate it as a masterful, no-unforced-errors integration of fresh (by that point) known best practices. Think about your own field and you'll probably agree that even this is a high bar. And in AI, it is generally the case that either you get a great model with «we trained it on some text… probably» tech report (Mistral, Google), or a mediocre one accompanied by a fake-ass novel full of jargon (every second Chinese group). Still, few cared.

Coder was trained, it seems, using lessons of the less impressive Deepseek-LLM-67B (even so, it was roughly Meta's LLaMA-2-70B peer that also could code; a remarkable result for a literally-who new team), which somehow came out a month after. Its paper (released even later still) was subtitled «Scaling Open-Source Language Models with Longtermism». I am not sure if this was some kind of joke at the expense of effective altruists. What they meant concretely was the following:

Over the past few years, LLMs … have increasingly become the cornerstone and pathway to achieving Artificial General Intelligence (AGI). … Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source LMs with a long-term perspective.

  • …Soon, we will release our technique reports in code intelligence and Mixture-of-Experts(MoE), respectively. They show how we create high-quality code data for pre-training, and design a sparse model to achieve dense model performance.
  • At present, we are constructing a larger and improved dataset for the upcoming version of DeepSeek LLM. We hope the reasoning, Chinese knowledge, math, and code capabilities will be significantly improved in the next version.
  • Our alignment team is dedicated to studying ways to deliver a model that is helpful, honest, and safe to the public. Our initial experiments prove that reinforcement learning could boost model complex reasoning capability.

…I apologize for geeking out. All that might seem normal enough. But, a) they've fulfilled every one of those objectives since then. And b) I've read a great deal of research papers and tech reports, entire series from many groups, and I don't remember this feeling of cheerful formidability. It's more like contemplating the dynamism of SpaceX or Tesla than wading through a boastful yet obscurantist press release. It is especially abnormal for a Mainland Chinese paper to be written like this – with friendly confidence, admitting weaknesses, pointing out errors you might repeat, not hiding disappointments behind academese word salad; and so assured of having a shot in an honest fight with the champion.

In the Coder paper, they conclude:

…This advancement underscores our belief that the most effective code-focused Large Language Models (LLMs) are those built upon robust general LLMs. The reason is evident: to effectively interpret and execute coding tasks, these models must also possess a deep understanding of human instructions, which often come in various forms of natural language. Looking ahead, our commitment is to develop and openly share even more powerful code-focused LLMs based on larger-scale general LLMs.

In the Mixture-of-Experts paper (8th January), they've shown themselves capable of novel architectural research too, introducing a pretty ingenuous «fine-grained MoE with shared experts» design with the objective of «Ultimate Expert Specialization» and economical inference: «DeepSeekMoE 145B significantly outperforms Gshard, matching DeepSeek 67B with 28.5% (maybe even 14.6%) computation». For those few who noticed it, this seemed a minor curiosity, or just bullshit.

On 5th February, they've dropped DeepSeekMath,of which I've already spoken: «Approaching Mathematical Reasoning Capability of GPT-4 with a 7B Model». Contra the usual Chinese pattern, it wasn't a lie; no, you couldn't in normal use get remotely as good results from it, but in some constrained regimes… The project itself was a mix of most of the previous steps: sophisticated (and well-explained) data harvesting pipeline, scaling laws experiments, further «longtermist» continued pretraining from Coder-7B-1.5 which itself is a repurposed LLM-7B, and the teased reinforcement learning approach. Numina, winners of AIMO, say «We also experimented with applying our SFT recipe to larger models like InternLM-20B, CodeLama-33B, and Mixtral-8x7B but found that (a) the DeepSeek 7B model is very hard to beat due to its continued pretraining on math…».

In early March they released DeepSeek-VL: Towards Real-World Vision-Language Understanding, reporting some decent results and research on building multimodal systems, and again announcing new plans: «to scale up DeepSeek-VL to larger sizes, incorporating Mixture of Experts technology».

III. Frontier minor league

This far, it's all been preparatory R&D, shared openly and explained eagerly yet barely noticed by anyone (except that the trusty Coder still served as base for labs like Microsoft Research to experiment on): utterly overshadowed in discussions by Alibaba, Meta, Mistral, to say nothing of frontier labs.

But on May 6th, 2024, the pieces began to fall into place. They released «DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model», which subsumed all aforementioned works (except VL).

It's… unlike any other open model, to the point you could believe it was actually made by some high-IQ finance bros from first principles. Its design choices are exquisite, just copying minor details can substantially improve on typical non-frontier efforts. It pushes further their already unorthodox MoE and tops it off with a deep, still poorly understood modification to the attention mechanism (Multi-head Latent Attention, or MLA). It deviates from industry-standard rotary position embeddings to accomodate the latter (a fruit of collaboration with RoPE's inventor). It's still so unconventional that we are only beginning to figure out how to run it properly (they don't share their internal pipeline, which is optimized for hardware they can access given American sanctions). But in retrospect, it's the obvious culmination of the vision announced with those first model releases and goofy tweets, probably a vision not one year old, and yet astonishingly far-sighted – especially given how young their star researchers are. But probably it's mundane in the landscape of AI that's actually used; I suspect it's close to how Sonnet 3.5 or Gemini 1.5 Pro work on the inside. It's just that the open-source peasants are still mucking around with stone age dense models on their tiny consumer GPUs.

I understand I might already be boring you out of your mind, but just to give you an idea of how impressive this whole sequence is, here's a 3rd April paper for context:

Recent developments, such as Mixtral (Jiang et al., 2024), DeepSeek-MoE (Dai et al., 2024), spotlight Mixture-of-Experts (MoE) models as a superior alternative to Dense Transformers. An MoE layer works by routing each input token to a selected group of experts for processing. Remarkably, increasing the number of experts in an MoE model (almost) does not raise the computational cost, enabling the model to incorporate more knowledge through extra parameters without inflating pre-training expenses… Although our findings suggest a loss-optimal configuration with Emax experts, such a setup is not practical for actual deployment. The main reason is that an excessive number of experts makes the model impractical for inference. In contrast to pretraining, LLM inference is notably memory-intensive, as it requires storing intermediate states (KV-cache) of all tokens. With more experts, the available memory for storing KV caches is squeezed. As a result, the batch size – hence throughput – decreases, leading to increased cost per query. … We found that MoE models with 4 or 8 experts exhibit more efficient inference and higher performance compared to MoE models with more experts. However, they necessitate 2.4x-4.3x more training budgets to reach the same performance with models with more experts, making them impractical from the training side.

This is basically where Mistral.AI, the undisputed European champion with Meta and Google pedigree (valuation $6.2B), the darling of the opensource community, stands.

And yet, apparently DeepSeek have found a way to get out of the bind. «4 or 8»? They scale to 162 experts, reducing active parameters to 21B, cutting down pretraining costs by 42.5% and increasing peak generation speed by 5.76x; and they scale up the batch size via compressing the KV cache by like 15 times with a bizarre application of low-rank projections and dot attention; and while doing so they cram in 3x more attention heads than any model this size has any business having (because their new attention decouples number of heads from cache size), and so kick the effective «thinking intensity» up a notch, beating the gold standard «Multihead attention» everyone has been lousily approximating; and they use a bunch of auxiliary losses to make the whole thing maximally cheap to use on their specific node configuration.

But the cache trick is pretty insane. The hardest-to-believe, for me, part of the whole thing. Now, 2 months later, we know that certain Western groups ought to have reached the same Pareto frontier, just with different (maybe worse, maybe better) tradeoffs. But those are literally inventors and/or godfathers of the Transformer – Noam Shazeer's CharacterAI, Google Deepmind's Gemini line… This is done by folks like this serious-looking 5th year Ph.D student, in under a year!

As a result, they:

  • use about as much compute on pretraining as Meta did on Llama-3-8B, an utter toy in comparison (maybe worth $2.5 million for them); 1/20th of GPT-4.
  • Get a 236B model that's about as good across the board as Meta's Llama-3-70B (≈4x more compute), which has the capacity – if not the capability – of mid-range frontier models (previous Claude 3 Sonnet; GPT-4 on a bad day).
  • Can serve it at around the price of 8B, $0.14 for processing 1 million tokens of input and $0.28 for generating 1 million tokens of output (1 and 2 Yuan), on previous-gen hardware too.
  • …and still take up to 70%+ gross margins, because «On a single node with 8 H800 GPUs, DeepSeek-V2 achieves a generation throughput exceeding 50K tokens per second… In addition, the prompt input throughput of DeepSeek-V2 exceeds 100K tokens per second», and the going price for such nodes is ≤$15/hr. That's $50 in revenue, for clarity. They aren't doing a marketing stunt.
  • …and so they force every deep-pocketed mediocre Chinese LLM vendor – Alibaba, Zhipu and all – to drop prices overnight, now likely serving at a loss.

Now, I am less sure about some parts of this story; but mostly it's verifiable.

I can see why an American, or a young German like Leopold, would freak out about espionage. The thing is, their papers are just too damn good and too damn consistent over the entire period if you look back (as I did), so «that's it, lock the labs» or «haha, no more tokens 4 u» is most likely little more than racist cope for the time being. The appropriate reaction would be more akin to «holy shit Japanese cars are in fact good».

Smart people (Jack Clark from Anthropic, Dylan Patel of Semianalysis) immediately take note. Very Rational people clamoring for AI pause (TheZvi) sneer and downplay: «This is who we are worried about?» (as he did before, and before). But it is still good fun. Nothing extreme. There slowly begin efforts at adoption: say, Salesforce uses V2-Chat to create synthetic data to finetune small Deepseek-Coder V1s to outperform GPT-4 on narrow tasks. Mostly nobody cares.

The paper ends in the usual manner of cryptic comments and commitments:

We thank all those who have contributed to DeepSeek-V2 but are not mentioned in the paper. DeepSeek believes that innovation, novelty, and curiosity are essential in the path to AGI.

DeepSeek will continuously invest in open-source large models with longtermism, aiming to progressively approach the goal of artificial general intelligence.

• In our ongoing exploration, we are dedicated to devising methods that enable further scaling up MoE models while maintaining economical training and inference costs. The goal of our next step is to achieve performance on par with GPT-4 in our upcoming release.

In the Appendix, you can find a lot of curious info, such as:

During pre-training data preparation, we identify and *filter out contentious content, such as values influenced by regional cultures, to avoid our model exhibiting unnecessary subjective biases on these controversial topics. Consequently, we observe that DeepSeek-V2 performs slightly worse on the test sets that are closely associated with specific regional cultures. For example, when evaluated on MMLU, although DeepSeek-V2 achieves comparable or superior performance on the majority of testsets compared with its competitors like Mixtral 8x22B, it still lags behind on the Humanity-Moral subset, which is mainly associated with American values.

Prejudices of specific regional cultures aside, though, it does have values – true, Middle Kingdom ones, such as uncritically supporting the Party line and adherence to Core Values Of Socialism (h/t @RandomRanger). The web version will also delete the last message if you ask something too clever about Xi or Tiananmen or… well, nearly the entirety of usual things Americans want to talk to Chinese coding-oriented LLMs about.

And a bit earlier, this funny guy from the team presented at Nvidia's GTC24 with the product for the general case – «culturally sensitive», customizable alignment-on-demand: «legality of rifle» for the imperialists, illegality of Tibet separatism for the civilized folk. Refreshingly frank.

But again, even that was just a preparatory.

IV. Coming at the king

Roughly 40 days later they release DeepSeek-V2-Coder: Breaking the Barrier of Closed-Source Models in Code Intelligence, where they return to the strategy announced at the very start: they take an intermediate checkpoint of V2, and push it harder and further on the dataset enriched with code and math (that that've continued to expand and refine), for 10.2 trillion tokens total. Now this training run is 60% more expensive than Llama-3-8B (still a pittance by modern standards). It also misses out on some trivia knowledge and somehow becomes even less charismatic. It's also not a pleasant experience because the API runs very slowly, probably from congestion (I guess Chinese businesses are stingy… or perhaps DeepSeek is generating a lot of synthetic data for next iterations). Anons on 4chan joke that it's «perfect for roleplaying with smart, hard-to-get characters».

More importantly though, it demolishes Llama-3-70B on every task that takes nontrivial intelligence; bests Claude 3 Opus on coding and math throughout, Gemini 1.5-Pro on most coding assistance, and trades blows with the strongest GPT-4 variants. Of course it's the same shape and the same price, which is to say, up to 100 times cheaper than its peers… more than 100 times, in the case of Opus. Still a bitch to run, but it turns out they're selling turnkey servers. In China, of course. To boot, they rapidly shipped running code in browser (a very simple feature but going most of the way to Claude Artifacts that wowed people do much), quadrupled context length without price changes (32k to 128k) and now intend to add context caching that Google boasts of as some tremendous Gemini breakthrough. They have... Impressive execution.

Benchmarks, from the most sophisticated and hard to hack to the most bespoke and obscure, confirm that it's «up there».

Etc etc, and crucially, users report similar impressions:

So I have pegged deepseek v2 coder against sonnet 3.5 and gpt4o in my coding tasks and it seems to be better than gpt4o (What is happening at OpenAI) and very similar to Sonnet 3.5. The only downside is the speed, it's kinda slow. Very good model and the price is unbeatable.

I had the same experience, this is a very good model for serious tasks. Sadly the chat version is very dry and uncreative for writing. Maybe skill issue, I do not know. It doesn't feel slopped, it's just.. very dry. It doesn't come up with things.

Some frustrating weak points, but they know of those, and conclude:

Although DeepSeek-Coder-V2 achieves impressive performance on standard benchmarks, we find that there is still a significant gap in instruction-following capabilities compared to current state-of-the-art models like GPT-4 Turbo. This gap leads to poor performance in complex scenarios and tasks such as those in SWEbench. […] In the future, we will focus more on improving the model’s instruction-following capabilities…

Followed by the list of 338 supported languages.

Well-read researchers say stuff like

DeepSeek-Coder-V2 is by far the best open-source math (+ coding) model, performing on par with GPT4o w/o process RM or MCTS and w/ >20x less training compute. Data contamination doesn't seem to be a concern here. Imagine about what this model could achieve with PRM, MCTS, and other yet-to-be-released agentic exploration methods. Unlike GPT4o, you can train this model further. It has the potential to solve Olympiad, PhD and maybe even research level problems, like the internal model a Microsoft exec said to be able to solve PhD qualifying exam questions».

Among the Rational, there is some cautious realization («This is one of the best signs so far that China can do something competitive in the space, if this benchmark turns out to be good»), in short order giving way to more cope : «Arena is less kind to DeepSeek, giving it an 1179, good for 21st and behind open model Gemma-2-9B».

And one more detail: A couple weeks ago, they released code and paper on Expert-Specialized Fine-Tuning, «which tunes the experts most relevant to downstream tasks while freezing the other experts and modules; experimental results demonstrate that our method not only improves the efficiency, but also matches or even surpasses the performance of full-parameter fine-tuning … by showing less performance degradation [in general tasks]». It seems to require that «ultimate expert specialization» design of theirs, with its supporting beam of generalist modules surrounded by meaningfully task-specific shards, to automatically select only the parts pertaining to some target domain; and this isn't doable with traditional dense of MoE designs. Once again: confident vision, bearing fruit months later. I would like to know who's charting their course, because they're single-handedly redeeming my opinion of the Chinese AI ecosystem and frankly Chinese culture.

V. Where does this leave us?

This might not change much. Western closed AI compute moat continues to deepen, DeepSeek/High-Flyer don't have any apparent privileged access to domestic chips, and other Chinese groups have friends in the Standing Committee and in the industry, so realistically this will be a blip on the radar of history. A month ago they've precluded a certain level of safetyist excess and corporate lock-in that still seemed possible in late 2023, when the argument that public availability of ≈GPT-4 level weights (with the main imaginary threat vectors being coding/reasoning-bottlenecked) could present intolerable risks was discussed in earnest. One-two more such leaps and we're… there, for the vague libertarian intuition of «there» I won't elucidate now. But they're already not sharing the silently updated Deepseek-V2-Chat (that somewhat improved its reasoning, getting closer to the Coder), nor the promised materials on DeepSeek-Prover (a quiet further development of their mathematical models line). Maybe it's temporary. Maybe they've arrived to where they wanted to be, and will turtle up like Stability and Mistral, and then likely wither away.

Mostly, I honestly just think it's remarkable that we're getting an excellent, practically useful free model with lowkey socialist sensibilities. Sadly, I do not foresee that this will inspire Western groups to accelerate open source and leave them in the dust. As Google says in Gemma-2 report:

Despite advancements in capabilities, we believe that given the number of larger and more powerful open models, this release will have a negligible effect on the overall risk landscape.

Less charitably, Google is not interested in releasing anything you might use to enhance your capabilities and become less dependent on Google or other «frontier company», and will only release it if you are well able of getting better stuff elsewhere. In my view, this is closer to the core value of Socialism than withholding info about Xinjiang reeducation camps.

I remain agnostic about the motivations and game plan of DeepSeek, but I do hope they'll maintain this policy of releasing models «with longtermism», as it were. We don't have many others to rely on.

Edits: minor fixes

0

Let's chat about the National Football League. This week's schedule (all times Eastern):

Thu 2024-10-10 8:15PM San Francisco 49ers @ Seattle Seahawks
Sun 2024-10-13 9:30AM Jacksonville Jaguars @ Chicago Bears
Sun 2024-10-13 1:00PM Cleveland Browns @ Philadelphia Eagles
Sun 2024-10-13 1:00PM Indianapolis Colts @ Tennessee Titans
Sun 2024-10-13 1:00PM Arizona Cardinals @ Green Bay Packers
Sun 2024-10-13 1:00PM Houston Texans @ New England Patriots
Sun 2024-10-13 1:00PM Tampa Bay Buccaneers @ New Orleans Saints
Sun 2024-10-13 1:00PM Washington Commanders @ Baltimore Ravens
Sun 2024-10-13 4:05PM Pittsburgh Steelers @ Las Vegas Raiders
Sun 2024-10-13 4:05PM Los Angeles Chargers @ Denver Broncos
Sun 2024-10-13 4:25PM Atlanta Falcons @ Carolina Panthers
Sun 2024-10-13 4:25PM Detroit Lions @ Dallas Cowboys
Sun 2024-10-13 8:20PM Cincinnati Bengals @ New York Giants
Mon 2024-10-14 8:15PM Buffalo Bills @ New York Jets

Week 7 thread is live: https://www.themotte.org/post/1209/weekly-nfl-thread-week-7

The Wednesday Wellness threads are meant to encourage users to ask for and provide advice and motivation to improve their lives. It isn't intended as a 'containment thread' and any content which could go here could instead be posted in its own thread. You could post:

  • Requests for advice and / or encouragement. On basically any topic and for any scale of problem.

  • Updates to let us know how you are doing. This provides valuable feedback on past advice / encouragement and will hopefully make people feel a little more motivated to follow through. If you want to be reminded to post your update, see the post titled 'update reminders', below.

  • Advice. This can be in response to a request for advice or just something that you think could be generally useful for many people here.

  • Encouragement. Probably best directed at specific users, but if you feel like just encouraging people in general I don't think anyone is going to object. I don't think I really need to say this, but just to be clear; encouragement should have a generally positive tone and not shame people (if people feel that shame might be an effective tool for motivating people, please discuss this so we can form a group consensus on how to use it rather than just trying it).

Transnational Thursday is a thread for people to discuss international news, foreign policy or international relations history. Feel free as well to drop in with coverage of countries you’re interested in, talk about ongoing dynamics like the wars in Israel or Ukraine, or even just whatever you’re reading.

1

This thread is for anyone working on personal projects to share their progress, and hold themselves somewhat accountable to a group of peers.

Post your project, your progress from last week, and what you hope to accomplish this week.

If you want to be pinged with a reminder asking about your project, let me know, and I'll harass you each week until you cancel the service.

24

Trying out a new weekly thread idea.

This would be a thread for anyone working on personal projects to share their progress, and hold themselves somewhat accountable to a group of peers. We can coordinate weekly standup type meetings if their is interest.

@ArjinFerman, @Turniper, and myself all had some initial interest.

Post your project, your progress from last week, and what you hope to accomplish this week.

Hi guys, I wrote the part II to the first story I posted 3 weeks ago.

I think moving together is an interesting action. Basically what it means is that you are roped to your partner on perhaps easy but massively exposed terrain. Easy in this sense is usually also very relative to the ability level of the pair. It means that due to a lack of protection between you if one of the pair falls, you both fall. So you have to have ultimate trust in your partners ability. Indeed, just on Sunday I was talking to a mountain guide which told me about a fall on Aiguille du Peigne, while a rope team was moving together.

You often don't see ultimate trust in the modern world invested in to a comrade, perhaps outside actual warfare. This is part of the reason why I think Alpine climbing is basically a substitute of this for the modern man. You go in to dangerous places, to do risky things, you don't bring back cattle or women or anything useful for that matter. But perhaps, you show that you could, if the times were different. You seek the same valor and the status that comes with this. Which is why a lot of climbers look down on people using guides, they see it as stolen valor. The first Englishman, Charles Hudson who went up Mont Blanc wrote a book in 1856 titled 'Where There's a Will There's a Way: An Ascent of Mont Blanc by a New Route and Without Guides' which was the first written articulation of this sentiment.

I think the psychology of this sort of adventure seeking has much more to explore. But I haven't yet delved in to it too much yet.

-2

Many of you probably may remember a great old alternative to this place known as /r/CultureWarRoundup on Reddit. Due to its superior (which is by no means to say perfect), more "hands-off" moderation philosophy, it peeled off a good portion of the most intelligent people from here at one point (some of whom like me have returned a bit at least for the moment, many of whom probably did not) and was quickly gaining momentum, peaking at around 500 - 700 (IIRC, writing off the top of my head about all of this) replies per week to its main CWR thread while this place (back when it was on Reddit) at least once fell below 1K (or at least close to it) replies (again, IIRC) to the same (less than the activity now for the most part, as the moderation here's somewhat toned down its scattershot executions since then, not enough and not half as much as at the beginning of this site when they were trying to attract people by pretending to have reformed more comprehensively, but still somewhat, since those dark days of Hlynka (though a certain Amadan seems to relish the idea of bringing them back at least in part)). The people were speaking, and it wasn't looking good for this place, with the tides firmly turning in CWR's favor, but unfortunately, as jannies often do, CWR's own jannies started making their own very dumb decisions.

As Reddit started cracking down harder on wrongthink, stiffening new account requirements, shadowbanning more accounts without even pretending to respond to appeals from users that they aren't spambots (since of course that's never been the actual only or even main purpose of shadowbanning anyway), and in general making the site more of a pain to use, resulting in many of CWR's regular users getting banned from Reddit as a whole frequently (mostly only if they posted outside of /r/CWR (which most people did sometimes of course), yet occasionally even if not, as /r/CWR itself did suffer a few Reddit admin removals despite being mostly too small of a fish to worry about), exhausting their alt accounts, and finding it harder to make new ones/use the site effectively at all, CWR's jannies still doubled down on sticking with the platform in general, even raising new account length/"karma" requirements to post on the sub (and refusing to waive them for the new accounts of obvious regulars, which I know because I asked multiple times), perhaps due to paranoia about /r/SneerClub type infiltrators intentionally posting vastly incendiary/ToS-violating things to try to get them banned from Reddit (a concern I don't think ever actually materialized, whereas cutting off the earnest posters who actually wanted to post there but literally couldn't did). The great momentum they'd shown in overtaking this place began to crater as their regular users of the sub like me simply disappeared, unable to post because of the restrictions they foolishly chose to enforce in tandem with the Reddit admins in the hopes of avoiding the subreddit ban guillotine (which they've succeeded at so far... at the cost of their actual life anyway).

Though an eventual half-move to Saidit (which suffered from confusion as the Reddit sub stayed open as well, splitting the community, a mistake I do have to credit the mods here for not repeating after) restored some activity, it couldn't reignite the flame. Now both the still open Reddit and Saidit subs are in zombie mode. There was also a Matrix chatroom at some point, but the invite link to it seems to be entirely dead now (and of course nobody sensible wanted to move to a chatroom anyway which is not even close to the same medium).

So does anybody know what happened to these people? Where's the current venue for the continuation of SSC's CWR threads but with less of the crappy overbearing moderation you'll find increasingly more here (much like /r/TheSchism is an alternative but with even worse, crappier/more overbearing moderation)? Where can actual men engage in unrestricted intellectual discussion in a truly properly masculine fashion without effeminate finger-wagging jannies from California all too frequently interfering to whine about "antagonism" (the very essence of the competition of ideas, and therefore impossible to ban from it) or whatever as they do here (again, not as bad as in the past, but still too much)? Any and all info is appreciated. Thank you in advance.

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.