site banner

Culture War Roundup for the week of May 25, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

4
Jump in the discussion.

No email address required.

Science should have nothing to do with law. If the science says something, let Congress reflect that in the law, but Judges do not have the authority to simply declare this kind of thing. Warren and his court have been a disaster for this country.

I've started listening to SCOTUS oral arguments as a podcast, and some of my consistent complaints are the lack of statistical literacy, and some justices really wanting to lean on "scientists" as an unelected fourth branch of government.

I think your suggestion here is the right one: let Congress interpret the science in writing laws; don't have the Judicial branch try to do scientific literature reviews.

It's not like we don't have lots of evidence of negligent or even outright fraudulent publications in even reputable journals.

It's not like we don't have lots of evidence of negligent or even outright fraudulent publications in even reputable journals.

There was a meta analysis done once that showed about 50% of peer reviewed papers turn out to be false. If you look at something like PubMed epidemiological literature, a lot of it is riddled with multicollinearity that severely impacts the precision of their regression estimates. This was actually acknowledged in findings of their own studies. But it generalizes across disciplines. You find it in Neoclassical economics where models have been repeatedly subjected to different datasets and the existing structure falls completely apart.

It's extremely bad in ML literature, where pressure to publish and get your citation + h-index up means that people publish all kinds of non-replicate-able junk. Nobody wants another incremental advancement paper so every paper is revolutionary. I cannot tell you the number of times I've taken a paper that looked interested, either used their code or re-implemented it approximately, used their datasets and gotten far worse results. The new trick, or really old trick making a resurgence, is to include all kinds of arcane math in the paper and not provide any code so its impossible to replicate it without a math PhD in that area.

Research papers are written for phds, and if you don't have a phd then you are not the target audience. Unreproducibility and over-mathiness of ML research is a common meme among the online ML-adjacent communities, but it's just not true. The ML community has done far more than any other community to encourage reproducibility and they've had a lot of success in doing so.

Source: I am an ML researcher with only a mediocre publication record. I've got my own gripes with the system that have led to my pub-record being mediocre, but reproducibility is not one of them.

I am a ML researcher, in Industry without a PhD. The papers are absolutely for me. (And if they aren't then thats a major clique/circle-jerking issue, as I'm the one actually trying to apply what is being done)

https://www.nature.com/articles/s41598-025-07087-2 This paper I recently tried to replicate for research on IoT cuffless BP, it absolutely fails to replicate. Not only that, but it also suffers from massive subject leakage on how it splits the data. It's pretty much overfit with a 75% overlap between signals and then it shuffles those between train and val. Even copying it's splitting approach I failed to get more than a MAE SBP of 6.07 and DBP of 4.3. Paper claims sub 2.0 for both.

Then there's this: https://arxiv.org/pdf/2512.19428. Maybe you know Grassmann flows and manifolds but I definitely did not learn this naturally. I pretty much need a background tutorial on this.

I actually enjoyed this paper's concept: https://arxiv.org/pdf/2602.14972 But needing to read 2-3 additional papers, one of which was super mathy proving out the intuition was a lot of work. It still takes me a bit to conceptualize this because it is DEEP in the bayesian world.

Maybe you are in a different subfield than I am, but I have consistently failed to replicate paper results for the occasional paper for the last 4-5 years. It happens, it's a thing. If I say that to other industry researchers they pretty much agree. One of the reasons we think poorly of academics.

Nature paper ...

Without looking at this paper I agree it is shit. This paper is not a machine learning paper (and basically nothing in Nature is). The failure to replicate is a problem of the culture of medical science and not ML.

Just because a researcher uses a compiler in their research does not make them a "compiler researcher", and similarly, just because someone uses machine learning in their research does not make them a "machine learning researcher". Papers at PLDI are not targeted at people who are "trying to apply compilers" and papers at NeurIPS/ICML are not targeting people who are "trying to apply ML". (If you actually want to see a "mathy" paper, BTW, you should take a look at the papers at COLT... these are definitely not for you and these are definitely hard-core proper machine learning papers.)

Grassman flows paper

This paper is definitely an ML paper, and honestly is pretty reasonable. It's not earth shattering, but it's exactly the kind of work that I would expect from a decent phd student (which the author is). It's pretty bread-and-butter ML to take a model and explore ways to reduce the representational complexity of the model. Grassmann manifolds are outside of standard ML math, but the explanation in 2.2 was easy to follow. The math here is no harder than the math in standard graduate textbooks.

Causal Foundation Models paper...

Again, this doesn't seem very mathy to me. The notation all looks like standard stuff from the Pearl textbook (admittedly not standard ML, but definitely standard for anything causal), and anyone who has worked through Bishop (which should be literally everyone with an ML phd of a certain age) should have no problem.

Having to look up 3 references to read and understand a paper seems absolutely reasonable to me.

We seem to have a different definition of what constitutes "ML Research". I'd break it down into two forms: Basic Research and Applied Research. Basic is probably not the precise word because a lot of core research is non-basic, but core is also an imprecise word, as is making a boundary around theoretical.

But Applied Research is pretty straight forward. It is the application of ML theory and algorithms/models to real-world practical problems. The "Basic" Research is generally more on developing the ML theory of what can work or is possible. You seem to think Applied Research is not actual "ML Research". I'm not sure the ML community agrees with you because there are prevalent conferences like CVPR or the NLP one I am blanking on. These are considered ML conferences, focused on a particular practical field. Industry research is almost always Applied, not all of us have the luxury of working on grants, business want returns and the research is around applying ML theory to real-problems. Like the Cuffless BP Nature paper. I think your definition is overly purity focused, though I imagine our tension is one as old as time between Academic PhDs and Industry Researchers.

The last two are definitely the "core/theoretical/basic" side of research because they aren't actually applying it to real problems. One's just a theory on Causal Modeling al la Pearl or Schölkopf. The pipeline is that someone like me takes these more theoretical models and implements them in the real-world.

Grassmann manifolds are outside of standard ML math, but the explanation in 2.2 was easy to follow.

Maybe I suck at math (a real possibility) or maybe you are just good at math (also a possibility) I still am very shaking on what a Grassmann manifold is. I don't think the paper is earth shattering in itself. I've seen several papers about kernelizing attention, or linearizing it, or anything to make it non-quadratic.

Again, this doesn't seem very mathy to me

I don't think this one is mathy, but it is arcane on the applications of meta-learning as bayesian priors to allow a model to generalize across out of distribution problems during inference time. Claiming it can do zero-shot inference on unrelated tasks because it learns how to formulate problems as an approximation of bayesian inference in a practical amount of time is a wild idea. It's making a very complicated claim that takes a long time to wrap your head around.

Having to look up 3 references to read and understand a paper seems absolutely reasonable to me.

Unfortunately this is a constraint in industry, I have a job, there is work to get done. spending 8+ hours to digest a theory paper is a large impact on my time. Even if it leads to something useful.

I'm not sure the ML community agrees with you because there are prevalent conferences like CVPR or the NLP one I am blanking on. These are considered ML conferences, focused on a particular practical field.

No. People who publish in these conferences do not consider these ML conferences. Historically computer vision and NLP started out as fully distinct communities with almost no overlap with the ML community. Since about 2014 and the deep learning revolution, the lines have been blurred a bit, but they are still very distinct communities.

NeurIPS/ICML are basically considered the same conference, and any paper that could be accepted at one could also be accepted at the other without modification (beyond styling); the only meaningful difference is the submission deadline. Similarly, CVPR/ICCV/ECCV are all basically the same conference with difference submission deadlines, and ACL/EMNLP/NAACL. You cannot, for example, take a paper designed for NeurIPS/ICML and get it published at CVPR/ICCV/ECCV without major structural changes, and that's we know they are part of different communities.

The division here is not academic/industry like you suggest. Bishop---who again is the prototypical author for probabilistic ML---works at Microsoft and you can find the textbook info at: https://www.microsoft.com/en-us/research/people/cmbishop/prml-book/. The division is based on the conference communities and who publishes/reviews where.

Unfortunately this is a constraint in industry, I have a job, there is work to get done. spending 8+ hours to digest a theory paper is a large impact on my time. Even if it leads to something useful.

Honestly, those papers shouldn't take 8 hours for a researcher to read. I had a pretty solid idea of what they were doing in <5 minutes, and I'd guess in <1hr I could fully understand everything about each paper.

The difference is that I am the target audience. Having done a ML phd, I've read >20 graduate level textbooks cover-to-cover and >1000 papers in great depth. If you haven't done this background work (which is fine---it's not for everybody, and I actively recommend my students not pursue this path) then these papers are not designed for you. You should accept this rather that complain that they are too hard or gatekeeping.

More comments