My favorite part of OPOL is having a secret language to use out in public with the kids.
One time I was at Costco with my 3 year old boy. There was a dwarf walking around the store, and my son was asking reasonable questions about why an adult was small like a child... but it was in Spanish and so the dwarf didn't hear himself being talked about... but then after seeing the dwarf for the 4th or 5th time, my boy had a sudden flash of inspiration. He shouts with all the wild-eyed enthusiasm of a toddler: "Me pregunto si mi pene es mas grande que lo suyo?" (I wonder if my penis is bigger than his?) Let me tell you I've never been more happy that my child spoke spanish. The free sample lady we were talking too doubled over laughing, but no one else in the store registered the outburst.
But I'm not super strict about OPOL. I also speak a handful of other languages with the kids that I'm not fully proficient in like Korean. Unfortunately, Korean has a number of words that sound the same as English words but mean very different things. Once when we were out hiking, my then 2 year old (different kid) saw a pair of men walking a dog. He shouts at the top of his lungs "Gay!!! Daddy I see a gay!!!" You see, "Gay" means dog in Korean, and my 2yo was doing the natural 2yo thing of being excited about a dog. Of course the men were horrified, and my brief explanation that my son was talking Korean did nothing to mollify them. Fortunately for us, my son never addressed me in the 2nd person around any black people (you is pronounced "nigga" in Korean).
Anyways, the point of my stories is that language is a chance for you to bond and have fun with your kids. If OPOL doesn't sound fun to you, then don't do it. For me, speaking weird languages to my kids has been tons of fun for us and for mom. Mom doesn't speak any of the languages I do with the kids, except that she now knows lots of different words for poop. And it's fun for all of us to talk about oonga and caca and pedos and bangui.
Groups like the Amish do not have this tension, and I think the ideal Christian behavior is far closer to the Amish than it is to the modal American/European Christian.
I have always been fascinated by Christian political parties. The idea seems repulsive to me both as an American (due to separation of church and state) and as a Christian (I'm from a tradition that sees even voting as borderline sinful).
This is similar to what Trump did with North Korea... they are now working within normal diplomatic relations that fall short of war.
The claim that North Korea has fewer military provocations now than before Trump's 2017-2018 negotiations is false. We've previously discussed this. I'm reposting my response below for the benefit of other readers:
My semi-insider understanding [of North Korea's provocations is that they] are far more in number and severity than before. For example:
- There continued to be major missile tests yearly until 2023, and in 2022 they flew a missile over Japan.
- In 2022, a North Korean drone got within 2 miles of the Blue House (where the South Korean president lives). This type of drone is more like a cruise missile than a quadcopter.
- In 2024, the North has officially abandoned a policy of reunification with the South and there's been all sorts of major border skirmishes. In 2024, the North launched artillery into the South.
- The North has been sending troops to fight in Ukraine and sending supplies to the Russians.
If you are seeing less provocations in the news, I think that's just your media diet.
Typical GDP growth in the US is about 2%/year. That means just waiting 4 years doing normal stuff gets you ~10% productivity improvement. ChatGPT was released just about 4 years ago.
There's a lot of subtleties in the economic figures. But my back-of-the-napkin math above argues that we would have had this 10% permanent productivity gain without any investment in AI.
They don't; it's all informal. AAAI is the closest thing and has a lot of overlap. Basically no one is a member if IEEE or ACM.
If you seriously feel that the ML community is gatekeeping, then I invite you to come join the community and propose ways to remove these gates. There are regular workshops hosted to address these issues and improve them. In just 4 days, there will be a workshop on "The Future of Machine Learning Publishing" https://inverseprobability.com/sorrento2026/future-ml-publishing.html.
There are also more-or-less annually workshops at NeurIPS/ICML on improving the publishing process in ML. Here is an (incomplete) ChatGPT generated list:
(2010) : https://mloss-static.ml.tu-berlin.de/workshop/icml10/
(2018) : https://ml-critique-correct.github.io/
(2019) : https://ml-retrospectives.github.io/
(2020) : https://ml-retrospectives.github.io/neurips2020/
(2021) : https://neurips.cc/virtual/2021/workshop/21885
(2022) : https://ml-eval.github.io/
(2023) : https://sites.google.com/view/reconsidering-peer-review
I don't know of any academic communities that are remotely as open and accessible as the NeurIPS/ICML community. The NLP and CV communities have made some progress in these directions (due to the overlap of their members and the ML community), but even other branches of CS are way behind.
To me that is indicative of a level of quality, skill on the author, or even writing to a wider audience (pretty much a skill) instead of writing to the clique (poor intent).
I don't see anything wrong with writing papers for a "clique" when you are actively trying to help people come into the clique who want to. The ML community has pioneered open access to papers via JMLR/ICML/NeurIPS breaking away from the older venues in the 1980s that refused open access, and basically every graduate level textbook is available for free online.
I basically agree with everything in your last paragraph. Except that AlexNet was published at NeurIPS, not Nature: https://papers.nips.cc/paper_files/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html
I'm not sure the ML community agrees with you because there are prevalent conferences like CVPR or the NLP one I am blanking on. These are considered ML conferences, focused on a particular practical field.
No. People who publish in these conferences do not consider these ML conferences. Historically computer vision and NLP started out as fully distinct communities with almost no overlap with the ML community. Since about 2014 and the deep learning revolution, the lines have been blurred a bit, but they are still very distinct communities.
NeurIPS/ICML are basically considered the same conference, and any paper that could be accepted at one could also be accepted at the other without modification (beyond styling); the only meaningful difference is the submission deadline. Similarly, CVPR/ICCV/ECCV are all basically the same conference with difference submission deadlines, and ACL/EMNLP/NAACL. You cannot, for example, take a paper designed for NeurIPS/ICML and get it published at CVPR/ICCV/ECCV without major structural changes, and that's we know they are part of different communities.
The division here is not academic/industry like you suggest. Bishop---who again is the prototypical author for probabilistic ML---works at Microsoft and you can find the textbook info at: https://www.microsoft.com/en-us/research/people/cmbishop/prml-book/. The division is based on the conference communities and who publishes/reviews where.
Unfortunately this is a constraint in industry, I have a job, there is work to get done. spending 8+ hours to digest a theory paper is a large impact on my time. Even if it leads to something useful.
Honestly, those papers shouldn't take 8 hours for a researcher to read. I had a pretty solid idea of what they were doing in <5 minutes, and I'd guess in <1hr I could fully understand everything about each paper.
The difference is that I am the target audience. Having done a ML phd, I've read >20 graduate level textbooks cover-to-cover and >1000 papers in great depth. If you haven't done this background work (which is fine---it's not for everybody, and I actively recommend my students not pursue this path) then these papers are not designed for you. You should accept this rather that complain that they are too hard or gatekeeping.
Nature paper ...
Without looking at this paper I agree it is shit. This paper is not a machine learning paper (and basically nothing in Nature is). The failure to replicate is a problem of the culture of medical science and not ML.
Just because a researcher uses a compiler in their research does not make them a "compiler researcher", and similarly, just because someone uses machine learning in their research does not make them a "machine learning researcher". Papers at PLDI are not targeted at people who are "trying to apply compilers" and papers at NeurIPS/ICML are not targeting people who are "trying to apply ML". (If you actually want to see a "mathy" paper, BTW, you should take a look at the papers at COLT... these are definitely not for you and these are definitely hard-core proper machine learning papers.)
Grassman flows paper
This paper is definitely an ML paper, and honestly is pretty reasonable. It's not earth shattering, but it's exactly the kind of work that I would expect from a decent phd student (which the author is). It's pretty bread-and-butter ML to take a model and explore ways to reduce the representational complexity of the model. Grassmann manifolds are outside of standard ML math, but the explanation in 2.2 was easy to follow. The math here is no harder than the math in standard graduate textbooks.
Causal Foundation Models paper...
Again, this doesn't seem very mathy to me. The notation all looks like standard stuff from the Pearl textbook (admittedly not standard ML, but definitely standard for anything causal), and anyone who has worked through Bishop (which should be literally everyone with an ML phd of a certain age) should have no problem.
Having to look up 3 references to read and understand a paper seems absolutely reasonable to me.
Research papers are written for phds, and if you don't have a phd then you are not the target audience. Unreproducibility and over-mathiness of ML research is a common meme among the online ML-adjacent communities, but it's just not true. The ML community has done far more than any other community to encourage reproducibility and they've had a lot of success in doing so.
Source: I am an ML researcher with only a mediocre publication record. I've got my own gripes with the system that have led to my pub-record being mediocre, but reproducibility is not one of them.
Search amazon for "duplo marble run" and you get all sorts of cool knock-off duplo sets that are so much better than anything lego makes: https://www.amazon.com/s?k=duplo+marble+run
I'm wondering how the Chinese "lego-compatible" ecosystem fits in here? It seems like an easy win FIRST and some up-and-coming brand.
My oldest is 8 and has been playing around with knock-off technic legos for about 2 years. They're about 30% the price of name-brand, and the pieces themselves seem basically just as good. The instructions and designs are definitely not as good as lego, but this seems like a place where FIRST could put in Western-quality work for cheap.
Some of the knock-off stuff is legitimately better than lego too. We have ~5 sets of fake-duplo marble runs that are legitimately much more fun to play with than anything lego makes in the duplo age range (both for kids and adults).
Thanks for sharing this. The project is fascinating to me from a technical perspective.
I'm currently working on a make-like build system for automating LLM workflows like yours. I've only been using it for internal projects so far, but I might try putting together an example that outputs material compatible with your system. So I looked into some of the technical details, and I have a few questions for you.
Q1
It looks like each novel is stored in its own git repo. I dug through your https://github.com/JohnQPulp/CupOfGold repo and I think I understand how all the info is stored. My first question is: is the annotation format you use in pulp.txt standard for visual novels or something you invented? Specifically, in the lines
All afternoon the wind sifted out of the black Welsh glens, crying notice that Winter was come sliding down over the world from the Pole; and riverward there was the faint moaning of new ice. It was a sad day, a day of gray unrest, of discontent.<e>"Winter... of discontent" opens Steinbeck's first novel. That's some neat, Shakespearean <book>The Winter of Our Discontent|career bookending</book>.</e>
b=wales
The gently moving air seemed to be celebrating the loss of some gay thing with a soft, tender elegy.
n:r=Robert Morgan; n:m=Mother Morgan; n:g=Gwenliana; n:h=Henry Morgan
I'm wondering if the html-like tags and the b=wales metadata stuff is formally documented anywhere?
Q2
These two repos look like how your generating the actual HTML from a book repo:
But what are you using to automate the actual git repos of the books? Could you walk me through that workflow a bit? (This is the part that I might try automating with my own tool.)
For example, I don't see anything in the book repos that look like they are designed to enforce consistency (like a character sheet) anywhere. All the material in the repo looks more like a final product than intermediate developer/artist "documentation". Do you generate any intermediate files like this?
Q3
What's the approximate cost for the full conversion? How much time does it take? (both manual and API/compute)
I'll second that I also appreciate these posts :)
I always thought mission was equally required for both men and women, and only adult converts "get out" of mission.
In my reply to @clo above I mention having just read an easy-greek reader of The Illiad... but since you mention Sherlock Holmes... I feel obligated now to mention that I just received an attic greek translation of Sherlock Holmes from amazon this week. I've been really enjoying reading these "modern ancient greek" stories recently.
I just read Ho epi Troian Polemos. It's an easy-greek reader that tells the story of the Illiad using only ~400 greek words. It's designed for someone who has had about 1 semester of greek studies.
If you're actually interested enough in the books to re-read a translation, then I recommend starting to just go to the original language!
I teach computer science and so I look at a lot of people's hands as I watch them type on the keyboard. I'd guess that about 1/3 of female students have nails long enough that they cannot type comfortably on a keyboard, and this meaningfully impacts their performance in my classes. (Foreign-born women do not have this problem; only American-born women.) I don't see any painted nails though, just grotesquely large nails.
The median parental income at this school is $500k/year, so these are pretty upper class women.
It's time for the daily Two Minutes Hate against translators/localizers/paraphrasers who take unjustified liberties with the source material. "Said" rather than "had said"? "Old gentleman" rather than "gentleman"? Commas rather than em dashes? No repetition of "my son"?
This week I was reading my bible in greek and noticed that in Matt 15:17 uses the word ἀφεδρῶνα (toilet). Sadly, none of the popular translations like NIV/ESV actually include this word in their translation :(
Dell-Mann Amnesia is related. But the effect I'm thinking about / worried about is different.
This is the sort of high-quality motte post about random topics I've never thought about that I love to see. Thank you.
I have a meta-comment, however, about themotte: I wonder how dangerous high quality posts like this are that are outside my area of expertise?
What I mean by that is that I don't have enough background knowledge to fully judge the accuracy of this post. I will probably make future decisions based on this information though. And I suspect that there are enough well written posts like this that contain enough not-quite-perfectly-accurate statements that I will make suboptimal decisions in the future based on motte comments that I thought were correct but turned out not to be. And I wonder what the cumulative negative effects of this will be on my life.
"Physical strength required for jobs in different occupations."
According to this link, carrying 1lbs of weight at all times qualifies for "medium work"... my clothing weighs more than that...
Foreign governments have long been trying to monetize this, as well, paying handsomely for information provided by insiders.
Actually, American spies are famously recruited for very cheap prices. For example Ronald Pelton sold the secrets of the multi-billion dollar Operation Ivy Bells to the Soviet KGB for only $35k.
- Prev
- Next

They always call me obba (Korean for dad), which is why this never came up.
More options
Context Copy link