site banner

Is the Future (Somewhat) Predictable? A Case for Treating Forecasting as a Skill

One view I hold - one I know many people here will be skeptical of - is that the future is partially predictable in a systematic way. Not in a deterministic or oracular sense, but in the limited, Tetlock-style sense of assigning calibrated probabilities to uncertain events and doing so better than baseline forecasters over time.

I’ve spent roughly the last 15 years trying to formalize and stress-test my own forecasting process. During that period, I’ve made public, timestamped predictions about events such as COVID, the Ukraine war, and various market movements. Some of these forecasts were wrong, some were directionally correct, and many were correct with meaningful lead time. Taken together, I think they at least suggest that forecasting can be treated as a learnable, improvable skill rather than an exercise in narrative hindsight.

When I’ve raised versions of this argument in the past (including in The Motte’s earlier Reddit incarnation), I’ve consistently encountered a few objections. I think these objections reflect reasonable priors, so I want to address them explicitly.

1 - “If prediction is possible, why aren’t the experts already doing it?”

My claim is not that expertise is useless, but that many expert institutions are poorly optimized for predictive accuracy. Incentives matter. Academia, media, and policy organizations tend to reward coherence, confidence, and alignment with prevailing narratives more than calibration or long-term scoring.

One reason I became interested in forecasting is that I appear to have unusually strong priors and pattern-recognition ability by objective measures. I’ve scored in the top 1% on multiple standardized exams (SAT, SHSAT, GMAT) on first attempts, which at least suggests above-average ability to reason under uncertainty and time pressure. That doesn’t make me infallible, but it does affect my prior that this might be a domain where individual skill differences matter.

Tetlock’s work also suggests that elite forecasting performance correlates less with formal credentials and more with specific cognitive habits: base-rate awareness, decomposition, active updating, and comfort expressing uncertainty numerically. These traits are not especially rewarded by most expert pipelines, which may explain why high-status experts often underperform trained forecasters.

My suspicion - very much a hypothesis, not a conclusion - is that many people in communities like this one are already better forecasters than credentialed experts, even if they don’t label what they’re doing as forecasting.

2 - “If you can forecast, why not just make money in markets?”

This is a fair question, since markets are one of the few environments where forecasts are continuously scored.

I have used forecasting methods in investing. Over the past five years, my average annual return has been approximately 40%, substantially outperforming major indices and comparable to or better than many elite hedge funds over the same period. This is net of mistakes, drawdowns, and revisions—not a cherry-picked subset.

That said, markets are noisy, capital-constrained, and adversarial. Forecasting ability helps, but translating probabilistic beliefs into portfolio construction, position sizing, and risk management is its own discipline. Forecasting is a necessary input, not a sufficient condition for success.

More importantly, I don’t think markets are the only - or even the most interesting - application. Forecasting is at least as relevant to geopolitics, institutional risk, public health, and personal decision-making, where feedback is slower but the stakes are often higher.

3 - “Where are the receipts?”

That’s a reasonable demand. I’ve tried to make at least some predictions public and timestamped so they can be evaluated ex ante rather than reconstructed after the fact.

Here are a few examples where I laid out forecasts and reasoning in advance:

https://questioner.substack.com/p/more-stock-advice

https://questioner.substack.com/p/superforecasting-for-dummies-9a5

I don’t claim these constitute definitive proof. At best, they are auditable data points that can be examined, criticized, or falsified.

What I’m Actually Interested in Discussing

I’m not asking anyone to defer to my forecasts, and I’m not claiming prediction is easy or universally applicable. What I am interested in is whether superforecasting should be treated as a legitimate applied discipline—and, if so:

Where does it work reliably, and where does it fail?

How should forecasting skill be evaluated outside of markets?

What selection effects or survivorship biases should we worry about?

Can forecasting methods be exploited or weaponized?

What institutional designs would actually reward calibration over narrative?

If your view is that forecasting success is mostly an artifact of hindsight bias or selective memory, I’d be genuinely interested in stress-testing that claim. Likewise, if you think forecasting works only in narrow domains, I’d like to understand where you’d draw those boundaries and why.

I’m less interested in persuading anyone than in subjecting the model itself to adversarial scrutiny. Looking forwards to hearing your thoughts.

4
Jump in the discussion.

No email address required.

I will answer your questions with two questions of my own. (The questions are semi-rhetorical in that I think they shed light on the answers to your questions, but also I would genuinely really like answers and I haven't seen any good answers.)


Theoretical Q.

I overall like the forecasting trend in the rationalist community. I find the idea of quantifying bias and uncertainty to be a valuable exercise that I have benefited from personally. I have a theory-level concern, however, that I've never seen properly addressed.

I internally model forecasting as: there exists a probability distribution over all possible futures, and the job of the forecaster is to approximate this distribution. In practice, forecasters do this by assigning probabilities to a bunch of events and then scoring themselves based on what actually happens (like you describe in your OP).

So here's my question: How confident can we actually be that your scoring algorithms are stable and consistent? I'm using these words in the technical sense from statistics. To see an example of how everything can go bad: Let's say you're trying to predict the number of people who die in 2026. If the true distribution of deaths/year is gaussian, you can use standard formulas for computing the mean and get a good estimate with error bars. But if the true distribution is Cauchy, the mean is undefined, and there is provably no way to accurately estimate this mean because it doesn't exist. The Cauchy distribution looks essentially identical to the Gaussian distribution, and it is extremely difficult to determine whether you are actually sampling from one or the other in practice. In practice, people who work under the Gaussian assumption will look like they're doing very well by the metrics superforecasters use until suddenly they have a disaster (see e.g. the 2008 financial collapse). Similarly, a 40% return over 5 years is "trivial" to achieve if you allow yourself to have a very high risk of ruin. Just invest in the S&P500 with 5x leverage.

So what are the actual, philosophical and statistical assumptions about the universe that superforecasters are relying on?


Practical Q

I work professionally with North Korea. I put in a lot of time studying their culture, geopolitics, language, etc in order to make my professional work more effective. I've long thought about how to quantify this work both to make my work even more effective and to convince other people that I am an expert on this topic. How do I go about as a practical matter starting to forecast on a very niche topic like this?

My impression is that most forecasters work very generally and basically try to eek out an edge over the general populace by (like you mention) not being fooled by basic statistical fallacies. This lets forecasters make more level-headed judgements about a wide range of topics, most of which are well-established questions that normies also think about (who will win the election? will an epidemic cause a downturn in the economy? etc.)

But I am interested only in a very narrow domain where there are basically no established questions to ask. With regards to North Korea, the basic questions might be:

  • Will Kim Jong Un die this year? (Almost certainly no; without looking it up, I'd guess the actuarial tables put him as <5% chance of death.)
  • Will the North and South declare war? (Also almost certainly no; I'd put it <1%.)
  • Will the North and South have a military skirmish? (Happens 1-2 times per decade, so let's say 20%)

But these are all super basic questions that anyone moderately politically aware could reasonably answer. There's no opportunity for me to develop my skill with questions like this, and there's not a "large enough n" for me to meaningfully test my skill. So I need to develop more detailed questions if I want to really improve my forecasting ability. But how? Some more detailed questions could be:

  • Will the North develop a new fully domestic cell phone in 2026? (I'd say 75% probability since they've been developing them the past few years. But then what exactly counts as "new" and what exactly counts as "fully domestic"?)
  • What will the price of rice be in Jan 2027? (It's currently 1 kilo/1800 won. I predict it will be <2200 in 1 year with 75% probability. Either a bad crop this year or more economic sanctions from the US could increase the price substantially, and I'll say that the union of those two events is about 25% probable.)

But how do I go about actually creating good questions like this? You especially want the questions to be correlated with the "basic"/important questions above, but it's not at all clear to me that the ability to predict food prices is at all related to the ability to predict whether and how large of a military conflict there will be.


One last aside: You don't mention the intelligence community at all. This is where calibrated predictions are rewarded more than narrative, and this is where people who actually want to work as superforecasters work. Some "fun" reading if you haven't already seen them are:

  1. "Psychology of Intelligence Analysis"

  2. "A Tradecraft Primer: Structured Analytic Techniques for Improving Intelligence Analysis"

  3. "Analytic Culture in the U.S. Intelligence Community"

These are all declassified CIA publications you can get from cia.gov. Most of my questions/frustrations expressed above are things that I've thought about from reading these works and talking to the people who use them professionally.

Some "fun" reading if you haven't already seen them are

I've been working in intelligence for 15 years and have read all of these books, and others from the canon.

I can say that none of these should be recommended after Tetlock's work has been published. Some techniques from the structured analytical technique toolbox don't work. Some definitely do more harm than good. And the continued teaching of these techniques in place of e.g. reference class forecasting is so baffling to me that I can't express my frustration.

Heuer seems like a good guy who was doing his best to fix the terrible problems in the CIA at the time. But he's been totally replaced as an authority on this subject since Kahneman came along. Kahneman obviously did experiments and knew the research. Heuer was going off his personal observations from his career, and suggested analytical techniques that were a little better than the gut feels of the ivy league scotch swizzling guys in the agency through the cold war. He has been very accepting of new research and basically says "I did my best to formalise analysis, if others can come along and do better that's fantastic."

Since that time, we've come a long way.

R Pherson on the other hand is just scum. He refuses to acknowledge that the techniques in his books don't have any scientific basis. They've been directly measured across various studies, and he essentially argues that they do, in fact, work. Despite them definitely not working on any meaningful metric. He's made a career in the lecture circuit and knows that backing down will undermine his financial basis.

The real indicator that these techniques don't work is that nobody in the planet really uses them. If they improved outcomes, people would. But instead everybody goes to these week long courses, learns how to do a bullshit mind map or analysis of competing hypothesis, gets signed off as certified, and never looks at the techniques again.