This book is a deep dive in the science/art of forecasting. The author published academic research on forecasting (e.g. results of his Good Judgment Project) and felt it was often misunderstood, especially journalists.
The Good Judgment project fed many forecasting questions over a multiyear time-frame to a large group of volunteers, often amateur forecasters. The book draws insights from the best ones in the group.
As usual, I will summarize the book and comment on how it relates to value investing.
Chapter 1 – Optimistic Skeptic
The mainstream media caters to pundits that make vague statements about the future. Often, there’s a symbiotic relationship between attention-grabbing vague forecasts and the media.
- while bold statements create more buzz hence revenues for mainstream media channels, adding vagueness guarantees that this revenue can be repeated indefinitely
- sometimes the publisher encourages these type of forecasts: sell-side analysts might participate in a run-up for the most attention grabbing predictions like “Dow 10 000”, “Oil 30$” which are intended to impress a large number of clients
Even serious public personalities sometimes make non-falsifiable statements, hedging themselves for the outcome:
- vague statements “may happen”, “is a serious possibility”, “significant/serious probability” can be interpreted ex-post as any probability between 0% to 100%
- a deadline is often not specified, one example was the 2010 open letter to Bernanke about the risks of “future inflation” (the signees were Seth Klarman, Jim Chanos, James Grant, Niall Ferguson, AQR Cap, Elliott’s Paul Singer). Are the forecasters wrong because inflation has not materialized yet? They don’t talk about which type of inflation (price inflation, asset inflation, monetary inflation which is, well, true by definition), the deadline, or odds. Instead, the below statement from the letter cannot be falsified:
The planned asset purchases risk currency debasement and inflation, and we do not think they will achieve the Fed’s objective of promoting employment.
The chapter also discusses how forecastable the future is, or could become with advances. The butterfly effect in complex non-linear systems guarantees that foresight of even Superforecasters is very limited when more than five years out. Even “big data” cannot push the temporal boundaries of forecasting.
- the last bit about the time boundary to forecasting complex systems reminded me about the empirical finance books by Robert Haugen on quantitative value investing
- Haugen found that market multiples correlate well with five year subsequent profit growth. In other words, the market predicts the fundamental performance of companies in the near future well. However, Haugen also correlated market multiples with profit growth between the fifth and tenth subsequent years. The correlation was almost zero (for Haugen’s empirical results see The New Finance . Haugen’s other book which treats the low volatility anomaly is The Inefficient Stock Market . Although ironically the low volatility anomaly has worked well ever since CAPM came into existence, it is currently in vogue because of the smart beta ETF hype. The Warren Buffett quote What the wise do in the beginning, the fools do in the end might well apply here. Another author I can recommend on the low volatility anomaly is Eric Falkenstein, through his book and blog).
- I agree on the big data limitations for forecasting. For other big data skepticism, I can recommend this Wired article by Nassim Taleb
Chapter 2 – Illusions of Knowledge
There’s many drivers why humans are bound to kid themselves.
- Being scientific means practicing experiment and self-doubt, but typically this doesn’t reflect well on the expert if he is catering to a broad public. Examples:
- the way medicine was practiced for centuries was very unscientific, physicians killing many patients in the process because of arrogance of (illusory) knowledge. See also Antifragile’s chapter on medicine by Nassim Taleb
- Adopting a scientific way of thinking takes a lot of mental energy, as prioritizing intuitive snap judgment or fast thinking is in our genes to protect us from caveman dangers (see Thinking fast, Thinking Slow by Kahneman)
Advice to force scientific thinking is asking questions like “What could convince me I’m wrong?”, which is the equivalent of science setting up experiments to prove hypotheses wrong.
Also, closely monitor your mind to not mutilate hard questions into simple ones, e.g. is some animal lurking for me in the grass simplifies to “has an animal ambushed me in the past”, or likewise the question “do I have cancer” simplifies into “does the expert say I have cancer?”. This form of system 1 thinking is called bait-and-switch, subconsciously converting hard questions in easy ones.
“What could convince me I’m wrong?”
- Other bait-and-switch examples in investing
- “is this a stock with solid future returns?” simplifying into “is this a company with great fundamentals?” (irrespective of valuation). See the nifty-fifty bubble and Ben Graham’s discussion on this bait-and-switch behavior of some retail investors that equate good companies with good stocks
- “will electric cars / the 3D printing technology break through?” simplifying into “will this growing industry offer attractive returns to me as a shareholder” (“Don’t equate potential for social progress with shareholder returns, just look at airlines” – Warren Buffett)
- Note that this is even more dangerous in assets or collectibles without real anchor, e.g. bitcoin, alt-coins. The question “will bitcoin rise?” simplifying into “will the blockchain have a bright future”. Likewise, for gold the question “will gold rise?” simplifies to “will there be war?”, “will the monetary system crumble?” with no regard as to how much of this is already priced in. Indeed, the question how much is already priced in is so hard to answer for these no-anchor assets that bait-and-switch has an exceptional appeal
Chapter 3 – Keeping Score
Forecasts with vague probability statements and no time-frames are driven by tip-of-the-nose intuition. A big drawback is that feedback is impossible as hindsight bias will almost ensure forecasters interpret their vague statements favorably after the fact. This is what Mr. Tetlock calls the wrong-side-of-maybe fallacy, stretching “maybe” to the ex-post correct outcome in the forecaster’s mind.
The conclusion is that we cannot get around the necessity to use rigorous definitions of what we are forecasting, and number estimates of probabilities to be scientific and get real feedback after the facts. The other advantage is that specific statements force the mind to think more, as vague statements lull the mind in warm fuzzy thoughts. Indeed, Tetlock proved in a randomized trial that even in the domain of using number estimates, forcing subjects to use more decimals in probability made their forecast more accurate!
Diving deeper in the hard science of forecasting we have several key concepts that can be quantified:
- Calibration – percentage correct versus percentage forecasted, if we would draw a graph of buckets of the % of positive event outcomes versus the forecaster’s estimated % of positive outcome and the graph is a straight 45° line, the forecaster would have perfect forecasting calibration to outcomes
- Resolution – resolution is a measure that captures whether a forecaster dares to make forecasts outside the “maybe” domain of 60%-40% to 40%-60% and venture into the extremer and more difficult domain of predicting 90%-10% if warranted “well calibrated but cowardly” = poor resolution, “well calibrated and brave” = good resolution
The Brier score (Wikipedia) aggregates these forecasting qualities into one number for a set of forecasts. The Brier score captures the mean squared difference of actual outcomes versus the forecaster’s estimated probability of those outcomes occurring. Hence it is a cost function that ideally is as close to zero as possible. In short, it captures forecasting ability well.
Chapter 4 – Superforecasters
This is where the book starts drawing lessons from common characteristics of best performing forecasters in Tetlock’s Good Judgment Project.
The main feature of Superforecasters is being open minded to different ideas, acknowledging that useful info is widely dispersed
- ideological forecasters use colored lenses to look at the world and because of confirmation bias see everything as confirmation to their “big idea”
- a good example of the use of looking at problems from different perspectives is the “Guess 2/3 of the average guess between 0 and 100” problem, the logical and psycho-logical both count to guess well
“To the man with the hammer, every problem looks like a nail. – Charlie Munger (or the dangers of ideology in this context)
As for the lucky coin tossers argument that would invalidate Tetlock’s Superforecasters, Tetlock cites Michael Mauboussin that says:
To see how much luck plays a role in a skill/luck game, see how much mean reversion occurs in players with good results.
The author tested whether his forecaster group mean reverted fast over multiple years. He concludes that the best forecasters from the group, Superforecasters, could not be reproduced by coin tossers.
- a very big lesson here is that ideology and investing do not mix well at all
- one should be suspect of investors making ideological arguments for certain macro views / stocks.
- in this we should be resolute and distrust superinvestors
- Seth Klarman: this reminds me of Bronte Capital’s John Hempton’s post about how Klarman went long a gold miner that Hempton soon found out was operated by tricksters and went soon to zero (Hempton went through Soviet documents about the deposits and knew it was zero approx. zero). Mr. Hempton made a very convincing case on how Klarman got blindsided: the ideology of being an Austerian (Klarman was also a signee of the Bernanke letter) and wanting to believe the Keynesian policy would lead to disaster).
- David Einhorn’s 2011 investment in miners and gold was arguably outside his circle of competence as a stock picker and did not do well. He documented his buys with a lot of ideology on some conferences (see Buttonwood)
- if I invert this, I’d say that investors investing “against” their own ideology deserve some attention (e.g. Hugh Hendry)
Klarman became a victim of not only ideology (Chapter 4) as precious metals didn’t do well, but also bait-and-switch “precious metals will do well, hence miners will do well”
Synthesis of Hempton’s post how Klarman got blindsided with Superforecasting
Chapter 5 – Supersmart
Outside of the main feat of open-mindedness, Tetlock found that Superforecasters were above average intelligent (but not necessarily >130 IQs), had a growth mindset (people of the viewpoint that intelligence is something you can attain, not genetically fixed), and decomposed problems.
1. The Fermi Technique
To answer a difficult question like “how many piano tuners are active in Chicago”, the physicist Fermi decomposed this into sub-problems to get to a surprisingly accurate answer. Central questions are:
- What would have to be true for this to happen?
- What information would allow me to answer this question?
It is very important to focus on these questions first to prevent system 1 thinking of taking over.
- guess the number of pianos in Chicago
- how many people live in Chicago
- what percentage own a piano
- how many pianos are at institutions
- how many pianos need tuning each year
- how long does tuning take per piano
- how many hours a week does a tuner work
Another example of why it is so important to approach the problem rationally immediately is the Arafat question. Following Arafat’s death an investigation to polonium poisoning was launched. The forecasting question was: will the investigator find polonium in Arafat’s body.
A bait-and-switch pitfall would be to conflate the question with “did Israel poison Arafat”, because many more possibilities exist (palestine rival, CIA, ex-post fake poisoning, and even natural amounts of polonium in the body). Therefore always ask “what would it take for this to happen?”. Also note that the question is not “was Arafat murdered with polonium” but “will polonium be found“. What would it take for this not to happen? Significant probability: investigator messing up investigation.
What would have to be true for this to happen? What would it take for this not to happen?
Summarizing, we always have to ask ourselves these serious questions before our system 1 thinking takes over.
2. The outside view
What we have to explore first is what Kahneman calls The Outside view or the base rate in statistics. Ask what the probabilities are for certain things to happen as viewed from an outsider perspective.
For example, “Tom likes racing on circuits and often speeds in traffic. He commutes to work every day, what is the probability he has a car crash in the next five years?”, a gut feeling answer would be swayed by the story elements like “racer” and guess a probability from this emotional element. The base rate would be to find the rate of deaths per km driven for the sub-group that Tom belongs to (e.g. male, 20-30 year old, American), using all quantifiable objective elements that are available.
The reason why the base rate is important is that people are suckers for stories and have been shown to overestimate particulars of a story and underestimate the weight of the benchmark.
The reason why starting with the base rate is paramount is that anchoring bias will work in our advantage (instead of our disadvantage if we start with our gut feeling probability) by having our forecasting process in this order, knowing the previous point that we underestimate the weight of the base rate. Only after knowing the base rate, we would craftily have to tilt it by taking into account qualitative elements of the story.
The reason why starting with base rates is so important is that we make anchoring bias work in our advantage.
3. Variant views, what are experts saying, pre-mortems, forget estimation and re-estimate
After having incorporated a tilt to the base rate, next we want to assume we are wrong. When researchers told forecasters they were wrong and had to find a better estimate, their subsequent new estimate was more accurate.
Assuming one is wrong in advance and making an analysis of what went wrong is called a pre-mortem and can provide good insights.
Finding variant opinions might also improve one’s grasp of the situation.
Another technique is to first forget one’s estimation process after letting several weeks pass and do another estimate. It turns out that these second estimates are on average more accurate.
Lastly, turning a question on it’s head, for example “will South Africa allow the Dalai Lama” can reveal new questions “will South Africa deny the Dalai Lama”, why would South Africa deny the Dalai Lama?
All these techniques together provide the Superforecaster with a dragonfly-eye that has approached the problem from many perspectives to form an accurate view.
- finding variant perceptions is a classic in investing. I believe Michael Steinhardt was first to coin it in the ’80s (I got this from his book I don’t really recommend). Personally I think this is extra important in investing as you want to understand why the opportunity exists (what is the consensus view, or who is selling/buying) . David Einhorn is also known for saying he typically looks at situations that on which he already knows reasons why it could be mispriced. I try to do the same because
- not doing so will lead one to research many more mediocre opportunities. Because investing is a decision making game that is ultimately won with limited resources (i.e. research time and information), great filters are needed. See the concept of Bounded Rationality by Herbert Simon.
- if I do find an opportunity that looks great, but I can’t formulate an answer to why this “great” opportunity exists, I am more likely to be missing something myself! This ties to Warren Buffet who says
If you do not know who the patsy is at the table, you’re the patsy.
Chapter 6 – Superquants
Numeracy is often misunderstood as an advantage for forecasters. Superforecasters were found to be highly numerate people, but weren’t explicitly using much numbers. Rather they were “thinking in” numbers instead of words. Tetlock:
Superior numeracy does help superforecasters but not because it lets them tap into arcane math that divine the future. The truth is simpler, subtler, and much more interesting.
Many non-numerate people use three-setting mental dials and think in low resolutions: impossible, sure or maybe “50-50”. It was shown (this is somewhat trivial) that people who gave 50%-50% often as an answer were much less accurate in their forecasts.
Non-numerate people also tend to believe in fate, which is the opposite of probabilistic thinking. Finding meaning in events is positively correlated with well-being but negatively with forecasting ability.
- I relate a lot to this argument that numeracy is misunderstood as an advantage in forecasting/investing. I consider myself numerate but don’t use any arcane math in investing. I also belief that Buffett speaks the truth when he says he doesn’t use any higher order math, although I’m sure he knows higher order math.
Chapter 7 – Supernewsjunkies
The best forecasters continually updated their probability forecasts to get a good Brier score by keeping current to the news. They craftily balanced underreaction with overreaction to news.
When the facts change, I change my mind. What do you do? – J.M. Keynes
- in investing the danger of news is primarily to overreact because of recency bias (facts that are easiest to recall get more weight in judgments)
- One way to avoid this is described by Guy Spier in his book The Education of a Value Investor . Guy uses the rule to never trade while the markets are open.
- by using this rule, recency bias is attenuated because even the most recent news becomes older. It also forces an investor to rethink, similar to the findings from Chapter 5 to forget ones first analysis and redo it is beneficial to the accuracy of the analysis
- One way to avoid this is described by Guy Spier in his book The Education of a Value Investor . Guy uses the rule to never trade while the markets are open.
- when news is unfavorable to an investor’s position, the danger shifts to underreaction because of confirmation bias
Chapter 8 – Perpetual beta
Superforecasting requires a lot of mental energy to keep system 2 thinking running. A great forecaster knows that he should always be on guard to mental biases. Overconfidence might lure him back to becoming mediocre by doing snap judgments or system 1 thinking, because he built a great track record and hence “his judgment must be correct”. For a superforecaster to stay on track, he needs to keep improving his abilities and question himself. In computer programmer terms he is in perpetual beta.
An important factor to remain grounded to reality is to have good feedback. Research has shown that police officers are overconfident about their lie-detection abilities, but become even more detached to reality the more older and “experienced” they become. This is because their feedback is very limited.
Superforecasters make post-mortems of wrong estimates and put a lot of effort in these analyses.
- It speaks for itself that this is true for investors, especially when they are in the spotlight for good performance. For CEO’s there’s a similar effect: “managers of the year” company’s stocks perform pretty bad in the subsequent year. This is perhaps because of a myriad of other causes, but one of them is almost certainly the overconfidence one gets from the public spotlight.
- On feedback: luckily, markets provide feedback, which in the short term is very noisy. Although the market is noisy, it is important to remain fact-based and not mentally retain big winners while forgetting ones’ losing positions. Easy shortcut is to keep grounded to reality by looking at aggregate portfolio performance, which is already less noisy than individual positions.
Chapter 9 – Superteams
Teams are difficult to analyse and to control for.
General findings were that teams that
- got instructions to work together constructively but avoid groupthink “tell me what you think is wrong about my reasoning” (i.e. pre-mortem encouragement), performed better than the average of the individuals
- had members with diverse skills added to the team performance as the wisdom-of-crowds effect kicks in when diverse information is put together
Chapter 10 – The Leader’s Dilemma
To be a great forecaster takes self-doubt and experimentation. How do you reconcile this with the qualities that are expected from a leader (i.e. a prominent decision-maker), like confidence and decisiveness?
An answer was found in the Prussian army and the Wehrmacht. These armies had a very decentralized structure where superior’s ask their subordinates to achieve something but not how to achieve it. Note: Hitler violated these rules he inherited from the Wehrmacht, and some key Wehrmacht mistakes in the Normandy landing were arguably because of Hitler’s insistence on top-down commands.
A culture of second guessing superiors about battlefield tactics was nurtured in junior officers. Although everyone was encouraged to second-guess oneself and everyone else, the leaders’ mindset exhibited calmness and decisiveness once that decision-making process was completed.
- the decentralized way the Prussian army functioned is similar to Berkshire Hathaway. Buffett clearly communicates to his managers he prioritizes high ROIC over growth for its own sake, but he does not tell his managers how to achieve that
Staying a superforecaster is inherently difficult because of the overconfidence pulling us back toward system one thinking.
We can be cautiously optimistic about the future of scientific forecasting. On the one hand, the rise of prediction markets and the trend at the intelligence agency to use probability numbers instead of vague language seems to be set.
On the other hand, forecasting will always remain a sensitive topic: forecasting that Trump might win the Presidential Elections creates a backlash from the democratic party that might conflate the forecast with political views. Many professional career pundits will keep on self-selecting for sensational content for mainstream media channels that need revenues. They will keep on using vague language if they want to be assured of their long term career. Famous talking heads will keep on using the wrong-side-of-maybe fallacy after the fact to defend their reputation.
Lastly, some questions that really do matter cannot be framed in forecasting questions. The skill of asking the right question (a question that is worth investigating) is often independent of the skill of forecasting. An example is the Korean conflict: “will North Korea launch a rocket to country X by time T?” is a question that is perhaps not that worthwhile to answer. Meanwhile, “how does this all unfold?” is more worthwhile but more difficult to frame in binary/multiple choice forecast problems. A solution can be to dissect the big question that matters into sub-problems, or a “cluster of forecasts” belonging to the big question (e.g. nuclear tests, rocket launch to country X , Y, cyber attacks, artillery shelling, nuclear war). A column writer like Thomas Friedman doesn’t have a great track record for well-defined clear forecasts, but people like him that think more qualitatively raise many interesting questions (e.g. certain aspects of the outcome of the Iraq war) that forecasters can then try to solve.
- there exist several hidden agendas for biased or vague forecasts: political, brokers wanting to generate fuzz and commission, mainstream media pundits aiming for a career in eye-grabbing vague statements
- to improve forecasting ability, one needs feedback. The only way to get clear feedback is to lay out clear forecasts in numbers (that is, probability estimates)
- use base rates first, adapt to qualitative specifics after
- ask yourself scientific questions:
- What would have to occur for this (not) to happen?
- What information would allow me to answer this question?
- decompose the problem into subproblems “the Fermi technique”
- assess several points of view and thinkers, stear clear of ideology
- forecasting well is inherently fragile as overconfidence directs us to use our “great” snap judgment again
- asking the right questions is a whole different problem