M

Your Notebook is now a Draft.

Once it's ready, please submit it for review by our team of Community Moderators. Thank you!

Pending

This content now needs to be approved by community moderators.

Submitted

This essay was submitted and is waiting for review.

{{qctrl.question.primary_project.name}}

When Should You Deviate From the Base Rate?

by RyanBeck {{qctrl.question.publish_time | dateStr}} Edited on {{qctrl.question.edited_time | dateStr}} {{"estimatedReadingTime" | translate:({minutes: qctrl.question.estimateReadingTime()})}}
  • Facebook
  • Twitter

  • From the Metaculus Journal series 'Shared Vision: Pro Forecaster Essays on Predicting the Future Better'

    Ryan Beck is Metaculus's Forecasting Program Coordinator, has predicted as a Metaculus Pro Forecaster, and is an INFER pro, where he currently ranks 5th on the all time forecasting leaderboard. Find him on Twitter here.

    If you ask an experienced forecaster for guidance, one of the first things they’ll tell you is to consider the base rate. The base rate is a measure of how prevalent an occurrence is in a certain population: how common a trait is in a group of people, or how frequently an event has occurred in the past over a certain period of time, for example. However, "base rate" is often used interchangeably with other concepts, such as the outside view, reference class, prior probability, and recurrence interval. But are these concepts all the same, and how do skilled forecasters know when to deviate from the base rate?

    Why Base Rates?

    The value of the base rate and its application to forecasting were popularized by psychology researchers Daniel Kahneman and Amos Tversky. In their paper On the Psychology of Prediction (Psychological Review, 1973) they present several experiments demonstrating base rate neglect, or people’s tendency to favor information about specifics as opposed to the underlying statistical probability.

    In one experiment, three groups were posed different questions. One group estimated the prevalence of nine graduate school specializations, while the other two were given a description of a hypothetical student. The first of these two groups ranked how similar the student was to each of the nine specializations, and the second predicted the student's likely specialization. The results indicated a stronger correlation between similarity and likelihood rankings than between base rate and likelihood, suggesting a heavier weight on the specific details than on the prevalence of each major.

    Kahneman and Tversky (Psychological Review, 1973) — Table 1

    In a second experiment, participants were given personality test descriptions of hypothetical individuals, with one group being told the tests were from a sample of 30 engineers and 70 lawyers, while another group heard the reverse ratio. The participants were tasked with determining the probability that each description was an engineer's, as well as the chance of randomly selecting an engineer from all tests. While they accurately gauged the latter, they seemed to overlook the engineer's base rate in their samples when evaluating individual descriptions, assigning probabilities based on characteristics instead.

    These experiments demonstrate how easy it can be to ignore base rates. This has strong implications for making accurate forecasts. When making predictions we might be tempted to focus primarily on the individual factors we believe are relevant and ignore the underlying prevalence, and this tendency can lead us astray.

    Kahneman and Lovallo (Management Science, 1993) further described these concepts as the inside view and the outside view, which they defined as follows:

    An inside view forecast is generated by focusing on the case at hand, by considering the plan and the obstacles to its completion, by constructing scenarios of future progress, and by extrapolating current trends. . . [The outside view] focuses on the statistics of a class of cases chosen to be similar in relevant respects to the present one. The case at hand is also compared to other members of the class, in an attempt to assess its position in the distribution of outcomes for the class.

    By incorporating the base rate or outside view, forecasters guard against focusing too much on specific circumstances and overlooking larger factors. The outside view is broad and typically does not account for specific circumstances, and it establishes a probability based on all the factors that have contributed to previous occurrences.

    Are These Terms Really Interchangeable?

    At the beginning I said:

    The term “base rate” is often used interchangeably with other terms and concepts, such as the outside view, reference class, prior probability, and recurrence interval.

    However, there are important differences between these concepts. In the original examples provided by Kahneman and Tversky, the “base rate” always referred to an underlying statistic about a population. For example, these instances all refer to an inherent prevalence in a population. They describe the unintuitive results that occur when administering tests with a certain rate of false positives. In short, when a trait has a low incidence in a population, a test for that trait with a moderate false-positive rate may actually return more false-positives than actual positives.

    In forecasting, the concept of a base rate has been applied more broadly. The base rate is often taken to mean the historical prevalence of an event. For example, the number of pandemics of a certain size, or the annual count of hurricanes. But are there really underlying traits for these events inherent in the period of interest? I find it likely that for hurricanes there are indeed some underlying factors that make a certain range of outcomes likely. However, for other types of events there are some key differences. Say you know that 5% of a population has red hair, and you select someone at random from that population. You can be confident that the person you selected has a 5% chance of having red hair. But say you know (using arbitrary numbers) that in 5% of the last 50 years North Korea has tested an ICBM. What then is the probability that North Korea will test an ICBM next year? It may be good practice to start from a baseline of 5%, but there’s certainly no basis in arguing that the inherent statistical probability is actually 5%. An ICBM test is not a random occurrence; there are significant political factors that dominate this consideration.

    BBC: North Korea: What missiles does it have? — March 20, 2023

    While this may seem like a minor semantic detail, it’s important to recognize that not all base rates are created equal, and indiscriminate use of the term can mislead about the value of different kinds of base rates. The term “outside view” as defined by Kahneman and Lovallo may be more applicable, as it refers to the broader practice of determining the prevalence of some occurrence in a reference class.

    Even the term outside view can be abused. As argued by Daniel Kokotajlo, the outside view is sometimes used by forecasters to mean various things, from trend extrapolation to one’s own intuition about the prevalence of some event. Be wary of claims based on an outside view that don't actually consider the prevalence of an occurrence within a reference class!

    And finally, even if you do take care to base your forecasts only on a considered outside view backed by a reference class, it’s important to think about the usefulness of that reference class. Picking a reference class is hard. Should I only consider events in the last 10 years? 50 years? All of known history? Should I use the prevalence of all civil wars to determine the probability of civil war in the United States, or limit it to only civil wars in countries with certain traits? This is a challenge that’s been known since at least 1876, when John Venn (yes, the inventor of Venn diagrams) wrote the following:

    . . .it must be remembered that each individual thing has not one distinct and appropriate class or group, to which, and to which alone, it properly belongs. We may indeed be practically in the habit of considering it under such a single aspect, and it may therefore seem to us more familiar when it occupies a place in one series rather than in another; but such a practice is merely customary on our part, not obligatory. It is obvious that every individual thing or event has an indefinite number of properties or attributes observable in it, and might therefore be considered as belonging to an indefinite number of different classes of things.

    Still, all is not lost. A forecaster can use their discretion to pick the most applicable reference class. It may even be valuable to compare the probability found from several reference classes and aggregate them. Methods have even been devised to help alleviate the issue of picking the right units of time. At the end of the relevant chapter in his book John Venn describes the competing factors that must be balanced to arrive at the best reference class:

    With regard to choosing one of these series rather than another, we have two opposing principles of guidance. On the one hand, the more special the series the better; for, though not more right in the end, we shall thus be more nearly right all along. But, on the other hand, if we try to make the series too special, we shall generally meet the practical objection arising from insufficient statistics.

    Deviating From the Base Rate

    The evidence suggests that using base rates is correlated with better forecasting accuracy. So one should be very careful when deviating too far from the base rate. But, is it ever okay to deviate far from the base rate? If it’s sometimes acceptable, how do you know when you should?

    There are no hard and fast rules, but there are signposts to look for. The key question to ask is whether the underlying conditions have changed dramatically. This question too is nontrivial, however: Some aspects can change, while others remain the same. If in doubt, don’t throw the base rate out. There can be a number of factors that are consistent between like outcomes which at the surface level appear to have different causes. The base rate is valuable because it accounts for these variables which don’t easily appear in an analysis of the relevant factors. The inside view focuses only on these apparent factors, while neglecting the other factors at play.

    Here’s a concrete example where deviating from the base rate was useful: the effects of COVID-19 on inflation. A naive view from early in the pandemic might suggest that 15 of the last 73 years had annual inflation of over 5%. Or perhaps you believe a more recent time period is applicable, in which case you might note that zero out of the last 30 years experienced inflation over 5%. So your base rate for a year surpassing 5% inflation could range from 0% to 20%. (Note that you should always account for a prior — a base rate of 0% is rarely sensible.) This approach would have led you astray, because most of the sample being considered did not account for extreme scenarios such as a global pandemic. This adherence to the base rate might have led institutions like the Federal Reserve and other economic organizations to underestimate the surge in inflation.

    Annual Percentage Change in the Consumer Price Index (CPI)

    The pandemic shook the status quo in countless ways, and I wouldn't suggest all of these were predictable. I myself adhered too closely to base rates and did not foresee the substantial increase in inflation or the length of time it would last. However, despite the challenge of predicting sudden shocks or departures from the status quo, we aren't powerless: Considering whether events represent a departure from the base rate can help us know when to favor other factors — or when to pivot to a more representative reference class that enables more accurate and better calibrated forecasting.

    Categories:
    Metaculus
    Submit Essay

    Once you submit your essay, you can no longer edit it.