BARBERIS AND THALER
people often fail to take the size of the sample into account: after all, a small sample can be just as representative as a large one. Six tosses of a coin resulting in three heads and three tails are as representative of a fair coin as 500 heads and 500 tails are in a total of 1,000 tosses. Representativeness implies that people will find the two sets of tosses equally informative about the fairness of the coin, even though the second set is much more so.
Sample-size neglect means that in cases where people do not initially know the data-generating process, they will tend to infer it too quickly on the basis of too few data points. For instance, they will come to believe that a financial analyst with four good stock picks is talented because four suc- cesses are not representative of a bad or mediocre analyst. It also generates a “hot hand” phenomenon, whereby sports fans become convinced that a basketball player who has made three shots in a row is on a hot streak and will score again, even though there is no evidence of a hot hand in the data (Gilovich, Vallone, and Tversky 1985). This belief that even small samples will reflect the properties of the parent population is sometimes known as the “law of small numbers” (Rabin 2002).
In situations where people do know the data-generating process in ad- vance, the law of small numbers leads to a gambler’s fallacy effect. If a fair coin generates five heads in a row, people will say that “tails are due.” Since they believe that even a short sample should be representative of the fair coin, there have to be more tails to balance out the large number of
While representativeness leads to an underweighting of
base rates, there are situations where base rates are over-emphasized relative to sample evidence. In an experiment run by Edwards (1968), there are two urns, one containing 3 blue balls and 7 red ones, and the other containing 7 blue balls and 3 red ones. A random draw of 12 balls, with replacement, from one of the urns yields 8 reds and 4 blues. What is the probability the draw was made from the first urn? While the correct answer is 0.97, most people estimate a number around 0.7, apparently overweighting the base rate of 0.5.
At first sight, the evidence of conservatism appears at odds with repre- sentativeness. However, there may be a natural way in which they fit to- gether. It appears that if a data sample is representative of an underlying model, then people overweight the data. However, if the data is not repre- sentative of any salient model, people react too little to the data and rely too much on their priors. In Edwards’s experiment, the draw of 8 red and 4 blue balls is not particularly representative of either urn, possibly leading to an overreliance on prior information.11
Mullainathan (2001) presents a formal model that neatly reconciles the evidence on un- derweighting sample information with the evidence on overweighting sample information. 11