What Is Sample Size Neglect?
Sample Size Neglect is a cognitive bias famously studied by Amos Tversky and Daniel Kahneman. It occurs when users of statistical information make false conclusions by failing to consider the sample size of the data in question. The underlying cause of Sample Size Neglect is that people often fail to understand that high levels of variance are more likely to occur in small samples. Therefore, it is critical to determine whether the sample size used to produce a given statistic is large enough to allow for meaningful conclusions. Knowing when a sample size is sufficiently large can be challenging for those who do not have a good understanding of statistical methods.
- Sample Size Neglect is a cognitive bias studied by Amos Tversky and Daniel Kahneman.
- It consists of drawing false conclusions from statistical information, due to having not considered the effects of sample size.
- Those wishing to reduce the risk of Sample Size Neglect should remember that smaller sample sizes are associated with more volatile statistical results, and vice-versa.
Understanding Sample Size Neglect
Most statistical inference depends on the law of large numbers. This says that with a large enough sample, the characteristics of the population from which the sample is drawn can be inferred, with some degree of confidence, from the characteristics of the sample. When a sample size is too small, accurate and trustworthy conclusions cannot be drawn. Sample size neglect consists of ignoring the effect of small samples on our ability to draw such conclusions. In the context of finance, this can mislead investors in various ways.
For instance, an investor might see an advertisement for a new investment fund, boasting of having generated 15% annualized returns since its inception. The investor might be quick to conclude that this fund is a ticket to rapid wealth generation. However, if the fund hasn't been around very long, this conclusion might misinform the potential investor. The results may be due to short-term anomalies and have little to do with the fund's actual investment methodology.
Sample Size Neglect is often confused with Base Rate Neglect, which is a related cognitive bias. While Sample Size Neglect refers to the failure to consider the role of sample sizes in determining the trustworthiness of statistical claims, Base Rate Neglect relates to people's tendency to neglect existing knowledge about a phenomenon when evaluating new information.
Real World Example of Sample Size Neglect
To better understand Sample Size Neglect, consider the following example, which is drawn from Tversky and Kahneman's research:
A person is asked to draw from a sample of five balls, and finds that four are red and one is green.
A person draws from a sample of 20 balls, and finds that 12 are red and eight are green.
Which sample provides better evidence that the balls are predominantly red?
Most people say that the first, smaller sample provides much stronger evidence because the ratio of red to green is much higher than the larger sample. However, in reality the higher ratio is outweighed by the smaller sample size. The sample of 20 actually provides much stronger evidence.
Another example from Tversky and Kahneman is as follows:
A town is served by two hospitals. In the larger hospital, an average of 45 babies are born each day, and in the smaller hospital about 15 babies are born each day. Although 50% of all babies are boys, the exact percentage fluctuates from day to day.
During one year, each hospital recorded the days on which more than 60% of the babies happened to be boys. Which hospital recorded more such days?
When asked this question, 22% of respondents said that the larger hospital would report more such days, while 56% said that the results would be the same for both hospitals. In fact, the correct answer is that the smaller hospital would record more such days, because its smaller size would produce greater variability.
As we noted earlier on, the foundation of Sample Size Neglect is that people often fail to understand that high levels of variance are more likely to occur in small samples. In investing, this can be very costly indeed.