What Is P-Value?

In statistics, the p-value is the probability of obtaining results as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct. The p-value is used as an alternative to rejection points to provide the smallest level of significance at which the null hypothesis would be rejected. A smaller p-value means that there is stronger evidence in favor of the alternative hypothesis.

How Is P-Value Calculated?

P-values are calculated using p-value tables or spreadsheets/statistical software. Because different researchers use different levels of significance when examining a question, a reader may sometimes have difficulty comparing results from two different tests. P-values provide a solution to this problem.

For example, if a study comparing returns from two particular assets were undertaken using by different researchers who used the same data but different significance levels, the researchers might come to opposite conclusions regarding whether the assets differ.

To avoid this problem, the researchers could report the p-value of the hypothesis test and allow the reader to interpret the statistical significance themselves. This is called a p-value approach to hypothesis testing.

P-Value Approach to Hypothesis Testing

The p-value approach to hypothesis testing uses the calculated probability to determine whether there is evidence to reject the null hypothesis. The null hypothesis, also known as the conjecture, is the initial claim about a population (or data generating process).

The alternative hypothesis states whether the population parameter differs from the value of the population parameter stated in the conjecture.

In practice, the significance level is stated in advance to determine how the small the p-value must be in order to reject the null hypothesis.

Type I Error

A type I error is a false rejection of the null hypothesis. This occurs when the null hypothesis is true in reality, but the null hypothesis is rejected, having a p-value that is less than the significance level (often 0.05). The probability of a type I error is the significance level (again, often 0.05), and is the relative frequency of occurrence of obtaining a p-value that is less than the significance level, assuming the null hypothesis is true.

Real-World Example of P-Value

Assume an investor claims that their investment portfolio's performance is equivalent to that of the Standard & Poor's (S&P) 500 Index. To determine this, the investor conducts a two-tailed test. The null hypothesis states that the portfolio's returns are equivalent to the S&P 500's returns over a specified period, while the alternative hypothesis states that the portfolio's returns and the S&P 500's returns are not equivalent. (If the investor conducted a one-tailed test, the alternative hypothesis would state that the portfolio's returns are either less than or greater than the S&P 500's returns.)

One commonly used significance level is 0.05. If the investor finds that the p-value is less than 0.05, then there is evidence against the null hypothesis. As a result, the investor would reject the null hypothesis and accept the alternative hypothesis. The smaller the p-value, the greater the evidence against the null hypothesis. Thus, if the investor finds that the p-value is 0.001, there is strong evidence against the null hypothesis, and the investor can confidently conclude the portfolio's returns and the S&P 500's returns are not be equivalent.

Conversely, a p-value that is greater than 0.05 indicates that there is (at best) weak evidence against the conjecture, so the investor would fail to reject the null hypothesis. In this case, the differences observed between the investment portfolio data and the S&P 500 data are explainable by chance alone.