Quantitative Methods - Common Probability Distributions
The topics in this section provide a number of the quantitative building blocks useful in analyzing and predicting random variables such as future sales and earnings, growth rates, market index returns and returns on individual asset classes and specific securities. All of these variables have uncertain outcomes; thus there is risk that any downside uncertainty can result in a surprising and material impact. By understanding the mechanics of probability distributions, such risks can be understood and analyzed, and measures taken to hedge or reduce their impact.
A probability distribution gathers together all possible outcomes of a random variable (i.e. any quantity for which more than one value is possible), and summarizes these outcomes by indicating the probability of each of them. While a probability distribution is often associated with the bell-shaped curve, recognize that such a curve is only indicative of one specific type of probability, the so-called normal probability distribution. The CFA curriculum does focus on normal distributions since they frequently apply to financial and investment variables, and are used in hypothesis testing. However, in real life, a probability distribution can take any shape, size and form.
Example: Probability Distribution
For example, say if we wanted to choose a day at random in the future to schedule an event, and we wanted to know the probability that this day would fall on a Sunday, as we will need to avoid scheduling it on a Sunday. With seven days in a week, the probability that a random day would happen to be a Sunday would be given by one-seventh or about 14.29%. Of course, the same 14.29% probability would be true for any of the other six days.
In this case, we would have a uniform probability distribution: the chances that our random day would fall on any particular day are the same, and the graph of our probability distribution would be a straight line.
|Figure 2.8: Probability Distribution|
Probability distributions can be simple to understand as in this example, or they can be very complex and require sophisticated techniques (e.g., option pricing models, Monte Carlo simulations) to help describe all possible outcomes.
Discrete Random Variables
Discrete random variables can take on a finite or countable number of possible outcomes. The previous example asking for a day of the week is an example of a discrete variable, since it can only take seven possible values. Monetary variables expressed in dollars and cents are always discrete, since money is rounded to the nearest $0.01. In other words, we may have a formula that suggests a stock worth $15.75 today will be $17.1675 after it grows 9%, but you can't give or receive three-quarters of a penny, so our formula would round the outcome of 9% growth to an amount of $17.17.
Continuous Random Variables
A continuous random variable has infinite possible outcomes. A rate of return (e.g. growth rate) is continuous:
- a stock can grow by 9% next year or by 10%, and in between this range it could grow by 9.3%, 9.4%, 9.5%
- in between 9.3% and 9.4% the rate could be 9.31%, 9.32%, 9.33%, and in between 9.32% and 9.33% it could grow 9.32478941%
- clearly there is no end to how precise the outcomes could be broken down; thus it's described as a continuous variable.
Outcomes in Discrete vs. Continuous Variables
The rule of thumb is that a discrete variable can have all possibilities listed out, while a continuous variable must be expressed in terms of its upper and lower limits, and greater-than or less-than indicators. Of course, listing out a large set of possible outcomes (which is usually the case for money variables) is usually impractical - thus money variables will usually have outcomes expressed as if they were continuous.
Rates of return can theoretically range from -100% to positive infinity. Time is bound on the lower side by 0. Market price of a security will also have a lower limit of $0, while its upper limit will depend on the security - stocks have no upper limit (thus a stock price's outcome > $0), but bond prices are more complicated, bound by factors such as time-to-maturity and embedded call options. If a face value of a bond is $1,000, there's an upper limit (somewhere above $1,000) above which the price of the bond will not go, but pinpointing the upper value of that set is imprecise.
A probability function gives the probabilities that a random variable will take on a given list of specific values. For a discrete variable, if (x1, x2, x3, x4 ...) are the complete set of possible outcomes, p(x) indicates the chances that X will be equal to x. Each x in the list for a discrete variable will have a p(x). For a continuous variable, a probability function is expressed as f(x).
The two key properties of a probability function, p(x) (or f(x) for continuous), are the following:
- 0 < p(x) < 1, since probability must always be between 0 and 1.
- Add up all probabilities of all distinct possible outcomes of a random variable, and the sum must equal 1.
Determining whether a function satisfies the first property should be easy to spot since we know that probabilities always lie between 0 and 1. In other words, p(x) could never be 1.4 or -0.2. To illustrate the second property, say we are given a set of three possibilities for X: (1, 2, 3) and a set of three for Y: (6, 7, 8), and given the probability functions f(x) and g(y).
For all possibilities of f(x), the sum is 0.31+0.43+0.26=1, so we know it is a valid probability function. For all possibilities of g(y), the sum is 0.32+0.40+0.23 = 0.95, which violates our second principle. Either the given probabilities for g(y) are wrong, or there is a fourth possibility for y where g(y) = 0.05. Either way it needs to sum to 1.
Probability Density Function
A probability density function (or pdf) describes a probability function in the case of a continuous random variable. Also known as simply the "density", a probability density function is denoted by "f(x)". Since a pdf refers to a continuous random variable, its probabilities would be expressed as ranges of variables rather than probabilities assigned to individual values as is done for a discrete variable. For example, if a stock has a 20% chance of a negative return, the pdf in its simplest terms could be expressed as: