The Basics of Probability Density Function (PDF), With an Example

What Is a Probability Density Function (PDF)?

Probability density function (PDF) is a statistical expression that defines a probability distribution (the likelihood of an outcome) for a discrete random variable (e.g., a stock or ETF) as opposed to a continuous random variable. The difference between a discrete random variable is that you can identify an exact value of the variable.

The normal distribution is a common example of a PDF, forming the well-known bell curve shape.

In finance, traders and investors use PDFs to understand how price returns are distributed in order to evaluate their risk and expected return profile.

Key Takeaways

  • Probability density functions are a statistical measure used to gauge the likely outcome of a discrete value (e.g., the price of a stock or ETF).
  • PDFs are plotted on a graph typically resembling a bell curve, with the probability of the outcomes lying below the curve.
  • A discrete variable can be measured exactly, while a continuous variable can have infinite values.
  • PDFs can be used to gauge the potential risk/reward of a particular security or fund in a portfolio.
  • The normal distribution is often cited, forming a bell-shaped curve.

Understanding Probability Density Functions (PDFs)

PDFs are used in finance to gauge the risk of a particular security, such as an individual stock or ETF.

They are typically depicted on a graph, with a normal bell curve indicating neutral market risk, and a bell at either end indicating greater or lesser risk/reward. When the PDF is graphically portrayed, the area under the curve will indicate the interval in which the variable will fall. The total area in this interval of the graph equals the probability of a discrete random variable occurring.

More precisely, since the absolute likelihood of a continuous random variable taking on any specific value is zero due to the infinite set of possible values available, the value of a PDF can be used to determine the likelihood of a random variable falling within a specific range of values.

Normal Distribution

Image by Julie Bang © Investopedia 2020

A distribution skewed to the right side of the curve suggests greater upside reward, while a distribution skewed to the left indicates greater downside risk for traders.

Probability distributions can also be used to create cumulative distribution functions (CDFs), which adds up the probability of occurrences cumulatively and will always start at zero and end at 100%.

Investors should use PDFs as one of many tools to calculate the overall risk/reward in play in their portfolios.

Discrete vs. Continuous Probability Distribution Functions

PDFs can describe either discrete or continuous data. The difference is that discrete variables can only take on specific values, such as integers, yes vs. no, times of day, and so on. A continuous variable, in contrast, contains all values along the curve, including very small fractions or decimals out to a theoretically infinite number of places.

Discrete vs. Continuous
Discrete vs. Continuous PDF.

Image by Julie Bang © Investopedia 2020

Calculating a Probability Distribution Function

PDFs are often characterized by their mean, standard deviation, kurtosis, and skewness.

  • Mean: the arithmetic average value
  • Standard deviation: the dispersion of the data about the mean
  • Kurtosis: describes the "fatness" of the tails of the PDF
  • Skewness: refers to deviations in the PDF's symmetry

Computing the PDF and plotting it graphically can involve complex calculations that use differential equations or integral calculus. In practice, graphing calculators or statistical software packages are required to calculate a probability distribution function.

The Normal Distribution

As an example, the calculation for the PDF of the normal distribution is as follows:

f ( x ) = 1 σ 2 π e 1 2 ( x μ σ ) 2 where: x = Value of the variable or data being examined μ = Mean σ = Standard deviation \begin{aligned}&f(x) = \frac{ 1 }{ \sigma \sqrt{ 2 \pi }} e ^ { - \frac{ 1 }{ 2 } ( \frac { x - \mu }{ \sigma} ) ^ 2 } \\&\textbf{where:} \\&x = \text{Value of the variable or data being examined} \\&\mu = \text{Mean} \\&\sigma = \text{Standard deviation} \\\end{aligned} f(x)=σ2π1e21(σxμ)2where:x=Value of the variable or data being examinedμ=Meanσ=Standard deviation

A normal distribution always has a skewness = 0 and kurtosis = 3.0.

Other Probability Distribution Functions

While the normal distribution is often the most-cited and well-known, several other PDFs exist.

Uniform Distribution

The simplest and most popular distribution is the uniform distribution, in which all outcomes have an equal chance of occurring. A six-sided die has a uniform distribution. Each outcome has a probability of about 16.67% (1/6).

Image 3

Image by Julie Bang © Investopedia 2020

Binomial Distribution

The binomial distribution represents data that can only take on one of two values, such as the flip of a coin (heads vs. tails) or logical expressions that take the form of yes/no, on/off, etc.

Binomial Distribution
A histogram of a binomial distribution. C.K.Taylor

Lognormal Distribution

The lognormal distribution is important in finance because it better describes actual asset price returns than the standard normal distribution. This PDF has positive (right) skewness and higher kurtosis.

Lognormal
Lognormal Distribution.

Image by Julie Bang © Investopedia 2020

 

Poisson Distribution

The Poisson distribution is a PDF that is used to describe count variables, or the probabilities that certain number of occurrences will happen. For instance, how many apples are found on apple trees, how many bees are alive in a beehive over time, or on how many trading days a portfolio will lose 5% or more.

Image 8

Image by Julie Bang © Investopedia 2020

Beta Distribution

The beta distribution is a general type of PDF that can take on a variety of shapes and characteristics, as defined by just two parameters: alpha and beta. It is often used in finance to estimate bond default recovery rates or mortality rates in insurance.

Beta Distribution Variations
Beta Distribution Variations.

Example of a Probability Density Function

As a simple example of a probability distribution, let us look at the number observed when rolling two standard six-sided dice. Each die has a 1/6 probability of rolling any single number, one through six, but the sum of two dice will form the probability distribution depicted in the image below.

Seven is the most common outcome (1+6, 6+1, 5+2, 2+5, 3+4, 4+3). Two and twelve, on the other hand, are far less likely (1+1 and 6+6).

Image

Image by Sabrina Jiang © Investopedia 2020

What Does a Probability Density Function (PDF) Tell Us?

A probability density function (PDF) describes how likely it is to observe some outcome resulting from a data-generating process. For instance, how likely is it for a fair coin flipped to come up heads (50%). Or the role of a die to come up 6 (1/6 = 16.7%). A PDF can tell us which values are therefore most likely to appear vs. the less likely outcomes. This will change depending on the shape and characteristics of the PDF.

What Is the Central Limit Theorem (CLT) and How Does It Relate to PDFs?

The central limit theorem (CLT) states that the distribution of a random variable in a sample will begin to approach a normal distribution as the sample size becomes larger, regardless of the true shape of the distribution. Thus, we know that flipping a coin is a binary process, described by the binomial distribution (heads or tails). However, if we consider several coin tosses, the odds of getting any particular combination of heads and tails begin to differ. For instance, if we were to flip the coin ten times, the odds of getting 5 of each is most likely but getting ten heads in a row is extremely rare. Imagine 1,000 coin flips, and the distribution approaches the normal bell curve.

What Is a PDF vs. a CDF?

A probability density function (PDF) explains which values are likely to appear in a data-generating process at any given time or for any given draw.

A cumulative distribution function (CDF) instead depicts how these marginal probabilities add up, ultimately reaching 100% (or 1.0) of possible outcomes. Using a CDF we can see how likely it is that a variable's outcome will be less than or equal to some predicted value.

The figure below, for example, shows the CDF for a normal distribution.

CDF
CDF.

Image by Julie Bang © Investopedia 2020

The Bottom Line

Probability distribution functions (PDFs) describe the expected values of random variables drawn from a sample. The shape of the PDF explains how likely it is that an observed value were to have occurred. The normal distribution is a commonly-used example that can be described just its mean and standard deviation. Other PDFs are more complex and nuanced. Stock price returns tend to follow a lognormal distribution rather than a normal one, indicating that downside losses are more frequent than very large gains, relative to what the normal distribution would predict.