What Is Skewness?
Skewness is a measurement of the distortion of symmetrical distribution or asymmetry in a data set. Skewness is demonstrated on a bell curve when data points are not distributed symmetrically to the left and right sides of the median on a bell curve. If the bell curve is shifted to the left or the right, it is said to be skewed.
Skewness can be quantified as a representation of the extent to which a given distribution varies from a normal distribution. A normal distribution has a zero skew, while a lognormal distribution, for example, would exhibit some right skew.
- Skewness, in statistics, is the degree of asymmetry observed in a probability distribution.
- Distributions can exhibit right (positive) skewness or left (negative) skewness to varying degrees. A normal distribution (bell curve) exhibits zero skewness.
- Investors note right-skewness when judging a return distribution because it, like excess kurtosis, better represents the extremes of the data set rather than focusing solely on the average.
- Skewness informs users of the direction of outliers, though it does not tell users the number of outliers.
- Skewness is often found in stock market returns as well as the distribution of average individual income.
There are several different types of distributions and skews. The "tail" or string of data points away from the median is impacted for both positive and negative skews. Negative skew refers to a longer or fatter tail on the left side of the distribution, while positive skew refers to a longer or fatter tail on the right. These two skews refer to the direction or weight of the distribution.
In addition, a distribution can have a zero skew. Zero skew occurs when a data graph is symmetrical. Regardless of how long or fat the distribution tails are, a zero skew indicates a normal distribution of data. A data set can also have an undefined skewness should the data not provide sufficient information about its distribution.
The mean of positively skewed data will be greater than the median. In a negatively skewed distribution, the exact opposite is the case: the mean of negatively skewed data will be less than the median. If the data graphs symmetrically, the distribution has zero skewness, regardless of how long or fat the tails are.
The three probability distributions depicted below are positively-skewed (or right-skewed) to an increasing degree. Negatively-skewed distributions are also known as left-skewed distributions.
Skewness is used along with kurtosis to better judge the likelihood of events falling in the tails of a probability distribution.
There are several ways to measure skewness. Pearson’s first and second coefficients of skewness are two common methods. Pearson’s first coefficient of skewness, or Pearson mode skewness, subtracts the mode from the mean and divides the difference by the standard deviation. Pearson’s second coefficient of skewness, or Pearson median skewness, subtracts the median from the mean, multiplies the difference by three, and divides the product by the standard deviation.
Formula for Pearson's Skewness
Sk1=sXˉ−MoSk2=s3Xˉ−Mdwhere:Sk1=Pearson’s first coefficient of skewness and Sk2 the seconds=the standard deviation for the sampleXˉ=is the mean valueMo=the modal (mode) valueMd=is the median value
Pearson’s first coefficient of skewness is useful if the data exhibit a strong mode. If the data have a weak mode or multiple modes, Pearson’s second coefficient may be preferable, as it does not rely on mode as a measure of central tendency.
Skewness tells you where the outliers occur, although it doesn't tell you how many outliers occur.
What Does Skewness Tell You?
Investors note skewness when judging a return distribution because it, like kurtosis, considers the extremes of the data set rather than focusing solely on the average. Short- and medium-term investors in particular need to look at extremes because they are less likely to hold a position long enough to be confident that the average will work itself out.
Investors commonly use standard deviation to predict future returns, but the standard deviation assumes a normal distribution. As few return distributions come close to normal, skewness is a better measure on which to base performance predictions. This is due to skewness risk.
Skewness risk is the increased risk of turning up a data point of high skewness in a skewed distribution. Many financial models that attempt to predict the future performance of an asset assume a normal distribution, in which measures of central tendency are equal. If the data are skewed, this kind of model will always underestimate skewness risk in its predictions. The more skewed the data, the less accurate this financial model will be.
Examples of a Skewed Distribution
The departure from "normal" returns has been observed with more frequency in the last two decades, beginning with the internet bubble of the late 1990s. In fact, asset returns tend to be increasingly right-skewed. This volatility occurred with notable events, such as the Sept. 11 terrorist attacks, the housing bubble collapse and subsequent financial crisis, and during the years of quantitative easing (QE).
The broad stock market is often considered to have a negatively skewed distribution. The notion is that the market more often returns a small positive return more often a large negative loss. However, studies have shown that the equity of an individual firm may tend to be left-skewed.
A common example of skewness is the distribution of household income within the United States, as individuals are less likely to earn very high annual income. For example, consider 2020 household income statistics. The lowest quintile of income ranged from $0 to $27,026, while the highest quintile of income ranged from $85,077 to $141,110. With the highest quintile being more than twice as large as the lowest quintile, higher-income data points are more dispersed and cause a positively-skewed distribution.
What Does Skewness Tell Us?
Skewness tells us the direction of outliers. In a positive skew, the tail of a distribution curve is longer on the right side. This means the outliers of the distribution curve are further out towards the right and closer to the mean on the left. Skewness does not inform on the number of outliers; it only communicates the direction of outliers.
What Causes Skewness?
Skewness is simply a reflection of a data set in which activity is heavily condensed in one range and less condensed in another. Imagine scores being measured at an Olympic long jump contest. Many jumpers will likely land larger distances, while a fewer amount will likely land short distances. This often creates a right-skewed distribution. Therefore, the relationship between the data points and how often they occur causes skewness.
Is Skewness Normal?
Skewness is commonly found when analyzing data sets, as there are situations that occur where skewness is simply a component of the data set being analyzed. For example, consider the average human lifespan. As most people tend to die after reaching an elderly age, fewer individuals relatively tend to pass away when they are younger. In this case, skewness is expected and normal.
What Does High Skewness Mean?
High skewness means a distribution curve has a shorter tail on one end a distribution curve and a long tail on the other. The data set follows a normal distribution curve; however, higher skewed data means the data is not evenly distributed. The data points favor one side of the distribution due to the nature of the underlying data.