What is Correlation?
Correlation, in the finance and investment industries, is a statistic that measures the degree to which two securities move in relation to each other. Correlations are used in advanced portfolio management, computed as the correlation coefficient, which has a value that must fall between -1.0 and +1.0.
- Correlation is a statistic that measures the degree to which two variables move in relation to each other.
- In finance, the correlation can measure the movement of a stock with that of a benchmark index, such as the Beta.
- Correlation does not imply causation!
The Formula for Correlation Is
Where r is the correlation coefficient and:
x̄ is the average of observations of variable x; and ȳ is the average of observations of variable y.
A perfect positive correlation means that the correlation coefficient is exactly 1. This implies that as one security moves, either up or down, the other security moves in lockstep, in the same direction. A perfect negative correlation means that two assets move in opposite directions, while a zero correlation implies no relationship at all.
For example, large-cap mutual funds generally have a high positive correlation to the Standard and Poor's (S&P) 500 Index - very close to 1. Small-cap stocks have a positive correlation to that same index, but it is not as high - generally around 0.8.
However, put option prices and their underlying stock prices will tend to have a negative correlation. As the stock price increases, the put option prices go down. This is a direct and high-magnitude negative correlation.
Investment managers, traders and analysts find it very important to calculate correlation, because the risk reduction benefits of diversification rely on this statistic. Financial spreadsheets and software can calculate the value of correlation quickly.
Assume an analyst needs to calculate the correlation for the following two data sets:
X: 41, 19, 23, 40, 55, 57, 33
Y: 94, 60, 74, 71, 82, 76, 61
There are three steps involved in finding the correlation. The first is to add up all the X values to find SUM(X), add up all the Y values to fund SUM(Y) and multiply each X value with its corresponding Y value and sum them to find SUM(X,Y):
SUM(X) = (41 + 19 + 23 + 40 + 55 + 57 + 33) = 268
SUM(Y) = (94 + 60 + 74 + 71 + 82 + 76 + 61) = 518
SUM(X,Y) = (41 x 94) + (19 x 60) + (23 x 74) + ... (33 x 61) = 20,391
The next step is to take each X value, square it, and sum up all these values to find SUM(x^2). The same must be done for the Y values:
SUM(X^2) = (41^2) + (19^2) + (23^2) + ... (33^2) = 11,534
SUM(Y^2) = (94^2) + (60^2) + (74^2) + ... (61^2) = 39,174
Noting that there are seven observations, n, the following formula can be used to find the correlation coefficient, r:
r = (n x (SUM(X,Y) - (SUM(X) x (SUM(Y))) / SquareRoot((n x SUM(X^2) - SUM(X)^2) x (n x SUM(Y^2) - SUM(Y)^2))
In this example, the correlation would be:
r = (7 x 20,391 - (268 x 518) / SquareRoot((7 x 11,534 - 268^2) x (7 x 39,174 - 518^2)) = 3,913 / 7,248.4 = 0.54