DEFINITION of Sum Of Squares

Sum of Squares is a statistical technique used in regression analysis to determine the dispersion of data points. In a regression analysis, the goal is to determine how well a data series can be fitted to a function which might help to explain how the data series was generated. The sum of squares is used as a mathematical way to find the function which best fits (varies least) from the data.

The calculation for Sum of Squares is ∑(xi – x̄)2

Sum of squares is also known as Variation.


The sum of squares is a measure of deviation from the mean. In statistics, the mean is the average of a set of numbers and is the most commonly used measure of central tendency. The arithmetic mean is simply calculated by summing up the values in the data set and dividing by the number of values. Let’s say the closing prices of Microsoft (MSFT) in the last five days were {74.01, 74.77, 73.94, 73.61, 73.40} in US dollars. The sum of the total prices is $369.73 and the mean or average price of the textbook is, therefore, $369.73/5 = $73.95.

But knowing the mean of a measurement set is not always enough. Sometimes, it is helpful to know how much variation there is in a set of measurements. How far apart the individual values are from the mean may give some insight into how fit the observations or values are to the regression model that is created. For example, if an analyst wanted to know whether the share price of MSFT moves in tandem with the price of Apple (AAPL), he can list out the set of observations for the process of both stocks for a certain period, say 1, 2, or 10 years and create a linear model with each of the observations or measurements recorded. If the relationship between both variables i.e. price of AAPL and price of MSFT, is not a straight line, then there are variations in the data set that need to be scrutinized. In statistics speak, if the line in the linear model created does not pass through all the measurements of value, then some of the variability that has been observed in the share prices is unexplained.

The sum of squares is used to calculate whether a linear relationship exists between two variables, and any unexplained variability is referred to as the residual sum of squares.

The sum of squares is the sum of the square of variation, where variation is defined as the spread between each individual value and the mean. To determine the sum of squares, the distance between each data point and the line of best fit is squared and then summed up. The line of best fit will minimize this value.

The formula for sum of squares is ∑(xi – x̄)2

where ∑ = sum

             xi = each value in the set

             x̄ = mean

             xi – x̄ = deviation

             (xi – x̄)2 = square of deviation

Now you can see why the measurement is called the sum of squared deviations, or the sum of squares for short. Using our MSFT example above, the sum of squares can be calculated as:

SS = (74.01 - 73.95)2 + (74.77 - 73.95)2 + (73.94 - 73.95)2 + (73.61 - 73.95)2 + (73.40 - 73.95)2

SS = (0.06) 2 + (0.82)2 + (-0.01)2 + (-0.34)2 + (-0.55)2

SS = 1.0942

Adding the sum of the deviations alone without squaring will result in a number equal to or close to zero since the negative deviations will almost perfectly offset the positive deviations. To get a more realistic number, the sum of deviations must be squared. The sum of squares will always be a positive number because the square of any number, whether positive or negative, is always positive. 

A high sum of squares indicates that most of the values are farther away from the mean, and hence, there is large variability in the data. A low sum of squares refers to low variability in the set of observations. In the example above, 1.0942 shows that the variability in the stock price of MSFT in the last five days is very low and investors looking to invest in stocks characterized by price stability and low volatility may opt for MSFT. However, making an investment decision on what stock to purchase requires much more observations than the five listed here. An analyst may have to work with years of data to know with a higher certainty how high or low the variability of an asset is. As more data points are added to the set, the sum of squares becomes larger as the values will be more spread out.

The most widely used measurements of variation are the standard deviation and the variance. However, to calculate either of the two metrics, the sum of squares must first be calculated. The variance is the average of the sum of squares, i.e.the sum of squares divided by the number of observations. The standard deviation is the square root of the variance.

There are two methods of regression analysis which use the sum of squares: the linear least squares method and the non-linear least squares method. Least squares refers to the fact that the regression function minimizes the sum of the squares of the variance from the actual data points. In this way, it is possible to draw a function which statistically provides the best fit for the data. Note that a regression function can either be linear (a straight line) or non-linear (a curving line).