DEFINITION of 'Sum Of Squares'

Sum of Squares is a statistical technique used in regression analysis to determine the dispersion of data points. In a regression analysis, the goal is to determine how well a data series can be fitted to a function which might help to explain how the data series was generated. The sum of squares is used as a mathematical way to find the function which best fits (varies least) from the data.

The calculation for Sum of Squares is ∑(xi – x̄)2

Sum of squares is also known as Variation.

BREAKING DOWN 'Sum Of Squares'

The sum of squares is a measure of deviation from the mean. In statistics, the mean is the average of a set of numbers and is the most commonly used measure of central tendency. The arithmetic mean is simply calculated by summing up the values in the data set and dividing by the number of values. Let’s say the closing prices of Microsoft (MSFT) in the last five days were {74.01, 74.77, 73.94, 73.61, 73.40} in US dollars. The sum of the total prices is $369.73 and the mean or average price of the textbook is, therefore, $369.73/5 = $73.95.

But knowing the mean of a measurement set is not always enough. Sometimes, it is helpful to know how much variation there is in a set of measurements. How far apart the individual values are from the mean may give some insight into how fit the observations or values are to the regression model that is created. For example, if an analyst wanted to know whether the share price of MSFT moves in tandem with the price of Apple (AAPL), he can list out the set of observations for the process of both stocks for a certain period, say 1, 2, or 10 years and create a linear model with each of the observations or measurements recorded. If the relationship between both variables i.e. price of AAPL and price of MSFT, is not a straight line, then there are variations in the data set that need to be scrutinized. In statistics speak, if the line in the linear model created does not pass through all the measurements of value, then some of the variability that has been observed in the share prices is unexplained. The sum of squares is used to calculate whether a linear relationship exists between two variables, and any unexplained variability is referred to as the residual sum of squares.

The sum of squares is the sum of the square of variation, where variation is defined as the spread between each individual value and the mean. To determine the sum of squares, the distance between each data point and the line of best fit is squared and then summed up. The line of best fit will minimize this value.

The formula for sum of squares is ∑(xi – x̄)2

where ∑ = sum

             xi = each value in the set

             x̄ = mean

             xi – x̄ = deviation

             (xi – x̄)2 = square of deviation

Now you can see why the measurement is called the sum of squared deviations, or the sum of squares for short. Using our MSFT example above, the sum of squares can be calculated as:

SS = (74.01 - 73.95)2 + (74.77 - 73.95)2 + (73.94 - 73.95)2 + (73.61 - 73.95)2 + (73.40 - 73.95)2

SS = (0.06) 2 + (0.82)2 + (-0.01)2 + (-0.34)2 + (-0.55)2

SS = 1.0942

Adding the sum of the deviations alone without squaring will result in a number equal to or close to zero since the negative deviations will almost perfectly offset the positive deviations. To get a more realistic number, the sum of deviations must be squared. The sum of squares will always be a positive number because the square of any number, whether positive or negative, is always positive. 

A high sum of squares indicates that most of the values are farther away from the mean, and hence, there is large variability in the data. A low sum of squares refers to low variability in the set of observations. In the example above, 1.0942 shows that the variability in the stock price of MSFT in the last five days is very low and investors looking to invest in stocks characterized by price stability and low volatility may opt for MSFT. However, making an investment decision on what stock to purchase requires much more observations than the five listed here. An analyst may have to work with years of data to know with a higher certainty how high or low the variability of an asset is. As more data points are added to the set, the sum of squares becomes larger as the values will be more spread out.

The most widely used measurements of variation are the standard deviation and the variance. However, to calculate either of the two metrics, the sum of squares must first be calculated. The variance is the average of the sum of squares, i.e.the sum of squares divided by the number of observations. The standard deviation is the square root of the variance.

There are two methods of regression analysis which use the sum of squares: the linear least squares method and the non-linear least squares method. Least squares refers to the fact that the regression function minimizes the sum of the squares of the variance from the actual data points. In this way, it is possible to draw a function which statistically provides the best fit for the data. Note that a regression function can either be linear (a straight line) or non-linear (a curving line).

  1. Square Position

    A square position is a situation where the market risk has been ...
  2. Sales Per Square Foot

    Sales per square foot is simply the total revenue divided by ...
  3. Standard Deviation

    The standard deviation is a statistic that measures the dispersion ...
  4. Chi Square Statistic

    A chi square statistic is a measurement of how expectations compare ...
  5. Regression

    A statistical measure that attempts to determine the strength ...
  6. Coefficient of Determination

    The coefficient of determination is a measure used in statistical ...
Related Articles
  1. Tech

    What is Square, Inc?

    Find out how Square helps small businesses accept credit cards, track sales and inventory through their mobile point of sale (POS) systems.
  2. Tech

    Square Inc. to Report Q3 Earnings (SQ)

    The global fintech leader may be positioned to secure its fourth consecutive earnings beat since its public debut in November of last year.
  3. Tech

    How Square Inc. Transformed in 2016

    Jack Dorsey's fintech platform saw rapid growth in Square Capital and with larger sellers.
  4. Tech

    Square Moves to Attract 'Upmarket' Clients (SQ)

    In the fintech leader's latest earnings call, the company indicates a push to harness larger businesses in order to gain more stable and lucrative revenue.
  5. Investing

    Square Seen Rising Higher on Bitcoin, M&A

    Square's stock already has risen nearly 60% in the past year, far outpacing the S&P 500.
  6. Investing

    Square Is Winning With Its Cash App: Nomura

    Square is making swift strides against its payment-processing competitors.
  7. Tech

    Fintech: Square Maintains Hold on Mobile Payments (SQ)

    With the announcement of faster chip card readers, Square's bread and butter remains in mobile payments.
  8. Investing

    Square’s 70% Run-Up Prompts Credit Suisse to Downgrade

    Credit Suisse thinks all the digital payments provider's upside is already baked into the stock
  9. Insights

    Apple, Square Announce New Payments Partnership (AAPL, SQ)

    Combined, both services give consumers better, more secure options than they currently had with credit and debit cards, the executives said.
  10. Investing

    Square's Overbought Shares May Plunge in Short Term

    Square Inc. stock has soared in 2018, rising by nearly 78%, but shares may be poised to pull back.
  1. What is the difference between the expected return and the standard deviation of ...

    Learn about the expected return and standard deviation and the difference between the expected return and standard deviation ... Read Answer >>
  2. What is the difference between linear regression and multiple regression?

    Learn the difference between linear regression and multiple regression and how multiple regression encompasses not only linear ... Read Answer >>
Trading Center