What is a 'Coefficient of Determination'
The coefficient of determination is a measure used in statistical analysis that assesses how well a model explains and predicts future outcomes. It is indicative of the level of explained variability in the data set. The coefficient of determination, also commonly known as "R-squared," is used as a guideline to measure the accuracy of the model.
BREAKING DOWN 'Coefficient of Determination'
The coefficient of determination is used to explain how much variability of one factor can be caused by its relationship to another factor. It is relied on heavily in trend analysis and is represented as a value between zero and one. The closer the value is to one, the better the fit, or relationship, between the two factors. The coefficient of determination is the square of the correlation coefficient, also known as "R," which allows it to display the degree of linear correlation between two variables.
The correlation is known as the "goodness of fit." A value of one indicates a perfect fit, and therefore it is a very reliable model for future forecasts. A value of zero, on the other hand, would indicate that the model fails to accurately model the data.
Analyzing the Coefficient of Determination
The coefficient of determination is the square of the correlation between the predicted scores in a data set versus the actual set of scores. It can also be expressed as the square of the correlation between X and Y scores, with the X being the independent variable and the Y being the dependent variable.
Regardless of representation, an R-squared equal to zero means that the dependent variable cannot be predicted using the independent variable. Conversely, if it equals one, it means that the dependent of variable is always predicted by the independent variable. A coefficient of determination that falls within this range measures the extent that the dependent variable is predicted by the independent variable. An R-squared of 0.20, for example, means that 20% of the dependent variable is predicted by the independent variable.
What Is the Goodness of Fit?
The goodness of fit, or the degree of linear correlation, measures the distance between a fitted line on a graph and all the data points that are scattered around the graph. The tight set of data will have a regression line that's very close to the points and have a high level of fit, meaning that the distance between the line and the data is very small. A good fit has an R-squared that is close to one.
However, R-squared is unable to determine whether the data points or predictions are biased. It also doesn't tell the analyst or user whether the coefficient of determination value is good or not. A low R-squared is not bad, for example, and it's up to the person to make a decision based on the R-squared number.