### What is Econometrics

Econometrics is the quantitative application of statistical and mathematical models using data to develop theories or test existing hypotheses in economics, and for forecasting future trends from historical data. It subjects real-world data to statistical trials and then compares and contrasts the results against the theory or theories being tested. Depending on if you are interested in testing an existing theory or using existing data to develop a new hypothesis based on those observations, econometrics can be subdivided into two major categories: theoretical and applied. Those who routinely engage in this practice are commonly known as econometricians.

### BREAKING DOWN Econometrics

Econometrics analyzes data using statistical methods in order to test or develop economic theory. These methods rely on statistical inferences to quantify and analyze economic theories by leveraging tools such as frequency distributions, probability and probability distributions, statistical inference, correlation analysis, simple and multiple regression analysis, simultaneous equations models and time series methods.

Econometrics was pioneered by Lawrence Klein, Ragnar Frisch and Simon Kuznets. All three won the Nobel Prize in economics in 1971 for their contributions. Today, it is used regularly among academics as well as practitioners such as Wall Street traders and analysts.

An example of the application of econometrics is to study the income effect using observable data. An economist may hypothesize that as a person increases his income, his spending will also increase. If the data show that such an association is present, a regression analysis can then be conducted to understand the strength of the relationship between income and consumption and whether or not that relationship is statistically significant - that is, it appears to be unlikely that it is due to chance alone.

### The Methodology of Econometrics

The first step to econometric methodology is to obtain and analyze a set of data and define a specific hypothesis that explains the nature and shape of the set. This data may be, for example, the historical prices for a stock index, observations collected from a survey of consumer finances, or unemployment and inflation rates in different countries. If you are interested in the relationship between the annual price change of the S&P 500 and the unemployment rate, you'd collect both sets of data. Here, you want to test the idea that higher unemployment leads to lower stock market prices. Stock market price is therefore your *dependent variable* and the unemployment rate is the *independent* or *explanatory* *variable. *The most common relationship is linear, meaning that any change in the explanatory variable will have a positive correlated with the dependent variable, in which case a simple regression model is often used to explore this relationship, which amounts to generating a best fit line between the two sets of data and then testing to see how far each data point is, on average, from that line.

Note that you can have several explanatory variables in your analysis, for example changes to GDP and inflation in addition to unemployment in explaining stock market prices. When more than one explanatory variable is used, it is referred to as multiple linear regression - a model that is the most commonly used tool in econometrics.

Several different regression models exist that are optimized depending on the nature of the data being analyzed and the type of question being asked. The most common example is the *ordinary least-squares* (OLS) regression, which can be conducted on several types of cross-sectional or time-series data. If you're interested in a binary (yes-no) outcome - for instance, how likely you are to be fired from a job (yes, you get fired, or no, you do not) based on your productivity - you can use a logistic regression or a probit model. Today, there are hundreds of models that an econometrician has at his disposal.

Econometrics is now conducted using statistical analysis software packages designed for these purposes, such as STATA, SPSS, or R. These software packages can also easily test for statistical significance to provide support that the empirical results produced by these models are not merely the result of chance. R-squared, t-tests, p-values, and null-hypothesis testing are all methods used by econometricians to evaluate the validity of their model results.

Econometrics is sometimes criticized for relying too heavily on the interpretation of data without linking it to established economic theory. It is crucial that the findings revealed in the data are able to be adequately explained by a theory, even if that means developing your own theory of the underlying processes. Regression analysis also does not prove causation, and just because two data sets show an association, it may be spurious: for example, drowning deaths in swimming pools increase with GDP. Does a growing economy cause people to drown? Of course not, but perhaps more people buy pools when the economy is booming.