# Chi Square Statistic

## What is a 'Chi Square Statistic'

A chi square statistic is a measurement of how expectations compare to results. The data used in calculating a chi square statistic must be random, raw, mutually exclusive, drawn from independent variables and drawn from a large enough sample. For example, the results of tossing a coin 100 times meets these criteria.

## BREAKING DOWN 'Chi Square Statistic'

As a simple example of how to calculate and use the chi square statistic, consider tossing a coin 100 times. The expected result of tossing a fair coin 100 times is that heads will come up 50 times and tails will come up 50 times. The actual result might be that heads comes up 45 times and tails comes up 55 times. The chi square statistic shows any discrepancies between the expected results and the actual results.

## Example Chi Squared Calculation

Imagine a random poll was taken across 2,000 different voters, both male and female. The people who responded were classified by their gender and whether they were republican, democrat or independent. Imagine a grid with the columns labeled republican, democrat and independent and two rows labeled male and female. Assume the data from the 2,000 respondents is as follows:

Male: 400 (republican), 300 (democrat), 100 (independent) - Total males equals 800

Female: 500 (republican), 600 (democrat), 100 (independent) - Total females equals 1,200

Totals: 900 (republican), 900 (democrat), 200 (independent) - Grand total equals 2,000

The first step to calculate the chi squared statistic is to find the expected frequencies. These are calculated for each "cell" in the grid. Since there are two categories of gender and three categories of political view, there are six total expected frequencies. The formula for the expected frequency is:

E(r,c) = (n(r) x c(r)) / n

Where "r" is the row in questions, "c" is the column in question, and "n" equals the corresponding total. In this example, the expected frequencies are:

E(1,1) = (900 x 800) / 2,000 = 360

E(1,2) = (900 x 800) / 2,000 = 360

E(1,.3) = (200 x 800) / 2,000 = 80

E(2,1) = (900 x 1,200) / 2,000 = 540

E(2,2) = (900 x 1,200) / 2,000 = 540

E(2,3) = (200 x 1,200) / 2,000 = 120

Next, these are used values to calculate the chi squared statistic using the following formula:

Chi squared = Sum of (O(r,c) - E(r,c)) ^ 2 / E(r,c), where O(r,c) is the observed data for the given row and column.

In this example, the expression for each observed value is:

O(1,1) = (400 - 360) ^ 2 / 360 = 4.44

O(1,2) = (300 - 360) ^ 2 / 360 = 10

O(1,3) = (100 - 80) ^ 2 / 80 =5

O(2,1) = (500 - 540) ^ 2 / 540 = 2.96

O(2,2) = (600 - 540) ^ 2 / 540 = 6.67

O(2,3) = (100 - 120) ^ 2 / 120 = 3.33

The chi squared statistic then equals the sum of these value, or 32.41.