What is a T-Test?

A t-test is a type of inferential statistic used to determine if there is significant difference between the means of two groups, which may be related in certain features. It is mostly used when the data sets, like the data set recorded as outcome from flipping a coin 100 times, would follow a normal distribution and may have unknown variances. A t-test is used as a hypothesis testing tool, which allows testing of an assumption applicable to a population. 

A t-test looks at the t-statistic, the t-distribution values, and the degrees of freedom to determine the probability of difference between two sets of data. To conduct a test with three or more variables, one must use an analysis of variance.



Breaking Down a T-Test

Consider that a drug manufacturer wants to test a newly invented medicine. It follows the standard procedure of trying the drug on one group of patients and giving a placebo to another group, called the control group. The placebo given to the control group is a substance of no intended therapeutic value and serves as a benchmark to measure how the other group, which is given the actual drug, responds. After the drug trial, the members of the placebo-fed control group report an increase in average life expectancy of three years, while the members of the group who are prescribed the new drug report an increase in average life expectancy of four years. Instant observation may indicate that the drug is indeed working as the results are better for the group using the drug. However, it is also possible that the observation may be due to chance occurrence, especially a surprising piece of luck. A t-test is useful to conclude if the results are actually correct and applicable to the entire population.

In a school, 100 students in class A scored an average of 85% with a standard deviation of 3%. Another 100 students belonging to class B scored an average of 87% with a standard deviation of 4%. While the average of class B is better than that of class A, it may not be correct to jump to the conclusion that the overall performance of students in class B is better than that of students in class A. This is because, along with the mean, the standard deviation of class B is also higher than that of class A. It indicates that their extreme percentages, on lower and higher sides, were much more spread out compared to that of class A. A t-test can help to determine which class fared better.

Essentially, a t-test allows us to compare the average values of the two data sets and determine if they came from the same population. In the above examples, if we were to take a sample of students from class A and another sample of students from class B, we would not expect them to have exactly the same mean and standard deviation. Similarly, samples taken from the placebo-fed control group and those taken from the drug prescribed group should have a slightly different mean and standard deviation.

Mathematically, the t-test takes a sample from each of the two sets and establishes the problem statement by assuming a null hypothesis that the two means are equal. Based on the applicable formulas, certain values are calculated and compared against the standard values, and the assumed null hypothesis is accepted or rejected accordingly. If the null hypothesis qualifies to be rejected, it indicates that data readings are strong and are not by chance.

Formula for Performing a T-Test

Calculating a t-test requires three key data values. They include the difference between the mean values from each data set (called the mean difference), the standard deviation of each group, and the number of data values of each group.

The outcome of the t-test produces a t-value. This calculated t-value is then compared against a value obtained from a critical value table (called the T Distribution Table). This comparison helps to determine how likely the difference between the means occurred by chance or whether the data sets really have intrinsic difference(s). The t-test questions whether the difference between the groups represents a true difference of the study or if it is likely a meaningless statistical difference.

The T Distribution table is available in one tail and two tails formats. The former is used for assessing cases which have a fixed value or range with a clear direction (positive/negative). For instance, what is the probability of output value remaining below -3, or getting more than seven when rolling a pair of dices. The latter is used for range bound analysis, such as asking if the coordinates fall between -2 and +2?

The calculations can be performed with standard software programs that support the necessary statistical functions, like those found in MS Excel.

The t-test produces two values as its output - t-value and degrees of freedom. The t-value is a ratio of the difference between the mean of the two sample sets and the difference that exists within the sample sets. While the numerator value (difference between the mean of the two sample sets) is straightforward to calculate, the denominator (difference that exists within the sample sets) can become a bit complicated depending upon the type of data values involved. The denominator of the ratio is a measurement of the dispersion, or variability. Higher values of the t-value, also called t-score, indicate that a large difference exists between the two sample sets. The smaller the t-value, the more similarity exists between the two sample sets.

  • A large t-score indicates that the groups are different.
  • A small t-score indicates that the groups are similar.

Degrees of freedom are the number of values in a study that have the freedom to vary and are essential for assessing the importance and the validity of the null hypothesis. Computation of these values usually depends upon the number of data records available in the sample set.

Different Types of T-Tests

There are three types of t-tests, and they are categorized as dependent and independent t-tests.

Correlated (or Paired) T-Test: The correlated t-test is performed when the samples typically consist of matched pairs of similar units, or when there are cases of repeated measures. For example, there may be instances of the same patients being tested repeatedly - before and after receiving a particular treatment. In such cases, each patient is being used as a control sample against themselves. This method also applies to cases where the samples are related in some manner or have matching characteristics, like a comparative analysis involving children, parents or siblings. Correlated or paired t-tests are of a dependent type, as these involve cases where the two sets of samples are related.

The formula for computing the t-value and degrees of freedom for a paired t-test is:

mean1 and mean2 are the average values of each of the sample sets, while var1 and var2 represent the variance of each of the sample sets.

The remaining two types belong to the independent t-tests. The samples of these types are selected independent of each other – that is, the data sets in the two groups don’t refer to the same values. They include cases like a group of 100 patients being split into two sets of 50 patients each. One of the groups becomes the control group and is given a placebo, while the other group receives the prescribed treatment. This constitutes two independent sample groups which are unpaired with each other.

Equal Variance (or pooled) T-Test: The equal variance t-test is used when the number of samples in each groups is the same, or the variance of the two data sets is similar. The following formula is used for calculating t-value and degrees of freedom for equal variance t-test:

Unequal Variance T-Test: The unequal variance t-test is used when the number of samples in each group is different, and the variance of the two data sets is also different. This test is also called the Welch's t-test. The following formula is used for calculating t-value and degrees of freedom for an unequal variance t-test:

Determining the Correct T-Test to Use

The following flowchart can be used to determine which t-test should be used based on the characteristics of the sample sets. The key items to be considered include whether the sample records are similar, the number of data records in each sample set, and the variance of each sample set.

Example of Unequal Variance T-Test Calculation

Assume that we are taking diagonal measurement of paintings received in an art gallery. One group of samples includes 10 paintings, while the other includes 20 paintings. The data sets, with the corresponding mean and variance values, are as follows:

Though the mean of Set 2 is higher than that of Set 1, we cannot conclude that all paintings have an average length around 21.6 units since the variance of Set 2 is significantly higher than Set 1. Is this by chance, or do differences really exist in the overall population of all the paintings received in the art gallery? We establish the problem by assuming the null hypothesis that the mean is same between the two sample sets and conduct a t-test to confirm if the hypothesis holds true.

Since the number of data records are different (n1 = 10 and n2 = 20) and the variance is also different, the t-value and degrees of freedom are computed for the above data set using the formula mentioned in the Unequal Variance T-Test section.

The t-value is -2.24787. Since the minus sign can be ignored when comparing the two t-values, the computed value is 2.24787.

The degrees of freedom value is 24.38 and is reduced to 24, owing to the formula definition requiring rounding down of the value to the least possible integer value.

Whenever a normal distribution is assumed, one can specify a level of probability (alpha level, level of significance, p) as a criteria for acceptance. In most cases, a 5% value can be assumed.

Using the degree of freedom value as 24 and a 5% level of significance, a look at the t-value distribution table gives a value of 2.064. Comparing this value against the computed value of 2.247 indicates that the calculated t-value is greater than the table value at a significance level of 5%. Therefore, it is safe to reject the null hypothesis that there is no difference between means. The population set has intrinsic differences, and they are not by chance.

The Bottom Line

A form of hypothesis testing, the t-test is just one of many tests used for this purpose. Statisticians must additionally use tests other than the t-test to examine more variables and tests with larger sample sizes. For a large sample size, statisticians use a z-test. Other testing options include the chi-square test and the f-test.