What is a Quartile
A quartile is a statistical term describing a division of observations into four defined intervals based upon the values of the data and how they compare to the entire set of observations.
Try not to confuse a quarter with a quartile.
BREAKING DOWN Quartile
To understand the quartile, it is important to understand the median as a measure of central tendency. The median in statistics is the middle value of a set of numbers. It is the point at which exactly half of the data lies below and above the central value. So, given a set of 13 numbers, the median would be the seventh number. The six numbers preceding this value are the lowest numbers in the data, and the six numbers after the median are the highest numbers in the data set given. Because the median is not affected by extreme values or outliers in the distribution, it is sometimes preferred to the mean.
While the median is a robust estimator of location, it says nothing about how the data on either side of its value is spread or dispersed. The quartile measures the spread of values above and below the mean by dividing the distribution into four groups. Just like the median divides the data into half so that 50% of the measurement lies below the median and 50% lies above it, the quartile breaks down the data into quarters so that 25% of the measurement are less than the lower quartile, 50% are less than the mean, and 75% are less than the upper quartile.
A quartile divides data into three points – a lower quartile, median, and upper quartile – to form four groups of the data set. The lower quartile or first quartile is denoted as Q1, and is the middle number that falls between the smallest value of the data set and the median. The second quartile, Q2, is also the median. The upper or third quartile denoted as Q3 is the central point that lies between the median and the highest number of the distribution. Now, we can map out the four groups formed from the quartiles. The first group of values contains the smallest number up to Q1; the second group includes Q1 to the median; third set is the median to Q3; and fourth category comprises Q3 to the highest data point of the entire set.
Each quartile contains 25% of the total observations. Generally, the data is arranged from smallest to largest with those observations falling below 25% of all the data analyzed allocated within the 1st quartile, observations falling between 25.1% and 50% and allocated in the 2nd quartile, then the observations falling between 51% and 75% allocated in the 3rd quartile, and finally the remaining observations allocated in the 4th quartile.
Let’s work with an example. Suppose, the distribution of math scores in a class of 19 students in ascending order is:
59, 60, 65, 65, 68, 69, 70, 72, 75, 75, 76, 77, 81, 82, 84, 87, 90, 95, 98
First, mark down the median, Q2, which in this case is the tenth value – 75.
Q1 is the central point between the smallest score and the median. In this case, Q1 falls between the first and fifth score – 68. [Note that the median can also be included when calculating Q1 or Q3 for an odd set of values. If we were to include the median on either side of the middle point, then Q1 will be the middle value between the first and tenth score, which is the average of the fifth and sixth score – (fifth + sixth)/2 = (68 + 69)/2 = 68.5].
Q3 is the middle value between Q2 and the highest score – 84. [Or if you include the median, Q3 = (82 + 84)/2 = 83].
Now that we have our quartiles, let’s interpret their numbers. A score of 68 (Q1) represents the first quartile and is the 25th percentile. 68 is the median of the lower half of the score set in the available data i.e. the median of the scores from 59 to 75. Q1 tells us that 25% of the scores are less than 68 and 75% of the class scores are greater. Q2 (the median) is the 50th percentile and shows that 50% of the scores are less than 75, and 50% of the scores are above 75. Finally, Q3, the 75th percentile, reveals that 25% of the scores are greater and 75% are less than 84.
If the data point for Q1 is farther away from the median than Q3 is from the median, then we can say that there is a greater dispersion among the smaller values of the data set than among the larger values. Same logic applies if Q3 is farther away from Q2 than Q1 is from the median.
If there is an even number of data points, the median will be the average of the middle two numbers. In our example above, if we had 20 students instead of 19, the median of their scores will be the arithmetic average of the tenth and eleventh number.
Quartiles are used to calculate the interquartile range which is a measure of variability around the median. The interquartile range is simply calculated as the difference between the first and third quartile: Q3 – Q1. In effect, it is the range of the middle half of the data that shows how spread out the data is.
For large data sets, Microsoft Excel can be used to calculate quartiles by using the QUARTILE function.