What Is Stratified Random Sampling?
Stratified random sampling is a method of sampling that involves the division of a population into smaller sub-groups known as strata. In stratified random sampling, or stratification, the strata are formed based on members' shared attributes or characteristics such as income or educational attainment.
Stratified random sampling is also called proportional random sampling or quota random sampling.
Key Takeaways
- Stratified random sampling allows researchers to obtain a sample population that best represents the entire population being studied.
- Stratified random sampling involves dividing the entire population into homogeneous groups called strata.
- Stratified random sampling differs from simple random sampling, which involves the random selection of data from an entire population, so each possible sample is equally likely to occur.
Stratified Random Sampling
How Stratified Random Sampling Works
When completing analysis or research on a group of entities with similar characteristics, a researcher may find that the population size is too large for which to complete research. To save time and money, an analyst may take on a more feasible approach by selecting a small group from the population. The small group is referred to as a sample size, which is a subset of the population that is used to represent the entire population. A sample may be selected from a population through a number of ways, one of which is the stratified random sampling method.
A stratified random sampling involves dividing the entire population into homogeneous groups called strata (plural for stratum). Random samples are then selected from each stratum. For example, consider an academic researcher who would like to know the number of MBA students in 2007 who received a job offer within three months of graduation.
The researcher will soon find that there were almost 200,000 MBA graduates for the year. They might decide to just take a simple random sample of 50,000 graduates and run a survey. Better still, they could divide the population into strata and take a random sample from the strata. To do this, they would create population groups based on gender, age range, race, country of nationality, and career background. A random sample from each stratum is taken in a number proportional to the stratum's size when compared to the population. These subsets of the strata are then pooled to form a random sample.
Stratified sampling is used to highlight differences between groups in a population, as opposed to simple random sampling, which treats all members of a population as equal, with an equal likelihood of being sampled
Example of Stratified Random Sampling
Suppose a research team wants to determine the GPA of college students across the U.S. The research team has difficulty collecting data from all 21 million college students; it decides to take a random sample of the population by using 4,000 students.
Now assume that the team looks at the different attributes of the sample participants and wonders if there are any differences in GPAs and students’ majors. Suppose it finds that 560 students are English majors, 1,135 are science majors, 800 are computer science majors, 1,090 are engineering majors, and 415 are math majors. The team wants to use a proportional stratified random sample where the stratum of the sample is proportional to the random sample in the population.
Assume the team researches the demographics of college students in the U.S and finds the percentage of what students major in: 12% major in English, 28% major in science, 24% major in computer science, 21% major in engineering, and 15% major in mathematics. Thus, five strata are created from the stratified random sampling process.
The team then needs to confirm that the stratum of the population is in proportion to the stratum in the sample; however, they find the proportions are not equal. The team then needs to re-sample 4,000 students from the population and randomly select 480 English, 1,120 science, 960 computer science, 840 engineering, and 600 mathematics students.
With those, it has a proportionate stratified random sample of college students, which provides a better representation of students' college majors in the U.S. The researchers can then highlight specific stratum, observe the varying studies of U.S. college students and observe the various grade point averages.
Simple Random Versus Stratified Random Samples
Simple random samples and stratified random samples are both statistical measurement tools. A simple random sample is used to represent the entire data population. A stratified random sample divides the population into smaller groups, or strata, based on shared characteristics.
The simple random sample is often used when there is very little information available about the data population, when the data population has far too many differences to divide into various subsets, or when there is only one distinct characteristic among the data population.
For instance, a candy company may want to study the buying habits of its customers in order to determine the future of its product line. If there are 10,000 customers, it may use choose 100 of those customers as a random sample. It can then apply what it finds from those 100 customers to the rest of its base. Unlike stratification, it will sample 100 members purely at random without any regard for their individual characteristics.
Proportionate and Disproportionate Stratification
Stratified random sampling ensures that each subgroup of a given population is adequately represented within the whole sample population of a research study. Stratification can be proportionate or disproportionate. In a proportionate stratified method, the sample size of each stratum is proportionate to the population size of the stratum.
For example, if the researcher wanted a sample of 50,000 graduates using age range, the proportionate stratified random sample will be obtained using this formula: (sample size/population size) x stratum size. The table below assumes a population size of 180,000 MBA graduates per year.
Age group |
24-28 |
29-33 |
34-37 |
Total |
Number of people in stratum |
90,000 |
60,000 |
30,000 |
180,000 |
Strata sample size |
25,000 |
16,667 |
8,333 |
50,000 |
The strata sample size for MBA graduates in the age range of 24 to 28 years old is calculated as (50,000/180,000) x 90,000 = 25,000. The same method is used for the other age range groups. Now that the strata sample size is known, the researcher can perform simple random sampling in each stratum to select his survey participants. In other words, 25,000 graduates from the 24-28 age group will be selected randomly from the entire population, 16,667 graduates from the 29-33 age range will be selected from the population randomly, and so on.
In a disproportional stratified sample, the size of each stratum is not proportional to its size in the population. The researcher may decide to sample 1/2 of the graduates within the 34-37 age group and 1/3 of the graduates within the 29-33 age group.
It is important to note that one person cannot fit into multiple strata. Each entity must only fit in one stratum. Having overlapping subgroups means that some individuals will have higher chances of being selected for the survey, which completely negates the concept of stratified sampling as a type of probability sampling.
Portfolio managers can use stratified random sampling to create portfolios by replicating an index such as a bond index.
Advantages of Stratified Random Sampling
The main advantage of stratified random sampling is that it captures key population characteristics in the sample. Similar to a weighted average, this method of sampling produces characteristics in the sample that are proportional to the overall population. Stratified random sampling works well for populations with a variety of attributes but is otherwise ineffective if subgroups cannot be formed.
Stratification gives a smaller error in estimation and greater precision than the simple random sampling method. The greater the differences between the strata, the greater the gain in precision.
Disadvantages of Stratified Random Sampling
Unfortunately, this method of research cannot be used in every study. The method's disadvantage is that several conditions must be met for it to be used properly. Researchers must identify every member of a population being studied and classify each of them into one, and only one, subpopulation. As a result, stratified random sampling is disadvantageous when researchers can't confidently classify every member of the population into a subgroup. Also, finding an exhaustive and definitive list of an entire population can be challenging.
Overlapping can be an issue if there are subjects that fall into multiple subgroups. When simple random sampling is performed, those who are in multiple subgroups are more likely to be chosen. The result could be a misrepresentation or inaccurate reflection of the population.
The above examples make it easy: undergraduate, graduate, male, and female are clearly defined groups. In other situations, however, it might be far more difficult. Imagine incorporating characteristics such as race, ethnicity, or religion. The sorting process becomes more difficult, rendering stratified random sampling an ineffective and less than ideal method.