What Is Stratified Random Sampling?
Stratified random sampling is a method of sampling that involves the division of a population into smaller subgroups known as strata. In stratified random sampling, or stratification, the strata are formed based on members’ shared attributes or characteristics, such as income or educational attainment. Stratified random sampling has numerous applications and benefits, such as studying population demographics and life expectancy.
Stratified random sampling is also called proportional random sampling or quota random sampling.
- Stratified random sampling allows researchers to obtain a sample population that best represents the entire population being studied.
- Sampling involves statistical inference made using a subset of a population.
- Stratified random sampling is done by dividing the entire population into homogeneous groups called strata.
- Proportional stratified random sampling involves taking random samples from stratified groups, in proportion to the population. In disproportionate sampling, the strata are not proportional to the occurrence of the population.
- Stratified random sampling differs from simple random sampling, which involves the random selection of data from an entire population, so each possible sample is equally likely to occur.
Stratified Random Sampling
How Stratified Random Sampling Works
When completing analysis or research on a group of entities with similar characteristics, a researcher may find that the population size is too large to complete research on it. To save time and money, an analyst may take on a more feasible approach by selecting a small group from the population. The small group is referred to as a sample size, which is a subset of the population used to represent the entire population. A sample may be selected from a population through a number of ways, one of which is the stratified random sampling method.
Stratified random sampling involves dividing the entire population into homogeneous groups called strata (plural for stratum). Random samples are then selected from each stratum. For example, consider an academic researcher who would like to know the number of MBA students in 2021 who received a job offer within three months of graduation.
The researcher will soon find that there were almost 200,000 MBA graduates for the year. They might decide just to take a simple random sample of 50,000 graduates and run a survey. Better still, they could divide the population into strata and take a random sample from the strata. To do this, they would create population groups based on gender, age range, race, country of nationality, and career background. A random sample from each stratum is taken in a number proportional to the stratum’s size compared with the population. These subsets of the strata are then pooled to form a random sample.
Stratified sampling is used to highlight differences among groups in a population, as opposed to simple random sampling, which treats all members of a population as equal, with an equal likelihood of being sampled.
Example of Stratified Random Sampling
Suppose a research team wants to determine the grade point average (GPA) of college students across the United States. The research team has difficulty collecting data from all 21 million college students; it decides to take a random sample of the population by using 4,000 students.
Now assume that the team looks at the different attributes of the sample participants and wonders if there are any differences in GPAs and students’ majors. Suppose it finds that 560 students are English majors, 1,135 are science majors, 800 are computer science majors, 1,090 are engineering majors, and 415 are math majors. The team wants to use a proportional stratified random sample where the stratum of the sample is proportional to the random sample in the population.
Assume the team researches the demographics of college students in the U.S. and finds the percentage of what students major in: 12% major in English, 28% major in science, 24% major in computer science, 21% major in engineering, and 15% major in mathematics. Thus, five strata are created from the stratified random sampling process.
The team then needs to confirm that the stratum of the population is in proportion to the stratum in the sample; however, they find the proportions are not equal. The team then needs to resample 4,000 students from the population and randomly select 480 English, 1,120 science, 960 computer science, 840 engineering, and 600 mathematics students.
With those groups, it has a proportionate stratified random sample of college students, which provides a better representation of students’ college majors in the U.S. The researchers can then highlight specific stratum, observe the varying types of studies of U.S. college students and observe the various GPAs.
Simple vs. Stratified Random Samples
Simple random samples and stratified random samples are both statistical measurement tools. A simple random sample is used to represent the entire data population. A stratified random sample divides the population into smaller groups, or strata, based on shared characteristics. However, stratified sampling is more complicated, time consuming, and potentially more expensive to carry out than simplified random sampling.
The simple random sample is often used when there is very little information available about the data population, when the data population has far too many differences to divide into various subsets, or when there is only one distinct characteristic among the data population.
For instance, a candy company may want to study the buying habits of its customers to determine the future of its product line. If there are 10,000 customers, it may use choose 100 of those customers as a random sample. It can then apply what it finds from those 100 customers to the rest of its base. Unlike stratification, it will sample 100 members purely at random without any regard for their individual characteristics.
Proportionate and Disproportionate Stratification
Stratified random sampling ensures that each subgroup of a given population is adequately represented within the whole sample population of a research study. Stratification can be proportionate or disproportionate. In a proportionate stratified method, the sample size of each stratum is proportionate to the population size of the stratum. This type of stratified random sampling is often a more precise metric because it’s a better representation of the overall population.
For example, if the researcher wanted a sample of 50,000 graduates using age range, the proportionate stratified random sample will be obtained using this formula: (sample size/population size) × stratum size. The table below assumes a population size of 180,000 MBA graduates per year.
|Number of people in stratum||90,000||60,000||30,000||180,000|
|Strata sample size||25,000||16,667||8,333||50,000|
The strata sample size for MBA graduates in the age range of 24 to 28 years old is calculated as (50,000/180,000) × 90,000 = 25,000. The same method is used for the other age-range groups. Now that the strata sample size is known, the researcher can perform simple random sampling in each stratum to select his survey participants. In other words, 25,000 graduates from the 24–28 age group will be selected randomly from the entire population, 16,667 graduates from the 29–33 age range will be selected from the population randomly, and so on.
In a disproportional stratified sample, the size of each stratum is not proportional to its size in the population. The researcher may decide to sample half of the graduates within the 34–37 age group and one-third of the graduates within the 29–33 age group.
It is important to note that one person cannot fit into multiple strata. Each entity must only fit in one stratum. Having overlapping subgroups means that some individuals will have higher chances of being selected for the survey, which completely negates the concept of stratified sampling as a type of probability sampling.
Portfolio managers can use stratified random sampling to create portfolios by replicating an index such as a bond index.
Advantages of Stratified Random Sampling
The main advantage of stratified random sampling is that it captures key population characteristics in the sample. Similar to a weighted average, this method of sampling produces characteristics in the sample that are proportional to the overall population. Stratified random sampling works well for populations with a variety of attributes but is otherwise ineffective if subgroups cannot be formed.
Stratification gives a smaller error in estimation and greater precision than the simple random sampling method. The greater the differences among the strata, the greater the gain in precision.
Disadvantages of Stratified Random Sampling
Unfortunately, this method of research cannot be used in every study. The method’s disadvantage is that several conditions must be met for it to be used properly. Researchers must identify every member of a population being studied and classify each of them into one, and only one, subpopulation. As a result, stratified random sampling is disadvantageous when researchers can’t confidently classify every member of the population into a subgroup. Also, finding an exhaustive and definitive list of an entire population can be challenging.
Overlapping can be an issue if there are subjects that fall into multiple subgroups. When simple random sampling is performed, those who are in multiple subgroups are more likely to be chosen. The result could be a misrepresentation or inaccurate reflection of the population.
The above examples make it easy: Undergraduate, graduate, male, and female are clearly defined groups. In other situations, however, it might be far more difficult. Imagine incorporating characteristics such as race, ethnicity, or religion. The sorting process becomes more difficult, rendering stratified random sampling an ineffective and less-than-ideal method.
When would you use stratified random sampling?
Stratified random sampling is often when researchers want to know about different subgroups or strata based on the entire population being studied—for instance, if one is interested in differences among groups based on race, gender, or education.
Which sampling method is best?
The method of sampling best to use will depend on the nature of the analysis and the data being used. In general, simple random sampling is often the easiest and cheapest, but stratified sampling can produce a more accurate sample relative to the population under study.
What are the two types of stratified random sampling?
Proportionate sampling takes each stratum in the sample as proportionate to the population size of the stratum. In disproportionate sampling, the analyst will over- or under-sample certain strata based on the research question or study design that they are employing. For example, those interested in childhood education outcomes may over-sample school-age children and those in their early work lives while under-sampling younger and older strata.
How are strata chosen for stratified random sampling?
The strata will depend on the subgroups in which you are interested that appear in your population. These subgroups are based on shared differences among participant characteristics such as gender, race, educational attainment, geographic location, or age group.
University of Arizona, College of Agriculture and Life Sciences. “Chapter 4: Stratified Random Sampling,” Page 1.
Yale University, Department of Statistics and Data Science. “Sampling.”
Simply Psychology. “How Stratified Random Sampling Works.”