What Is a Stratified Random Sample?

Match Sample Type to Research Purpose

Representative Focus Group
Nicole S. Young/E+/Getty Images

A stratified random sample is a type of probabilistic sampling procedure. The two basic parts of this type of sample are: 1) It is stratified, and 2) it is probabilistic. So what does this mean exactly and why is it important? A stratified random sample is also known as a proportional random sampling or a quota random sampling.

What Is a Stratified Random Sample? 

A sample is a mini-representation of a larger population.

Samples can be determined informally or formally. But samples that are systematically developed according to certain scientific methods are generally perceived as being more useful for making generalizations about the larger population.

What Does Stratified Mean?

Stratified samples consist of homogeneous sub-groups that are considered to be distinct in important ways. A collection of these homogeneous sub-groups is referred to as strata. This method of sampling procedures enables the population to be dividing into homogeneous subgroups from which simple random samples may be selected.

Why Is a Stratified Sample Useful?

The aim of stratified random sampling is to select participants from different subgroups who are believed to have relevance to the research that will be conducted. For instance, the results of a study could be influenced by the subjects’ attributes, such as their ages, gender, work experience level, racial and ethnic group, economic situation, level of education attained, and so forth.

A stratified sample is constructed so that these potentially influential characteristics are can be reasonably assumed to reflect the pattern of these characteristics in the overall population. In this way, the sample reflects the population from which it has been taken, but the sample cannot be said to be representative of the larger population.

Remember, the selection of members of a stratified sample is not a random process. That said, once the strata have been established, simple random sampling is used to select the members of the samples for each strata.

What Does Probabilistic Mean?

A stratified random sample is probabilistic because every method used to select the sample population provide a reasonably reliable way of estimating how representative the sample population is to the larger (universe) population from which the sample was selected. In other words, probabilistic sample permits a researcher to estimate the odds that the sample selected does or does not represent the larger population from which the sample was drawn.

Examples

Use stratified random sampling methods when there is interest in the differences between homogeneous subgroups and the larger sample population as a whole.

Let’s say that a population of business clients can be divided into three groups: Gen-Xers, Gen-Yers (Millennial), and Baby Boomers. Moreover, we have reason to believe that both the Gen-Xers and the Gen-Yers are relatively smaller minorities of the overall business clientele. Gen-Xers make up about 5 percent of the overall population of the clientele and Gen-Yers make up about 10 percent of the clientele.

A simple random sample of 100 members (n = 100) might generate 5 Gen-Xers and 10 Gen-Yers if we used a sampling fraction of 10 percent. It would be possible to get even fewer Gen-Xers and fewer Gen-Yers than that in the sample – just by chance. Stratification is likely to produce more representative outcomes. Say we want to have at least 25 people in each group. If we still take a sample of 100 (n = 100), then we can sample 25 Gen-Xers, 25 Gen-Yers, and 50 Baby Boomers.

We know that 10 percent of the population is Millennials or Gen-Yers (or about 100 of our clients. A random sample of 25 clients will give a within-stratum sampling fraction of 25/100 or 25 percent. We also know that 5 percent of the 50 clients who are not are Baby Boomers are Gen-Xers. This means that the within-stratum fraction will be 25/50 or 50 percent.

So, 50 Gen-Xers plus 100 Gen-Yers is a total of 150 of our client sample. Since the overall client population is 1000, we subtract the Gen-Xers plus the Gen-Yers (a total of 150 clients) which leaves 850 clients, who are Baby Boomers. The within-stratum sampling fraction for the Baby Boomers is 50/850 or about 5.88 percent.

Two things are evident: (1) The three groups are more homogeneous within-group than across the whole population. This means that there is less variance, which provides the opportunity for greater statistical precision. (2) And since the sample has been stratified, there will be enough members from each group to be able to make meaningful subgroup inferences.

Stratified sampling might be preferred over simple random sampling when it is important to represent the overall population and to represent the key subgroups of the population, especially when the subgroups are quite small but distinguished in important ways. By using stratified sampling methods, a researcher can effectively assure that subgroups can be differentiated in the discussion of the research findings.