Technically, a representative sample requires only whatever percentage of the statistical population is necessary to replicate as closely as possible the quality or characteristic being studied or analyzed. For example, in a population of 1,000 that is made up of 600 men and 400 women used in an analysis of buying trends by gender, a representative sample can consist of a mere five members, three men and two women, or 0.5 percent of the population. However, while this sample is nominally representative of the larger population, it is likely to result in a high degree of sampling error when making inferences regarding the larger population because it is so small.
Sampling error is an unavoidable consequence of employing samples to analyze a larger group. Obtaining data from them is a process that is limited and incomplete by its very nature. But because it is so often necessary given the limited availability of resources, economic analysts employ methods that can reduce sampling error to statistically negligible levels. While representative sampling is one of the most effective methods used to reduce error, it is often not enough to do so sufficiently its own.
One strategy used in combination with representative sampling is making sure that the sample is big enough to optimally reduce error. And while, in general, the larger the subgroup, the more likely that error is reduced, at a certain point, the reduction becomes so minimal that it does not justify the additional expense necessary to make the sample larger.
Just as the use of a technically representative but tiny sample is not enough to reduce sampling error on its own, simply choosing a large group without taking representation into account may lead to even more flawed results than using the small representative sample. Returning to the example above, a group of 600 males is statistically useless on its own when analyzing gender differences in buying trends.
Surprisingly, the sampling fraction has very little to do with the error of the results when random sampling is used. The main determinant of error is the absolute sample size, not the sample size relative to the population size.