When population variance (σ2) is known, the z-statistic can be used to calculate a reliability factor. Relative to the t-distribution, it will result in tighter confidence intervals and more reliable estimates of mean and standard deviation. Z-values are based on the standard normal distribution.
For establishing confidence intervals when the population variance is known, the interval is constructed with this formula:
For alpha of 5% (i.e. a 95% confidence interval), the reliability factor (Zα/2) is 1.96, but for a CFA exam problem, it is usually sufficient to round to an even 2 to solve the problem. (Remember that z-value at 95% confidence is 2, as tables for z-values are sometimes not provided!) Given a sample size of 16, a sample mean of 20 and population standard deviation of 25, a 95% confidence interval would be 20 + 2*(25/(16)1/2) = 20 + 2*(25/4) = 20 + 12.5. In short, for this sample size and for these sample statistics, we would be 95% confident that the actual population mean would fall in a range from 7.5 to 32.5.
Suppose that this 7.5-to-32.5 range was deemed too broad for our purposes. Reducing the confidence interval is accomplished in two ways: (1) increasing sample size, and (2) decreasing our allowable level of confidence.
1. Increasing sample size from 16 to 100 - Our 95% confidence is now equal to 20 + 2*(25/(100)1/2) = 20 + 2*(25/10) = 20 + 5. In other words, increasing the sample size to 100 narrows the 95% confidence range: min 15 to max 25.
2. Using 90% confidence - Our interval is now equal to 20 + 1.65*(25/(100)1/2) = 20 + 1.65*(25/10) = 20 + 4.125. In other words, decreasing the percentage confidence to 90% reduces the range: min 15.875 to max 24.125.
When population variance is unknown, we will need to use the t-distribution to establish confidence intervals. The t-statistic is more conservative; that is, it results in broader intervals. Assume the following sample statistics: sample size = 16, sample mean = 20, sample standard deviation = 25.
To use the t-distribution, we must first calculate degrees of freedom, which for sample size 16 is equal to n - 1 = 15. Using an alpha of 5% (95% confidence interval), our confidence interval is 20 + (2.131) * (25/161/2), which gives a range minimum of 6.68 and a range maximum of 33.32.
As before, we can reduce this range with (1) larger samples and/or (2) reducing allowable degree of confidence:
1. Increase sample size from 16 to 100 - The range is now equal to 20 + 2 * (25/10) à minimum 15 and maximum 25 (for large sample sizes the t-distribution is sufficiently close to the z-value that it becomes an acceptable alternative).
2. Reduce confidence from 95% to 90% - The range is now equal to 20 + 1.65 * (25/10) à minimum 15.875 and maximum 24.125.
Large Sample Size
In our earlier discussion on the central limit theorem, we stated that large samples will tend to be normally distributed even when the underlying population is non-normal. Moreover, at sufficiently large samples, where there are enough degrees of freedom, the z and t statistics will provide approximately the same reliability factor so we can default to the standard normal distribution and the z-statistic. The structure for the confidence interval is similar to our previous examples.
For a 95% confidence interval, if sample size = 100, sample standard deviation = 10 and our point estimate is 15, the confidence interval is 15 + 2* (10/1001/2) or 15 + 2. We are 95% confident that the population mean will fall between 13 and 17.
Suppose we wanted to construct a 99% confidence interval. Reliability factor now becomes 2.58 and we have 15 + 2.58*(10/1001/2) or 15 + 2.58, or a minimum of 12.42 and a maximum of 17.58.
The table below summarizes the statistics used in constructing confidence intervals, given various situations:
|Distribution||Population Variance||Sample Size||Appropriate Statistic|
||Unknown||Large||t or z|
|Non-Normal||Unknown||Large||t or z|
Exam Tips and Tricks
While these calculations don\'t seem difficult, it\'s true that this material seems at times to run together, particularly if a CFA candidate has never used it or hasn\'t studied it in some time. While not likely to be a major point of emphasis, expect at least a few questions on confidence intervals and in particular, a case study that will test basic knowledge of definitions, or that will compare/contrast the two statistics presented (t-distribution and z-value) to make sure you know which is useful in a given application. More than anything, the idea is to introduce confidence intervals and how they are constructed as a prerequisite for hypothesis testing
InvestingStandard error is a statistical term that measures the accuracy with which a sample represents a population.
MarketsSampling is a term used in statistics that describes methods of selecting a pre-defined representative number of data from a larger data population.
MarketsSystematic sampling is similar to random sampling, but it uses a pattern for the selection of the sample.
MarketsIn statistics, a representative sample accurately represents the make-up of various subgroups in an entire data pool.
MarketsCentral limit theorem is a fundamental concept in probability theory.
MarketsA simple random sample is a subset of a statistical population in which each member of the subset has an equal probability of being chosen.
MarketsStratified random sampling is a technique best used with a sample population easily broken into distinct subgroups. Samples are then taken from each subgroup based on the ratio of the subgroup’s ...
TradingWe take a look at these chart intervals and how we can use them to our advantage.
TradingWhen you're indecisive about an investment, the best way to keep a cool head might be test various hypotheses using the most relevant statistics.
RetirementLearn more about the convenience of the subscription beauty box industry, and discover why the Birchbox company in particular has become so popular.