Quantitative Methods - Hypothesis Testing

Hypothesis testing provides a basis for taking ideas or theories that someone initially develops about the economy or investing or markets, and then deciding whether these ideas are true or false. More precisely, hypothesis testing helps decide whether the tested ideas are probably true or probably false as the conclusions made with the hypothesis-testing process are never made with 100% confidence - which we found in the sampling and estimating process: we have degrees of confidence - e.g. 95% or 99% - but not absolute certainty. Hypothesis testing is often associated with the procedure for acquiring and developing knowledge known as the scientific method. As such, it relates the fields of investment and economic research (i.e., business topics) to other traditional branches of science (mathematics, physics, medicine, etc.)

Hypothesis testing is similar in some respects to the estimation processes presented in the previous section. Indeed, the field of statistical inference, where conclusions on a population are drawn from observing subsets of the larger group, is generally divided into two groups: estimation and hypothesis testing. With estimation, the focus was on answering (with a degree of confidence) the value of a parameter, or else a range within which the parameter most likely falls. Think of estimating as working from general to specific. With hypothesis testing, the focus is shifted: we start my making a statement about the parameter's value, and then the question becomes whether the statement is true or not true. In other words, it starts with a specific value and works the other way to make a general statement.

What is a Hypothesis?
A hypothesis is a statement made about a population parameter. These are typical hypotheses: "the mean annual return of this mutual fund is greater than 12%", and "the mean return is greater than the average return for the category". Stating the hypothesis is the initial step in a defined seven-step process for hypothesis testing - a process developed based on the scientific method. We indicate each step below. In the remainder of this section of the study guide, we develop a detailed explanation for how to answer each step's question.

Hypothesis testing seeks to answer seven questions:

  1. What are the null hypothesis and the alternative hypothesis?
  2. Which test statistic is appropriate, and what is the probability distribution?
  3. What is the required level of significance?
  4. What is the decision rule?
  5. Based on the sample data, what is the value of the test statistic?
  6. Do we reject or fail to reject the null hypothesis?
  7. Based on our rejection or inability to reject, what is our investment or economic decision?

Null Hypothesis
Step #1 in our process involves stating the null and alternate hypothesis. The null hypothesis is the statement that will be tested. The null hypothesis is usually denoted with "H0". For investment and economic research applications, and as it relates to the CFA exam, the null hypothesis will be a statement on the value of a population parameter, usually the mean value if a question relates to return, or the standard deviation if it relates to risk. It can also refer to the value of any random variable (e.g. sales at company XYZ are at least $10 million this quarter). In hypothesis testing, the null hypothesis is initially regarded to be true, until (based on our process) we gather enough proof to either reject the null hypothesis, or fail to reject the null hypothesis.

Alternative Hypothesis
The alternative hypothesis is a statement that will be accepted as a result of the null hypothesis being rejected. The alternative hypothesis is usually denoted "Ha". In hypothesis testing, we do not directly test the worthiness of the alternate hypothesis, as our testing focus is on the null. Think of the alternative hypothesis as the residual of the null - for example, if the null hypothesis states that sales at company XYZ are at least $10 million this quarter, the alternative hypothesis to this null is that sales will fail to reach the $10 million mark. Between the null and the alternative, it is necessary to account for all possible values of a parameter. In other words, if we gather evidence to reject this null hypothesis, then we must necessarily accept the alternative. If we fail to reject the null, then we are rejecting the alternative.

One-Tailed Test
The labels "one-tailed" and "two-tailed" refer to the standard normal distribution (as well as all of the t-distributions). The key words for identifying a one-tailed test are "greater than or less than". For example, if our hypothesis is that the annual return on this mutual fund will be greater than 8%, it's a one-tailed test that will be rejected based only on finding observations in the left tail.

Figure 2.13 below illustrates a one-tailed test for "greater than" (rejection in left tail). (A one-tailed test for "less than" would look similar to the graph below, with the rejection region for less than in the right tail rather than the left.)

Two-Tailed test
Characterized by the words "equal to or not equal to". For example, if our hypothesis were that the return on a mutual fund is equal to 8%, we could reject it based on observations in either tail (sufficiently higher than 8% or sufficiently lower than 8%).

Choosing the null and the alternate hypothesis:
If θ (theta) is the actual value of a population parameter (e.g. mean or standard deviation), and θ0 (theta subzero) is the value of theta according to our hypothesis, the null and alternative hypothesis can be formed in three different ways:

Choosing what will be the null and what will be the alternative depends on the case and what it is we wish to prove. We usually have two different approaches to what we could make the null and alternative, but in most cases, it's preferable to make the null what we believe we can reject, and then attempt to reject it. For example, in our case of a one-tailed test with the return hypothesized to be greater than 8%, we could make the greater-than case the null (alternative being less than), or we could make the greater-than case the alternative (with less than the null). Which should we choose? A hypothesis test is typically designed to look for evidence that may possibly reject the null. So in this case, we would make the null hypothesis "the return is less than or equal to 8%", which means we are looking for observations in the left tail. If we reject the null, then the alternative is true, and we conclude the fund is likely to return at least 8%.

Test Statistic
Step #2 in our seven-step process involves identifying an appropriate test statistic. In hypothesis testing, a test statistic is defined as a quantity taken from a sample that is used as the basis for testing the null hypothesis (rejecting or failing to reject the null).

Calculating a test statistic will vary based upon the case and our choice of probability distribution (for example, t-test, z-value). The general format of the calculation is:

Formula 2.36
Test statistic = (sample statistic) - (value of parameter according to null)
(Standard error of sample statistic)

Type I and Type II Errors
Step #3 in hypothesis testing involves specifying the significance level of our hypothesis test. The significance level is similar in concept to the confidence level associated with estimating a parameter - both involve choosing the probability of making an error (denoted by α, or alpha), with lower alphas reducing the percentage probability of error. In the case of estimators, the tradeoff of reducing this error was to accept a wider (less precise) confidence interval. In the case of hypothesis testing, choosing lower alphas also involves a tradeoff - in this case, increasing a second type of error.

Errors in hypothesis testing come in two forms: Type I and Type II. A type I error is defined as rejecting the null hypothesis when it is true. A type II error is defined as not rejecting the null hypothesis when it is false. As the table below indicates, these errors represent two of the four possible outcomes of a hypothesis test:

The reason for separating type I and type II errors is that, depending on the case, there can be serious consequences for a type I error, and there are other cases when type II errors need to be avoided, and it is important to understand which type is more important to avoid.

Significance Level
Denoted by α, or alpha, the significance level is the probability of making a type I error, or the probability that we will reject the null hypothesis when it is true. So if we choose a significance level of 0.05, it means there is a 5% chance of making a type I error. A 0.01 significance level means there is just a 1% chance of making a type I error. As a rule, a significance level is specified prior to calculating the test statistic, as the analyst conducting the research may use the result of the test statistic calculation to impact the choice of significance level (may prompt a change to higher or lower significance). Such a change would take away from the objectivity of the test.

While any level of alpha is permissible, in practice there is likely to be one of three possibilities for significance level: 0.10 (semi-strong evidence for rejecting the null hypothesis), 0.05 (strong evidence), and 0.01 (very strong evidence). Why wouldn't't we always opt for 0.01 or even lower probabilities of type I errors - isn't the idea to reduce and eliminate errors? In hypothesis testing, we have to control two types of errors, with a tradeoff that when one type is reduced, the other type is increased. In other words, by lowering the chances of a type I error, we must reject the null less frequently - including when it is false (a type II error). Actually quantifying this tradeoff is impossible because the probability of a type II error (denoted by β, or beta) is not easy to define (i.e. it changes for each value of θ). Only by increasing sample size can we reduce the probability of both types of errors.

Decision Rule
Step #4 in the hypothesis-testing process requires stating a decision rule. This rule is crafted by comparing two values: (1) the result of the calculated value of the test statistic, which we will complete in step #5 and (2) a rejection point, or critical value (or values) that is (are) the function of our significance level and the probability distribution being used in the test. If the calculated value of the test statistic is as extreme (or more extreme) than the rejection point, then we reject the null hypothesis, and state that the result is statistically significant. Otherwise, if the test statistic does not reach the rejection point, then we cannot reject the null hypothesis and we state that the result is not statistically significant. A rejection point depends on the probability distribution, on the chosen alpha, and on whether the test in one-tailed or two-tailed.

For example, if in our case we are able to use the standard normal distribution (the z-value), if we choose an alpha of 0.05, and we have a two-tailed test (i.e. reject the null hypothesis when the test statistic is either above or below), the two rejection points are taken from the z-values for standard normal distributions: below -1.96 and above +1.96. Thus if the calculated test statistic is in these two rejection ranges, the decision would be to reject the null hypothesis. Otherwise, we fail to reject the null hypothesis.

Look Out!

Traditionally, it was said that we accepted the null hypothesis; however, the authors have discouraged use of the word "accept", in terms of accepting the null hypothesis, as those terms imply a greater degree of conviction about the null than is warranted. Having made the effort to make this distinction, do not be surprised if this subtle change (which seems inconsequential on the surface) somehow finds its way onto the CFA exam (if you answer "accept the null hypothesis", you get the question wrong, and if you answer "fail to reject the null hypothesis" you score points.

Power of a Test
The power of a hypothesis test refers to the probability of correctly rejecting the null hypothesis. There are two possible outcomes when the null hypothesis is false: either we (1) reject it (as we correctly should) or (2) we accept it - and make a type II error. Thus the power of a test is also equivalent to 1 minus the beta (β), the probability of a type II error. Since beta isn't quantified, neither is the power of a test. For hypothesis tests, it is sufficient to specify significance level, or alpha. However, given a choice between more than one test statistic (for example, z-test, t-test), we will always choose the test that increases a test's power, all other factors equal.

Confidence Intervals vs. Hypothesis Tests
Confidence intervals
, as a basis for estimating population parameters, were constructed as a function of "number of standard deviations away from the mean". For example, for 95% confidence that our interval will include the population mean (μ), when we use the standard normal distribution (z-statistic), the interval is: (sample mean) ± 1.96 * (standard error), or, equivalently,-1.96*(standard error) < (sample mean) < +1.96*(standard error).

Hypothesis tests, as a basis for testing the value of population parameters, are also set up to reject or not reject based on "number of standard deviations away from the mean". The basic structure for testing the null hypothesis at the 5% significance level, again using the standard normal, is -1.96 < [(sample mean - hypothesized population mean) / standard error] < +1.96, or, equivalently,-1.96 * (std. error) < (sample mean) - (hypo. pop. mean) < +1.96 * (std. error).

In hypothesis testing, we essentially create an interval within which the null will not be rejected, and we are 95% confident in this interval (i.e. there's a 5% chance of a type I error). By slightly rearranging terms, the structure for a confidence interval and the structure for rejecting/not rejecting a null hypothesis appear very similar - an indication of the relationship between the concepts.

Making a Statistical Decision
Step #6 in hypothesis testing involves making the statistical decision, which actually compares the test statistic to the value computed as the rejection point; that is, it carries out the decision rule created in step #4. For example, with a significance level of 0.05, using the standard normal distribution, on a two-tailed test (i.e. null is "equal to"; alternative is not equal to), we have rejection points below -1.96 and above +1.96. If our calculated test statistic
[(sample mean - hypothesized mean) / standard error] = 0.6, then we cannot reject the null hypothesis. If the calculated value is 3.6, we reject the null hypothesis and accept the alternative.

The final step, or step #7, involves making the investment or economic decision (i.e. the real-world decision). In this context, the statistical decision is but one of many considerations. For example, take a case where we created a hypothesis test to determine whether a mutual fund outperformed its peers in a statistically significant manner. For this test, the null hypothesis was that the fund's mean annual return was less than or equal to a category average; the alternative was that it was greater than the average. Assume that at a significance level of 0.05, we were able to establish statistical significance and reject the null hypothesis, thus accepting the alternative. In other words, our statistical decision was that this fund would outperform peers, but what is the investment decision? The investment decision would likely take into account (for example) the risk tolerance of the client and the volatility (risk) measures of the fund, and it would assess whether transaction costs and tax implications make the investment decision worth making. In other words, rejecting/not rejecting a null hypothesis does not automatically require that a decision be carried out; thus there is the need to assess the statistical decision and the economic or investment decision in two separate steps.

Interpreting Statistical Results
Related Articles
  1. Career Education & Resources

    How Hard are the CFA Exams?

    Learn about the difficulty of the CFA exams with a description of the tests, some statistics on pass rates and suggestions that can help you pass the exams.
  2. Professionals

    What it Takes to be a Financial Analyst

    A financial analyst researches companies and economic conditions to make business, sector and industry recommendations.
  3. Career Education & Resources

    Financial Analyst: Career Path & Qualifications

    Read about what it takes to become a financial analyst in a corporation or securities firm, and learn how far you can rise in the profession.
  4. Career Education & Resources

    Financial Planner: Career Path & Qualifications

    Learn what education and certifications you need to become a financial planner, as well as the future prospects and earnings potential for financial planners.
  5. Career Education & Resources

    Where to Find Non-Profit Finance Jobs

    The non-profit sector offers a stable selection of jobs for those who seek other types of fulfillment from their jobs than just purely financial.
  6. Career Education & Resources

    Portfolio Manager: Career Path & Qualifications

    Learn about the basic requirements for getting hired as a portfolio manager, and discover how most professionals in the field rise into the position.
  7. Your Practice

    4 Professional Associations Advisors Should Join

    These four professional organizations are among the most respected and well known in the industry.
  8. Professionals

    Equity Research: Career Path and Qualifications

    Find out what equity research analysts do on a day-to-day basis, and learn more about the typical career progression for these securities professionals.
  9. Professionals

    What's on the CFA Level II Exam?

    The Chartered Financial Analyst Level II exam is the second of three tests that CFA candidates must pass.
  10. Professionals

    Financial Data Analyst: Career Path & Qualifications

    Learn more about the career options available to financial data analysts, and determine whether the profession is a good match for you.
  1. Personal Financial Advisor

    Professionals who help individuals manage their finances by providing ...
  2. CFA Institute

    Formerly known as the Association for Investment Management and ...
  3. Chartered Financial Analyst - CFA

    A professional designation given by the CFA Institute (formerly ...
  4. Security Analyst

    A financial professional who studies various industries and companies, ...
  1. What are the differences between a Chartered Financial Analyst (CFA) and a Certified ...

    The differences between a Chartered Financial Analyst (CFA) and a Certified Financial Planner (CFP) are many, but comes down ... Read Full Answer >>
  2. How do I become a Chartered Financial Analyst (CFA)?

    According to the CFA Institute, a person who holds a CFA charter is not a chartered financial analyst. The CFA Institute ... Read Full Answer >>
  3. What types of positions might a Chartered Financial Analyst (CFA) hold?

    The types of positions that a Chartered Financial Analyst (CFA) is likely to hold include any position that deals with large ... Read Full Answer >>
  4. Who benefits the most from prepaid expenses?

    Prepaid expenses benefit both businesses and individuals. Prepaid expenses are the types of expenses that are bought or paid ... Read Full Answer >>
  5. If I am looking to get an Investment Banking job. What education do employers prefer? ...

    If you are looking specifically for an investment banking position, an MBA may be marginally preferable over the CFA. The ... Read Full Answer >>
  6. Can I still pass the CFA Level I if I do poorly in the ethics section?

    You may still pass the Chartered Financial Analysis (CFA) Level I even if you fare poorly in the ethics section, but don't ... Read Full Answer >>
Hot Definitions
  1. Socially Responsible Investment - SRI

    An investment that is considered socially responsible because of the nature of the business the company conducts. Common ...
  2. Presidential Election Cycle (Theory)

    A theory developed by Yale Hirsch that states that U.S. stock markets are weakest in the year following the election of a ...
  3. Super Bowl Indicator

    An indicator based on the belief that a Super Bowl win for a team from the old AFL (AFC division) foretells a decline in ...
  4. Flight To Quality

    The action of investors moving their capital away from riskier investments to the safest possible investment vehicles. This ...
  5. Discouraged Worker

    A person who is eligible for employment and is able to work, but is currently unemployed and has not attempted to find employment ...
  6. Ponzimonium

    After Bernard Madoff's $65 billion Ponzi scheme was revealed, many new (smaller-scale) Ponzi schemers became exposed. Ponzimonium ...
Trading Center