Null Hypothesis
From SkepticWiki
Contents |
[edit] Definition
In statistics, the null hypothesis is a statement that describes what happens when no treatment or experimentation has an effect. This contrasts with the experimental hypothesis or alternate hypothesis, which is a statement that something unusual or measurable is occurring.
Hypothesis testing determines which of these competing hypotheses is better supported by the evidence. If the null hypothesis is unlikely to produce the experimental evidence, then those performing the test can reject the null hypothesis.
[edit] Examples
Example 1 : A proposed new vitamin is offered as a treatment to reduce the number and severity of colds. To test the effectiveness of this treatment you develop the following hypotheses:
- Null hypothesis (often written H0) : People taking this vitamin will get exactly as many colds as people who don't take this vitamin.
- Experimental hypothesis (often written H1) : People taking this vitamin will get fewer colds than people who don't take this vitamin.
The test is to administer the vitamin to an experimental group while giving a control group nothing (or, more properly be administered a placebo), expose both groups to cold viruses, and count the number of colds caught by both groups. If the number of colds caught by the control group equals or exceeds the number of colds caught by the experimental group, the vitamin is not effective at reducing the number and severity of colds.
You can apply this method in many ways.
Example 2: A psychic claims to be able to successfully predict whether a coin flip results in heads or tails. To test the effectiveness of the claims, we phrase the hypotheses as follows:
- H0: The psychic will have exactly as many hits as misses.
- H1: The psychic will have more hits than misses.
The test is to flip the coin lots of times (or an agreed-upon number of times). If the psychic predictions amount to more misses than hits, the claim is invalid.
Example 3: A researcher claims that women remember names of other people better than men.
- H0: A group of women in a name-recall test will achieve the same average score as a group of men.
- H1: A group of women in a name-recall test will achieve a higher average score than a group of men.
This test is also simple to administer and to tally results.
[edit] Discussion
Interpreting test results can be difficult because there are uncertainties in any sort of statistical measurement. For example, you cannot obtain equal numbers of hits and misses in a test of 101 coin flips because the number 101 is odd. The best possible result would be 50 hits and 51 misses, or vice versa.
Statisticians usually evaluate hypotheses in terms of a p-value -- the likelihood that the null hypothesis would produce at least as extreme a distribution as the one that is observed. That means the 51 hits and 50 misses, although technically in favor of the experimental hypothesis in Example 2, fall within what would be expected even if the null hypothesis was true.
On the other hand, the chance of a psychic achieving 70 hits and 31 misses is unlikely if the null hypothesis is true. Therefore you would reject the null hypothesis if the probability of getting the experimental results was smaller than the alpha cutoff.
It is important to remember that you can use this philosophical framework only to disprove the null hypothesis. You cannot disprove the experimental hypothesis by this method. The vitamin might have an effect that was not discernible in the small sampling available for Example 1. No amount of statistical testing can ever conclusively demonstrate that there is no effect, although we can establish that if there is an effect, it must be smaller than such-and-such and therefore unimportant -- often characterized as "You Can't Prove a Negative".
[edit] Related Links
[edit] References
Larry Gonick: "The Cartoon Guide to Statistics"
