Chapter 22 Comparing Three or More Groups Using ANOVA
Analysis of variance (ANOVA) is an extension of t-tests. It tests whether the means of measurements from three or more treatment groups are equal. It works by comparing whether individuals chosen from different groups are, on average, more different than individuals chosen from the same group.
If your ANOVA test reports a significant p-value, that tells you that at least one of the means is different from the other, but it does not say which treatment groups are different. To compare each pair of groups, we use a post-hoc test like the Tukey-Kramer test. Most post-hoc tests are a modified version of a t-test.
The most common version of ANOVA is the one-way ANOVA. Like a two-sample t-test, it tests a null hypothesis that the means of a measurement variable are the same in three or more independently sampled groups. Repeated measures ANOVA is like the paired t-test, in it tests a null hypothesis that the mean difference in a measured variable between 3+ categorical or treatment groups is zero.
Like t-tests, there are two different versions of ANOVA. Fisher’s ANOVA is used when the variance is about the same in all of the groups. Welch’s ANOVA is the better choice if there is unequal variance in the groups.
This video is a good introduction to ANOVA: Video Intro to ANOVA
22.1 An Example of One-Way ANOVA
There is a disagreement among your friends about the benefits of being a vegetarian. Some say it lowers blood cholesterol (a benefit), while others argue it lowers blood iron levels (which is not good.) You and your friends decide to find out which claim (if either) is true by comparing blood cholesterol and iron levels of male vegetarian (MV), female vegetarian (FV), male omnivorous (MO), and female omnivorous (FO) students.
You have four categories (FO, MO, FV, and MV) that you are comparing for two measurement variables (cholesterol, iron). How do you put the data in a form you can evaluate using ANOVA?
22.2 What Do the Statistical Hypotheses Look Like For One-Way ANOVA?
The null hypothesis is that the population means are the same for all groups. We can state it mathematically as:
H0: x̅MV = x̅FV = x̅MO = x̅FO
The alternative hypothesis is that least one mean is different from the others.
HA: x̅MV ≠ x̅FV
or
x̅MV ≠ x̅MO
or
x̅MV ≠ x̅FO
or
x̅FV ≠ x̅MO
or
x̅FV ≠ x̅FO
or
x̅MO ≠ x̅FO
22.3 Running the Experiment
You recruit 40 volunteers to help you with your study. Here are the raw data you collect.
Table 1. Blood cholesterol and iron levels for male and femal omnivores and vegetarians.
Group | Blood cholesterol (mg/dl) | Blood iron (μg/dl) |
---|---|---|
Female omnivore | 172 | 111 |
Female omnivore | 157 | 113 |
Female omnivore | 169 | 124 |
Female omnivore | 171 | 116 |
Female omnivore | 158 | 112 |
Female omnivore | 170 | 116 |
Female omnivore | 175 | 113 |
Female omnivore | 175 | 122 |
Female omnivore | 181 | 108 |
Female omnivore | 183 | 114 |
Female vegetarian | 148 | 104 |
Female vegetarian | 136 | 90 |
Female vegetarian | 141 | 93 |
Female vegetarian | 144 | 90 |
Female vegetarian | 135 | 86 |
Female vegetarian | 158 | 94 |
Female vegetarian | 149 | 82 |
Female vegetarian | 162 | 95 |
Female vegetarian | 142 | 91 |
Female vegetarian | 143 | 96 |
Male omnivore | 199 | 131 |
Male omnivore | 180 | 146 |
Male omnivore | 192 | 157 |
Male omnivore | 194 | 150 |
Male omnivore | 187 | 146 |
Male omnivore | 189 | 156 |
Male omnivore | 191 | 146 |
Male omnivore | 185 | 181 |
Male omnivore | 194 | 133 |
Male omnivore | 201 | 155 |
Male vegetarian | 165 | 121 |
Male vegetarian | 166 | 108 |
Male vegetarian | 158 | 117 |
Male vegetarian | 174 | 121 |
Male vegetarian | 164 | 129 |
Male vegetarian | 153 | 125 |
Male vegetarian | 175 | 117 |
Male vegetarian | 178 | 125 |
Male vegetarian | 163 | 121 |
Male vegetarian | 181 | 127 |
Table 1 has all of the data we need, but which measurements should we be averaging? Should we include all of the measurements in the ANOVA?
A common mistake we see students make when they first start using one-way ANOVA is arranging their data incorrectly for analysis. We actually made the experiment a little confusing intentionally so we can show you the problem, and help you learn to do it a more intuitive way. If we rearrange the data, it becomes easier to see which groups of numbers you will compare using ANOVA.
Table 2. Blood cholesterol data (in mg/dl)
Female omni. | Female veget. | Male omni. | Male veget. |
---|---|---|---|
172 | 148 | 199 | 165 |
157 | 136 | 180 | 166 |
169 | 141 | 192 | 158 |
171 | 144 | 194 | 174 |
158 | 135 | 187 | 164 |
170 | 158 | 189 | 153 |
175 | 149 | 191 | 175 |
175 | 162 | 185 | 178 |
181 | 142 | 194 | 163 |
183 | 143 | 201 | 181 |
Table 3. Blood iron data (in μg/dl)
Female omni. | Female veget. | Male omni. | Male veget. |
---|---|---|---|
111 | 104 | 131 | 121 |
113 | 90 | 146 | 108 |
124 | 93 | 157 | 117 |
116 | 90 | 150 | 121 |
112 | 86 | 146 | 129 |
116 | 94 | 156 | 125 |
113 | 82 | 146 | 117 |
122 | 95 | 181 | 125 |
108 | 91 | 133 | 121 |
114 | 96 | 155 | 127 |
The numbers we need to compare by ANOVA now are in separate columns according to groups. The four columns in each table are the groups we will compare. Notice that we also separated the data for blood cholesterol from blood iron, because a one-way ANOVA only works with one measurement variable at a time. Blood cholesterol and blood iron levels are different measurements, so we cannot compare them directly. We have to separate the two types of measurements for analysis.
22.4 Calculating ANOVA
Technically you can run ANOVA in Excel, but we do not recommend setting it up yourself. Even with the Data Analysis package, it is very easy to set up incorrectly. Instead we recommend using this pre-formatted ANOVA Excel spreadsheet, created by Dr. John H. McDonald at the University of Delaware. His excellent online book of basic statistics includes Excel spreadsheets for many tests.
Another option is to use one of these online ANOVA calculators.
If your initial ANOVA tells you that at least one of the means is different from the others (p<0.05), you will need to perform a post hoc test to determine which groups are significantly different. Don’t just compare the groups using a two-sample t-test over and over; you risk saying two groups are different when they are not. Instead use a Tukey-Kramer test (or some other post-hoc test) to determine which groups are different from each other.
22.5 How to Report and Interpret ANOVA Statistics
When reporting the results of a one-way ANOVA in text, you need to include the p-value. Your statement summarizing our thought experiment might look like this:
There was significant difference (p<0.00001) in blood cholesterol overall, and also in blood iron (p<0.005) overall between the four groups (see Figure N). However there was no significant difference between vegetarians and omnivores in either blood cholesterol or blood iron (p=NS, Tukey-Kramer post-hoc test.) We found blood cholesterol was significantly higher in males than females, regardless of diet (p<0001 for vegetarians, p<0.05 for omnivores). Similarly blood iron was significantly higher in males than females, regardless of diet (p<00001 for vegetarians, p<0.005 for omnivores).
The findings of this study highlight another important thing to remember when writing the discussion of your report: statistical significance is not the end of the story. Statistical results need to be interpreted. If we had stopped with the ANOVA and not looked at the post-hoc tests, we might have assumed (incorrectly) that the difference between the groups was due to diet, and come to the wrong conclusion.
22.6 There Are Other Kinds of ANOVA
You are unlikely to need other types of ANOVA in a basic biology course, but it helps to know they exist. Two-way ANOVA is used if you have one measurement variable and two categorical variables.
There is a special type of two-way ANOVA called repeated measures ANOVA (rmANOVA), which works essentially the same way as a paired t-test. In rmANOVA, observations or measurements are made on the same individual more than once, usually at different time points. The first measurement on each individual is the control value for that individual. Subsequent measurements are compared back to that value.
If you must run an rmANOVA, we recommend using dedicated statistical software. Outcomes are reported the same as with one-way ANOVAs.