Chapter 4 What IS a Hypothesis?

Part 2 of this Guide shows you how hypotheses and experimental design are connected to scientific writing. You will learn how to turn your questions about the world around us into testable hypotheses, and how to design an experiment that provides you with reliable data that you can use to start answering your questions.

4.1 Anatomy of a Hypothesis

All hypotheses have two basic parts: a set of conditions that exist or can be created (the “if” part), and a predicted outcome (the “then” part.)

Hypotheses are not limited to the sciences. They are a part of how we solve problems in our everyday life. You use them every time you ask or think:

  • If I do “A,” then “B” is probably going to happen.
  • If “C” is true, then I predict “D” is not true.

Even very young children can put together complex chains of observations and hypotheses to solve problems. Imagine a 4-year old child sees a box of cookies on a shelf that they cannot reach. If we could listen in on their internal conversation, we might hear:

  • “If I pull the chair to the counter, then I can reach the box of cookies.”
  • “If mom or dad catches me, then they will be mad.”

These two hypotheses do not appear out of nowhere. Every hypothesis is based on prior observations (information, knowledge, or experiences) that the person uses to make their prediction. Going back to our cookies example, what prior observations does the child have?

  • “When my parents stood me on the chair to comb my hair, I was higher up and could see the shelf where the cookies are now.”
  • “The last time I ate cookies without asking, I got scolded and my parents took away the cookies.”

Using just two prior observations, our young cookie thief can predict the outcomes if they create a set of conditions (moving the chair), or some outside event occurs (either parent catches them.)

We can connect past observations, new observation, and hypotheses together in complex chains that we use to solve problems and make decisions.

  • “Is mom outside? Is dad in the basement? If they are not close, then they will not hear me move the chair, and I can get cookies.”
  • “Dad is in the next room (a new observation) so if I move the chair he will hear and catch me.”
  • “I’m not moving the chair.”

We call this type of thinking hypothetico-deductive reasoning.

4.2 Informal vs. Formal Hypotheses

Let’s look at another situation from daily life. Remember, a hypothesis is a testable prediction based on previous observations.

You would like to run in the campus 5K race with your friend in a few months. Your friend runs regularly and can run a 5K in 25 minutes. You are not as fast, and need 32 minutes to run a 5K. What are some hypotheses you could make about how to improve?

  • If I drink 3 cups of coffee before running, then I will run faster.
  • If I run twice a day, then I will run faster.
  • If I run with my friend instead of alone, then I will run faster.
  • If I practice running faster for 1K every day, then I will run faster.

All of these are informal hypotheses. They have conditions (if statements) and predictions (then statements), but there are no specific predictions that can be tested. A formal or testable hypothesis provides specific conditions and a specific prediction that can be measured or evaluated in a consistent, unbiased way.

These are the same hypotheses rewritten so they can be tested.

  • Informal: If I drink 3 cups of coffee before running, then I will run faster.
    • How big is the cup of coffee?
    • How soon before running?
    • How will you measure improvement in running speed?
  • Testable: If I drink 3, 8-ounce cups of coffee 30 minutes before running, then I will run the 5K distance in less than 32 minutes.
    • This is better but still could be improved. For instance, would 31 minutes, 45 seconds be an improvement?


  • Informal: If I run twice a day, then I will run faster.
    • When will you run?
    • How long will you run each time?
    • Again, would 31 minutes, 45 seconds (less than 32 minutes) be an improvement?
  • Testable: If I run twice a day (once for 20 minutes in the morning, and once for 40 minutes in the afternoon), then I will run the 5K distance in 30 minutes instead of 32 minutes.


  • Informal: If I run with my friend instead of alone, then I will run faster.
    • What are you doing different?
    • What improvement do you predict you will see? When?
  • Testable: If I run with my friend instead of alone, and try to run at their pace each time, then after 60 days I will run the 5K distance at their pace.
    • This is more specific about the prediction, but what if your friend slows down to match your pace? How will you know? Can you measure pace more rigorously?


  • Informal: If I practice running faster for 1K every day, then I will run faster.
    • Will you do this at the start, in the middle, or at the end of your run?
    • How much faster will you go?
  • Testable: If I run 3K every day, and try to run the second kilometer in 24 minutes, then after 60 days I will be able to run the entire 5K distance in less than 27 minutes.

4.3 There Are Different Kinds of Hypotheses

A testable hypothesis can be stated as a biological hypothesis, or as a statistical hypothesis. The biological hypothesis is a descriptive statement of what we predict what we will observe. The statistical hypothesis puts the biological hypothesis into mathematical terms that we can evaluate using statistics. Both types can be split into a null hypothesis and alternate hypothesis.

This language gives many students trouble, so let’s look at the terms in the context of another experiment.

4.3.1 Biological Hypotheses

Imagine you notice that when egg-laying chickens are fed chocolate, more female chickens hatch from the eggs than males. You decide to test this observation formally.

  • Testable Hypothesis: if chickens are fed chocolate, then the sex ratio of males to females hatched from eggs laid by those chickens will be less than 1:1.

We’ll start with the null hypothesis (BO). It describes what you expect to see if the conditions you create have no effect. The alternate hypothesis (BA) describes what you expect to see if there IS an effect. Put another way, the null hypothesis is boring and dull (null, dull, get it?!), and the alternate hypothesis is interesting.

The null biological hypothesis (BO) is that the ratio of males to females hatched is 1:1 regardless of whether the hens that laid those eggs ate chocolate.

The alternate hypothesis is that the test group(s) are different from each other, or different from a theoretical expectation. Here the alternate biological hypothesis is that chickens that are fed chocolate lay eggs that have a sex ratio different from 1:1.

In practice you rarely see a formally stated biological null hypothesis in a scientific journal article, only the alternate hypothesis. So you might wonder why we bother. Stating the biological null hypothesis formally helps us state our statistical hypothesis accurately. It also helps us think more clearly about our experimental design, particularly about what controls we need.

4.3.2 Statistical Hypotheses

The goal of statistical hypothesis testing is to discover the likelihood that the result might be a result of random variation (in other words, just coincidence.)

Suppose you feed chocolate to a bunch of chickens, then look at the sex ratio in their offspring. It’s very tempting to look for patterns in your data that support the exciting alternative hypothesis. If you get more females than males, it would be a tremendously exciting discovery about the mechanism of sex determination that you could publish in Science or Nature. Female chickens are more valuable than male chickens in egg-laying breeds, and poultry scientists have spent a lot of time and money trying to change the sex ratio in chickens. On the other hand, if chocolate doesn’t change the sex ratio, you would have a hard time getting your study published in the Eastern Rhode Island Journal of Chickenology.

You run an experiment feeding chocolate to 20 egg-laying chickens. As a control, you feed another 20 chickens regular feed without chocolate. For both groups you count the number of eggs laid in 7 days that produce male chicks, and the number of eggs that produce female chicks. Let’s consider 3 possible outcomes:

Possible Outcome 1: You get 47 female chicks and 1 male chick. The effect is so dramatic that you conclude that chocolate really changed the sex ratio based on just the numbers alone.

Possible Outcome 2: You count 25 female chicks and 23 male chicks from chocolate-fed hens, and 18 female chicks and 19 male chicks from the control hens. These results give us no reason to think there is not a 1:1 ratio of females to males in both the test and control groups.

Possible Outcome 3: Chocolate-fed chickens lay eggs that produce 31 females and only 17 males (a little under 2:1 sex ratio). The chickens that were fed regular chow laid eggs that produced 25 males, and 24 females (about 1:1 sex ratio). Now it is not so clear-cut. Could this just be coincidence? Stating this in more mathematical terms:

“If the boring biological null hypothesis is really true, and chocolate does not affect sex ratio, what’s the probability of getting a sex ratio of 2:1 just due to random chance?”

This is our statistical hypothesis, and it too has null and alternate versions (abbreviated HO and HA).

Null (Ho): Sex ratio (choco-chix) = Sex ratio (control)
Alternate (Ha): Sex ratio (choco-chix) =/= Sex ratio (control)

Statistical tests estimate the p-value, which is the probability of obtaining the observed results assuming the null hypothesis is true (i.e., by chance). Statistical hypothesis testing methods are explained in a later section of this Guide. For now what you need to know is:

  • If there is a high probability that the observed results are due to random variation, you would say that you “fail to reject the null hypothesis.” Don’t say that the alternative hypothesis is wrong. Statistical testing does not give us that level of certainty.
  • If the observed results are unlikely under the null hypothesis, you would say that you “reject the null hypothesis.”
  • In statistical testing there always is some margin of error. That is why we cannot prove conclusively (and never say) that the alternative hypothesis is correct.

4.4 Where’s the Hypothesis in a Research Article?

Our students get confused when we say we want them to make their hypotheses as “if-then” statements, when they do not see such formal statements in the most of the scientific articles they read. We are not being inconsistent, just trying to develop a thinking skill. We ask our students to state their hypothesis in the if-then form so they learn to THINK in those terms. As they (and you) gain experience, it is not always necessary to explicitly state the hypothesis as an “if-then” statement.

Nearly all primary literature has at least one testable hypothesis, but it may not be worded in a way that is easy to find. Look at this example:

"Based on the previous conclusions of Betto and Bell (2019) related to mating seasonality in passerines, it is reasonable to suggest that non-passerine species will have different seasonal mating patterns too."

There IS an if-then statement hiding in there. We can find it by revising and rearranging the wording a bit.

"In 2019, Betto and Bell concluded that when it is warmer than usual, passerine birds will mate later in the season. IF Betto and Bell are right, THEN we predict non-passerine birds will do the same thing. IF the weather is warmer than usual, THEN non-passerine birds also will mate later in the season.)"

The second cause for confusion is that, for most published articles, the Introduction section is one giant “if” statement. In essence the authors are saying:

"Here are our prior observations, and here is what all of these other researchers are saying about our model or a related system. This is how we are interpreting these findings. Now IF all of the stuff we just told you is true, THEN we expect to find..."

When articles are written this way, the authors are assuming you as the reader realize that the Introduction is their “if” statement.

Sometimes authors have no obvious hypothesis and don’t actually make any specific predictions. Instead they state their objective or goals for the study. This is very common in applied science research. For example, this is an excerpt from a recent abstract:

Social and ecological differences in early SARS-CoV-2 pandemic screening and outcomes have been documented, but the means by which these differences have arisen are not understood. The objective of this study is to characterize social, economic, and chronic disease mechanisms underlying differences in outcomes for patients within the Cleveland Clinic Health System... (Dalton & Gunzler, 2021; https://doi.org/10.1371/journal.pone.0255343

In this case, the hypothesis is implied. The authors of this study are assuming that there is some difference between patients of different socio-economic and chronic disease status that affects their outcomes if they are infected with SARS-CoV-2. Their “if” statement is implied, but a clear biological alternate hypothesis:

"If there are differences in the social, economic, and chronic disease status of patients with COVID-19, then we predict there will be measurable differences in their health outcomes."

What is MISSING from this hypothesis are specific predictors. The study authors do not know which factors are going to be important, but they are predicting that at least one social, economic, or chronic health factor will be correlated with a difference in health outcome after COVID-19 infection.