For the hypotheses of a theorem, see Theorem. For other uses, see Hypothesis (disambiguation).
"Hypothetical" redirects here. For the 2001 progressive metal album, see Hypothetical (album).
A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. For a hypothesis to be a scientific hypothesis, the scientific method requires that one can test it. Scientists generally base scientific hypotheses on previous observations that cannot satisfactorily be explained with the available scientific theories. Even though the words "hypothesis" and "theory" are often used synonymously, a scientific hypothesis is not the same as a scientific theory. A working hypothesis is a provisionally accepted hypothesis proposed for further research.
A different meaning of the term hypothesis is used in formal logic, to denote the antecedent of a proposition; thus in the proposition "If P, then Q", P denotes the hypothesis (or antecedent); Q can be called a consequent. P is the assumption in a (possibly counterfactual) What If question.
The adjective hypothetical, meaning "having the nature of a hypothesis", or "being assumed to exist as an immediate consequence of a hypothesis", can refer to any of these meanings of the term "hypothesis".
In its ancient usage, hypothesis referred to a summary of the plot of a classical drama. The English word hypothesis comes from the ancient Greekὑπόθεσις word hupothesis, meaning "to put under" or "to suppose".
In Plato's Meno (86e–87b), Socrates dissects virtue with a method used by mathematicians, that of "investigating from a hypothesis." In this sense, 'hypothesis' refers to a clever idea or to a convenient mathematical approach that simplifies cumbersome calculations.Cardinal Bellarmine gave a famous example of this usage in the warning issued to Galileo in the early 17th century: that he must not treat the motion of the Earth as a reality, but merely as a hypothesis.
In common usage in the 21st century, a hypothesis refers to a provisional idea whose merit requires evaluation. For proper evaluation, the framer of a hypothesis needs to define specifics in operational terms. A hypothesis requires more work by the researcher in order to either confirm or disprove it. In due course, a confirmed hypothesis may become part of a theory or occasionally may grow to become a theory itself. Normally, scientific hypotheses have the form of a mathematical model. Sometimes, but not always, one can also formulate them as existential statements, stating that some particular instance of the phenomenon under examination has some characteristic and causal explanations, which have the general form of universal statements, stating that every instance of the phenomenon has a particular characteristic.
In entrepreneurial science, a hypothesis is used to formulate provisional ideas within a business setting. The formulated hypothesis is then evaluated where either the hypothesis is proven to be "true" or "false" through a verifiability- or falsifiability-oriented experiment.
Any useful hypothesis will enable predictions by reasoning (including deductive reasoning). It might predict the outcome of an experiment in a laboratory setting or the observation of a phenomenon in nature. The prediction may also invoke statistics and only talk about probabilities. Karl Popper, following others, has argued that a hypothesis must be falsifiable, and that one cannot regard a proposition or theory as scientific if it does not admit the possibility of being shown false. Other philosophers of science have rejected the criterion of falsifiability or supplemented it with other criteria, such as verifiability (e.g., verificationism) or coherence (e.g., confirmation holism). The scientific method involves experimentation, to test the ability of some hypothesis to adequately answer the question under investigation. In contrast, unfettered observation is not as likely to raise unexplained issues or open questions in science, as would the formulation of a crucial experiment to test the hypothesis. A thought experiment might also be used to test the hypothesis as well.
In framing a hypothesis, the investigator must not currently know the outcome of a test or that it remains reasonably under continuing investigation. Only in such cases does the experiment, test or study potentially increase the probability of showing the truth of a hypothesis.:pp17,49–50 If the researcher already knows the outcome, it counts as a "consequence" — and the researcher should have already considered this while formulating the hypothesis. If one cannot assess the predictions by observation or by experience, the hypothesis needs to be tested by others providing observations. For example, a new technology or theory might make the necessary experiments feasible.
People refer to a trial solution to a problem as a hypothesis, often called an "educated guess" because it provides a suggested solution based on the evidence. However, some scientists reject the term "educated guess" as incorrect. Experimenters may test and reject several hypotheses before solving the problem.
According to Schick and Vaughn, researchers weighing up alternative hypotheses may take into consideration:
- Testability (compare falsifiability as discussed above)
- Parsimony (as in the application of "Occam's razor", discouraging the postulation of excessive numbers of entities)
- Scope – the apparent application of the hypothesis to multiple cases of phenomena
- Fruitfulness – the prospect that a hypothesis may explain further phenomena in the future
- Conservatism – the degree of "fit" with existing recognized knowledge-systems.
Main article: Working hypothesis
A working hypothesis is a hypothesis that is provisionally accepted as a basis for further research in the hope that a tenable theory will be produced, even if the hypothesis ultimately fails. Like all hypotheses, a working hypothesis is constructed as a statement of expectations, which can be linked to the exploratory research purpose in empirical investigation. Working hypotheses are often used as a conceptual framework in qualitative research.
The provisional nature of working hypotheses make them useful as an organizing device in applied research. Here they act like a useful guide to address problems that are still in a formative phase.
In recent years, philosophers of science have tried to integrate the various approaches to evaluating hypotheses, and the scientific method in general, to form a more complete system that integrates the individual concerns of each approach. Notably, Imre Lakatos and Paul Feyerabend, Karl Popper's colleague and student, respectively, have produced novel attempts at such a synthesis.
Hypotheses, concepts and measurement
Concepts in Hempel's deductive-nomological model play a key role in the development and testing of hypotheses. Most formal hypotheses connect concepts by specifying the expected relationships between propositions. When a set of hypotheses are grouped together they become a type of conceptual framework. When a conceptual framework is complex and incorporates causality or explanation it is generally referred to as a theory. According to noted philosopher of science Carl Gustav Hempel "An adequate empirical interpretation turns a theoretical system into a testable theory: The hypothesis whose constituent terms have been interpreted become capable of test by reference to observable phenomena. Frequently the interpreted hypothesis will be derivative hypotheses of the theory; but their confirmation or disconfirmation by empirical data will then immediately strengthen or weaken also the primitive hypotheses from which they were derived."
Hempel provides a useful metaphor that describes the relationship between a conceptual framework and the framework as it is observed and perhaps tested (interpreted framework). "The whole system floats, as it were, above the plane of observation and is anchored to it by rules of interpretation. These might be viewed as strings which are not part of the network but link certain points of the latter with specific places in the plane of observation. By virtue of those interpretative connections, the network can function as a scientific theory." Hypotheses with concepts anchored in the plane of observation are ready to be tested. In "actual scientific practice the process of framing a theoretical structure and of interpreting it are not always sharply separated, since the intended interpretation usually guides the construction of the theoretician." It is, however, "possible and indeed desirable, for the purposes of logical clarification, to separate the two steps conceptually."
Statistical hypothesis testing
Main article: Statistical hypothesis testing
When a possible correlation or similar relation between phenomena is investigated, such as whether a proposed remedy is effective in treating a disease, the hypothesis that a relation exists cannot be examined the same way one might examine a proposed new law of nature. In such an investigation, if the tested remedy shows no effect in a few cases, these do not necessarily falsify the hypothesis. Instead, statistical tests are used to determine how likely it is that the overall effect would be observed if the hypothesized relation does not exist. If that likelihood is sufficiently small (e.g., less than 1%), the existence of a relation may be assumed. Otherwise, any observed effect may be due to pure chance.
In statistical hypothesis testing, two hypotheses are compared. These are called the null hypothesis and the alternative hypothesis. The null hypothesis is the hypothesis that states that there is no relation between the phenomena whose relation is under investigation, or at least not of the form given by the alternative hypothesis. The alternative hypothesis, as the name suggests, is the alternative to the null hypothesis: it states that there is some kind of relation. The alternative hypothesis may take several forms, depending on the nature of the hypothesized relation; in particular, it can be two-sided (for example: there is some effect, in a yet unknown direction) or one-sided (the direction of the hypothesized relation, positive or negative, is fixed in advance).
Conventional significance levels for testing hypotheses (acceptable probabilities of wrongly rejecting a true null hypothesis) are .10, .05, and .01. Whether the null hypothesis is rejected and the alternative hypothesis is accepted, must be determined in advance, before the observations are collected or inspected. If these criteria are determined later, when the data to be tested are already known, the test is invalid.
The above procedure is actually dependent on the number of the participants (units or sample size) that is included in the study. For instance, the sample size may be too small to reject a null hypothesis and, therefore, it is recommended to specify the sample size from the beginning. It is advisable to define a small, medium and large effect size for each of a number of important statistical tests which are used to test the hypotheses.
- ^ abHilborn, Ray; Mangel, Marc (1997). The ecological detective: confronting models with data. Princeton University Press. p. 24. ISBN 978-0-691-03497-3. Retrieved 22 August 2011.
- ^Wilbur R. Knorr, "Construction as existence proof in ancient geometry", p. 125, as selected by Jean Christianidis (ed.), Classics in the history of Greek mathematics, Kluwer.
- ^Gregory Vlastos, Myles Burnyeat (1994) Socratic studies, Cambridge ISBN 0-521-44735-6, p. 1
- ^"Neutral hypotheses, those of which the subject matter can never be directly proved or disproved, are very numerous in all sciences." — Morris Cohen and Ernest Nagel (1934) An introduction to logic and scientific method p. 375. New York: Harcourt, Brace, and Company.
- ^"Bellarmine (Ital. Bellarmino), Roberto Francesco Romolo", Encyclopædia Britannica, Eleventh Edition.: 'Bellarmine did not proscribe the Copernican system ... all he claimed was that it should be presented as a hypothesis until it should receive scientific demonstration.' This article incorporates text from a publication now in the public domain: Chisholm, Hugh, ed. (1911). "Hypothesis". Encyclopædia Britannica. 14 (11th ed.). Cambridge University Press. p. 208.
- ^Crease, Robert P. (2008) The Great EquationsISBN 978-0-393-06204-5, p.112 lists the conservation of energy as an example of accounting a constant of motion. Hypothesized by Sadi Carnot, truth demonstrated by James Prescott Joule, proven by Emmy Noether.
- ^Harvard Business Review (2013) "Why Lean Startup Changes Everything"
- ^Tristan Kromer 2014 "Success Metric vs. Fail Condition"
- ^Lean Startup Circle "What is Lean Startup?"
- ^Popper 1959
- ^"When it is not clear under which law of nature an effect or class of effect belongs, we try to fill this gap by means of a guess. Such guesses have been given the name conjectures or hypotheses.", Hans Christian Ørsted(1811) "First Introduction to General Physics" ¶18. Selected Scientific Works of Hans Christian Ørsted, ISBN 0-691-04334-5 p.297
- ^"In general we look for a new law by the following process. First we guess it. ...", —Richard Feynman (1965) The Character of Physical Law p.156
- ^Schick, Theodore; Vaughn, Lewis (2002). How to think about weird things: critical thinking for a New Age. Boston: McGraw-Hill Higher Education. ISBN 0-7674-2048-9.
- ^Oxford Dictionary of Sports Science & Medicine. Eprint via Answers.com.
- ^See in "hypothesis", Century Dictionary Supplement, v. 1, 1909, New York: The Century Company. Reprinted, v. 11, p. 616 (via Internet Archive) of the Century Dictionary and Cyclopedia, 1911.
hypothesis [...]—Working hypothesis, a hypothesis suggested or supported in some measure by features of observed facts, from which consequences may be deduced which can be tested by experiment and special observations, and which it is proposed to subject to an extended course of such investigation, with the hope that, even should the hypothesis thus be overthrown, such research may lead to a tenable theory.
- ^Patricia M. Shields, Hassan Tajalli (2006). "Intermediate Theory: The Missing Link in Successful Student Scholarship". Journal of Public Affairs Education. 12 (3): 313–334.
- ^Patricia M. Shields (1998). "Pragmatism As a Philosophy of Science: A Tool For Public Administration". In Jay D. White. Research in Public Administration. 4. pp. 195–225 . ISBN 1-55938-888-9.
- ^Patricia M. Shields and Nandhini Rangarajan. 2013. A Playbook for Research Methods: Integrating Conceptual Frameworks and Project Management]. Stillwater, OK: New Forums Press. pp. 109-157
- ^Hempel, C. G. (1952). Fundamentals of concept formation in empirical science. Chicago, Illinois: The University of Chicago Press, p. 36
- ^Hempel, C. G. (1952). Fundamentals of concept formation in empirical science. Chicago, Illinois: The University of Chicago Press, p. 36.
- ^ abHempel, C. G. (1952). Fundamentals of concept formation in empirical science. Chicago, Illinois: The University of Chicago Press, p. 33.
- ^Altman. DG., Practical Statistics for Medical Research, CRC Press, 1990, Section 8.5,
- ^Mellenbergh, G.J.(2008). Chapter 8: Research designs: Testing of research hypotheses. In H.J. Adèr & G.J. Mellenbergh (Eds.) (with contributions by D.J. Hand), Advising on Research Methods: A consultant's companion (pp. 183-209). Huizen, The Netherlands: Johannes van Kessel Publishing
- ^Altman. DG., Practical Statistics for Medical Research, CRC Press, 1990, Section 15.3,
|Look up hypothesis in Wiktionary, the free dictionary.|
|Wikiversity has learning resources about Hypothesis|
- "How science works", Understanding Science by the University of California Museum of Paleontology.
Update: I’ve since revised this hypothesis format. You can find the most current version in this article:
“My hypothesis is …”
These words are becoming more common everyday. Product teams are starting to talk like scientists. Are you?
The internet industry is going through a mindset shift. Instead of assuming we have all the right answers, we are starting to acknowledge that building products is hard. We are accepting the reality that our ideas are going to fail more often than they are going to succeed.
Rather than waiting to find out which ideas are which after engineers build them, smart product teams are starting to integrate experimentation into their product discovery process. They are asking themselves, how can we test this idea before we invest in it?
This process starts with formulating a good hypothesis.
These Are Not the Hypotheses You Are Looking For
When we are new to hypothesis testing, we tend to start with hypotheses like these:
- Fixing the hard-to-use comment form will increase user engagement.
- A redesign will improve site usability.
- Reducing prices will make customers happy.
There’s only one problem. These aren’t testable hypotheses. They aren’t specific enough.
A good hypothesis can be clearly refuted or supported by an experiment. – Tweet This
The 5 Components of a Good Hypothesis
To make sure that your hypotheses can be supported or refuted by an experiment, you will want to include each of these elements:
- the change that you are testing
- what impact we expect the change to have
- who you expect it to impact
- by how much
- after how long
The Change: This is the change that you are introducing to your product. You are testing a new design, you are adding new copy to a landing page, or you are rolling out a new feature.
Be sure to get specific. Fixing a hard-to-use comment form is not specific enough. How will you fix it? Some solutions might work. Others might not. Each is a hypothesis in its own right.
Design changes can be particularly challenging. Your hypothesis should cover a specific design not the idea of a redesign.
In other words, use this:
- This specific design will increase conversions.
- Redesigning the landing page will increase conversions.
The former can be supported or refuted by an experiment. The latter can encompass dozens of design solutions, where some might work and others might not.
The Expected Impact: The expected impact should clearly define what you expect to see as a result of making the change.
How will you know if your change is successful? Will it reduce response times, increase conversions, or grow your audience?
The expected impact needs to be specific and measurable. – Tweet This
You might hypothesize that your new design will increase usability. This isn’t specific enough.
You need to define how you will measure an increase in usability. Will it reduce the time to complete some action? Will it increase customer satisfaction? Will it reduce bounce rates?
There are dozens of ways that you might measure an increase in usability. In order for this to be a testable hypothesis, you need to define which metric you expect to be affected by this change.
Who Will Be Impacted: The third component of a good hypothesis is who will be impacted by this change. Too often, we assume everyone. But this is rarely the case.
I was recently working with a product manager who was testing a sign up form popup upon exiting a page.
I’m sure you’ve seen these before. You are reading a blog post and just as you are about to navigate away, you get a popup that asks, “Would you like to subscribe to our newsletter?”
She A/B tested this change by showing it to half of her population, leaving the rest as her control group. But there was a problem.
Some of her visitors were already subscribers. They don’t need to subscribe again. For this population, the answer to this popup will always be no.
Rather than testing with her whole population, she should be testing with just the people who are not currently subscribers.
This isn’t easy to do. And it might not sound like it’s worth the effort, but it’s the only way to get good results.
Suppose she has 100 visitors. Fifty see the popup and fifty don’t. If 45 of the people who see the popup are already subscribers and as a result they all say no, and of the five remaining visitors only 1 says yes, it’s going to look like her conversion rate is 1 out of 50, or 2%. However, if she limits her test to just the people who haven’t subscribed, her conversion rate is 1 out of 5, or 20%. This is a huge difference.
Who you test with is often the most important factor for getting clean results. – Tweet This
By how much: The fourth component builds on the expected impact. You need to define how much of an impact you expect your change to have.
For example, if you are hypothesizing that your change will increase conversion rates, then you need to estimate by how much, as in the change will increase conversion rate from x% to y%, where x is your current conversion rate and y is your expected conversion rate after making the change.
This can be hard to do and is often a guess. However, you still want to do it. It serves two purposes.
First, it helps you draw a line in the sand. This number should determine in black and white terms whether or not your hypothesis passes or fails and should dictate how you act on the results.
Suppose you hypothesize that the change will improve conversion rates by 10%, then if your change results in a 9% increase, your hypothesis fails.
This might seem extreme, but it’s a critical step in making sure that you don’t succumb to your own biases down the road.
It’s very easy after the fact to determine that 9% is good enough. Or that 2% is good enough. Or that -2% is okay, because you like the change. Without a line in the sand, you are setting yourself up to ignore your data.
The second reason why you need to define by how much is so that you can calculate for how long to run your test.
After how long: Too many teams run their tests for an arbitrary amount of time or stop the results when one version is winning.
This is a problem. It opens you up to false positives and releasing changes that don’t actually have an impact.
If you hypothesize the expected impact ahead of time than you can use a duration calculator to determine for how long to run the test.
Finally, you want to add the duration of the test to your hypothesis. This will help to ensure that everyone knows that your results aren’t valid until the duration has passed.
If your traffic is sporadic, “how long” doesn’t have to be defined in time. It can also be defined in page views or sign ups or after a specific number of any event.
Putting It All Together
Use the following examples as templates for your own hypotheses:
- Design x [the change] will increase conversions [the impact] for search campaign traffic [the who] by 10% [the how much] after 7 days [the how long].
- Reducing the sign up steps from 3 to 1 will increase signs up by 25% for new visitors after 1,000 visits to the sign up page.
- This subject line will increase open rates for daily digest subscribers by 15% after 3 days.
After you write a hypothesis, break it down into its five components to make sure that you haven’t forgotten anything.
- Change: this subject line
- Impact: will increase open rates
- Who: for daily digest subscribers
- By how much: by 15%
- After how long: After 3 days
And then ask yourself:
- Is your expected impact specific and measurable?
- Can you clearly explain why the change will drive the expected impact?
- Are you testing with the right population?
- Did you estimate your how much based on a baseline and / or comparable changes? (more on this in a future post)
- Did you calculate the duration using a duration calculator?
It’s easy to give lip service to experimentation and hypothesis testing. But if you want to get the most out of your efforts, make sure you are starting with a good hypothesis.
Did you learn something new reading this article? Keep learning. Subscribe to the Product Talk mailing list to get the next article in this series delivered to your inbox.
Filed Under: Experimentation