A/B testing has been used to optimize and evaluate the effectiveness of programs, products, and services in a wide range of fields like financial management, environment, transportation, and health. In general, A/B testing works best when you can identify a clear touchpoint (a way of reaching end-users), as well as a clear, measurable outcome that you are trying to achieve. It’s important to have good data on your outcome: you’ll need to know whether users in each group completed the desired outcome.
Before and after comparisons can be valid in some circumstances, but always leave some doubt as to whether the differences you observe in program outcomes are due to what you changed in your program, or to other changes that are occurring in the background at the same time. A/B testing, by design, controls for these factors and provides greater confidence that observed differences are due to the changes you’ve made.
Yes. Many tests simultaneously compare multiple treatment groups and a control. However, this tool is only designed to help run the simplest A/B test, comparing between just two groups. Running a test with multiple treatment groups often requires more complicated randomization and analysis designs; we recommend consulting with an expert in policy analysis to help you with this.
The statistical tests that our A/B testing tool runs on the Prepare and Analyze pages are easiest when you can express your outcome as the percent of people who take an action. Many outcomes important to practitioners can be expressed this way. For example, a practitioner using the tool to compare the impact of a payment reminder could frame their outcome as “the percent of people who got a reminder who pay within thirty days.” If you can’t frame your outcome as a percent, we need a little bit more information about the distribution of your data in order to do our calculations. The tool will guide you through this.
It’s important to run your test with a large enough sample size that you will be able to detect an effect. If our sample size calculator warns you that your sample size is too low, you should try to find a way to increase it before continuing.
Some common ways to increase sample size include:
We designed this tool to lead practitioners through the simplest-possible example of how to run an A/B test. This tool is probably not best for people who want to use more advanced statistical tools or run complex tests. In those situations, we recommend consulting an expert in program evaluation.
We want A/B testing to be more accessible. In much of the work we have done with private, public, and non-profit partners, we see demand for more evidence-based and data-driven program design. An increasing number of actors across the world are making use of A/B testing to get direct feedback from customers and optimize their service offering. ideas42 leverages its expertise running randomized tests with a wide range of organizations and has designed an easy tool for anyone to use.
For a selection of ideas42 projects, visit our project page. You can also learn more about behavioral innovations that have worked well in government and other sectors at the B-Hub, our online home for evidence-based behavioral solutions.