AB testing button colours — is it even worth the effort?

You may have heard an AB test success story that goes something like, “All we did was change the button colour from orange to blue and our conversion rate increased 30%!”

What is the likelihood that you could replicate this triumph with a similarly tiny change?

When designing an experiment, one of the key decisions is how many different variables to test. The answer to this question will depend on our objectives in running the experiment.

At one extreme we could test changing the smallest thing possible, such as a button colour, font size or even simply a word. At the other extreme, we could design a completely new version of a page, with different imagery, text, action buttons etc. Across the spectrum between those two extremities are all the potential variations to test.

Let’s look at each of the extremes:

1. Testing micro-variable

Testing every little variable is the most scientific approach. While this would yield the most accurate learnings, there are two obvious downsides to this approach:

  1. The more granular the change, the less likely it is to have an effect.
  2. Each test takes time to run and thus there are opportunity costs with every experiment.

2. Testing a completely new page design

Testing a whole new page also has its positives and negatives. While a new page design may affect a large result, there is a good probability that we wouldn’t know which elements on the page were the most impactful to the outcome.

When testing a new page design, there’s also a chance that some of the variables might be causing a hidden negative lift. That is, they are negating the positive impact that other variables are having. If you had tested each of the elements separately, you would have seen a negative lift, but by testing a whole page, an overall positive lift has hidden their effect.

Let’s look at an example where this could occur.

In the below image you see the control (A) and the experiment (B). We have changed the main image and text in the experiment condition. Our goal is for users to click the ‘Browse catalogue’ link..

If we were to test the image change and text change both together and separately, we might receive the following results.

We can see that while the new image has persuaded more people to click the ‘Browse catalogue’ link, the text change has discouraged people from clicking the link.

There is also another alternative behaviour to consider. Even though the text change didn’t succeed on its own, when paired with the new image, it might now have more context to the user and be more effective.

Nothing is ever so simple, right?

Defining the goals of the experiment

Commonly, experimenters will err on the side of caution and test only a small change. As previously described, this method has an opportunity cost that may not be appropriate for every situation.

Where we have prior evidence or a strong hunch on a change, for more rapid results, we may choose to test several variables on the page at the same time. This approach is a good way to validate a hypothesis but won’t give us as much granularity in our results.

It is important to define the goals of the test upfront and to always consider whether an AB test is the most appropriate next experiment to run. There may be other experiment types, such as customer interviews that can be used to get an early gauge on whether your change might have an effect at a larger scale.

Happy experimenting!

Senior Product Manager @ Campaign Monitor