Why Retesting your Experiments is crucial for real Optimization Impact

Willem Isbrucker Product Tank Hamburg Talk about retesting experiments

Did you run AB tests before during your career as a product person?

Did you manage to achieve significant results with at least one of those tests?

Did you reject the same idea when a colleague suggests it as an experiment 6+ months after you ran it?
Not good. Wait, what?

What might seem counterintuitive at first, is one of the most common mistakes conversion optimizers make throughout their careers.
The primary motivation for testing hypotheses using AB tests is to gather data to avoid decisions which are solely based on gut feeling.
You also want to reduce waste by preventing colleagues in your company from putting effort into a testing idea you already invalidated.

But what if you’re causing more harm than good with this behavior? I recently had the chance to attend a talk from Willem Isbrucker, Senior Product Owner at Booking.com during our Product Tank Hamburg event and got a valuable lesson taught.

When he presented the audience with an AB test setup he ran and asked the audience for the winner, 80% in the room predicted the right version (variant). But then he revealed that the 20% who considered the control version to win was also right in a certain way. The experiment also ran a year ago with the opposite result.
While you could assume a mistake in the analytics, the real reason behind it was that the environment around the testes element (a search box) evolved, and thereby the element itself performed better being visualized differently.

And while looking back at my own set of AB tests I ran throughout my career, I also rejected ideas of retesting because…well, we already proved the hypothesis to be wrong or right.

So, whenever you get presented with a testing idea which ran 6+ months in the past, look again at your product and check the situation:
Did you make changes to the main navigation in the meantime? Did you test within a particular segment or across the user base? Does it maybe make sense to target the change at a particular user group? What if the recently introduced new pricing tier would impact the test again?

Try to find a balance between reducing waste and embracing retesting of known hypotheses. You might be surprised regarding the impact you can generate.