A/B testing is an essential element in any product managers playbook. However having the freedom and flexibility to customize testing based on what the data is saying often requires a lot of time and effort, particularly when it comes to engineering resources. Optimizely offers a flexible approach to experimentation through the use of feature testing, which provides more customization options without the additional development effort typically required to implement these feature optimizations. Megan Bubley, a Senior Product Manager at The Zebra, will share her experience working with Optimizely’s feature tests to create a results page where users can compare multiple auto insurance options driven by actual user needs, as well as her experience customizing the experience based on device platform.
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagement by 10%
1. How The Zebra Utilized Feature
Experiments To Increase Carrier Card
Engagement by 10%
Megan Bubley
Sr. Product Manager
THE ZEBRA
2. 2
The Zebra
We make insurance black and white.
The Zebra was founded with a single goal: to
simplify insurance, and in doing so we’ve
become the nation’s leading insurance
comparison site.
Our vision is to be a trusted partner for
consumers, helping them to better manage and
understand all their insurance needs through
every stage of life, even in the face of the
unexpected.
3. Carrier Card Redesign
Auto Results Page - Multivariate Feature Test
3
Our goal with this redesign was to rapidly test and learn. We
validated our design choices with usability tests but still had
questions around which combination of features would create
the best user experience and how performance would differ
based on device.
We wanted to test a combination of carrier card features but
were concerned about the time it would take to analyze test
performance and the amount of engineering bandwidth
required to introduce a new round of A/B tests.
In the past running a multivariate experiment this robust would
require weeks of engineering time and would drastically
increase our delivery timelines, however through the use of
feature testing we were able to test multiple variants with
limited engineering effort in a short amount of time!
4. Differentiation by Platform
Desktop vs. Mobile
4
For our first round of tests we wanted to focus on
information layout and creating consistency across
platforms so we could better assess how design
impacted user engagement.
We knew that historically desktop and mobile have
performed very differently in terms of engagement.
● Mobile typically performed worse when more
elements were added to carrier cards vs.
desktop.
● Since desktop carrier cards have more real
estate to utilize, more content was typically
added to these cards creating a very different
user experience across platforms.
5. 5
Step One: Work with design and engineering to identify the carrier
card elements we want to test along with the number of variants
we wanted to start with.
Step Two: Create feature variables within Optimizely that allow us
to turn on or off elements on the carrier card so we can customize
the design.
Step Three: Once the test goes live, use real time data to monitor
test performance and come up with future variants we want to test
based on carrier card conversion rates.
Development Strategy
7. 7
Running The Test
The most challenging part of A/B testing has typically
been saving space for optimizations, particularly when a
feature is testing and developers are no longer actively
working it.
In the past we have had issues with prioritizing future
optimizations based on how we think the test will
perform in order to keep the momentum going with
engineers.
With feature based testing we were able to make changes
to our test based on the data without any additional
engineering effort—essentially we had the ability to create
a custom carrier card without any engineering resources.
Example of the feature test setup used for the carrier card
redesign.
7
9. Monitoring Performance & Use of Feature Flags
Achieving Our Long Term Vision
9
While we had hoped to implement a similar experience across
desktop and mobile we noticed that different variants performed
better on different platforms—for example rectangular logos
performed better on mobile and square logos did better on desktop.
Since we wanted to come back to some of these findings and retest
at a later date we utilized the feature flag functionality once we had
a winning variant to promote that feature to 100% of traffic without
eating into our traffic allotment in Optimizely.
10. 10
10%
Increase in carrier card engagement across platforms
60%
Reduction in engineering effort and development time
Final Outcomes
We ran the carrier card feature test for over 6 months and
worked closely with stakeholders to make changes to our
variants as our product changed.
By running this test as a feature experiment we were able
to implement changes to our variants without any
additional development effort—something that would have
taken us weeks to implement in the past.
We were also able to test different variants on mobile and
desktop which allowed us to differentiate our experience
by platform based on user behavior.
As a result of these tests we saw significant improvement
to carrier card engagement across platforms.
11. 11
Benefits
1. Lots of flexibility when it comes to
testing multiple designs.
2. Ability to turn on and off feature
variables with the flip of a switch—no
engineering effort required.
3. Ability to move the experiment behind a
feature flag to reduce the strain on
traffic allotments
4. Ability to retest features at a later date —
once new features have been
launched—to understand full impact of
the final product on the user experience.
Challenges
1. Lots of frontloading required to identify
the different feature variables and
testing strategy.
2. Additional engineering effort required up
front in order to set up the feature test—
was more challenging to implement
than a straight A/B test.
3. Additional testing required to ensure all
feature variables display correctly in
production.
4. Multiple dashboards needed to record
performance and there was not a super
easy way to track performance in
Optimizely.
I'm a senior product manager and i’ve been at The Zebra a little over a year and a half now. My main focus has been on improving the user experience on our auto results page and this presentation will cover one of the features we launched through the use of Optimizely’s feature experiment tool.
[ID: CODE B04]
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagement by 10%
Speaker: Megan Bubley, Sr. Product Manager, The Zebra
SESSION TYPE: Customer
TRACK: Product, Engineering
The Zebra is an online insurance marketplace where users can search and compare multiple carriers to find a policy that works best for them.
We realize shopping for insurance can be overwhelming for users so our goal is to take some of the stress and confusion out of that process and essentially make the purchasing process black and white for users.
Before we jump in I wanted to provide a bit of insight on our results page.
Essentially our main product - users need to complete a funnel with questions about their driving history and personal background in order for us to return rates from carriers.
We have two types of carrier cards on our results page - ads and quotes - and each of these carrier cards has slightly different information - for example quotes return prices and ads do not.
Prior to this redesign we conducted a lot of user research around our current results page experience to better understand some of our users pain points and help us prioritize the features we planned to test as part of this experiment
This test was focused on redesigning our results page carrier cards.
Prior to this redesign there were a lot of discrepancies between what content was visible on desktop vs. mobile - which I’ll get into a bit more later on in the presentation.
As part of this test we wanted the flexibility of testing a lot of different variants in a short amount of time.
Our goal with this experiment was to be able to rapidly test, learn, and iterate.
Even though we had a lot of user research guiding our testing strategy we wanted the ability to incorporate both qual and quant findings into our testing decisions and if we could make changes to the test in real time based on what the data was telling us we could create a carrier card that best served our users.
We knew we wanted to try and create as much consistency as possible for users no matter what platform they were shopping on but also noted that historically user behavior was pretty different when it came to user engagement with carrier cards on desktop vs. mobile.
On mobile, when more content was added to carrier cards we typically saw a decrease in users who clicked on those cards - our assumption here was that more information on smaller screen sizes was overwhelming for users and increased the likelihood of decision fatigue.
On desktop we had more real estate to play around with - however we felt the content we were displaying on the cards was not useful to users. There was a lot of duplicate information on these cards and our assumption here was users would skim right over this information and not find any real value the way the information was presented
While we had high hopes that we could find a happy medium across both platforms we also understood that we may need to change up our testing strategy based on how the test was performing and running this as a feature experiment provided us with the flexibility to do that without any additional engineering effort.
We had very specific goals we were trying to achieve with this test and we wanted the freedom to move quickly but the flexibility to test a lot of different things - which can be incredibly challenging especially when thinking about engineering resources and bandwidth.
Knew we wanted to run this as a feature experiment where we could essentially “create our own carrier card” by creating individual feature variables for all the components on the card.
This required a lot of thought and effort up front to ensure we knew all the different variables we wanted to test and make sure those variables would display correctly on the cards. - More to come on this
Once we identified all the variables we wanted to test we were able to add them into Optimizely and begin developing the feature.
Here’s some insight into how we broke out the different feature variables on the carrier cards - creating separate variable for each component on the card gave us the ability to create our own carrier card essentially with the flip of a switch.
As a company that is incredibly focused on A/B testing and validating new designs by implementing them fully into our main product, one of the biggest challenges has been saving space and time to optimize our tests based on the data were gathering. As a result we either pivot to a new feature without fully achieving our final vision for the feature or we begin to prioritize optimizations based on how we think the test will perform - I think everyone will agree this is not an ideal way to operate.
The benefit of utilizing feature testing is that we have the ability to make changes to the product without engineering resources.
In the past this has always been the most difficult part of prioritization - thinking through the time it will take for devs to deinstrument a previous A/B, build a new feature, and then spin up a new A/B test has taken weeks.
We also had to be cognizant of the time it took for a test to run and what developers would be working on while the test was in progress - do we move onto a new feature and try to circle back to the one that’s currently testing when we have the time? Do we wait for the test to finish before kicking off any new feature work - is this the best use of developers time? Needless to say these questions made it difficult for us to feel confident in our next steps.
Feature testing took a lot of pressure off both product and engineering since the development work to support multiple variants was done up front and in doing so we could essentially run dozens of A/B tests in a fraction of the time it would normally take us.
Here on the screen you can see an example of the variable key with all the different carrier card elements. We were able to turn on an element by switching the value to true and within minutes the new design would be visible in production.
As I mentioned earlier we decided to launch the carrier card redesign with 3 potential designs - control and 2 variants - which you can see on the screen. These are the first round mobile designs we tested with - and even though its not pictured on the screen we did follow the same strategy on desktop as well.
We planned to let the data help us make the call on what we did next so if control won we could assume adding more content to the cards was distracting and limit the number of changes we made in the next round of testing. If a variant won we planned to test that against a slightly different design to see if there was a potential to optimize the experience further.
Once the test is live monitoring performance is pretty much the fun part. Our hope was that a similar trend would emerge across desktop and mobile but that was ultimately not the case so we ended up testing slightly different experience across platforms in order to identify a winning variant.
Once we had a winner on both platforms we utilized the feature flag functionality in order to promote the winner to 100% without having to deinstrument the experiment. The nice part about using feature flags is that it allowed us to keep our experiment intact in case we wanted to come back to the experiment at a future date when new features had been introduced and retest without eating away at our traffic allocation in optimizely.
This turned out to be a really smart move as we did come back to this test a few times over the course of 6 months to test different minor changes to the carrier cards which ended up being really impactful, not only because we were able to make these adjustments without any additional engineering resources, but we were also able to address stakeholder concerns and feedback in real time which helped us build even stronger collaboration process.
As I mentioned we ended up running this test for about 6 months with great success.
We were able to launch 3 different versions of carrier cards with our first round of testing to create a performance baseline we could build off of going forward.
We were able to make come back to the experiment and make changes to carrier cards to better understand how individual elements on the cards impacted user engagement.
Once the test was live we were able to rapidly test and iterate with no additional engineering effort, something that would have taken us months of time and effort in the past, and as a result we were able to reduce development time by 60%.
This setup also had a really positive impact on stakeholder relationships and collaboration since we were able to loop them into our decision making process more regularly and provide them with more freedom to guide our iteration process.
Finally we were able to improve the experience for our users, particularly in terms of carrier card conversion rates, by using data to drive real time changes to our carrier cards and regularly change up the experience based on how that feature was performing even after we introduced new product features on our results page.
For my last slide I wanted to quickly run through some of the benefits and challenges of using feature testing for this carrier card redesign experiment.