Litmus - Experimentation platform

Litmus is an experimentation platform that enables teams in Gojek to run meaningful experiments to make data-informed decisions bring us closer to being the number one super app that customers love.

The TEam

2 designers, 1 product manager,
5 engineers, 4 marketers

Responsibilities

User experience, interface design,
simplyifying omnichannel journeys,
user research & testing,

TimelinE 

January 2020 - October 2020

Team

Data experimentation team - including 2 Product Managers, Tech Lead, 4 Engineers

My Role

Product Designer

Responsibilities

Strategy, User research, Design, Testing and
Product collaboration

Year

2021 - 22

BAckground

Introduction

Gojek represents a better future, with more opportunities, for all our customers and partners, in SEA and all developing nations. Yet to achieve the goal of becoming the operating system of developing nations, we need to be better and faster in making the right decisions for our customers and business. Making the right decisions has been extremely hard, especially in uncertain COVID conditions like today, where patterns and assumptions we used in the past become almost obsolete. This is where the importance of experimentation comes in.

Litmus is an experimentation platform that enables teams in Gojek to run meaningful experiments to increase our confidence in making the right decisions and bring us closer to being the number one super app that customers love.

About the product

Initially there was no refined way in which teams could run experiments in the Gojek ecosystem, if they wanted to. When experiments are run by product teams individually, they do not necessarily follow contemporary norms of an experiment and thus end up being inconclusive usually.

Hence, Litmus was built with the aim of making Gojek a data-driven company. Litmus is used for experimenting when - 

  • Rolling-out a new feature
  • Making changes to an existing feature
  • Running A/B test

My role

I am responsible for leading UX and UI across key parts of this product and others in the data experimentation team at Gojek.

I collaborate with product managers, business leaders, and engineering to define the vision for the product and conduct design workshops within the team.

I owned multiple functionalities for the Litmus MVP design, and have showcased a few outcomes in this project.

process

The problem

The product was in a bad shape from an adoption perspective and the leaders wanted the design to help shape the vision for the product. This case study focuses on how the product evolved from mvp stage overtime, looking at existing methods of creating experiments, and building a user-centric experimentation platform. 

Scoping

Before starting any design work, I worked closely with the product manager and tech lead to make sure we were all aligned on the goals of the product, the timeline we are working with, what is within scope, and what impact are we aiming for. This helped us define a clear agenda and a well-thought approach keeping all stakeholders informed on the scope.

Research

In the next few weeks, I conducted primary and secondary research to understand our users' existing way of working, challenges, and their expectations from an experimentation platform. These include 10-15 users like Product Managers, Engineers & Data Analysts from different teams.

My research encompassed:

  • Understanding the user goals and needs
  • Uncovering pain points with the existing user journey
  • Determining the success of the tasks measured


Understanding user journey

To understand the user journey better, I created a blueprint diagram. This helped me understand how the current system works as a whole. While working on new features, or improving existing ones, it helped me guide through identifying scenarios during the design.

Empathizing with users

Through the research we were able to conclude some of the key user pain points that were mentioned - 

01

Not designed for everyone - Litmus was initially designed for engineers as the primary users. Now it’s not as user-friendly for other users, making it difficult for PMs, analysts & other stakeholders to utilize it to its fullest potential. It has become a tool for developers, which makes it inaccessible for other users at the same time.

02

Dependency on data analysts - Considering it was primarily designed for devs, there were lots of tasks that require PMs and data analysts to know when to finish or modify an experiment - like the need for sample size and experiment reach. This creates a lot of dependency on other stakeholders who have their own workarounds for certain tasks.

03

Users blocked on service requests - Users have to ask the service desk every time they need to access previous changes made in the experiments. This added a lot of friction amongst the teams. It was difficult to track when the experiment was rolled out and modified by who, etc. and there is no easy way to check this. 

Key takeaways from research & workshop

We prepared a detailed CSAT survey report during this process that helped us make informed decisions and the team sat down again to brainstorm on the research insights to analyse these problem areas in depth.

After spending a month on the discovery phase, we had a lot of information to process and break it down to align with the business goals & target metrics - 

  • Growth in active experiments
  • User adoption
  • Path to profitability

the design

Redefining the problem statement

How might we make the platform more inclusive, simple and useful for the new and existing users without having a lot of dependency on stakeholders?

Solution 1: Determining Sample Size

Before starting any experiment, the user must determine how many people must participate in order to meet the desired statistical constraints. Users devised their own workaround to solve the problem.

This solution must be centralised so that we can confidently state that sample size was calculated using a reliable method for each experiment. We designed a sample size calculator for users to do that on Litmus itself to avoid discrepancy in results.

Number of variants for experiment (including control) - The no. of variants an experiment has including control. It helps in deciding the correct statistical function to apply for sample size calculation.

Control group's historical proportion (%) - This is the historical performance of the primary metric for the experiment.

Minimum Detectable Effect (%) - The minimum relative change in baseline rate you would like to be able to detect.

Confidence Level (%) - The percent indicates how confident you want to be that your results are correct.
Confidence Level = 1 - Significance Level

Power (%) - Power is a measure of how well you can distinguish the difference you are detecting from no difference at all.

Result -

  • The number shown is per group. With 2 groups in A/B test, we double this number.
  • You can change the x-axis from the left panel to see how changing one of the calculator parameters while keeping the others constant impacts the sample size requirement.

Solution 2: Experiment History

Since we allow users to edit experiment configurations even after their creation, users would like to see an audit log of these changes so that they can compare it to any changes in experiment results at those times.

The design allows this to be managed like code, version control gives complete visibility and helps build the trust in data. All the changes made in the experiment are now easily visible like traffic control, treatment changes, rules addition or updates, etc. 

Solution 3 : Rules redesign

It's crucial that the product is obvious and that users can complete tasks without difficulty or hesitation. It was very difficult to define segmentation rules in Litmus because users had to read the documentation to understand the syntax and what they should add.

Currently, there is no easy way to validate if the rules are correct and working as expected and it's a real pain for users to get the rules right, especially those who are non-developers. As of now, the rules on Litmus are taken in the form of strings, for example, iOS version > 15, app version > 4.32. Even minor typos result in the rule not being applied.

Hence, we simplified the experience so that users won't have to worry about it and designed such that the system handles it better.

In the new UI, we made sure that the product is obvious in nature and that users can easily complete tasks without any anxiety or friction.

Defining rules in Litmus was a real pain point for the users since they have to go through the documentation to learn the syntax and what all they add in there. Our system did not provide instant error validation, and that makes the experience even worse.
We simplified the experience so users can define rules without any anxiety and redesigned such the platform can take care of it without a lot of friction and minimise chances of error.

Conclusion

Challenges & Learnings

Unfortunately, one of our main challenges is the lack of resources and time that we have day to day, we’re stretched quite thin! With this challenge, we often have to reiterate our designs in order to be achievable within time and technical constraints, while still providing more value to our users.

Impact

We noticed positive results with the above solutions in terms of quantitative and qualitative data. We started getting lots of feedback & feature requests from product groups to help them with running experiments.

  • Monthly active experiments increased by 32% 
  • Increased CSAT score by 20% for selected attributes
  • Saved 40k hours on experiment analysis in 6 months