Success of a data science project has as much to do with product management as with data science. I see this in my work. It comes up in conversations with other practitioners. Yet most articles and talks about data science leave the product management topics out. These are slides from Data Science for Product Managers - a talk I gave to a group of product managers at TappedIn — a meetup that CleverTap runs in Santa Monica.
There are 2 big areas of data science - A for “analyze” and B for “build”.
A is product development informed by data. It became adopted pretty widely by now. Having analytics, running A/B tests, doing cohort and funnel analysis became part of the product management culture.
The “build” kind of data science is about building smarts into the product itself and this is the kind I want to talk about.
Implementing some of this requires machine learning and it is important for product managers to understand the level of complexity of some techniques that apply to their products. However, when machine learning is discussed, too much emphasis is put on the algorithms.
More needs to be said about how a smart product gains humans’ trust and make them feel good about using it.
An app that allows you to pay for parking. You fire it up, it shows 3 choices - start a new parking session, see you old sessions.
Choose “Start a new session”, go to next screen, there are several options here - select a parking zone. Done.
I would not give this a second thought on a desktop.
But when I use this app, I’m late, I hold the phone in one hand and trying to pay for parking while running to the ferry.
I am running and fumbling with the phone and thinking - DON’T YOU KNOW ME?!
It’s a weekday morning, I am at the parking lot next to the ferry terminal, you have seen me here before. More than once.
Just give me one button - PAY NOW. And a small link to all the other features.
Every time a user has this “DON’T YOU KNOW ME?!” moment, it is an opportunity to make a product just a little bit smarter.
Smart products convert DONT YOU KNOW ME?! into YOU GET ME!
Even when they don’t know my next step exactly, they reduce the search space.
Smarter products - new problems.
Complexity goes way beyond the algorithms.
Take Nest smart thermostat - great visual design, easy to install, it is powered by machine learning that learns your preferences. It’s a good product, but even they can’t get it quite right.
Got it when we just had our baby.
We both like it pretty cool, but my wife felt cold after birth. This is just when Nest was learning.
Once it did, for some reason it was very tough for it to adjust.
Another thing - when it turns the heater on, there is no indicator is it was a human in the house or the software. I am OK correcting Nest. But not my wife.
Making products smarter introduces probabilistic behavior.
Because probabilistic behavior feels kind of like life, you start having different expectations.
Northern California has some very hot days with cold mornings. On a day like that I would not turn the heater on in the morning. But Nest would. It just knows - get to 68 degrees. But it has no context - something that is easy and intuitive to a human is not easy to software.
Getting the relationship of the user with a smart product right is tricky.
Product managers are the best people in a company to get the tradeoffs right.
Just like a pm does not have to be developer to manage a software product, she does not have to be a mathematician or a data scientist to manage a data product.
But it is necessary to understand some core concepts.
I'll use 2 data products to demonstrate some of these necessary concepts.
One is consumer facing.
Jawbone UP is a mobile app that accompanies the company’s fitness trackers.
A large group of users use the app to manage their diet by logging their meals.
An obvious metric - number of users who attempt to log a meal, percentage of completions, and number of habitual users.
If you ever used to food diary app, it is a pain - you have to enter every item you ate or search the database.
A lot of abandonment. The PM got together with the DS team to figure out what can be done.
What I would really want is to snap a picture of my plate and have the app figure out what is on it, including all the calories, etc. I have not seen anything like that work reliably yet.
What if we made data entry easier?
Users tend to log entire meals and there some items go together more often than others.
If you typed “cereal” we can suggest a few items that you might have eaten with your cereal. Then instead if typing, you just tap.
Many thousands of meals logged in the past.
That's where we can find our common food pairings.
This is a very cool visualisation by Emi Nomura, a data scientist at Jawbone, that from one food item you can trace its most frequent friends, so to speak.
The algorithm is pretty simple. Use historical data to compute for every item 3 to 5 foods that co-occur with it unusually often.
Monica Rogati, who used to be VP of data at Jawbone has this saying:...
Yes, you could go much more advanced algorithm, but this simple one can get you pretty far.
the biggest improvements were achieved by cleaning the data and understanding it deeply
What items are logged frequently with “hamburger”?
There are dozens of different things that users entered that are different strings, but all mean “hamburger”.
You need to combine them into one in order to get a strong signal.
The improvements that you can get from cleaning your data are great.
The plot of the movie Big Short can be summarized as “guys clean a dataset, get rich”.
In case of Jawbone meal logging, the biggest lyft in performance came from realizing that breakfasts are different from other meals. Spinach in the morning was probably a part of omelete. Spinach at lunch was most likely a salad.
Sometimes, cleaning your data requires a good understanding of the domain you are working with.
Which properties of your data you do and don’t use is to a significant degree a product management decision.
For example, different cuisines disagree on what foods are eaten best together. Do you use this knowledge somehow? Depends what you know about your users.
Here is the second data product. This one is B2B and is working in the background.
Directly helps companies like Airbnb, Linkedin, Pinterest with on-demand customer support. When a user submits a support ticket, some of these are sent to Directly which distributes them to a network of expert users that are ready to answer them. If experts resolve a question successfully, they get paid and Directly takes a cut. Otherwise, the experts can reroute the ticket back to the customer’s call center.
When questions are created in the helpdesk how do we find ones that the expert users can (and want) to solve?
Initially, we relied on our customers to configure some categories that their users chose when they were filling out the support form.
Users are not great about categorizing their issues.
We tried keywords. Very cumbersome to manage.
We need to pick as many tickets as we can, but not to create too much noise for the experts.
Solution: let us look at at ALL your tickets as they come in and a machine learning model will choose which ones will be sent to the expert users.
Here is how it works: ….. Explain the image
The model is a classifier and it needs examples to learn what a good ticket looks. It can do so from watching how the experts respond to tickets they have seen earlier. If the experts took a ticket and resolved it successfully, it becomes a positive example. If the send the question back or resolve it, but the user reviews their answer negatively, this question becomes a negative example.
How do we know if a model is good?
When "normal software” breaks, it breaks with high visibility. An issue with ML is that it will ALWAYS give you an answer.
How we compare models?
An obvious metric is accuracy. Basically the percentage of predictions that the algorithm, gets right. However in product is data science this is a very bad metric.
This depends on how balanced or unbalanced the classes that you are predicting are.
Example: fraud detection, rare disease testing. If 0.1% of transactions are fraudulent, you can create a “very sophisticated” predictive model. When asked “Is this transaction fraudulent?” it will always say “no”. The accuracy of this model will be about 99.9%.
Thinking through this is exactly the PM’s job. In this case you don’t need to know the math that underlies the predictive model.
How do we QA data products?
"Why did you show me ‘french fries’?" Well, because this is the item that is logged together most frequently with burger.
"Why you decided that this transaction is fraudulent? Why did you decide that this customer support ticket is resolvable?"
the simpler the model the more interpretable it is.
When a model is not easily interpreted, but it performs well, it’s your task to manage expectations.
If you start building data products, you will keep seeing a lot of companies that offer “machine learning in a box”, of “machine learning as a service”.
Those are platforms that expose APIs where you can just send your data and they will do classification, predictions or recommendations.
It is tempting, because it looks like you don’t have to spend time and money building your own infrastructure.
But it is important to remember that machine learning itself is just a part of the project.
I am not saying don’t use them, but more think carefully what percentage of work using this actually saves you.