Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

How to be a Good Machine Learning PM by Google Product Manager

4 865 vues

Publié le

In this presentation you will learn:
-Machine Learning definition and the different types of problems it can solve
-Framework to decide if your specific problem could or should be solved with Machine Learning
-The role that a Product Manager plays in each part of the Machine Learning lifecycle

Publié dans : Technologie

How to be a Good Machine Learning PM by Google Product Manager

  1. 1. www.productschool.com How to be a Good Machine Learning PM by Google Product Manager
  2. 2. FREE INVITE Join 23,000+ Product Managers on
  3. 3. COURSES Product Management Learn the skills you need to land a product manager job
  4. 4. COURSES Coding for Managers Build a website and gain the technical knowledge to lead software engineers
  5. 5. COURSES Data Analytics for Managers Learn the skills to understand web analytics, SQL and machine learning concepts
  6. 6. COURSES Digital Marketing for Managers Learn how to acquire more users and convert them into clients
  7. 7. COURSES Blockchain for Managers Learn how to trade cryptocurrencies and build products using the blockchain
  8. 8. Ruben Lozano TONIGHT’S SPEAKER
  9. 9. Machine Learning for Product Managers Product School | Seattle | Oct 17, 2018
  10. 10. Ruben Lozano-Aguilera Product Manager Google Cloud
  11. 11. 3 Overview: What is ML? To ML or NOT to ML: When should I use it? Let’s do ML: What is the ML lifecycle? Communication: How should I partner with ML scientists? 2 1 4 Agenda
  12. 12. Overview What is ML? 1
  13. 13. Artificial Intelligence What is ML? Machine Learning Deep Learning 1950s 1980s 2010s
  14. 14. What is ML? Rules Data Classical Programming Answers Problem Data Algorithm Model Output Answers Data Machine Learning Rules The field of study that gives computers the ability to learn without being explicitly programmed” Arthur Samuel Pioneer of AI research
  15. 15. ML and Statistics ML optimizes on predictive performance while statistics places importance on interpretability and parsimony/simplicity. Statistics Simply Put ML Dependent/Response/Output Variable The thing you’re trying to predict Label or Target Independent/Explanatory/Input Variable The data that help you make predictions Feature Data Transformation Reshaping data to get more value out of it Feature Engineering Variable/Subset Selection Using the most valuable data Feature Selection
  16. 16. What is ML? Supervised Learning Regression (Quantity) Classification (Category) Linear Ridge Lasso Trees SVM KNN Unsupervised Learning K-Means PCA Collaborative Filtering
  17. 17. To ML or Not To ML When should I use ML? 2
  18. 18. To ML when your problem… Handles very complex logic Scales-up fast Adapts in real-time Requires specialized personalization …and has existing examples of actual answers
  19. 19. Sample ML problems Problem type Description Ranking Recommendation Classification Regression Helping users find the most relevant thing Giving users the thing they may be most interested in Figuring out what kind of thing something is Finding uncommon things Clustering Predicting a numerical value of a thing Example Anomaly Putting similar things together Ranking algorithm within Amazon Search
  20. 20. Sample ML problems Problem type Description Ranking Recommendation Classification Regression Helping users find the most relevant thing Giving users the thing they may be most interested in Figuring out what kind of thing something is Finding uncommon things Clustering Predicting a numerical value of a thing Example Anomaly Putting similar things together Recommendations from Netflix Room suggestions from Google Calendar
  21. 21. Sample ML problems Problem type Description Ranking Recommendation Classification Regression Helping users find the most relevant thing Giving users the thing they may be most interested in Figuring out what kind of thing something is Finding uncommon things Clustering Predicting a numerical value of a thing Example Anomaly Putting similar things together Product classification for Amazon catalog High-Low Dress Straight Dress Striped Skirt Graphic Shirt
  22. 22. Sample ML problems Problem type Description Ranking Recommendation Classification Regression Helping users find the most relevant thing Giving users the thing they may be most interested in Figuring out what kind of thing something is Finding uncommon things Clustering Predicting a numerical value of a thing Example Anomaly Putting similar things together Predicting sales for specific Amazon products Seasonality | Out of stock | Promotions
  23. 23. Sample ML problems Problem type Description Ranking Recommendation Classification Regression Helping users find the most relevant thing Giving users the thing they may be most interested in Figuring out what kind of thing something is Finding uncommon things Clustering Predicting a numerical value of a thing Example Anomaly Putting similar things together Related news from Google Search
  24. 24. Sample ML problems Problem type Description Ranking Recommendation Classification Regression Helping users find the most relevant thing Giving users the thing they may be most interested in Figuring out what kind of thing something is Finding uncommon things Clustering Predicting a numerical value of a thing Example Anomaly Putting similar things together Fruit freshness Before After Good Damage Serious Damage Decay
  25. 25. To ML when your data… Is high qualityShould be usedCan be used Respects privacy SecureAccessible Available Fresh Unbiased Relevant Representative 1 2 3
  26. 26. NOT to ML when your problem… Can be solved by simple rules Does not adapt to new data Requires full interpretability Requires 100% accuracy
  27. 27. NOT to ML when your data… Is low qualityShould not be usedCannot be used Privacy concerns UnsecureInaccessible Unavailable Stale Biased Irrelevant Scarce or Incomplete 1 2 3
  28. 28. Exercise: To ML or Not To ML A. What apparel items should be protected by copyright laws? B. Which resumes should we prioritize to interview for our candidate pipeline? C. What products should be exclusively sold to Hispanics in the US? D. Which sellers have the greatest revenue potential? E. Where should Amazon build HQ2? F. Which search queries should we scope for the Amazon Fresh store?
  29. 29. Let’s do ML! ML Lifecycle 3
  30. 30. What do you need for ML? Tools & SystemsProcessesPeople
  31. 31. ML Scientist Applied Scientist Research Scientist Data Scientist Data Engineer Software Engineer Scienc e Math; Statistics; ML Algorithms Engineerin g ML Libraries; Data Collection Tools; Programming Languages ML Scientis t Applied Scientis t Research Scientist Data Scientis t Business Intelligenc e Engineer Data Enginee r Software Enginee r Dev Manage r Technica l Program Manager Get the right people Tools & SystemsProcessesPeople
  32. 32. Process ML Lifecycle Tools & SystemsProcessesPeople Formulate problem Select and preprocess data Feature engineering Train, test, and tune models 2 3 4 1
  33. 33. Formulate the problem Tools & SystemsProcessesPeople 1 PROBLEM 2 DATA 3 FEATURES 4 MODEL What is the problem to solve? What is the measurable goal? What do you want to predict?
  34. 34. Select and preprocess data Tools & SystemsProcessesPeople 1 PROBLEM 2 DATA 3 FEATURES 4 MODEL Selecting Preprocessing • Available • Missing • Discarding • Formatting • Cleaning • Sampling
  35. 35. Feature engineering Tools & SystemsProcessesPeople 1 PROBLEM 2 DATA 3 FEATURES 4 MODEL • Feature: Individual measurable property or characteristic of the phenomenon being observed • Goals: Use domain and data knowledge to develop relevant features from existing raw features in the data to increase the predictive power of ML Scaling Decomposition Aggregation
  36. 36. Train, test and tune models Tools & SystemsProcessesPeople 1 PROBLEM 2 DATA 3 FEATURES 4 MODEL Data Set Test Data Training Data Model Training ML Model
  37. 37. Productionize Integrate ML solution with existing software, and keeping it running successfully over time Tools & SystemsProcessesPeople Deployment environment Data storage Monitoring and maintenance Security and privacy Great ML problems cannot be productionize due to high implementation costs or inability to be tested in practice
  38. 38. Product Manager role in Machine Learning ML Lifecycle Formulate problem Select and preprocess data Feature engineering Train, test, and tune models 2 3 4 1
  39. 39. Formulate the problem
  40. 40. Formulate the problem To formulate the problem You have to ask the next questions What is the problem? What is the measurable goal? What do you want to predict? PM ROLE Note: The type of problem you solve defines the algorithm to use (clustering -> k-means)
  41. 41. Problem: You have not use ML before To formulate the problem You have to ask the next questions What is the problem? What is the measurable goal? What do you want to predict? Increase revenue growth for coached (vs. non-coached) Sellers by X% at the end of six months. Each week, the New Seller Success team onboards hundreds of new Sellers, and this group is expected to grow X% YoY. Personalized coaching time, however, doesn’t scale. As such, the team needed a way to accurately predict top performers to double down on. The top 5% of net new Sellers six months after their launch. PM ROLE
  42. 42. Problem: You are already using ML To formulate the problem You have to ask the next questions What is the problem? What is the measurable goal? What do you want to predict? Increase unit oder rate for category X in the US by +X% within the next X months without affecting revenue Units per order from category X in the US has remained flat YoY and engagement has declined as measured by purchase-week frequency. Category X products that are more likely to be added to a customer cart based on items in the customer cart PM ROLE
  43. 43. Select and preprocess data
  44. 44. Selecting Preprocessing • Formatting • Cleaning • Sampling • Labeling • Available • Missing • Discarding Select and preprocess data
  45. 45. Selecting data Select the right datasets Public Custom Internal for the right purposes Train and tune models Replace flawed or outdated data Measuring success PM ROLE
  46. 46. Preprocessing data: Formatting Format your data consistently, so you can work with it PM ROLE Data Type Possible Values Example Usage Binary 0, 1 (arbitrary labels) binary outcome ("yes/no", "true/false", "success/failure", etc.) Categorical or nominal 1, 2, ..., K (arbitrary labels) categorical outcome (specific blood type, political party, word, etc.) Ordinal integer or real number (arbitrary scale) relative score, significant only for creating a ranking Binomial 0, 1, ..., N number of successes (e.g. yes votes) out of N possible Count nonnegative integers (0, 1, ...) number of items (telephone calls, people, molecules, etc.) in given interval/area
  47. 47. Preprocessing data: Cleaning Clean Incomplete Inconsistent Noisy Biased PM ROLE means removing or fixing missing data
  48. 48. Preprocessing data: Cleaning Clean means removing or fixing missing data Keywords Recognized Session? Is Prime? Customer ID Device # Searches $ iphone case Y N A000 3 iphone case N Mobile 5 iphone case Y N C000 Mobile 10 $ 20 iphone case Y Y D000 Mobile 2 iphone case N E000 Desktop 7 $ 5,000 iphone case N Mobile 4 iphone case N F000 Mobile 8 $ 30 iphone case N Y Tablet 4 iphone case Y Y B000 Mobile $10 iphone case Y N A000 Desktop 1 $ 90 Deletion $0 $0 $0 $0 $0 Dummy Substitution ? Mean Substitution Mobile Frequent Substitution Lookup SubstitutionPM ROLE
  49. 49. Preprocessing data: Sampling Sampling chooses representative data to solve your problem ISSUES STRATEGIES Random Stratified Seasonality Trends Leakage Biases PM ROLE
  50. 50. Preprocessing data: Unintended bias Sampling chooses representative data to solve your problem Where to offer Prime Free Same-Day Delivery? PM ROLE Auto labeling images
  51. 51. Preprocessing data: Labeling Labeling is tagging or classifying your data PM ROLE MANUALAUTOMATED BIASES Auditors IncentivesPlurality Metrics Gold Standards
  52. 52. Feature engineering
  53. 53. develops relevant features from existing raw features Feature engineering ML Statistics Simply Put Label Target Dependent/ Response/ Output Variable The thing you’re trying to predict Feature Independent/ Explanatory/ Input Variable The data that help you make predictions Feature Engineering Data Transformation Reshaping data to get more value Feature Selection Variable/Subset Selection Using the most valuable data Feature engineering PM ROLE
  54. 54. Train, test and tune models
  55. 55. Train, test and tune models must be trained, tested, and tunedModels PM ROLE Data Set Test Data Training Data Model Training ML Model
  56. 56. How do you evaluate the model? Regression (Continuous) • Root-mean-squared error • R-squared Classification (Categorical) • Accuracy
  57. 57. How do you evaluate the model? Regression (Continuous) • Root-mean-squared error • R-squared Classification (Categorical) • Accuracy • Precision and recall
  58. 58. Precision and Recall True Positive Cancer NoCancer No Cancer Cancer False Positive False Negative True Negative Prediction TrueState
  59. 59. Precision and Recall True Positive (TP) Cancer NoCancer No Cancer Cancer False Positive (FP) False Negative True Negative Prediction TrueState Correct True Predictions All True Predictions Precision (Quality) TP TP + FP What proportion of positive identifications was actually correct?
  60. 60. Precision and Recall True Positive (TP) Cancer NoCancer No Cancer Cancer False Positive False Negative (FN) True Negative Prediction TrueState Correct True Predictions All True Cases Recall (Quantity) TP TP + FN What proportion of actual positives was identified correctly?
  61. 61. Precision and Recall True Positive Cancer NoCancer No Cancer Cancer False Positive False Negative True Negative Prediction TrueState Precision Recall0 100 % 100 %
  62. 62. Communication How can I best partner with scientists? 4
  63. 63. How can I best partner with scientists? ML Scientist Applied Scientist Research Scientist Data Scientist Data Engineer Software Engineer ML Scientis t Applied Scientis t Research Scientist Data Scientis t Business Intelligenc e Engineer Data Enginee r Software Enginee r Dev Manage r Technica l Program Manager
  64. 64. How can I best partner with scientists? Treat your ML project as a partnership “A PM from an ML project I worked on basically threw the requirements over the fence to me and was mostly unavailable. To meet timelines, I kept moving forward. Unfortunately, the deliverable at the end of the three-month project, though aligned with initial business requirements, was not what the PM wanted and didn’t meet the need. The model never made it into production and we really didn’t gain any learnings.”
  65. 65. How can I best partner with scientists? Treat your ML project as a partnership Have a clear problem, hypothesis and success metric “PMs who come prepared with a clear, preferably data-driven, problem and hypothesis will have a much more productive discussion with me than otherwise. The problem definition need not be perfect, but I do want to understand what’s been tried, why it isn’t working and what we’re aiming for.”
  66. 66. How can I best partner with scientists? Be willing to make tradeoffs Treat your ML project as a partnership Have a clear problem, hypothesis and success metric
  67. 67. How can I best partner with scientists? Be willing to make tradeoffs • Time vs Quality • White Box vs Black Box • False Positives vs False Negatives • Go vs No-Go Metrics
  68. 68. How can I best partner with scientists? • Help get data and explain it • Scientists are not Software Engineers • ML creates tech debt • Be considerate of scientist time and momentum
  69. 69. Thank you!
  70. 70. www.productschool.com Part-time Product Management, Coding, Data, Digital Marketing and Blockchain courses in San Francisco, Silicon Valley, New York, Santa Monica, Los Angeles, Austin, Boston, Boulder, Chicago, Denver, Orange County, Seattle, Bellevue, Toronto, London and Online

×