Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Machine Learning with GraphLab Create

1 091 vues

Publié le

Presented by Neel Kishan.

Publié dans : Technologie
  • Soyez le premier à commenter

Machine Learning with GraphLab Create

  1. 1. Dato Confidential1 Neel Kishan – Technical Sales Lead neel@dato.com
  2. 2. Dato Confidential Hello my name is Neel Kishan Technical Sales Lead (former neuroscientist, GPU programmer, Eagle Scout, Chicago sports fan) 2 neel@dato.com Let’s Schedule a Time to Talk: https://calendly.com/dato-neel
  3. 3. Dato Confidential We empower developers to create intelligent applications with real-time machine learning services quickly and easily. Intelligent Applications Dato Platform GraphLab Create Dato Predictive Services Machine Learning Lifecycle
  4. 4. Dato Confidential4 Teams have found ways to build intelligent applications… Recommenders Lead Scoring Churn Prediction Multi-channel Targeting Auto-Summarization Fraud detection Intrusion Detection Demand Forecasting Data Matching Failure Prediction
  5. 5. Dato Confidential5 Why do these projects take so long? • Lengthy code rewrites for scalable production services • Mundane tasks to integrate libraries, transform data to specific formats, fill in missing values, etc. • Many tools are just slow
  6. 6. Dato Confidential6 Challenges for developing intelligent apps • Algorithm-centric APIs create confusion and a steep learning curve • Understanding models has been a craft passed only through tribal knowledge • Production services are hard to maintain and manage
  7. 7. Dato Confidential Intuitive APIs Easy to learn with smart defaults so your first application comes together fast Deploy instantly as REST Eliminates the lengthy rewrites to integrate and serve live, at scale Integrated libraries for any data Deep learning, graphs, text, and images on a common scalable data structure eliminates all the glue code and context switching Dato Machine Learning Built to rapidly deliver intelligent applications
  8. 8. Dato Confidential What makes Dato special? 8
  9. 9. Dato Confidential The Dato Machine Learning Platform Deploy Models Feedback GraphLab Create & Dato Distributed TrainDevelop Experiments Dato Predictive Services Serve (REST API) Monitor www. on your infrastructure: GraphLab Create & Dato Distributed • Creating models • Data engineering • Evaluation & Visualization Predictive Services • Serving models • Live experimentation • Model management
  10. 10. Dato Confidential10 Scalable Data Structures for Machine Learning User Com. Title Body User Disc. SFrame - on-disk, columnar & partitioned table SGraph – graph structure composed of multiple tables TimeSeries – table with a time index
  11. 11. Dato Confidential High performance machine learning 11 0.60% 0.65% 0.70% 0.75% 0.80% 0.85% 0 2 4 6 8 10 12 TestError Time(hr) H2O.ai: 10 machines/80 cores recommenders deep learning & images graph analytics Faster algorithms accelerate teams Fails to complete on other systems!
  12. 12. Dato Confidential12 Intuitive API – Easily create a live machine learning service import graphlab as gl data = gl.SFrame.read_csv('my_data.csv') model = gl.recommender.create( data, user_id='user', item_id='movie’, target='rating') recommendations = model.recommend(k=5) cluster = gl.deploy.load(‘s3://path’) cluster.add(‘servicename’, model) Create a Recommender 5 lines of code Toolkit w/auto selection Deploy in minutes
  13. 13. Dato Confidential13 Dato Machine Learning Toolkits Applications • recommender • sentiment_analysis • churn_predictor • data_matching • pattern_mining • anomaly_detection Fundamentals • regression • classifier • nearest_neighbors • clustering • deeplearning • text_analytics • graph_analytics Utilities • model_parameter_search • cross_validation • evaluation • comparison • feature_engineering Join us April 7th for a webinar on Deep Learning: Image Similarity and Beyond
  14. 14. Dato Confidential Demo of GLC & PS 14
  15. 15. Dato Confidential Deployment scenarios 15
  16. 16. Dato Confidential16 Neel Kishan – Technical Sales Lead neel@dato.com
  17. 17. Dato Confidential Appendix And Supporting Material
  18. 18. Dato Confidential Dato is becoming the backbone of intelligent applications for 80+ customers • Commercialization of Carnegie Mellon ML Project founded by Professor Carlos Guestrin in 2013 • Vibrant user community numbering 40,000+ from Coursera and open source projects • Major customers in retail, finance, media, and software 18
  19. 19. Dato Confidential19 Appendix 1919 Deployment Scenarios & Pricing
  20. 20. Dato Confidential Machine Learning Deployment Options 20 Dato Predictive Services Batch write of predictions Embedded process or script Export (e.g. PMML)
  21. 21. Dato Confidential Pricing • Subscription license which includes support and and upgrades • Licensed by user for Create & by machine for production use • Training & technical services also available 21
  22. 22. Dato Confidential222222 Use Cases
  23. 23. Dato Confidential23 Our customers are leading the creation of intelligent applications
  24. 24. Dato Confidential Quantifying the value – Fastest to Production & Reduced Operational Cost Built a 90% accurate sentiment analyzer for hotel reviews after 30 minutes of trying Dato’s GraphLab Create Created an efficient (40 mins in Dato vs. 33 days in R) pipeline with 46% lift in accuracy “[Dato’s] GraphLab CreateTM gives us easy access to some of the most advanced machine learning and this lets us iterate on our ideas faster” 24 Simplify the process to develop and deploy internal services for SalesForce PDS and adjacent teams Reduced hundreds of tools to manage, complexity of solution, and development time Achieved in 2 days with Dato’s GraphLab Create what took 2 weeks in R Dropped concept to deployment from months to minutes Replace a heuristic heavy job ranking system to improve job search relevance Developed in weeks with significant increase in clickthrough after years of no growth
  25. 25. Dato Confidential Fraud Detection and Security “Merchant intelligence for safer, more profitable commerce.” Others like Alan & G2 Web Services: Alan Krumholz, Principal Data Scientist Score merchants based on their web presence and actions to help their banking customers identify fraudulent merchants. Accelerate business decisions, reducing manual intervention required and minimizing false positives. Achieved in 2 days with GraphLab Create what took two weeks in R. Dropped deployment from months to minutes. WHO: INSPIRATION: VALUE: OUTCOME: Customer Success Story 25
  26. 26. Dato Confidential Data Matching Customer Success Story “Fast, free, thorough home search.” Others like Nick & Zillow: Nicholas McClure, Senior Data Scientist Build a service that matches property listings across many inbound data feeds and collapses to a most accurate listing. Data & listing quality is critical to Zillow’s core product. Created an efficient (40 mins in GLC vs. 33 day R pipeline) pipeline with much higher accuracy (95% up from 65%). WHO: INSPIRATION: VALUE: OUTCOME: 26
  27. 27. Dato Confidential Recommenders Customer Success Story They are the site for “Advice and support on pregnancy and parenting.” Others like Shelley & BabyCenter: Shelley Klopp, DBA & Chief Architect Build and deploy their first recommender to increase session engagement by recommending relevant content Initial model increased average session by multiple page views First prototype built in < 1 week Ongoing model experimentation is increasing engagement WHO: INSPIRATION: VALUE: OUTCOME: 27
  28. 28. Dato Confidential Sentiment and Text Analysis Customer Success Story “Get hired. Love your job.” Others like Marcos and Glassdoor: Marcos Sainz, Lead Machine Learning Engineer Replace a heuristic heavy job ranking system with an ML driven system to improve job search relevance More relevant jobs led to happier users and higher clickthrough Concept to production in weeks WHO: INSPIRATION: VALUE: OUTCOME: 28
  29. 29. Dato Confidential Image analytics and Deep features Customer Success Story “Smart waste management.” Others like Ben & Compology: Ben Chehebar, Co-founder/Lead of Product Use machine learning to predict how full dumpsters are. This allows them to augment their human classification using mechanical turk and allows them to scale their operations. Concept to deployed service in less than a month with accuracy as good or better than the humans. WHO: INSPIRATION: VALUE: OUTCOME: 29