Dato aims to accelerate the creation of intelligent applications by making sophisticated machine learning as easy as "Hello world." The company provides an integrated machine learning platform that handles data engineering, advanced ML techniques, and deployment of models as predictive services. This allows small teams to be highly productive in building intelligent applications like recommenders, fraud detection, and personalized medicine. Dato's platform provides out-of-core computation, tools for feature engineering, rich data type support, and scalable models to help customers in various industries rapidly iterate and deploy ML applications.
2. Dato Confidential2
Hello, my name is…
Shawn Scully
scully@dato.com
Director of Product
(Physicist, Cleantech Geek, Data Scientist, Urban Farmer)
I
Intelligent Applications
5. Dato Confidential5
Business
must be intelligent
Machine learning
applications
• Recommenders
• Fraud detection
• Ad targeting
• Financial models
• Personalized medicine
• Churn prediction
• Smart UX
(video & text)
• Personal assistants
• IoT
• Socials nets
• …
Last decade:
Data management
Now:
Intelligent apps
?
Last 5 years:
Traditional analytics
7. Dato Confidential
Systems
Elastic, scalable
People
Data scientist
Challenge today: Path from inspiration to production
ScalePrototyping
Data engineering is painful
• Limited by system memory
• Data munging & feature eng.
• Manipulate complex data types
Data intelligence is hard
• Models don’t scale
• No task-oriented ML
• Algos trapped in papers
Production is fragile
• Build custom services & API
• Write new code to scale
• Model management
Inspiration
Data Intelligence
Data Engineering Production
9. Dato Confidential
We make small teams extremely productive.
9
Developer (former DBA) built & deployed first recommender to
increase community engagement (and therefore ad revenue).
Small team of developers built & deployed a recommender in 1/5 the
time of previous efforts and at higher performance for increased sales.
Small team of data scientists more rapidly iterating on models to
improve state of the art music experience for better user experience.
Small team iterating quickly to improve personalization (and increase
revenue) in their daily deals.
2 person team iterate & deploy better job search ranking using text to
increase clicks & therefore revenue.
11. Dato Confidential
• Out-of-core computation
• Tools for feature engineering
• Rich data type support
• Models built for scale
• App-oriented toolkits
• Advanced ML & Extensible
• Deploy models as low-latency REST services
• Same code for distributed computation
• Elastically scale up or out with one command
• Job monitoring & model management
• Deploy existing Python code & models
• Run on AWS EC2 or Hadoop YARN
SGraph
Create Engine
SFrameCanvas
Machine Learning Toolkits SDK
GraphLab Create Dato DistributedDato Predictive Services
Predictive Engine
REST Client Direct
Model Mgmt
Distributed Engine
DirectJob Client
Job Mgmt
The Dato Machine Learning Platform
12. Dato Confidential12
Sophisticated ML made easy - Toolkits
Recommender
Image
search
Sentiment
analysis
Data
matching
Auto
tagging
Churn
predictor
Object detector
Product
sentiment
Click
prediction
Fraud
detection
User
segmentation
Data
completion
Anomaly
detection
Document
clustering
Forecasting
Search
ranking
Summarization …
import graphlab as gl
data = gl.SFrame.read_csv('my_data.csv')
model = gl.recommender.create(data,
user_id='user',
item_id='moviez
target='rating')
recommendations = model.recommend(k=5)
Principles:
• Get started fast
• Rapidly iterate
• Combine for new apps
13. Dato Confidential13
Sophisticate ML made easy - Transfer learning
• Train a model on one task, use it for another task
• Examples
- Learn to walk, use that knowledge to run
- Train image tagger to recognize cars, use that knowledge to
recognize trucks.
13
14. Dato Confidential14
Create an intelligent world!
Data
Engineering
Sophisticated
ML
Deployment
• Fast & scalable
• Rich data types
• Built for ML
• App-oriented ML
• Supporting utils
• Extensibility
• Batch & always-on
• RESTful interface
• Elastic & robust
scully@dato.com
15. Dato Confidential
Get the software: dato.com/download
Start learning: dato.com/learn
Bug me: scully@dato.com
Notes de l'éditeur
Add message?
This is costly, takes a long time, and limits the impact your teams can have.