Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
0 to Kaggle
in 30 minutesYou-Cyuan Jhang, Sr. Data Science Engineer @Castlight Health
Ming Tsai, Sr. Data Engineer @Silico...
Founded 2010
200,000 data scientists worldwide
Digit Recognizer Contest
US Postal Service 21 million pieces of mails every hour
More than $1 million could be saved each ...
PostgreSQL
MADLib
run algorithm
in-place
no data movement
Kaggle Machine Learning Pipeline
Unknown Data
Training
Prediction
Dataset
ModelKnown Data
What can Madlib Do?
Linear Regression, Logistic Regression, Support Vector
Machine, Random Forest, Singular Value Decompos...
K-means
Demo
Demo
Future
K-means Visualization http://tech.nitoyon.
com/en/blog/2013/11/07/k-means/
Source
https://github.com/ming-svds/kmea...
Prochain SlideShare
Chargement dans…5
×

0 to kaggle in 30 minutes

2 990 vues

Publié le

0 to kaggle in 30 minutes

Publié dans : Données & analyses
  • Soyez le premier à commenter

0 to kaggle in 30 minutes

  1. 1. 0 to Kaggle in 30 minutesYou-Cyuan Jhang, Sr. Data Science Engineer @Castlight Health Ming Tsai, Sr. Data Engineer @Silicon Valley Data Science
  2. 2. Founded 2010 200,000 data scientists worldwide
  3. 3. Digit Recognizer Contest US Postal Service 21 million pieces of mails every hour More than $1 million could be saved each day sorting zip codes
  4. 4. PostgreSQL MADLib run algorithm in-place no data movement
  5. 5. Kaggle Machine Learning Pipeline Unknown Data Training Prediction Dataset ModelKnown Data
  6. 6. What can Madlib Do? Linear Regression, Logistic Regression, Support Vector Machine, Random Forest, Singular Value Decomposition, Clustering K-means Clustering
  7. 7. K-means Demo
  8. 8. Demo
  9. 9. Future K-means Visualization http://tech.nitoyon. com/en/blog/2013/11/07/k-means/ Source https://github.com/ming-svds/kmeans-digit-on-madlib

×