Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Predictive analytics and big data tutorial

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 42 Publicité

Predictive analytics and big data tutorial

Télécharger pour lire hors ligne

This presentation covers data science buzz words, big data introduction, predictive analytics, and model building methods. Structured vs unstructured. Supervised learning vs unsupervised learning.

This presentation covers data science buzz words, big data introduction, predictive analytics, and model building methods. Structured vs unstructured. Supervised learning vs unsupervised learning.

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Les utilisateurs ont également aimé (20)

Publicité

Similaire à Predictive analytics and big data tutorial (20)

Publicité

Plus récents (20)

Predictive analytics and big data tutorial

  1. 1. Ben Taylor @bentaylordata Predictive Analytics / Data Science
  2. 2. Presentation Objectives • Enable you to be smarter than your prospect (data history / lingo) • Motivate you to be unstoppable and hyper-confident • Motivate you to begin looking for data driven opportunities • Motivate you to become a data scientist
  3. 3. "What the hell is cloud computing?" -Larry Ellison, CEO Oracle
  4. 4. What is cloud computing? ?
  5. 5. What is big data?  Big data includes datasets or problems which exceed the capacity of a single computer and require a distributed data access system.  The concept of "big" is relative to the conventional systems and technology and is subject to change in the future with advances in memory and storage solutions. http://www.pcmag.com/article2/0,2817,2453838,00.asp
  6. 6. Big data trends
  7. 7. What is a data scientist?
  8. 8. What is a data scientist? Engineering Finance Economics Mathematics Computer Science Physics Data Science 6-10yrs Python Bootcamp $8,000 (3 months) $16,000-$4,000 (3 months) $115K avg
  9. 9. What is a data scientist?
  10. 10. What is a data scientist? Master Builder
  11. 11. What is a data scientist? Reality distortion: Hyper-confidence
  12. 12. Data Scientist = Peacock
  13. 13. @bentaylordata Humans Algorithms VS
  14. 14. Smartest pirate 0 20 40 60 80 100 120 140 160 180 200 0 200 400 600 800 1000 1200 1400 Ships captured Treasurechests
  15. 15. Humans Algorithms VS 0 20 40 60 80 100 120 140 160 180 200 Ships captured NA
  16. 16. Humans Algorithms VS 0 20 40 60 80 100 120 140 160 180 200 Ships captured German (1795), French (1806) 0 20 40 60 80 100 120 140 160 180 2 0 200 400 600 800 1000 1200 1400 Ships captured Treasurechests
  17. 17. Humans Algorithms VS 1997, IBM deep blue Kasparov
  18. 18. Humans Algorithms VS 2011, IBM Watson Ken Jennings & Brad Rutter
  19. 19. Humans Algorithms VS 2014, HireVue Iris Hiring Panel
  20. 20. Prediction process Raw data Data munging Training Model
  21. 21. Data munging Prediction process Raw data Feature selection Training Model Data cleaning Clean data
  22. 22. Numeric Excel example @bentaylordata
  23. 23. Data munging Prediction process Raw data Feature selection Training Model Data cleaning LSR, SVM, RANDOM FOREST, NAÏVE BAYESIAN, NEURAL NET
  24. 24. Missing values + categorical @bentaylordata
  25. 25. Data munging Prediction process Raw data Feature selection Training Model Data cleaning LSR, SVM, RANDOM FOREST, NAÏVE BAYESIAN, NEURAL NET Retail > 15, Engineering > 95 > 5.67
  26. 26. Resume model
  27. 27. Resume model
  28. 28. Data munging Prediction process Raw data Feature selection Training Model Data cleaning LSR, SVM, RANDOM FOREST, NAÏVE BAYESIAN, NEURAL NET Retail > 15, Engineering > 95 GPA, Colleges, Hobbies > 5.67
  29. 29. Text deeper dive
  30. 30. Sentiment example
  31. 31. Sentiment example
  32. 32. Sentiment
  33. 33. Given data, find cat? dog? @bentaylordata
  34. 34. Talk like a data nerd @bentaylordata
  35. 35. Confidence & Over-fitting
  36. 36. Confidence & Over-fitting
  37. 37. Data Lingo  Supervised vs unsupervised learning  Supervised: Training set provided.  Unsupervised: No training set, clustering based on similar attributes.
  38. 38. Data Lingo  Analytic Layers  Descriptive Analytics: Telling a data story, plotting, or visualization.  Predictive Analytics: Predict future outcomes, usually trained on a historical training set  Prescriptive Analytics: Using the insight from your predictive model to proactively change something  Interview/Interaction Analytics: Any analytics surrounding the interview or interaction.
  39. 39. Data Lingo  Prediction methods  Regression: Predicting a continuous output (stock)  Classification: Predicting discrete category outputs. i.e. Yes/Maybe/No
  40. 40. Data Lingo  Data Types  Structured: Does it play well in Excel?  Unstructured: Raw text (Twitter), audio, video, photos, resumes, etc…

×