Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
From Data to Deployment:
Full Stack Data Science
Ben Link
Data Scientist
Indeed is the #1 external
source of hire
64% of US job searchers search
on Indeed each month
80.2M
unique US visitors per ...
We help
people
get
jobs.
Data Science @ Indeed
Applicant Quality
Job / Employer
Application Model
Resume / Job
Seeker
Good Fit?
What does a data scientist do at Indeed?
Gather Data
Label
Data
Prototype
Models
Generate
Features
Model
Review
Choose final
parameters
A/B Test
Model
Hypothesis
F...
Gather Data
Label
Data
Prototype
Models
Generate
Features
Model
Review
Choose final
parameters
A/B Test
Model
Hypothesis
F...
Gather Data
Label
Data
Prototype
Models
Generate
Features
Model
Review
Choose final
parameters
A/B Test
Model
Hypothesis
F...
Gather Data
Label
Data
Prototype
Models
Generate
Features
Model
Review
Choose final
parameters
A/B Test
Model
Hypothesis
F...
Gather
Data
Label
Data
Hypothesis
Formulation
Explore
Data
Prototype Models
Generate
Features
Analyze
Labels
Analyze
Features
Evaluate
Model
Model
Review
Deploy
Model
Label Hold-out
Data
Choose Final
Parameters
A/B Test
Model
Monitor
Model
Repeat
Full-stack data scientists
Prevent handoff
mistakes
Can contribute on
any team
Have big picture
in mind
1 2 3
Prevent handoff mistakes
1
Ipython
Model
Feature
Extraction
Model Building
DB
Raw
Data
DB
Web Infrastructure
Model
Feature
Extraction
Raw
Data
DB
Web Infrastructure
Model
Feature
Extraction
JSON
Data
Data
Service
DB
NoSQL
Web Infrastructure
Model
Feature
Extraction
JSON
Data
Data
Service
Web Infrastructure
Model
Feature
Extraction
JSON
Data
Data
Service
NoSQL
New
Service
Web Infrastructure
Model
Feature
Extraction
JSON
Data
Data
Service
NoSQL
Web Infrastructure
New
Service
Model
JSON
Data
NoSQL
Feature
Extraction
Web Infrastructure
New
Service
Model
JSON
Data
NoSQL
Java Feature
Extraction
Contribute on any team
2
Drive logging of data
Drive product decisions
using external data
Get first data science solution
into production quickly
Iterate on existing solutions
Recognize deployment costs during
feature / model development
Think Big
3
Focus on right problem
Aware of big picture
Practical Data Science
Job Description Classifiers
Predicting (min) years of experience
from a job description
Simple features for first models
{ ‘regex:5+’:1, ‘tfidf:expert’:1.75, ‘tfidf:advanced’:0.93, ‘tfidfBigram:5
years’:2.25 }
Label data before, during, and after you build a model
Extract features in one place
Reuse your model building code
Releas...
Label data before, during, and after you build a model
Extract features in one place
Reuse your model building code
Releas...
The best way to understand your
problem is to label your own data
The fastest way to get labels for your
data is to label ...
Labeling encourages
feature development
Labeling creates a human
performance benchmark
Labeling throughout gives you
indications of shifting data
Is the job part time, full time, or both?
Sometimes you don’t need much data
Need to only do better
than a simple heuristic
Training Samples
Score
Learning Curve
0 1000 3000 70005000
1.00
0.98
0.94
0.88
0.84
0.92
0.96
0.90
0.86
Training
score
Cro...
Now train others to label
Or use experts
Check their consistency
Can build next generation model quickly
Always flag weird data
Label data before, during, and after you build a model
Extract features in one place
Reuse your model building code
Releas...
Model
Feature Extraction
Features Predictions
Model
Builder
Model
Predictor
Prevents feature inconsistency
between train / serve time
Allows faster feature iteration
Encourages feature extraction reuse
Deploy feature extraction services
Features ModelModel Builder
Feature Extraction
Job Description
Feature Extractor
0.007
"bigramTfidif:5 years"
0.049
"bigramTfidf:experience in"
0.006
"tfidf:expert"
0.026
"averageWordLength"
5.506
"tfidf...
Label data before, during, and after you build a model
Extract features in one place
Reuse your model building code
Releas...
Features Model
● feature sampling
● feature scaling
● feature selection
Model Builder
● test/train splits
● cross validati...
input_file=job_decription_years_exp.gz
output_dir=output/job_description_years_exp_model_builds
model_name=JobExperience
m...
TruePositiveRate
False Positive Rate
ROC Curve
0.0 0.2 1.00.0 0.80.60.4
1.00
0.2
0.4
0.6
0.8
Feature
Name
Feature
Importance
experience 0.27
5 years 0.19
experience in 0.17
expert 0.16
averageWordLength 0.11
years 0...
Output your models into
a standard format
Deploy quickly
Model
Model Predictor
Feature Extraction
Predictions
Putting it all together
Model
Feature Extraction
Features Predictions
Model
Builder
Model
Predictor
Label data before, during, and after you build a model
Extract features in one place
Reuse your model building code
Releas...
viewjobeval_en_US JUDY-419: Proctor test for viewjob evaluation test
edit
details
control test1control test1
viewjobeval_en_US JUDY-419: Proctor test for viewjob evaluation test
edit
details
control test1
viewjobeval_en_US JUDY-419: Proctor test for viewjob evaluation test
test1 - 50%edit
details
Log everything
uid=1b0un002j1jfi8mp&type=judyQoaEvalFeatures&appdcname=aus&appinstance=judy&tk=1b0un002d1jfid0o&locale=en_US&f.jdTfidf%3A...
Reuse logs for future models
Logs give us insight
into changing data
Logs allow us to see
what went wrong
Label data before, during, and after you build a model
Extract features in one place
Reuse your model building code
Releas...
Quantitative Validation
Training Set
class precision recall f1-score
support
0.0 1.00 1.00 1.00
448
1.0 0.99 1.00 1.00
663
2.0 1.00 0.98 0.99
269
...
TruePositiveRate
False Positive Rate
ROC Curve
0.0 0.2 1.00.0 0.80.60.4
1.00
0.2
0.4
0.6
0.8
Qualitative Validation
Review your Models
Another perspective
Transparency and Reproducibility
Awareness
1 Context
2 Data
3 Response variable
4 Features
5 Model selection and performance
6 Transparency and recommendations
Context
What should this model enable us to do
(highlighting, filtering, sorting, etc.)?
What products / interfaces / work...
Data
What queries and filters were used?
From what time range did your data originate?
Did you sample your dataset?
Response variable
How was the response variable
labeled or collected?
What the model outputs (predictions) represent
and h...
Features
How were your features generated?
Which features were most important?
Model selection and performance
Performance reports on train / test sets
Overall CV search strategy and scoring function
O...
Transparency and recommendations
Properties files for Model Builder
Link to branch of Model Builder code
Examples of Model...
Label data before, during, and after you build a model
Extract features in one place
Reuse your model building code
Releas...
Features and data are hard dependencies
Need a post deploy plan
Use log data to check for feature changes
Bucketcount
tfidf:`excel`
Test Name ttest_ind ks_2samp mannwhit levene ranksums
p-value 3.79e-09 0.00021 8.41e-05 3.79e-09...
Check prediction class distributions
Label data before, during, and after you build a model
Extract features in one place
Reuse your model building code
Releas...
Every model should be validated,
retraining is time expensive
Use feature monitoring to
determine feature stability
Choose less sensitive features
Avoid counts
Full stack data scientists
Full stack data science organizations
More Indeed Engineering
Careers
indeed.jobs
Twitter
@IndeedEng
Engineering Blog &
Talks
indeed.tech
Open Source
opensource...
Questions?
Label data before, during, and after you build a model
Extract features in one place
Reuse your model building ...
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
From data to deployment- full stack data science
Prochain SlideShare
Chargement dans…5
×

From data to deployment- full stack data science

Link to YouTube: https://www.youtube.com/watch?v=LLvvNNWp3D0

Indeed serves over 200 million job seekers a month. To better serve our job seekers we take advantage of data related to jobs, resumes, clicks, impressions, applies, and hires. Many important product features at Indeed are built using data science solutions that incorporate our data. This talk aims to cover the process and tools we use to build data science solutions.

  • Soyez le premier à commenter

From data to deployment- full stack data science

  1. 1. From Data to Deployment: Full Stack Data Science
  2. 2. Ben Link Data Scientist
  3. 3. Indeed is the #1 external source of hire 64% of US job searchers search on Indeed each month 80.2M unique US visitors per month 16M jobs 50+ countries 28 languages 200M unique visitors 2010 Unique Visitors (millions) 2009 2011 2012 2013 2014 2015 0 40 80 120 160 200
  4. 4. We help people get jobs.
  5. 5. Data Science @ Indeed
  6. 6. Applicant Quality
  7. 7. Job / Employer Application Model Resume / Job Seeker Good Fit?
  8. 8. What does a data scientist do at Indeed?
  9. 9. Gather Data Label Data Prototype Models Generate Features Model Review Choose final parameters A/B Test Model Hypothesis Formulation Explore Data Analyze Labels Analyze Features Label Hold-out Data Deploy Model Monitor Model Repeat Evaluate Model
  10. 10. Gather Data Label Data Prototype Models Generate Features Model Review Choose final parameters A/B Test Model Hypothesis Formulation Explore Data Analyze Labels Analyze Features Label Hold-out Data Deploy Model Monitor Model Repeat Evaluate Model
  11. 11. Gather Data Label Data Prototype Models Generate Features Model Review Choose final parameters A/B Test Model Hypothesis Formulation Explore Data Analyze Labels Analyze Features Label Hold-out Data Deploy Model Monitor Model Repeat Evaluate Model
  12. 12. Gather Data Label Data Prototype Models Generate Features Model Review Choose final parameters A/B Test Model Hypothesis Formulation Explore Data Analyze Labels Analyze Features Label Hold-out Data Deploy Model Monitor Model Repeat Evaluate Model
  13. 13. Gather Data Label Data Hypothesis Formulation Explore Data
  14. 14. Prototype Models Generate Features Analyze Labels Analyze Features
  15. 15. Evaluate Model Model Review Deploy Model Label Hold-out Data
  16. 16. Choose Final Parameters A/B Test Model Monitor Model Repeat
  17. 17. Full-stack data scientists Prevent handoff mistakes Can contribute on any team Have big picture in mind 1 2 3
  18. 18. Prevent handoff mistakes 1
  19. 19. Ipython Model Feature Extraction Model Building DB Raw Data
  20. 20. DB Web Infrastructure Model Feature Extraction Raw Data
  21. 21. DB Web Infrastructure Model Feature Extraction JSON Data Data Service
  22. 22. DB NoSQL Web Infrastructure Model Feature Extraction JSON Data Data Service
  23. 23. Web Infrastructure Model Feature Extraction JSON Data Data Service NoSQL
  24. 24. New Service Web Infrastructure Model Feature Extraction JSON Data Data Service NoSQL
  25. 25. Web Infrastructure New Service Model JSON Data NoSQL Feature Extraction
  26. 26. Web Infrastructure New Service Model JSON Data NoSQL Java Feature Extraction
  27. 27. Contribute on any team 2
  28. 28. Drive logging of data
  29. 29. Drive product decisions using external data
  30. 30. Get first data science solution into production quickly
  31. 31. Iterate on existing solutions
  32. 32. Recognize deployment costs during feature / model development
  33. 33. Think Big 3
  34. 34. Focus on right problem
  35. 35. Aware of big picture
  36. 36. Practical Data Science
  37. 37. Job Description Classifiers
  38. 38. Predicting (min) years of experience from a job description
  39. 39. Simple features for first models { ‘regex:5+’:1, ‘tfidf:expert’:1.75, ‘tfidf:advanced’:0.93, ‘tfidfBigram:5 years’:2.25 }
  40. 40. Label data before, during, and after you build a model Extract features in one place Reuse your model building code Release softly and log everything Validate and review every model Monitor after deploying Retrain when needed
  41. 41. Label data before, during, and after you build a model Extract features in one place Reuse your model building code Release softly and log everything Validate and review every model Monitor after deploying Retrain when needed
  42. 42. The best way to understand your problem is to label your own data The fastest way to get labels for your data is to label your own data The easiest way to know your labels are consistent is to label your own data
  43. 43. Labeling encourages feature development
  44. 44. Labeling creates a human performance benchmark
  45. 45. Labeling throughout gives you indications of shifting data
  46. 46. Is the job part time, full time, or both?
  47. 47. Sometimes you don’t need much data
  48. 48. Need to only do better than a simple heuristic
  49. 49. Training Samples Score Learning Curve 0 1000 3000 70005000 1.00 0.98 0.94 0.88 0.84 0.92 0.96 0.90 0.86 Training score Cross-validation score
  50. 50. Now train others to label
  51. 51. Or use experts
  52. 52. Check their consistency
  53. 53. Can build next generation model quickly
  54. 54. Always flag weird data
  55. 55. Label data before, during, and after you build a model Extract features in one place Reuse your model building code Release softly and log everything Validate and review every model Monitor after deploying Retrain when needed
  56. 56. Model Feature Extraction Features Predictions Model Builder Model Predictor
  57. 57. Prevents feature inconsistency between train / serve time
  58. 58. Allows faster feature iteration
  59. 59. Encourages feature extraction reuse
  60. 60. Deploy feature extraction services
  61. 61. Features ModelModel Builder Feature Extraction
  62. 62. Job Description Feature Extractor
  63. 63. 0.007 "bigramTfidif:5 years" 0.049 "bigramTfidf:experience in" 0.006 "tfidf:expert" 0.026 "averageWordLength" 5.506 "tfidf:2" 0.017 "tfidf:5" 0.029 "tfidf:years" 0.017 ...
  64. 64. Label data before, during, and after you build a model Extract features in one place Reuse your model building code Release softly and log everything Validate and review every model Monitor after deploying Retrain when needed
  65. 65. Features Model ● feature sampling ● feature scaling ● feature selection Model Builder ● test/train splits ● cross validation ● generate plots ● email results ● export model
  66. 66. input_file=job_decription_years_exp.gz output_dir=output/job_description_years_exp_model_builds model_name=JobExperience model_version=1.2 model_type=RandomForestClassifier model_params=[{`n_estimators`:[100, 125, 150], `max_depth`:[3, 4, 5, 6]}] downsampling_ratio=1.75 use_feature_selection=True feature_selection_variance_retained=0.9 plot_learning_curve=True mail_to=benjaminl@indeed.com
  67. 67. TruePositiveRate False Positive Rate ROC Curve 0.0 0.2 1.00.0 0.80.60.4 1.00 0.2 0.4 0.6 0.8
  68. 68. Feature Name Feature Importance experience 0.27 5 years 0.19 experience in 0.17 expert 0.16 averageWordLength 0.11 years 0.08 ... ... Class Precisio n Recall F1- Score Support 1.0 0.92 0.90 0.91 353 2.0 0.87 0.92 0.90 310 5.0 0.90 0.86 0.88 213 avg /total 0.90 0.90 0.90 876
  69. 69. Output your models into a standard format
  70. 70. Deploy quickly
  71. 71. Model Model Predictor Feature Extraction Predictions
  72. 72. Putting it all together
  73. 73. Model Feature Extraction Features Predictions Model Builder Model Predictor
  74. 74. Label data before, during, and after you build a model Extract features in one place Reuse your model building code Release softly and log everything Validate and review every model Monitor after deploying Retrain when needed
  75. 75. viewjobeval_en_US JUDY-419: Proctor test for viewjob evaluation test edit details control test1control test1
  76. 76. viewjobeval_en_US JUDY-419: Proctor test for viewjob evaluation test edit details control test1
  77. 77. viewjobeval_en_US JUDY-419: Proctor test for viewjob evaluation test test1 - 50%edit details
  78. 78. Log everything
  79. 79. uid=1b0un002j1jfi8mp&type=judyQoaEvalFeatures&appdcname=aus&appinstance=judy&tk=1b0un002d1jfid0o&locale=en_US&f.jdTfidf%3A794=0.079 31499364678474&f.candidateResumeRead=0.0&f.trigramJDTfidf%3A2365=0.03493229123324494&f.trigramJDTfidf%3A1135=0.03964128705308954 &f.jdTfidf%3A1618=0.08411276446891801&f.jdTfidf%3A2025=0.07554196313862578&f.jdTfidf%3A796=0.10368340560564313&f.trigramJDTfidf%3A 1324=0.023586131767642488&f.trigramJDTfidf%3A1300=0.013675981072748583&f.jobApplicantDistance=25000.0&f.tfidfBestFitJobsJobDescription Similarity=0.0&f.jdTfidf%3A2357=0.12212208847891733&f.jdTfidf%3A1786=0.24798453870628528&f.jdTfidf%3A1583=0.11102969484158107&f.trigra mJDTfidf%3A440=0.009580278396637679&f.bestFitJobsJobDescriptionSimilarity=0.0&f.jdTfidf%3A16=0.09676734768924529&f.trigramJDTfidf%3A3 42=0.052695755493244574&f.jdTfidf%3A2961=0.12933227874206563&f.jdTfidf%3A2559=0.0781937359029168&f.coverLetterJobTitleSimilarity=0.0&f .jdTfidf%3A313=0.13274661170267346&f.trigramJDTfidf%3A2844=0.011672658147330478&f.jdTfidf%3A1228=0.0826878541112167&f.jdTfidf%3A38 6=0.09321074430754722&f.jdTfidf%3A587=0.09338485474725206&f.trigramJDTfidf%3A2007=0.03398987646377408&f.jdTfidf%3A25=0.0848508555 3898714&f.trigramJDTfidf%3A743=0.052044363109186274&f.trigramJDTfidf%3A742=0.00936380975357828&f.jdTfidf%3A21=0.08956959630539192 &f.trigramJDTfidf%3A1465=0.05695667014121465&f.trigramJDTfidf%3A170=0.019054361889691666&f.trigramJDTfidf%3A2041=0.078672252220736 76&f.jdTfidf%3A178=0.06740515563149391&f.trigramJDTfidf%3A1348=0.020307558998175355&f.yearsOfWorkExperience=0.0&f.trigramJDTfidf%3A2 874=0.021452684048600148&f.trigramJDTfidf%3A2739=0.008846404277542146&f.jtYrsExpRegex%3A0=0.0&f.pastJobTitleSimilarity%3A0=0.0&f.pas tJobTitleSimilarity%3A1=0.0&f.tfidfResumeJobDescriptionSimilarity=0.020420184609032756&f.jdTfidf%3A276=0.0865108192737853&f.pastJobTitleSi milarity%3A2=0.0&f.jdTfidf%3A882=0.09227660841710272&f.trigramJDTfidf%3A904=0.028517392545983834&f.applicantsPerJob=0.0&f.majorJobDe scriptionSimilarity=0.018518518518518517&f.jobDescriptionCharacterLength=501.0&f.trigramJDTfidf%3A221=0.03856671987843533&f.jdSupervisorTi tleRegex%3A3=1.0&f.jdSupervisorTitleRegex%3A1=0.0&f.jdSupervisorTitleRegex%3A2=0.0&f.jdSupervisorTitleRegex%3A0=0.0&f.jdTfidf%3A1937=0. 10276933510059638&f.jdTfidf%3A2240=0.16550210190515535&f.jdTfidf%3A264=0.1061544307504775&f.jdTfidf%3A1933=0.08140883446275106&f. trigramJDTfidf%3A2932=0.04909455318062527&f.jdTfidf%3A1082=0.09783192017828135&f.jdTfidf%3A2454=0.08232280250175841&f.jdLicenceReg exp%3A2=0.0&f.tfidfCoverLetterJobDescriptionSimilarity=0.0&f.jdTfidf%3A485=0.11773996424853242&f.trigramJDTfidf%3A1942=0.03500133306124 8&f.jdLicenceRegexp%3A0=0.0&f.jdLicenceRegexp%3A1=0.0&f.jdTfidf%3A299=0.08046452951090553&f.trigramJDTfidf%3A2261=0.0539089291266 305&f.jdTfidf%3A872=0.08711259378092336&f.trigramJDTfidf%3A1377=0.037898645513041965&f.trigramJDTfidf%3A487=0.022278961460829243&
  80. 80. Reuse logs for future models
  81. 81. Logs give us insight into changing data
  82. 82. Logs allow us to see what went wrong
  83. 83. Label data before, during, and after you build a model Extract features in one place Reuse your model building code Release softly and log everything Validate and review every model Monitor after deploying Retrain when needed
  84. 84. Quantitative Validation
  85. 85. Training Set class precision recall f1-score support 0.0 1.00 1.00 1.00 448 1.0 0.99 1.00 1.00 663 2.0 1.00 0.98 0.99 269 avg / total 1.00 1.00 1.00 1380 [ 2015-12-15 21:42:27,537 INFO ] [indeed.model_builder] Test Set class precision recall f1-score support 0.0 0.85 0.90 0.87 146 1.0 0.92 0.96 0.94 226 2.0 0.91 0.70
  86. 86. TruePositiveRate False Positive Rate ROC Curve 0.0 0.2 1.00.0 0.80.60.4 1.00 0.2 0.4 0.6 0.8
  87. 87. Qualitative Validation
  88. 88. Review your Models
  89. 89. Another perspective
  90. 90. Transparency and Reproducibility
  91. 91. Awareness
  92. 92. 1 Context 2 Data 3 Response variable 4 Features 5 Model selection and performance 6 Transparency and recommendations
  93. 93. Context What should this model enable us to do (highlighting, filtering, sorting, etc.)? What products / interfaces / workflows will initially use this model ?
  94. 94. Data What queries and filters were used? From what time range did your data originate? Did you sample your dataset?
  95. 95. Response variable How was the response variable labeled or collected? What the model outputs (predictions) represent and how they should be scaled or thresholded?
  96. 96. Features How were your features generated? Which features were most important?
  97. 97. Model selection and performance Performance reports on train / test sets Overall CV search strategy and scoring function Other performance tests (e.g. newer hold out sets, stress testing) Expected model performance
  98. 98. Transparency and recommendations Properties files for Model Builder Link to branch of Model Builder code Examples of Model Predictions Possible directions for future improvements A couple sentences on why you think the model is ready for production
  99. 99. Label data before, during, and after you build a model Extract features in one place Reuse your model building code Release softly and log everything Validate and review every model Monitor after deploying Retrain when needed
  100. 100. Features and data are hard dependencies
  101. 101. Need a post deploy plan
  102. 102. Use log data to check for feature changes
  103. 103. Bucketcount tfidf:`excel` Test Name ttest_ind ks_2samp mannwhit levene ranksums p-value 3.79e-09 0.00021 8.41e-05 3.79e-09 0.00017
  104. 104. Check prediction class distributions
  105. 105. Label data before, during, and after you build a model Extract features in one place Reuse your model building code Release softly and log everything Validate and review every model Monitor after deploying Retrain when needed
  106. 106. Every model should be validated, retraining is time expensive
  107. 107. Use feature monitoring to determine feature stability
  108. 108. Choose less sensitive features
  109. 109. Avoid counts
  110. 110. Full stack data scientists
  111. 111. Full stack data science organizations
  112. 112. More Indeed Engineering Careers indeed.jobs Twitter @IndeedEng Engineering Blog & Talks indeed.tech Open Source opensource.indeedeng.io
  113. 113. Questions? Label data before, during, and after you build a model Extract features in one place Reuse your model building code Release softly and log everything Validate and review every model Monitor after deploying Retrain when needed

×