SlideShare une entreprise Scribd logo
1  sur  43
How to Use Survival Analytics
to Predict Employee Turnover
(Or, Why You Shouldn't Use Logistic Regression to Predict Attrition)
Talent Analytics, Corp.
Pasha Roberts, Chief Scientist
May 16, 2017
San Francisco, CA USA
Demo Code: https://github.com/talentanalytics/class_survival_101/
www.talentanalytics.com © 2017 Talent Analytics, Corp 1 / 41
▸
▸
▹
▹
▸
▸
Who is Talent Analytics?
Predictive Modeling Platform – Advisor™
We predict employee attrition and performance pre-hire.
Much like credit risk modeling:
Predict likelihood to pay / default on mortgage, before extending credit
Predict likelihood to perform / leave role or company early, before extending job offer
PAAS (Prediction As A Service)
Seamless deployment of predictions into talent acquisition process
www.talentanalytics.com © 2017 Talent Analytics, Corp 2 / 41
Turnover and Tenure:
A Timey-Wimey Relationship
www.talentanalytics.com © 2017 Talent Analytics, Corp 3 / 41
▸
▸
Two Different (But Related) Things
Turnover: Percent of employees that terminate
within a period of time
Tenure: Employees’ length of time working at a
role or company
Is it a Wave or a Particle?
www.talentanalytics.com © 2017 Talent Analytics, Corp 4 / 41
Low Tenure, High Turnover
www.talentanalytics.com © 2017 Talent Analytics, Corp 5 / 41
High Tenure, Low Turnover
www.talentanalytics.com © 2017 Talent Analytics, Corp 6 / 41
High Tenure, High Turnover (in 2015)
www.talentanalytics.com © 2017 Talent Analytics, Corp 7 / 41
Low Tenure, Low Turnover (in Nov, Dec)
www.talentanalytics.com © 2017 Talent Analytics, Corp 8 / 41
▸
▸
▸
Main Problem: We Can't See the Future
Everyone will terminate.
Someday, somehow... But when?
Technical Lingo: Right Censoring
www.talentanalytics.com © 2017 Talent Analytics, Corp 9 / 41
▸
▸
▸
Each Side Has Missing Information
Turnover Has No Information About Tenure
Treats a temp the same as a seasoned veteran
Ignores patterns - such as people quitting right after a bonus
Tenure Has No Information About Turnover
High Tenure = staying power or retirement risk?
Each Side Varies Depending on When You Look
www.talentanalytics.com © 2017 Talent Analytics, Corp 10 / 41
▸
▸
Annual Attrition Timing is Arbitrary
What if you could see attrition
at every point in time?
“One Year” only matters to
accountants and astronomers
Cumulative Breakeven matters
more to the business
www.talentanalytics.com © 2017 Talent Analytics, Corp 11 / 41
Survival 101
www.talentanalytics.com © 2017 Talent Analytics, Corp 12 / 41
▸
▸
▸
▸
▸
▸
Survival Analytics
Medical Roots - How long will a patient survive a disease?
Engineering Roots - How long until a machine fails?
Social Science - How long do people live?
Employment - How long until an employee terminates? or promotes?
In General - “Expected duration of time until one or more events happen”
Artfully combines turnover and tenure
Lots of competing jargon and syntax from many application domains:
Failure Rate, Hazard Rate, Force of Mortality, ...
www.talentanalytics.com © 2017 Talent Analytics, Corp 13 / 41
▸
▸
▸
▸
Hazard Rate:
Conditional Daily Risk of Termination
Every day in tenure there is a (small)
probability that the employee will terminate
Conditional on survival to the prior day
Like an actuarial life table
Can rise at different times:
e.g. training, review or bonus time
www.talentanalytics.com © 2017 Talent Analytics, Corp 14 / 41
▸
▸
▸
▸
Survival Rate: Daily Probability
of Being Employed at Time t
How likely are you to be working here,
t days after being hired?
Near 100% on Day 1
(some people actually never show up)
Downhill from there - usually not linear
One role at a time
www.talentanalytics.com © 2017 Talent Analytics, Corp 15 / 41
▸
▸
▸
▸
▸
▸
▹
▹
Survival Rate: Daily Probability
of Being Employed at Time t
How likely are you to be working here,
t days after being hired?
Near 100% on Day 1
(some people actually never show up)
Downhill from there - usually not linear
One role at a time
This is the full picture
Simple transformation of the Hazard Curve
Best to use statistical tool (R, SAS)
Apparently possible in Excel
www.talentanalytics.com © 2017 Talent Analytics, Corp 15 / 41
Attrition is Built Into Survival Curves
www.talentanalytics.com © 2017 Talent Analytics, Corp 16 / 41
Shelf Pattern Sparse Pattern
Many Shapes
www.talentanalytics.com © 2017 Talent Analytics, Corp 17 / 41
Heavy Terms During Training Heavy Terms After 1-Year Bonus
Cliffs or Kinks in the Survival Curve
www.talentanalytics.com © 2017 Talent Analytics, Corp 18 / 41
Everyone Has a Survival Curve
www.talentanalytics.com © 2017 Talent Analytics, Corp 19 / 41
Predicting Survival
To the Individual Employee
www.talentanalytics.com © 2017 Talent Analytics, Corp 20 / 41
▸
▸
▸
▹
▸
▸
▸
Don't Use Logistic Methods
to Predict Attrition!
Commonly Done:
Predict one-year attrition with logistic
Use tenure as a variable along with others
Common Result:
“The biggest cause of termination is tenure”
Not an actionable coef cient
Ignores nuances available to survival methods
Mishandles current employee tenure
Less accurate
www.talentanalytics.com © 2017 Talent Analytics, Corp 21 / 41
▸
▸
▸
▸
Proportional Hazard
Most Survival models are “Proportional Hazard”
Start with Base Survival rate
Convert to Base Cumulative Hazard rate
Multiply Cumulative Hazard by a single factor
(hence “Proportional Hazard")
Linear changes to Hazard lead
to a rotation of Survival Curve
Predictive Goal:
Predict the amount of Survival Curve
rotation for each candidate
www.talentanalytics.com © 2017 Talent Analytics, Corp 22 / 41
▸
▹
▹
▹
▹
▸
▸
▸
▸
Proportional Hazard Prediction Methods
Predicting a Single Number
Cox regression
Vanilla Cox
Stepwise
ElasticNet or LASSO
Variable selection with Random Forest
Decision Trees
Random Forests
Neural Networks
SVM, etc
There are also other survival algorithms,
including parametric methods
www.talentanalytics.com © 2017 Talent Analytics, Corp 23 / 41
▸
▸
▸
▸
Build/Validate a Survival Model
With Thresholds
The Model Predicts:
“Dark Blue” hires will survive the longest
“Light Blue” hires will survive above-median
“Grey” survival curve is current median
survival.
“Light Red” and “Dark Red” candidates are not
likely to survive.
www.talentanalytics.com © 2017 Talent Analytics, Corp 24 / 41
Deeper Dive:
Predicting Survival
with Proportional Hazard
Follow Demo Code on GitHub:
https://github.com/talentanalytics/class_survival_101/
www.talentanalytics.com © 2017 Talent Analytics, Corp 25 / 41
▸
▸
▸
▸
▸
▸
Formulas and Jargon
Hazard Function:   
Probability of termination at time t, conditional on survival up to time t
Ranges from 0 to 1 (typically very small)
Cumulative Hazard Function:  
Cumulative conditional probability of termination up to time t
Ranges from 0 to 4ish (theoretically to In nity)
Survival Function:   
Probability that termination will be later than time t
Ranges from 1 to 0
Theoretically the formulas above are continuous measures, but in practice are discrete daily.
www.talentanalytics.com © 2017 Talent Analytics, Corp
h(t)
H(t) = h(x) dx = −log(S(t))∫
t
0
S(t) = e
−H(t)
26 / 41
Calculating Survival in R
Four Simple Steps with R survival package:
1. Gather and Prepare the Data
2. Calculate Baseline Survival Function
3. Model Proportional Hazard with Cox Regression
4. Validate Model
www.talentanalytics.com © 2017 Talent Analytics, Corp
(t)S0
27 / 41
▸
▸
▸
Calculating Survival in R
Four Simple Steps with R survival package:
1. Gather and Prepare the Data
2. Calculate Baseline Survival Function
3. Model Proportional Hazard with Cox Regression
4. Validate Model
5. Deploy Model into hiring
6. Hire people likely to stay on the job longer
7. Save and earn money
Avoid attributing savings to soft factors like ripple effect, morale, engagement
Hard savings from reduced Replacement Cost
Hard earnings from increased Employee Lifetime Value
www.talentanalytics.com © 2017 Talent Analytics, Corp
(t)S0
27 / 41
▸
▹
▸
▹
▹
1) Gather Employee Data
> attr.data
emp.id hire.date term.date factor.x factor.y factor.z
<int> <dttm> <dttm> <dbl> <dbl> <dbl>
1 52 2016-02-23 <NA> 0.09555369 1.515028 102.3811
2 261 2014-10-30 <NA> 2.40144024 9.065855 100.5727
3 193 2013-12-25 2015-03-28 1.88463382 4.331642 105.6887
4 150 2013-03-27 2013-09-19 3.81625316 7.363454 110.6842
5 31 2015-03-25 <NA> 5.17392728 11.824601 110.0590
Simple Requirements
Just hire.date and term.date is all you need from HRIS
Leave term.date as NA if the employee haven't terminated yet
Merge with predictive input (independent) variables
e.g. assessment results, experience, CV detail, social media, semantic
Here we use factor.x, factor.y and factor.z
www.talentanalytics.com © 2017 Talent Analytics, Corp 28 / 41
▸
▸
▹
▹
▸
▹
Calculate Fields for Survival
> attr.data %>% dplyr::select(-dplyr::contains("factor"))
emp.id hire.date term.date is.term end.date tenure.years
<int> <dttm> <dttm> <lgl> <dttm> <dbl>
1 52 2016-02-23 <NA> FALSE 2017-05-01 1.1854894
2 261 2014-10-30 <NA> FALSE 2017-05-01 2.5023956
3 193 2013-12-25 2015-03-28 TRUE 2015-03-28 1.2539357
4 150 2013-03-27 2013-09-19 TRUE 2013-09-19 0.4818617
5 31 2015-03-25 <NA> FALSE 2017-05-01 2.1026694
Simple Derivations
is.term: Have they terminated? Is term.date NA?
end.date: Either term.date, or censor date, depending on is.term
Could put transfer.date here with no is.term
The last date we know the employee was at work
tenure.years: What is employee tenure from hire.date to end.date?
All we know about the employee's tenure, as of censor date
www.talentanalytics.com © 2017 Talent Analytics, Corp 29 / 41
▸
▸
▸
Dataset is Ready
> glimpse(attr.data)
Observations: 400
Variables: 14
$ label <fctr> a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a,...
$ hire.date <dttm> 2013-04-20, 2016-01-14, 2013-09-20, 2014-05-17, 2016-...
$ term.date <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2016-...
$ is.term <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE...
$ end.date <dttm> 2017-05-01, 2017-05-01, 2017-05-01, 2017-05-01, 2017-...
$ tenure.years <dbl> 4.02975576, 1.29469635, 3.61029191, 2.95626651, 1.1463...
$ factor.x <dbl> 7.44846858, 1.08376058, 2.81983987, 0.56880027, 0.7868...
$ factor.y <dbl> 16.838313, 9.335225, 9.519951, 12.697147, 12.912898, 1...
$ factor.z <dbl> 103.10732, 100.37171, 99.76382, 95.88645, 97.61509, 10...
$ emp.id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,...
$ scale.x <dbl> 2.41351313, -0.24754913, 0.47829958, -0.46285224, -0.3...
$ scale.y <dbl> 1.31773198, -0.57718839, -0.53053544, 0.27187182, 0.32...
$ scale.z <dbl> 0.3407242, -0.1849069, -0.3017096, -1.0467221, -0.7145...
$ is.training <lgl> TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, TRU...
Example is 400-row simulated dataset
Scaled the input factors to consistent z-scale
Separated into 80% training and 20% validation datasets
www.talentanalytics.com © 2017 Talent Analytics, Corp 30 / 41
▸
2) Calculate Baseline Survival ( )
survival::Surv is a single object to hold Tenure and Turnover at once
> surv.obj <- survival::Surv(training.data$tenure.years, training.data$is.term)
Needs “time variable” tenure.years and “event variable” is.term
survival::survfit is the Kaplan-Meier Estimator
> surv.fit <- survival::survfit(surv.obj ~ 1)
> summary(surv.fit)
Call: survfit(formula = surv.obj ~ 1)
time n.risk n.event survival std.err lower 95% CI upper 95% CI
0.000 320 1 0.997 0.00312 0.991 1.000
0.249 308 1 0.994 0.00448 0.985 1.000
0.298 303 1 0.990 0.00554 0.980 1.000
0.320 298 1 0.987 0.00644 0.974 1.000
www.talentanalytics.com © 2017 Talent Analytics, Corp
(t)S0
31 / 41
Native Survival Plot Crafted ggplot2 Plot
Plot Baseline Survival
www.talentanalytics.com © 2017 Talent Analytics, Corp 32 / 41
▸
▸
▸
▸
▸
3) Back to Proportional Hazard
Consider a multiplier x that centers on 1
Multiply
for proportional (linear) changes to Hazard
Survival Curve rotates in response:
Beware - inverse relationship:
x ← 1.1 will increase , decrease
x ← 0.9 will decrease , increase
So we just need to predict x to predict Survival
One number drives the entire curve, based on baseline
www.talentanalytics.com © 2017 Talent Analytics, Corp
(t) = (t)xHx H0
(t) =Sx e
− (t)xH0
(t)Hx (t)Sx
(t)Hx (t)Sx
(t)H0
33 / 41
Model Proportional Hazard
with Cox Regression
> cox.model <- survival::coxph(formula = surv.obj ~ scale.x + scale.y + scale.z,
data = training.data)
> cox.model
Call:
survival::coxph(formula = surv.obj ~ scale.x + scale.y + scale.z,
data = training.data)
coef exp(coef) se(coef) z p
scale.x -0.4298 0.6506 0.0918 -4.68 0.0000028
scale.y -0.3204 0.7258 0.0957 -3.35 0.00082
scale.z -0.1967 0.8215 0.0984 -2.00 0.04558
Likelihood ratio test=38.2 on 3 df, p=0.000000025
n= 320, number of events= 125
www.talentanalytics.com © 2017 Talent Analytics, Corp 34 / 41
▸
▹
▹
▸
▹
▹
4) Predict validation.data
with New Cox Model
> cox.pred <- predict(cox.model, newdata = validation.data, type = "lp")
Note: we built model with training.data, now we predict against validation.data
Our model has never seen anything in validation.data
Great test to see how closely our predictions match known outcomes
The predict.coxph() function returns 5 different types of predictions
We want type = "lp" which is log(multiplier) to baseline
Other types won't work with survivalROC()
www.talentanalytics.com © 2017 Talent Analytics, Corp
(t)H0
35 / 41
▸
▸
▸
▸
How does ROC Work with Survival?
> roc.obj <- survivalROC::survivalROC(Stime = validation.data$tenure.years,
status = validation.data$is.term,
marker = cox.pred,
predict.time = 1,
lambda = 0.003)
> roc.obj$AUC
[1] 0.7102442
ROC typically evaluates classification methods
So, we make this a classi cation problem
Classify: Was the employee terminated at t = 1 or not?
Compare actual vs. predicted classi cations at all cutpoints
Our AUC is 0.71
Not bad for a censored attrition prediction!
www.talentanalytics.com © 2017 Talent Analytics, Corp 36 / 41
Plot AUC with ggplot2
www.talentanalytics.com © 2017 Talent Analytics, Corp 37 / 41
▸
▸
▸
▸
Don't Get Fooled by Randomness
We are predicting AUC, too
Let's try predicting 1000 random samples of the same underlying pattern
> test.repl <- replicate(1000, demoPrediction(verbose = FALSE))
> summary(test.repl)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.3470 0.6191 0.6833 0.6771 0.7356 0.9446
AUC generally fell between 0.619 and 0.735, but 50% were outside this range
The likely AUC is closer to 0.68 - still good
What if you got a lucky (or unlucky) 80/20 validation split?
www.talentanalytics.com © 2017 Talent Analytics, Corp 38 / 41
▸
▸
Multiple Cross-Validation
Raises the Bar
Against Getting Fooled
Typically run 20-100 CV iterations
Can consume lots of core-hours if
nested for hyperparameters
www.talentanalytics.com © 2017 Talent Analytics, Corp 39 / 41
▸
▸
▸
▸
▸
Download and Experiment With this Code
These slides will be on PAW website
Download demo code and doc from
https://github.com/talentanalytics/class_survival_101/
Clone it, fork it, run it
Send bug xes
Let me know how it goes
www.talentanalytics.com © 2017 Talent Analytics, Corp 40 / 41
Questions and Discussion
Pasha Roberts
pasha@talentanalytics.com
Follow Demo Code on GitHub:
https://github.com/talentanalytics/class_survival_101/
www.talentanalytics.com © 2017 Talent Analytics, Corp 41 / 41

Contenu connexe

Tendances

The Case for Value Stream Architecture (Mik Kersten, Carmen DeArdo)
The Case for Value Stream Architecture (Mik Kersten, Carmen DeArdo)The Case for Value Stream Architecture (Mik Kersten, Carmen DeArdo)
The Case for Value Stream Architecture (Mik Kersten, Carmen DeArdo)Carmen DeArdo
 
Predictive Analytics for HR: A Primer to Get Started on your HR Analytics Jou...
Predictive Analytics for HR: A Primer to Get Started on your HR Analytics Jou...Predictive Analytics for HR: A Primer to Get Started on your HR Analytics Jou...
Predictive Analytics for HR: A Primer to Get Started on your HR Analytics Jou...Dr Susan Entwisle
 
Agile recruiting: Optimizing your Talent Acquisition Operating Model
Agile recruiting: Optimizing your Talent Acquisition Operating ModelAgile recruiting: Optimizing your Talent Acquisition Operating Model
Agile recruiting: Optimizing your Talent Acquisition Operating ModelBeamery
 
How to go from structureless to structured without losing your vibe
How to go from structureless to structured without losing your vibeHow to go from structureless to structured without losing your vibe
How to go from structureless to structured without losing your vibeCamille Fournier
 
Lean Project Portfolio Management
Lean Project Portfolio ManagementLean Project Portfolio Management
Lean Project Portfolio ManagementAlexander Apostolov
 
Agile project management
Agile project managementAgile project management
Agile project managementeng100
 
Agile Delivery Powerpoint Presentation Slides
Agile Delivery Powerpoint Presentation SlidesAgile Delivery Powerpoint Presentation Slides
Agile Delivery Powerpoint Presentation SlidesSlideTeam
 
The New Model for Talent Management: Agenda for 2015
The New Model for Talent Management:  Agenda for 2015The New Model for Talent Management:  Agenda for 2015
The New Model for Talent Management: Agenda for 2015Josh Bersin
 
Top 9 business analyst interview questions answers
Top 9 business analyst interview questions answersTop 9 business analyst interview questions answers
Top 9 business analyst interview questions answersJobinterviews
 
Talent Acquisition Best Practices Process Map
Talent Acquisition Best Practices Process MapTalent Acquisition Best Practices Process Map
Talent Acquisition Best Practices Process Mapblayton551
 
Value Stream Management: Is Your Organization Ready?
Value Stream Management: Is Your Organization Ready?Value Stream Management: Is Your Organization Ready?
Value Stream Management: Is Your Organization Ready?DevOps.com
 

Tendances (20)

The Case for Value Stream Architecture (Mik Kersten, Carmen DeArdo)
The Case for Value Stream Architecture (Mik Kersten, Carmen DeArdo)The Case for Value Stream Architecture (Mik Kersten, Carmen DeArdo)
The Case for Value Stream Architecture (Mik Kersten, Carmen DeArdo)
 
The Rise of People Analytics
The Rise of People AnalyticsThe Rise of People Analytics
The Rise of People Analytics
 
Predictive Analytics for HR: A Primer to Get Started on your HR Analytics Jou...
Predictive Analytics for HR: A Primer to Get Started on your HR Analytics Jou...Predictive Analytics for HR: A Primer to Get Started on your HR Analytics Jou...
Predictive Analytics for HR: A Primer to Get Started on your HR Analytics Jou...
 
Agile recruiting: Optimizing your Talent Acquisition Operating Model
Agile recruiting: Optimizing your Talent Acquisition Operating ModelAgile recruiting: Optimizing your Talent Acquisition Operating Model
Agile recruiting: Optimizing your Talent Acquisition Operating Model
 
People Analytics is the New HR
People Analytics is the New HRPeople Analytics is the New HR
People Analytics is the New HR
 
How to go from structureless to structured without losing your vibe
How to go from structureless to structured without losing your vibeHow to go from structureless to structured without losing your vibe
How to go from structureless to structured without losing your vibe
 
Lean Project Portfolio Management
Lean Project Portfolio ManagementLean Project Portfolio Management
Lean Project Portfolio Management
 
Business agility
Business agilityBusiness agility
Business agility
 
Agile project management
Agile project managementAgile project management
Agile project management
 
HR Metrics That Matter
HR Metrics That MatterHR Metrics That Matter
HR Metrics That Matter
 
Agile Delivery Powerpoint Presentation Slides
Agile Delivery Powerpoint Presentation SlidesAgile Delivery Powerpoint Presentation Slides
Agile Delivery Powerpoint Presentation Slides
 
The New Model for Talent Management: Agenda for 2015
The New Model for Talent Management:  Agenda for 2015The New Model for Talent Management:  Agenda for 2015
The New Model for Talent Management: Agenda for 2015
 
People Analytics.pptx
People Analytics.pptxPeople Analytics.pptx
People Analytics.pptx
 
Top 9 business analyst interview questions answers
Top 9 business analyst interview questions answersTop 9 business analyst interview questions answers
Top 9 business analyst interview questions answers
 
Hr Analytics
Hr AnalyticsHr Analytics
Hr Analytics
 
Talent Acquisition Best Practices Process Map
Talent Acquisition Best Practices Process MapTalent Acquisition Best Practices Process Map
Talent Acquisition Best Practices Process Map
 
Digital HR.pptx
Digital HR.pptxDigital HR.pptx
Digital HR.pptx
 
RethinkingAgile_AAC2019
RethinkingAgile_AAC2019RethinkingAgile_AAC2019
RethinkingAgile_AAC2019
 
Hr analytics
Hr analyticsHr analytics
Hr analytics
 
Value Stream Management: Is Your Organization Ready?
Value Stream Management: Is Your Organization Ready?Value Stream Management: Is Your Organization Ready?
Value Stream Management: Is Your Organization Ready?
 

Similaire à 1345 keynote roberts

1505 Statistical Thinking course extract
1505 Statistical Thinking course extract1505 Statistical Thinking course extract
1505 Statistical Thinking course extractJefferson Lynch
 
10.09.14 glassdoor webinar_all_slides
10.09.14 glassdoor webinar_all_slides10.09.14 glassdoor webinar_all_slides
10.09.14 glassdoor webinar_all_slidesDr. John Sullivan
 
Cloud: Risk profiles, Preference and Decisions in Cloud
Cloud: Risk profiles, Preference and Decisions in CloudCloud: Risk profiles, Preference and Decisions in Cloud
Cloud: Risk profiles, Preference and Decisions in CloudAmazon Web Services
 
The Datafication of HR: Graduating from Metrics to Analytics
The Datafication of HR: Graduating from Metrics to AnalyticsThe Datafication of HR: Graduating from Metrics to Analytics
The Datafication of HR: Graduating from Metrics to AnalyticsVisier
 
Siemens Resource Planning Case Study
Siemens Resource Planning Case StudySiemens Resource Planning Case Study
Siemens Resource Planning Case StudyGreg Bailey
 
Ai revolution for human capital for government v2
Ai revolution for human capital for government v2Ai revolution for human capital for government v2
Ai revolution for human capital for government v2Liew Wei Da Andrew
 
Leveraging The Machine: The Future of People Data Is Now
Leveraging The Machine: The Future of People Data Is Now Leveraging The Machine: The Future of People Data Is Now
Leveraging The Machine: The Future of People Data Is Now Cornerstone OnDemand
 
Asking When, Not If in Predictive Modeling
Asking When, Not If in Predictive ModelingAsking When, Not If in Predictive Modeling
Asking When, Not If in Predictive ModelingAndrea Kropp
 
The Insider's Guide to Workforce Analytics
The Insider's Guide to Workforce AnalyticsThe Insider's Guide to Workforce Analytics
The Insider's Guide to Workforce AnalyticsVisier
 
2017-09-Presentation (1).pptx
2017-09-Presentation (1).pptx2017-09-Presentation (1).pptx
2017-09-Presentation (1).pptxssuser3fb3b5
 
2017-09-Presentation (1).pptx
2017-09-Presentation (1).pptx2017-09-Presentation (1).pptx
2017-09-Presentation (1).pptxssuser1415bc
 
Scalable agile metrics to optimize flow and reduce delivery risks
Scalable agile metrics to optimize flow and reduce delivery risksScalable agile metrics to optimize flow and reduce delivery risks
Scalable agile metrics to optimize flow and reduce delivery risksAdonis El Fakih
 
201505 Statistical Thinking course extract
201505 Statistical Thinking course extract201505 Statistical Thinking course extract
201505 Statistical Thinking course extractJefferson Lynch
 
Complex Problem Solving and Big Data Analytics
Complex Problem Solving and Big Data AnalyticsComplex Problem Solving and Big Data Analytics
Complex Problem Solving and Big Data AnalyticsCoThink
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4Salford Systems
 

Similaire à 1345 keynote roberts (20)

1440 track2 roberts
1440 track2 roberts1440 track2 roberts
1440 track2 roberts
 
1505 Statistical Thinking course extract
1505 Statistical Thinking course extract1505 Statistical Thinking course extract
1505 Statistical Thinking course extract
 
10.09.14 glassdoor webinar_all_slides
10.09.14 glassdoor webinar_all_slides10.09.14 glassdoor webinar_all_slides
10.09.14 glassdoor webinar_all_slides
 
Cloud: Risk profiles, Preference and Decisions in Cloud
Cloud: Risk profiles, Preference and Decisions in CloudCloud: Risk profiles, Preference and Decisions in Cloud
Cloud: Risk profiles, Preference and Decisions in Cloud
 
The Datafication of HR: Graduating from Metrics to Analytics
The Datafication of HR: Graduating from Metrics to AnalyticsThe Datafication of HR: Graduating from Metrics to Analytics
The Datafication of HR: Graduating from Metrics to Analytics
 
Siemens Resource Planning Case Study
Siemens Resource Planning Case StudySiemens Resource Planning Case Study
Siemens Resource Planning Case Study
 
Ai revolution for human capital for government v2
Ai revolution for human capital for government v2Ai revolution for human capital for government v2
Ai revolution for human capital for government v2
 
Leveraging The Machine: The Future of People Data Is Now
Leveraging The Machine: The Future of People Data Is Now Leveraging The Machine: The Future of People Data Is Now
Leveraging The Machine: The Future of People Data Is Now
 
Asking When, Not If in Predictive Modeling
Asking When, Not If in Predictive ModelingAsking When, Not If in Predictive Modeling
Asking When, Not If in Predictive Modeling
 
The Insider's Guide to Workforce Analytics
The Insider's Guide to Workforce AnalyticsThe Insider's Guide to Workforce Analytics
The Insider's Guide to Workforce Analytics
 
2017-09-Presentation (1).pptx
2017-09-Presentation (1).pptx2017-09-Presentation (1).pptx
2017-09-Presentation (1).pptx
 
2017-09-Presentation (1).pptx
2017-09-Presentation (1).pptx2017-09-Presentation (1).pptx
2017-09-Presentation (1).pptx
 
Scalable agile metrics to optimize flow and reduce delivery risks
Scalable agile metrics to optimize flow and reduce delivery risksScalable agile metrics to optimize flow and reduce delivery risks
Scalable agile metrics to optimize flow and reduce delivery risks
 
201505 Statistical Thinking course extract
201505 Statistical Thinking course extract201505 Statistical Thinking course extract
201505 Statistical Thinking course extract
 
Complex Problem Solving and Big Data Analytics
Complex Problem Solving and Big Data AnalyticsComplex Problem Solving and Big Data Analytics
Complex Problem Solving and Big Data Analytics
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
DFSS short
DFSS shortDFSS short
DFSS short
 
9. Risk.ppt
9. Risk.ppt9. Risk.ppt
9. Risk.ppt
 
9. Risk.ppt
9. Risk.ppt9. Risk.ppt
9. Risk.ppt
 
9. Risk.ppt
9. Risk.ppt9. Risk.ppt
9. Risk.ppt
 

Plus de Rising Media, Inc.

1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptopRising Media, Inc.
 
1620 keynote olson_using our laptop
1620 keynote olson_using our laptop1620 keynote olson_using our laptop
1620 keynote olson_using our laptopRising Media, Inc.
 
1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptopRising Media, Inc.
 
1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptopRising Media, Inc.
 
1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptopRising Media, Inc.
 
1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptopRising Media, Inc.
 
855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptopRising Media, Inc.
 
1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareableRising Media, Inc.
 
905 keynote peele_using our laptop
905 keynote peele_using our laptop905 keynote peele_using our laptop
905 keynote peele_using our laptopRising Media, Inc.
 

Plus de Rising Media, Inc. (20)

1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop
 
Matt gershoff
Matt gershoffMatt gershoff
Matt gershoff
 
Keynote adam greco
Keynote adam grecoKeynote adam greco
Keynote adam greco
 
1620 keynote olson_using our laptop
1620 keynote olson_using our laptop1620 keynote olson_using our laptop
1620 keynote olson_using our laptop
 
1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop
 
1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop
 
1415 track 2 richardson
1415 track 2 richardson1415 track 2 richardson
1415 track 2 richardson
 
1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop
 
1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop
 
915 e metrics_claudia perlich
915 e metrics_claudia perlich915 e metrics_claudia perlich
915 e metrics_claudia perlich
 
855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop
 
1615 plack using our laptop
1615 plack using our laptop1615 plack using our laptop
1615 plack using our laptop
 
1530 rimmele do not share
1530 rimmele do not share1530 rimmele do not share
1530 rimmele do not share
 
1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable
 
1115 fiztgerald schuchardt
1115 fiztgerald schuchardt1115 fiztgerald schuchardt
1115 fiztgerald schuchardt
 
1000 kondic do not share
1000 kondic do not share1000 kondic do not share
1000 kondic do not share
 
905 keynote peele_using our laptop
905 keynote peele_using our laptop905 keynote peele_using our laptop
905 keynote peele_using our laptop
 
Stephen morse sharable
Stephen morse sharableStephen morse sharable
Stephen morse sharable
 
Elder shareable
Elder shareableElder shareable
Elder shareable
 
1115 ramirez using our laptop
1115 ramirez using our laptop1115 ramirez using our laptop
1115 ramirez using our laptop
 

Dernier

Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...gragchanchal546
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 

Dernier (20)

Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 

1345 keynote roberts

  • 1. How to Use Survival Analytics to Predict Employee Turnover (Or, Why You Shouldn't Use Logistic Regression to Predict Attrition) Talent Analytics, Corp. Pasha Roberts, Chief Scientist May 16, 2017 San Francisco, CA USA Demo Code: https://github.com/talentanalytics/class_survival_101/ www.talentanalytics.com © 2017 Talent Analytics, Corp 1 / 41
  • 2. ▸ ▸ ▹ ▹ ▸ ▸ Who is Talent Analytics? Predictive Modeling Platform – Advisor™ We predict employee attrition and performance pre-hire. Much like credit risk modeling: Predict likelihood to pay / default on mortgage, before extending credit Predict likelihood to perform / leave role or company early, before extending job offer PAAS (Prediction As A Service) Seamless deployment of predictions into talent acquisition process www.talentanalytics.com © 2017 Talent Analytics, Corp 2 / 41
  • 3. Turnover and Tenure: A Timey-Wimey Relationship www.talentanalytics.com © 2017 Talent Analytics, Corp 3 / 41
  • 4. ▸ ▸ Two Different (But Related) Things Turnover: Percent of employees that terminate within a period of time Tenure: Employees’ length of time working at a role or company Is it a Wave or a Particle? www.talentanalytics.com © 2017 Talent Analytics, Corp 4 / 41
  • 5. Low Tenure, High Turnover www.talentanalytics.com © 2017 Talent Analytics, Corp 5 / 41
  • 6. High Tenure, Low Turnover www.talentanalytics.com © 2017 Talent Analytics, Corp 6 / 41
  • 7. High Tenure, High Turnover (in 2015) www.talentanalytics.com © 2017 Talent Analytics, Corp 7 / 41
  • 8. Low Tenure, Low Turnover (in Nov, Dec) www.talentanalytics.com © 2017 Talent Analytics, Corp 8 / 41
  • 9. ▸ ▸ ▸ Main Problem: We Can't See the Future Everyone will terminate. Someday, somehow... But when? Technical Lingo: Right Censoring www.talentanalytics.com © 2017 Talent Analytics, Corp 9 / 41
  • 10. ▸ ▸ ▸ Each Side Has Missing Information Turnover Has No Information About Tenure Treats a temp the same as a seasoned veteran Ignores patterns - such as people quitting right after a bonus Tenure Has No Information About Turnover High Tenure = staying power or retirement risk? Each Side Varies Depending on When You Look www.talentanalytics.com © 2017 Talent Analytics, Corp 10 / 41
  • 11. ▸ ▸ Annual Attrition Timing is Arbitrary What if you could see attrition at every point in time? “One Year” only matters to accountants and astronomers Cumulative Breakeven matters more to the business www.talentanalytics.com © 2017 Talent Analytics, Corp 11 / 41
  • 12. Survival 101 www.talentanalytics.com © 2017 Talent Analytics, Corp 12 / 41
  • 13. ▸ ▸ ▸ ▸ ▸ ▸ Survival Analytics Medical Roots - How long will a patient survive a disease? Engineering Roots - How long until a machine fails? Social Science - How long do people live? Employment - How long until an employee terminates? or promotes? In General - “Expected duration of time until one or more events happen” Artfully combines turnover and tenure Lots of competing jargon and syntax from many application domains: Failure Rate, Hazard Rate, Force of Mortality, ... www.talentanalytics.com © 2017 Talent Analytics, Corp 13 / 41
  • 14. ▸ ▸ ▸ ▸ Hazard Rate: Conditional Daily Risk of Termination Every day in tenure there is a (small) probability that the employee will terminate Conditional on survival to the prior day Like an actuarial life table Can rise at different times: e.g. training, review or bonus time www.talentanalytics.com © 2017 Talent Analytics, Corp 14 / 41
  • 15. ▸ ▸ ▸ ▸ Survival Rate: Daily Probability of Being Employed at Time t How likely are you to be working here, t days after being hired? Near 100% on Day 1 (some people actually never show up) Downhill from there - usually not linear One role at a time www.talentanalytics.com © 2017 Talent Analytics, Corp 15 / 41
  • 16. ▸ ▸ ▸ ▸ ▸ ▸ ▹ ▹ Survival Rate: Daily Probability of Being Employed at Time t How likely are you to be working here, t days after being hired? Near 100% on Day 1 (some people actually never show up) Downhill from there - usually not linear One role at a time This is the full picture Simple transformation of the Hazard Curve Best to use statistical tool (R, SAS) Apparently possible in Excel www.talentanalytics.com © 2017 Talent Analytics, Corp 15 / 41
  • 17. Attrition is Built Into Survival Curves www.talentanalytics.com © 2017 Talent Analytics, Corp 16 / 41
  • 18. Shelf Pattern Sparse Pattern Many Shapes www.talentanalytics.com © 2017 Talent Analytics, Corp 17 / 41
  • 19. Heavy Terms During Training Heavy Terms After 1-Year Bonus Cliffs or Kinks in the Survival Curve www.talentanalytics.com © 2017 Talent Analytics, Corp 18 / 41
  • 20. Everyone Has a Survival Curve www.talentanalytics.com © 2017 Talent Analytics, Corp 19 / 41
  • 21. Predicting Survival To the Individual Employee www.talentanalytics.com © 2017 Talent Analytics, Corp 20 / 41
  • 22. ▸ ▸ ▸ ▹ ▸ ▸ ▸ Don't Use Logistic Methods to Predict Attrition! Commonly Done: Predict one-year attrition with logistic Use tenure as a variable along with others Common Result: “The biggest cause of termination is tenure” Not an actionable coef cient Ignores nuances available to survival methods Mishandles current employee tenure Less accurate www.talentanalytics.com © 2017 Talent Analytics, Corp 21 / 41
  • 23. ▸ ▸ ▸ ▸ Proportional Hazard Most Survival models are “Proportional Hazard” Start with Base Survival rate Convert to Base Cumulative Hazard rate Multiply Cumulative Hazard by a single factor (hence “Proportional Hazard") Linear changes to Hazard lead to a rotation of Survival Curve Predictive Goal: Predict the amount of Survival Curve rotation for each candidate www.talentanalytics.com © 2017 Talent Analytics, Corp 22 / 41
  • 24. ▸ ▹ ▹ ▹ ▹ ▸ ▸ ▸ ▸ Proportional Hazard Prediction Methods Predicting a Single Number Cox regression Vanilla Cox Stepwise ElasticNet or LASSO Variable selection with Random Forest Decision Trees Random Forests Neural Networks SVM, etc There are also other survival algorithms, including parametric methods www.talentanalytics.com © 2017 Talent Analytics, Corp 23 / 41
  • 25. ▸ ▸ ▸ ▸ Build/Validate a Survival Model With Thresholds The Model Predicts: “Dark Blue” hires will survive the longest “Light Blue” hires will survive above-median “Grey” survival curve is current median survival. “Light Red” and “Dark Red” candidates are not likely to survive. www.talentanalytics.com © 2017 Talent Analytics, Corp 24 / 41
  • 26. Deeper Dive: Predicting Survival with Proportional Hazard Follow Demo Code on GitHub: https://github.com/talentanalytics/class_survival_101/ www.talentanalytics.com © 2017 Talent Analytics, Corp 25 / 41
  • 27. ▸ ▸ ▸ ▸ ▸ ▸ Formulas and Jargon Hazard Function:    Probability of termination at time t, conditional on survival up to time t Ranges from 0 to 1 (typically very small) Cumulative Hazard Function:   Cumulative conditional probability of termination up to time t Ranges from 0 to 4ish (theoretically to In nity) Survival Function:    Probability that termination will be later than time t Ranges from 1 to 0 Theoretically the formulas above are continuous measures, but in practice are discrete daily. www.talentanalytics.com © 2017 Talent Analytics, Corp h(t) H(t) = h(x) dx = −log(S(t))∫ t 0 S(t) = e −H(t) 26 / 41
  • 28. Calculating Survival in R Four Simple Steps with R survival package: 1. Gather and Prepare the Data 2. Calculate Baseline Survival Function 3. Model Proportional Hazard with Cox Regression 4. Validate Model www.talentanalytics.com © 2017 Talent Analytics, Corp (t)S0 27 / 41
  • 29. ▸ ▸ ▸ Calculating Survival in R Four Simple Steps with R survival package: 1. Gather and Prepare the Data 2. Calculate Baseline Survival Function 3. Model Proportional Hazard with Cox Regression 4. Validate Model 5. Deploy Model into hiring 6. Hire people likely to stay on the job longer 7. Save and earn money Avoid attributing savings to soft factors like ripple effect, morale, engagement Hard savings from reduced Replacement Cost Hard earnings from increased Employee Lifetime Value www.talentanalytics.com © 2017 Talent Analytics, Corp (t)S0 27 / 41
  • 30. ▸ ▹ ▸ ▹ ▹ 1) Gather Employee Data > attr.data emp.id hire.date term.date factor.x factor.y factor.z <int> <dttm> <dttm> <dbl> <dbl> <dbl> 1 52 2016-02-23 <NA> 0.09555369 1.515028 102.3811 2 261 2014-10-30 <NA> 2.40144024 9.065855 100.5727 3 193 2013-12-25 2015-03-28 1.88463382 4.331642 105.6887 4 150 2013-03-27 2013-09-19 3.81625316 7.363454 110.6842 5 31 2015-03-25 <NA> 5.17392728 11.824601 110.0590 Simple Requirements Just hire.date and term.date is all you need from HRIS Leave term.date as NA if the employee haven't terminated yet Merge with predictive input (independent) variables e.g. assessment results, experience, CV detail, social media, semantic Here we use factor.x, factor.y and factor.z www.talentanalytics.com © 2017 Talent Analytics, Corp 28 / 41
  • 31. ▸ ▸ ▹ ▹ ▸ ▹ Calculate Fields for Survival > attr.data %>% dplyr::select(-dplyr::contains("factor")) emp.id hire.date term.date is.term end.date tenure.years <int> <dttm> <dttm> <lgl> <dttm> <dbl> 1 52 2016-02-23 <NA> FALSE 2017-05-01 1.1854894 2 261 2014-10-30 <NA> FALSE 2017-05-01 2.5023956 3 193 2013-12-25 2015-03-28 TRUE 2015-03-28 1.2539357 4 150 2013-03-27 2013-09-19 TRUE 2013-09-19 0.4818617 5 31 2015-03-25 <NA> FALSE 2017-05-01 2.1026694 Simple Derivations is.term: Have they terminated? Is term.date NA? end.date: Either term.date, or censor date, depending on is.term Could put transfer.date here with no is.term The last date we know the employee was at work tenure.years: What is employee tenure from hire.date to end.date? All we know about the employee's tenure, as of censor date www.talentanalytics.com © 2017 Talent Analytics, Corp 29 / 41
  • 32. ▸ ▸ ▸ Dataset is Ready > glimpse(attr.data) Observations: 400 Variables: 14 $ label <fctr> a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a,... $ hire.date <dttm> 2013-04-20, 2016-01-14, 2013-09-20, 2014-05-17, 2016-... $ term.date <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2016-... $ is.term <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE... $ end.date <dttm> 2017-05-01, 2017-05-01, 2017-05-01, 2017-05-01, 2017-... $ tenure.years <dbl> 4.02975576, 1.29469635, 3.61029191, 2.95626651, 1.1463... $ factor.x <dbl> 7.44846858, 1.08376058, 2.81983987, 0.56880027, 0.7868... $ factor.y <dbl> 16.838313, 9.335225, 9.519951, 12.697147, 12.912898, 1... $ factor.z <dbl> 103.10732, 100.37171, 99.76382, 95.88645, 97.61509, 10... $ emp.id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,... $ scale.x <dbl> 2.41351313, -0.24754913, 0.47829958, -0.46285224, -0.3... $ scale.y <dbl> 1.31773198, -0.57718839, -0.53053544, 0.27187182, 0.32... $ scale.z <dbl> 0.3407242, -0.1849069, -0.3017096, -1.0467221, -0.7145... $ is.training <lgl> TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, TRU... Example is 400-row simulated dataset Scaled the input factors to consistent z-scale Separated into 80% training and 20% validation datasets www.talentanalytics.com © 2017 Talent Analytics, Corp 30 / 41
  • 33. ▸ 2) Calculate Baseline Survival ( ) survival::Surv is a single object to hold Tenure and Turnover at once > surv.obj <- survival::Surv(training.data$tenure.years, training.data$is.term) Needs “time variable” tenure.years and “event variable” is.term survival::survfit is the Kaplan-Meier Estimator > surv.fit <- survival::survfit(surv.obj ~ 1) > summary(surv.fit) Call: survfit(formula = surv.obj ~ 1) time n.risk n.event survival std.err lower 95% CI upper 95% CI 0.000 320 1 0.997 0.00312 0.991 1.000 0.249 308 1 0.994 0.00448 0.985 1.000 0.298 303 1 0.990 0.00554 0.980 1.000 0.320 298 1 0.987 0.00644 0.974 1.000 www.talentanalytics.com © 2017 Talent Analytics, Corp (t)S0 31 / 41
  • 34. Native Survival Plot Crafted ggplot2 Plot Plot Baseline Survival www.talentanalytics.com © 2017 Talent Analytics, Corp 32 / 41
  • 35. ▸ ▸ ▸ ▸ ▸ 3) Back to Proportional Hazard Consider a multiplier x that centers on 1 Multiply for proportional (linear) changes to Hazard Survival Curve rotates in response: Beware - inverse relationship: x ← 1.1 will increase , decrease x ← 0.9 will decrease , increase So we just need to predict x to predict Survival One number drives the entire curve, based on baseline www.talentanalytics.com © 2017 Talent Analytics, Corp (t) = (t)xHx H0 (t) =Sx e − (t)xH0 (t)Hx (t)Sx (t)Hx (t)Sx (t)H0 33 / 41
  • 36. Model Proportional Hazard with Cox Regression > cox.model <- survival::coxph(formula = surv.obj ~ scale.x + scale.y + scale.z, data = training.data) > cox.model Call: survival::coxph(formula = surv.obj ~ scale.x + scale.y + scale.z, data = training.data) coef exp(coef) se(coef) z p scale.x -0.4298 0.6506 0.0918 -4.68 0.0000028 scale.y -0.3204 0.7258 0.0957 -3.35 0.00082 scale.z -0.1967 0.8215 0.0984 -2.00 0.04558 Likelihood ratio test=38.2 on 3 df, p=0.000000025 n= 320, number of events= 125 www.talentanalytics.com © 2017 Talent Analytics, Corp 34 / 41
  • 37. ▸ ▹ ▹ ▸ ▹ ▹ 4) Predict validation.data with New Cox Model > cox.pred <- predict(cox.model, newdata = validation.data, type = "lp") Note: we built model with training.data, now we predict against validation.data Our model has never seen anything in validation.data Great test to see how closely our predictions match known outcomes The predict.coxph() function returns 5 different types of predictions We want type = "lp" which is log(multiplier) to baseline Other types won't work with survivalROC() www.talentanalytics.com © 2017 Talent Analytics, Corp (t)H0 35 / 41
  • 38. ▸ ▸ ▸ ▸ How does ROC Work with Survival? > roc.obj <- survivalROC::survivalROC(Stime = validation.data$tenure.years, status = validation.data$is.term, marker = cox.pred, predict.time = 1, lambda = 0.003) > roc.obj$AUC [1] 0.7102442 ROC typically evaluates classification methods So, we make this a classi cation problem Classify: Was the employee terminated at t = 1 or not? Compare actual vs. predicted classi cations at all cutpoints Our AUC is 0.71 Not bad for a censored attrition prediction! www.talentanalytics.com © 2017 Talent Analytics, Corp 36 / 41
  • 39. Plot AUC with ggplot2 www.talentanalytics.com © 2017 Talent Analytics, Corp 37 / 41
  • 40. ▸ ▸ ▸ ▸ Don't Get Fooled by Randomness We are predicting AUC, too Let's try predicting 1000 random samples of the same underlying pattern > test.repl <- replicate(1000, demoPrediction(verbose = FALSE)) > summary(test.repl) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.3470 0.6191 0.6833 0.6771 0.7356 0.9446 AUC generally fell between 0.619 and 0.735, but 50% were outside this range The likely AUC is closer to 0.68 - still good What if you got a lucky (or unlucky) 80/20 validation split? www.talentanalytics.com © 2017 Talent Analytics, Corp 38 / 41
  • 41. ▸ ▸ Multiple Cross-Validation Raises the Bar Against Getting Fooled Typically run 20-100 CV iterations Can consume lots of core-hours if nested for hyperparameters www.talentanalytics.com © 2017 Talent Analytics, Corp 39 / 41
  • 42. ▸ ▸ ▸ ▸ ▸ Download and Experiment With this Code These slides will be on PAW website Download demo code and doc from https://github.com/talentanalytics/class_survival_101/ Clone it, fork it, run it Send bug xes Let me know how it goes www.talentanalytics.com © 2017 Talent Analytics, Corp 40 / 41
  • 43. Questions and Discussion Pasha Roberts pasha@talentanalytics.com Follow Demo Code on GitHub: https://github.com/talentanalytics/class_survival_101/ www.talentanalytics.com © 2017 Talent Analytics, Corp 41 / 41