LAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality

11
Model Accuracy
Training vs Reality
Mike Sharkey & Brian Becker
Blue Canary
Delivered by Dan Rinzel
Blackboard, Inc.
#LAK16 - Practitioner Track
April 28th, 2016

22
Agenda
Project goals & data collection process
Measuring efficacy & modeling lessons
learned
Enabling triage & intervention
Key takeaways

33
Project Goals
Blue Canary built a predictive model for a client institution’s students enrolled
in their online program, to assess attrition risk
 7 week courses, rolling starts every week
 Policy definition for weekly attendance – students expected to attend &
post in 4 out of 7 days each week
 strong correlation between attendance & attrition was assumed
Trained the model on data that included attendance and attrition
 1,456 distinct courses that ran between Jan 2013 & Aug 2014
 Class size x̄ = 23 enrolled students
 19,506 distinct students
With the model proven, ran a live 6-month pilot
 Rolled out to 100 faculty members teaching 1 of 3 introductory courses
in the bachelor’s degree program - ~4,500 students
 Enabled integrated alerts for student advisors
 Compared predictions to actual behavior

44
Data Collection Process
Collected SIS and LMS fields from the institution to get historic data for
training the predictive model.
Historically, we know if the student did or did not meet the attendance
requirements, so we have the outcomes needed to develop a model.
From there, split the data into three buckets: 70% of the data, used to
train the model, and two other buckets each with 15%, used to test and
validate the model.
We then take specific fields that are important in identifying student
behavior to construct features. These features are the inputs to the
random forest machine learning modeling process

55
Data Collection Process
Features sourced from SIS Data
Incoming GPA
Inbound Transfer Credits
Previous Course Grade
Family Income
Age
Days since last course
Gender
Credits earned (% of attempted)
Military service
Degree Program
# Failed/Dropped Courses
Features sourced from LMS Data
Current Course Grade
Met prior week attendance?
# days with posts in the last 7
# posts decile – main forum
# posts decile – all forums
Days since last post

66
Measuring Efficacy: Methodology
To determine the accuracy of our machine learning model we use the
numerical values from a confusion matrix to calculate precision, recall and
F1 Score.
Using our scenario, precision is defined on the positive side as: of the
students we predicted would attend class that week, what percent actually
attended?
Recall is defined as: of the students that did attend class that week, what
percent did we accurately predict?
The F1 Score is simply the harmonic mean of precision and recall.
Went live with predictions in April 2015 - fed the model with current data
each day & compared actual weekly results against the accuracy of the
initial training model over a 6-month span

77
Measuring Efficacy: Results & Lessons Learned

88
Measuring Efficacy: Results & Lessons Learned
Graphs for Precision/Recall/F1 Score comparing training & practice go
here
0 0.05 0.1 0.15 0.2 0.25
# Withdrawn Courses
# Failed Courses
Credits earned (% of attempted)
Degree program
Military status
Days since last course
Gender
Current class - days since last post
Age bracket (decade)
Previous course grade
Salary decile
Current class - total posts decile
Cumulative GPA
Transfer Credits
Current class - previous week # posts
Current class - days with posts (rolling 7 day)
Current class - previous week attendance
Current class - cumulative performance
FEATURE DRIVERS RANKED BY IMPORTANCE WITHIN MODEL
Week 2-6 Model
Week 0-1 Model

99
Enabling Triage & Intervention
Augmenting the other tools available to teachers in fully-online
courses
Creating efficiencies for advisors who may have large caseloads
of students to help with attrition risk diagnosis & intervention
Give both groups supplemental confidence in the prediction
numbers
Provide a Create Alert call to action

1010

1111

1212
Key Takeaways
After running the model for six months, we see that the actual model
efficacy tracked very closely with the predicted model efficacy from
training. This is a positive testament to the power and validity of the
model.
Additionally, the model accuracy numbers we saw (in the 75-80% range)
are very much in line with the accuracy rates we have seen with models at
other institutions. This adds another level of confidence for using
predictive models as a diagnostic tool to address at-risk students and turn
those models into intervention-based actions.

1313
Thank You!
Dan Rinzel
Senior Product Manager for Analytics @ Blackboard
dan.rinzel@blackboard.com

LAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (14)

Similaire à LAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality

Similaire à LAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality (20)

Dernier

Dernier (20)

LAK16 Practitioner Track presentation: Model Accuracy. Training vs Reality

Notes de l'éditeur