This document discusses using data science and machine learning to improve the customer experience at Comcast. It describes using error data from set-top boxes and customer behavior data to predict when customers will call and the reasons for their calls. H2O machine learning algorithms were able to more accurately predict calls and reasons compared to Spark ML, improving the customer service experience. Overall, adopting H2O's algorithms provided superior results, faster performance, and better use of memory compared to alternative tools.
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Better Customer Experience with Data Science - Bernard Burg, Comcast
1. Better Customer Experience with
Data Science
(just add water)
Bernard Burg
Comcast
bernard_burg@comcast.com
7/19/16 H2O Open Tour 2016, New York 1
2. XFINITY TV
XFINITY Internet
XFINITY Voice
XFINITY Home
Digital & OtherOther
*Minority interest and/or non-controlling interest.
Slide is not comprehensive of all Comcast NBCUniversal assets
Updated: December 22, 2015
3. Complex Troubleshooting
• Failure scenario
– Customer orders a Video-on-Demand
– Transaction fails, customer care call initiated
• Consequences
– Unhappy customer: no visibility or opportunity to mitigate issue
– Potentially avoidable phone call
• Numerous potential reasons for failure
– Billing
– Resource unavailable
– Service issue
– Hardware issue (set-top box or router)
– Software issue
– Parental control settings
7/19/16 H2O Open Tour 2016, New York 3
4. Analysis
• What brought the customer to this point?
– Call records
– Billing history
– Events generated by hardware
– Upstream outages
– Usage spikes
• What’s the best course of action now?
• How can we predict such issues?
7/19/16 H2O Open Tour 2016, New York 4
5. Project Goals
7/19/16 H2O Open Tour 2016, New York 5
Improve Customer Experience
• Keep our customers informed
• Empower our CARE agents
– Timely, accurate, complete information & context
– Smart recommendations
• Higher first call resolution
Maximize Efficiency
• Customer self service
– Fewer calls & truck rolls
• Self Assisted-healing equipment
6. Goal of Data Science
7/19/16 H2O Open Tour 2016, New York 6
Each user’s set top boxes sends up to 150+
different codes of error messages, at any time:
Goal 1: predict if a user will call
Goal 2: predict why they call
7. Predicting User Calls
Using Error Model Alone
Data science
Gradient Boosting Machine
66% accuracy
Temporal model
The algorithm reached a glass ceiling
calls
no-calls
Using Error + User Behavior Models
Data science
Gradient Boosting Machine
79% accuracy
Temporal model
Behavior model
calls
no-calls
no-calls
7/19/16 H2O Open Tour 2016, New York 7
8. Predicting Why Users Call
A Single Algorithm Predicting 10 Buckets
Data science
Gradient Boosting Machine
47% accuracy is not great
but is about 5 times better than random
Temporal model
7/19/16 H2O Open Tour 2016, New York 8
Spark ML H2O
Accuracy 42% 47%
Processing time 10 minutes 2 minutes
Memory Limited size of test No limit reached
Ease of use Program dataFrame UI
9. Very easy to make in sparkling Water:
Map enum to n binary buckets
7/19/16 H2O Open Tour 2016, New York 9
Predicting Why Users Call
10 Specialized Algorithms Predicting 10 Buckets
10 binary
buckets
11. Predicting Why Users Call
Looks good but…
Data science
Gradient Boosting MachineTemporal model
7/19/16 H2O Open Tour 2016, New York 11
Accuracy SparkML H2O H2O’s gain
Bucket 0: activations 97% 99% 2%
Bucket 1: appointment 97% 99% 2%
Bucket 2: billing 84% 86% 2%
Bucket 3: op-3 90% 93% 3%
Bucket 4: op-4 85% 90% 5%
Bucket 5: op-5 99% 99% 0%
Bucket 6: op-6 98% 100% 2%
Bucket 7: op-7 80% 82% 2%
Bucket 8: op-8 93% 97% 4%
Bucket 9: technical 66% 87% 21%
Average Accuracy 89% 95% 6%
Data science
Gradient Boosting Machine
Spark ML H2O
Accuracy ? 60%
Processing time 10 * 10 minutes 11 * 2 minutes
Memory Limited size of test No limit reached
Ease of use Program dataFrame UI
Why this
drop from
95% to 60%
12. Learning 10 Specialized Algorithms in H2O
7/19/16 H2O Open Tour 2016, New York 12
Predicting Why Users Call
13. Overlapping Buckets
7/19/16 H2O Open Tour 2016, New York 13
Hope given by a 95%
composite precision of the 10
binary algorithms did not
materialize because of
overlapping classes
misclassifying elements as
shown in ROC (Receiver
Operating characteristic)
charts as drawn by H2O
false
positive
false
positive
truepositivetruepositive
14. Forecasting Improvements with H20
7/19/16 H2O Open Tour 2016, New York 14
• Hypothesis case 1: B2:billing can be predicted with 100% accuracy
• The overall prediction model would jump to : 75% accuracy
Replace
Estimatio
n by result
15. Forecasting Improvements
7/19/16 H2O Open Tour 2016, New York 15
• By fixing one of the problematic buckets:
• The overall prediction model would jump to : 75% accuracy
• By fixing both problematic buckets:
• The overall prediction model would jump to : 86% accuracy
These simple forecasts are worth gold,
as they allow us to focus on the essential
(out of 1000’s of parameters)
16. Conclusion
7/19/16 H2O Open Tour 2016, New York 16
Choice to switch to H20 was simple
• Superior results (accuracy)
• Faster algorithms (factor 3)
• Better use of memory
• Accelerated studies because of
– Input UI allowing to select/deselect columns
– Very smart output UI (ROC, influent parameters…)
• Stable and reliable algorithms
Room for improvement:
• Sparkling water interface showed some instabilities
• We designed around it by generating csv files