This document summarizes building a data solution for a telecom company. The objectives were to personalize recommendations, increase customer retention and loyalty. The data included large volumes of static and dynamic customer data. The pipeline involved clustering customers into groups, identifying high and low profit customers as outliers, and feature selection before building learning models. Insights from the models included clustering customers by revenue, profit, usage, discounts and costs to identify optimization opportunities.
Elements of language learning - an analysis of how different elements of lang...
Telecom Data Analytics
1. Story of Building a Telecom
Data Solution
Sawinder Pal Kaur, PhD
Data Scientist, SAP Labs
2. Outline
1. Define business objectives and translating business
problem into data science problem
2. Introduction to Telecom data - data scale, volume,
continuous and categorical variables, static and dynamic
data
3. Architecture and data processing pipeline: Big data
handling and data science methods for Categorical
feature selection
4. Solution Engineering: How to keep project managers do
feature selection and identify the opportunities to
optimize the existing plans and services?
4. Business Objective
• Personalize
recommendation
• More customer satisfaction
• Improved Customer
retention
• Increased frequency of
selling
• Better mix of products
• Increased customer loyalty
• Better decision on coupons
and discounts
• Develop effective strategy for
new product launches
• Better offers to specific
customer profile
• Better product design /
pricing
• Improve quality of service
for highest margin
customers
• Invest where highest
margin customers are
using the network
resources
Recommend
Plans and Services
Grouping/
Clustering
Identify Profit
Maximization
Opportunities
6. Data
• How much data is available?
• Data infrastructure
• Data dashboards
• Data preparation for
Machine learning
• Data protection and privacy
7. Partitioning the data into similar groups
Multi dimensional clustering
Grouping customers-
One dimensional
binning/clustering
8. High, low, and normal
profitable customers -
One dimensional outlier
detection
Multi dimensional outlier detection
9. • Dealing with missing –
• Delete the rows with missing
• Replace missing using
• mean/median
• Other number
• Conditional mean
• Model like K nearest neighborhood
10. • Filter Methods – used as independent feature selection e.g.
Pearson correlation, Mutual Information, MRMR
• Dimensionality reduction – PCA, Variational autoencoder
• Feature Engineering
• Creating new variables – Polynomials, Interaction variables, Ratios
• Wrapper and Embedded methods - used in the model building
process
Feature
selection
Base set
Learning
Model
Performance