SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
1
Machine Learning
aplicado em
Telecom utilizando a
R e H20.ai
SP Big Data Meetup - March 2016
2
• Master’s Degree in Applied Computational Intelligence (2015)
• Specialist in Database Engineering (2014)
• Former Data Miner in Financial Markets
• Mineração de Dados - https://mineracaodedados.wordpress.com/
• E-mail: flavioclesio@gmail.com / twitter: @flavioclesio
Flávio Clésio – @flavioclesio
Machine Learning Engineer @ Movile
3
Introduction
Problem Definition
Survival Analysis
Deep Learning
Architecture
Conclusions and Recommendations
Overview
4
• Very competitive environment: Competition, government,
economy, tight margins, 3G and mobile internet, and so on
• Customer Saturation = Skyrocket Churn
• Retention is the name of the game
• Quality over money (Content is the king!)
Introduction
Value-Added Services (VAS) industry
5
• Ensemble Learning using Survival Analysis and Deep Learning
• Alter that we’ll score our subscribers
• Each score range will have a special treatment
• Development of retention strategies: Price elasticity, cross-
selling, special offers, channel acquisition analysis, up-front
payments, freemium strategies, promotional strategies, pay as
you go
Problem Definition – How to stop the Churn?
Prescriptive Churn Modeling
6
•Survival function S(t):
T = Event time
f(t) = Density function
F(t) = Cumulative Density function.
Survival Analysis
Statistical procedure that estimates Time-to-Event probability of survival
7
Survival Analysis
Cumulative curve of survival
Source: New Evidence, Confirmatory overall survival analysis of CLEOPATRA
8
• Several processing hidden layers
• High complexity structures
• Non-Linear problems
• Sorry guys but it’s a very old idea (Ivakhnenko, A. G. (1971).
Polynomial theory of complex systems)
• Deeply applied in Computer Vision
Deep Learning
Abstraction, processing, no-linearity. Together.
9
Deep Learning
Abstraction, processing, no-linearity. Together.
Source: Multimedia Laboratory Deep Learning Identity-Preserving Face Space
10
Movile Architecture
Abstraction Layer
11
What worked?
• Ensemble Learning
• Pipeline of data integration (Use
Amazon Redshift as meat grinder)
• Weekly processing
• Statistical Sampling
• Fast experimental approach (No long
meetings, no bureaucracy, quick-win
modeling, etc.)
What did not worked?
• Daily processing
• Stand-alone solutions like Weka,
Statistica, Python
• Traditional approaches of Machine
Learning like SVM and Logistic
Regression
• H2o.ai distributed processing using R or
python
Conclusions and Recommendations
Warfield hard lessons
12
facebook.com/movile
@Movile_Latam
linkedin.com/company/movile
pinterest.com/movile/
movile.com
55 11 9 8501 6488
Flávio Clésio - @flavioclesio

Contenu connexe

Similaire à SP Big Data Meetup - March/16

H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Geoffrey Fox
 

Similaire à SP Big Data Meetup - March/16 (20)

Predictive analytics in mobility
Predictive analytics in mobilityPredictive analytics in mobility
Predictive analytics in mobility
 
Privacy-preserving Analytics and Data Mining at LinkedIn
Privacy-preserving Analytics and Data Mining at LinkedInPrivacy-preserving Analytics and Data Mining at LinkedIn
Privacy-preserving Analytics and Data Mining at LinkedIn
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Data Analytics in Real World (May 2016)
Data Analytics in Real World (May 2016)Data Analytics in Real World (May 2016)
Data Analytics in Real World (May 2016)
 
Internship Presentation.pdf
Internship Presentation.pdfInternship Presentation.pdf
Internship Presentation.pdf
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
 
Chatbots: Automated Conversational Model using Machine Learning
Chatbots: Automated Conversational Model using Machine LearningChatbots: Automated Conversational Model using Machine Learning
Chatbots: Automated Conversational Model using Machine Learning
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
 
ARC202:real world real time analytics
ARC202:real world real time analyticsARC202:real world real time analytics
ARC202:real world real time analytics
 
fINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptxfINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptx
 
Dr Abel Sanchez at Bristlecone Pulse 2017 MIT
Dr Abel Sanchez at Bristlecone Pulse 2017 MITDr Abel Sanchez at Bristlecone Pulse 2017 MIT
Dr Abel Sanchez at Bristlecone Pulse 2017 MIT
 
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
 
Agile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity managementAgile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity management
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Fms invited talk_2018 v5
Fms invited talk_2018 v5Fms invited talk_2018 v5
Fms invited talk_2018 v5
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
 

Plus de Flavio Clesio

Loren seagrave neuro biomechanics of maximum velocity sprinting
Loren seagrave neuro biomechanics of maximum velocity sprintingLoren seagrave neuro biomechanics of maximum velocity sprinting
Loren seagrave neuro biomechanics of maximum velocity sprinting
Flavio Clesio
 
Tom tellez sprinting a biomechanical approach
Tom tellez sprinting   a biomechanical approachTom tellez sprinting   a biomechanical approach
Tom tellez sprinting a biomechanical approach
Flavio Clesio
 

Plus de Flavio Clesio (10)

Machine Learning Operations Active Failures, Latent Conditions
Machine Learning Operations Active Failures, Latent ConditionsMachine Learning Operations Active Failures, Latent Conditions
Machine Learning Operations Active Failures, Latent Conditions
 
Security in Machine Learning
Security in Machine LearningSecurity in Machine Learning
Security in Machine Learning
 
Machine Learning Operations (MLOps) - Active Failures and Latent Conditions
Machine Learning Operations (MLOps) - Active Failures and Latent ConditionsMachine Learning Operations (MLOps) - Active Failures and Latent Conditions
Machine Learning Operations (MLOps) - Active Failures and Latent Conditions
 
Apache Spark: Casos de uso e escalabilidade
Apache Spark: Casos de uso e escalabilidadeApache Spark: Casos de uso e escalabilidade
Apache Spark: Casos de uso e escalabilidade
 
Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...
Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...
Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...
 
Loren seagrave neuro biomechanics of maximum velocity sprinting
Loren seagrave neuro biomechanics of maximum velocity sprintingLoren seagrave neuro biomechanics of maximum velocity sprinting
Loren seagrave neuro biomechanics of maximum velocity sprinting
 
Tom tellez sprinting a biomechanical approach
Tom tellez sprinting   a biomechanical approachTom tellez sprinting   a biomechanical approach
Tom tellez sprinting a biomechanical approach
 
Dan pfaff - guidelines for plyometric training
Dan pfaff - guidelines for plyometric trainingDan pfaff - guidelines for plyometric training
Dan pfaff - guidelines for plyometric training
 
Mini Atletismo
Mini AtletismoMini Atletismo
Mini Atletismo
 
Planilha De Treinos - 100 Metros
Planilha De Treinos - 100 MetrosPlanilha De Treinos - 100 Metros
Planilha De Treinos - 100 Metros
 

Dernier

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Dr.Costas Sachpazis
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Dernier (20)

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSUNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 

SP Big Data Meetup - March/16

  • 1. 1 Machine Learning aplicado em Telecom utilizando a R e H20.ai SP Big Data Meetup - March 2016
  • 2. 2 • Master’s Degree in Applied Computational Intelligence (2015) • Specialist in Database Engineering (2014) • Former Data Miner in Financial Markets • Mineração de Dados - https://mineracaodedados.wordpress.com/ • E-mail: flavioclesio@gmail.com / twitter: @flavioclesio Flávio Clésio – @flavioclesio Machine Learning Engineer @ Movile
  • 3. 3 Introduction Problem Definition Survival Analysis Deep Learning Architecture Conclusions and Recommendations Overview
  • 4. 4 • Very competitive environment: Competition, government, economy, tight margins, 3G and mobile internet, and so on • Customer Saturation = Skyrocket Churn • Retention is the name of the game • Quality over money (Content is the king!) Introduction Value-Added Services (VAS) industry
  • 5. 5 • Ensemble Learning using Survival Analysis and Deep Learning • Alter that we’ll score our subscribers • Each score range will have a special treatment • Development of retention strategies: Price elasticity, cross- selling, special offers, channel acquisition analysis, up-front payments, freemium strategies, promotional strategies, pay as you go Problem Definition – How to stop the Churn? Prescriptive Churn Modeling
  • 6. 6 •Survival function S(t): T = Event time f(t) = Density function F(t) = Cumulative Density function. Survival Analysis Statistical procedure that estimates Time-to-Event probability of survival
  • 7. 7 Survival Analysis Cumulative curve of survival Source: New Evidence, Confirmatory overall survival analysis of CLEOPATRA
  • 8. 8 • Several processing hidden layers • High complexity structures • Non-Linear problems • Sorry guys but it’s a very old idea (Ivakhnenko, A. G. (1971). Polynomial theory of complex systems) • Deeply applied in Computer Vision Deep Learning Abstraction, processing, no-linearity. Together.
  • 9. 9 Deep Learning Abstraction, processing, no-linearity. Together. Source: Multimedia Laboratory Deep Learning Identity-Preserving Face Space
  • 11. 11 What worked? • Ensemble Learning • Pipeline of data integration (Use Amazon Redshift as meat grinder) • Weekly processing • Statistical Sampling • Fast experimental approach (No long meetings, no bureaucracy, quick-win modeling, etc.) What did not worked? • Daily processing • Stand-alone solutions like Weka, Statistica, Python • Traditional approaches of Machine Learning like SVM and Logistic Regression • H2o.ai distributed processing using R or python Conclusions and Recommendations Warfield hard lessons