SlideShare une entreprise Scribd logo
Data Driven Solutions:
A practical overview on Machine
Learning
Startup Weekend
Raul Eulogio
Introduction
● Raul Eulogio
○ Data Analyst at Hospice of Santa Barbara
○ Co-founder: inertia7.com
○ President of Data Science at UCSB
○ Self taught Machine learning enthusiast
Data Science at UCSB and Farmer’s Insurance
Competition
Farmers Insurance is challenging you to put your data skills to the test. Seize this
opportunity to practically apply data science to tackle a problem in the insurance
field.
The top-performing teams will bring home:
● 1st place: $2000
● 2nd place: $1000
● 3rd place: $500
Additionally, all winning teams will get to present their work to a panel of Farmers
employees. MUST BE UCSB Student and Paid Member of Data Science at
UCSB. Application here
● Use data collection to your advantage
○ The authors of Lean Analytics state: “data driven learning is the cornerstone of success in
startups. It’s how you learn what’s working and iterate towards the right product and
market before the money runs out.”
● Data Science
○ Enhancing the interpretation of reality
○ Automating machines to respond to their environments
* I will use Machine Learning and Data Science interchangeably
Why Machine Learning*?
Data Driven Solutions for a Data Driven
Organization in a Data Driven World
I’m not here ...
● to tell you Data Science “is the sexiest job of the 21st century”
● to tell you that studies show less than 1% of data is being analyzed
● to show you the usual Venn Diagram that is presented at almost every data
science talk
Multifaceted Domain
Machine Learning can be ...
○ Exploratory Analysis
■ Exploring trends in data
■ Creating a narrative with data
○ Unsupervised Learning
■ Exploring Trends and Patterns on a larger scale
■ Find hidden structure in data
○ Supervised Learning
■ Predicting an output based on inputs
■ Regression and Classification
Case Studies
Show through examples, all data available online and all work is open source and
on my Github Repository!
○ Exploratory Analysis - Apple Watch Data
■ Existing data collected by customer/user
● Data Collection (Python) and Data Exploration (R)
○ Unsupervised Learning - Spotify Data
■ Data made available by 3rd party source
● Data Exploration and Data Modeling (Python)
○ Supervised Learning - IBM Customer Churn Data
■ Data collected by organization
● Data Exploration (R) and Data Modeling (Python)
Exploratory Analysis: A case study on Apple Watch
● Sisense states: “You do [EDA] by taking a broad look at patterns, trends,
outliers, unexpected results and so on in your existing data, using
visual and quantitative methods to get a sense of the story this tells.
You’re looking for clues that suggest your logical next steps, questions
or areas of research.”
● Example data was gathered by Apple Watch on a daily basis to help fitness
tracking and other health related data
How can we use customer’s daily fitness regimen to identify important trends?
Can we detect and prevent days/weeks where our customers will reduce their
workout regimen?
Is there correlation between our customer’s workout regimen and use of our
services?
Questions to consider
Unsupervised Learning: A Case Study on Spotify
Music
● Noticing trends and patterns within the data
○ Combining all features and usually unlabelled data
● Using Spotify API to create a recommender based on distance metric
● Ability to create clusters within our data
○ Recommend other use cases/product based on customer preference
■ Examples include Recommended videos on Youtube, Customers also bought on
Amazon, Daily mix on Spotify
How does it work?
● Nearest Neighbor Algorithm
● Algorithm creates feature space
using all inputs
● Inputs include:
○ Danceability
○ Loudness
○ Tempo
○ Category
○ And more
Examples of recommender at work
After some research I found that a lot of these songs were very similar in nature!
I have more knowledge in hip hop so I can say many of these were good recommendations some interesting songs were Don’t Wanna Know and
Summertime Sadness.
Questions to consider
How can we provide a seamless music experience for users?
Can we understand a users musical taste to maximize daily workout regime?
How can we effectively utilize 3rd party data to benefit our product?
● Can we accurately predict when a customer will stop using a
service/business?
● Binary Classification problem using customer information including:
○ Tenure
○ Total Charges
○ Monthly Charges
○ Gender
○ Utilization of Phone Services?
○ And more...
● Models used:
○ Gradient Boosting
○ Logistic Regression
● Things to consider: Data Preprocessing, Data leakage, Class Imbalance
Supervised Learning: A case study on Churn Rate
Variable Importance gathered by Gradient Boosting
Exploratory Analysis on Churn Data Set
Exploratory Analysis on Churn Data Set
Exploratory Analysis on Churn Data Set
Final Results on Churn Data Set
● Gradient Boosting: 76% Accuracy (CV)
● Logistic Regression: 77% Accuracy
(CV)
● Not high but we can still gain insight
○ Variable Importance for GB
○ Coefficients for variables for LR
● Customers with Month-to-Month
Contracts most likely to Churn
● Neural Networks? Careful of Black Box
Model
● Collect more data!
Results
● “All models are wrong but some are useful" - George Box
● ~77% accuracy for both Logistic Regression and Gradient Boosting: Not too
high in terms of groundbreaking results but can still can give insight. Typically
90% accuracy is a good start
● Key Takeaway: Data Science/Machine Learning is a life cycle not a one and
done procedure.
● Iterations are key; if model and data didn’t output wanted results, collect more
data. Ask what data should be collected and how it should be collected with
key stakeholders.
Questions to consider
How can we integrate Customer Reviews into our Machine Learning process?
What other covariates can we consider when creating our models?
Are we collecting the right data?
Which model can give us the most insight into our data without being to
computationally expensive?
Q&A
● If you have any questions or would like to contribute to these projects email
me: raul.eulogio@inertia7.com
● Check out inertia7.com if you want to learn all things Machine Learning and
Data Science
Resources
● Overview of Machine Learning using scikit-learn
● Introduction to Gradient Boosting
● Book Recommender (Inspired Spotify Recommender)
● Github Repo with Source code for presentation
● Logistic Regression with Scikit-learn

Contenu connexe

Tendances

Top career opportunities in data science
Top career opportunities in data scienceTop career opportunities in data science
Top career opportunities in data science
TanyaAgarwal71
 
Applied data science in decision making in sales department
Applied data science in decision making in sales departmentApplied data science in decision making in sales department
Applied data science in decision making in sales department
Institute of Contemporary Sciences
 
Data analytics course in bangalore
Data analytics course in bangaloreData analytics course in bangalore
Data analytics course in bangalore
Umeshchandra Reddy Tera
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Koo Ping Shung
 
10.a predictive analytics primer
10.a predictive analytics primer10.a predictive analytics primer
10.a predictive analytics primer
Anirud Reddy Vem
 
BigMLSchool: Customer Segmentation
BigMLSchool: Customer SegmentationBigMLSchool: Customer Segmentation
BigMLSchool: Customer Segmentation
BigML, Inc
 
Asists in context nyacce 2013
Asists in context nyacce 2013Asists in context nyacce 2013
Asists in context nyacce 2013
Venu Thelakkat
 
Vikrant data scientist
Vikrant data scientistVikrant data scientist
Vikrant data scientist
Vikrant Narayan
 
BigMLSchool: Bankruptcy Prediction
BigMLSchool: Bankruptcy PredictionBigMLSchool: Bankruptcy Prediction
BigMLSchool: Bankruptcy Prediction
BigML, Inc
 
Asists in context nyacce 2013
Asists in context nyacce 2013Asists in context nyacce 2013
Asists in context nyacce 2013
Venu Thelakkat
 
CV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLCV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCL
Han Yang
 
First Steps on Big Data
First Steps on Big DataFirst Steps on Big Data
First Steps on Big Data
Alexandre Simundi
 

Tendances (12)

Top career opportunities in data science
Top career opportunities in data scienceTop career opportunities in data science
Top career opportunities in data science
 
Applied data science in decision making in sales department
Applied data science in decision making in sales departmentApplied data science in decision making in sales department
Applied data science in decision making in sales department
 
Data analytics course in bangalore
Data analytics course in bangaloreData analytics course in bangalore
Data analytics course in bangalore
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
10.a predictive analytics primer
10.a predictive analytics primer10.a predictive analytics primer
10.a predictive analytics primer
 
BigMLSchool: Customer Segmentation
BigMLSchool: Customer SegmentationBigMLSchool: Customer Segmentation
BigMLSchool: Customer Segmentation
 
Asists in context nyacce 2013
Asists in context nyacce 2013Asists in context nyacce 2013
Asists in context nyacce 2013
 
Vikrant data scientist
Vikrant data scientistVikrant data scientist
Vikrant data scientist
 
BigMLSchool: Bankruptcy Prediction
BigMLSchool: Bankruptcy PredictionBigMLSchool: Bankruptcy Prediction
BigMLSchool: Bankruptcy Prediction
 
Asists in context nyacce 2013
Asists in context nyacce 2013Asists in context nyacce 2013
Asists in context nyacce 2013
 
CV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLCV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCL
 
First Steps on Big Data
First Steps on Big DataFirst Steps on Big Data
First Steps on Big Data
 

Similaire à Machine Learning - Startup weekend UCSB 2018

Help Me, Help You: Supporting Your Data
Help Me, Help You: Supporting Your DataHelp Me, Help You: Supporting Your Data
Help Me, Help You: Supporting Your Data
Data Con LA
 
Data Science in Python.pptx
Data Science in Python.pptxData Science in Python.pptx
Data Science in Python.pptx
Ramakrishna Reddy Bijjam
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
Srinath Perera
 
Investing in ai driven startups
Investing in ai driven startupsInvesting in ai driven startups
Investing in ai driven startups
Roy Lowrance
 
Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17
Rittman Analytics
 
Leverage The Power of Small Data
Leverage The Power of Small DataLeverage The Power of Small Data
Leverage The Power of Small Data
Karyn Zuidinga
 
What's Next: Big Data – Beyond the Buzzword
What's Next: Big Data – Beyond the BuzzwordWhat's Next: Big Data – Beyond the Buzzword
What's Next: Big Data – Beyond the Buzzword
Ogilvy Consulting
 
Using AI for Ecommerce Analytics
Using AI for Ecommerce AnalyticsUsing AI for Ecommerce Analytics
Using AI for Ecommerce Analytics
Alexander Galea
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
Benjamin Schulte
 
Data Visualization: Sales forecasting
Data Visualization: Sales forecastingData Visualization: Sales forecasting
Data Visualization: Sales forecasting
AlgoAnalytics Financial Consultancy Pvt. Ltd.
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019
OpenSource Connections
 
Pivotal Tracker - Research Findings
Pivotal Tracker - Research FindingsPivotal Tracker - Research Findings
Pivotal Tracker - Research Findings
Paulina Galindo
 
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
DATAVERSITY
 
How to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business AnalystHow to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business Analyst
Product School
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!
Dylan
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
Product School
 
How to Leverage Traditional Media for a Successful Omnichannel Strategy
How to Leverage Traditional Media for a Successful Omnichannel StrategyHow to Leverage Traditional Media for a Successful Omnichannel Strategy
How to Leverage Traditional Media for a Successful Omnichannel Strategy
Tinuiti
 
Data science guide
Data science guideData science guide
Data science guide
gokulprasath06
 
Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data Decisions
Product School
 
Data Drive Your Content Creation - Dawn of the Data Age Lecture Series
Data Drive Your Content Creation - Dawn of the Data Age Lecture SeriesData Drive Your Content Creation - Dawn of the Data Age Lecture Series
Data Drive Your Content Creation - Dawn of the Data Age Lecture Series
Luciano Pesci, PhD
 

Similaire à Machine Learning - Startup weekend UCSB 2018 (20)

Help Me, Help You: Supporting Your Data
Help Me, Help You: Supporting Your DataHelp Me, Help You: Supporting Your Data
Help Me, Help You: Supporting Your Data
 
Data Science in Python.pptx
Data Science in Python.pptxData Science in Python.pptx
Data Science in Python.pptx
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
Investing in ai driven startups
Investing in ai driven startupsInvesting in ai driven startups
Investing in ai driven startups
 
Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17
 
Leverage The Power of Small Data
Leverage The Power of Small DataLeverage The Power of Small Data
Leverage The Power of Small Data
 
What's Next: Big Data – Beyond the Buzzword
What's Next: Big Data – Beyond the BuzzwordWhat's Next: Big Data – Beyond the Buzzword
What's Next: Big Data – Beyond the Buzzword
 
Using AI for Ecommerce Analytics
Using AI for Ecommerce AnalyticsUsing AI for Ecommerce Analytics
Using AI for Ecommerce Analytics
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
 
Data Visualization: Sales forecasting
Data Visualization: Sales forecastingData Visualization: Sales forecasting
Data Visualization: Sales forecasting
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019
 
Pivotal Tracker - Research Findings
Pivotal Tracker - Research FindingsPivotal Tracker - Research Findings
Pivotal Tracker - Research Findings
 
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
 
How to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business AnalystHow to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business Analyst
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
 
How to Leverage Traditional Media for a Successful Omnichannel Strategy
How to Leverage Traditional Media for a Successful Omnichannel StrategyHow to Leverage Traditional Media for a Successful Omnichannel Strategy
How to Leverage Traditional Media for a Successful Omnichannel Strategy
 
Data science guide
Data science guideData science guide
Data science guide
 
Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data Decisions
 
Data Drive Your Content Creation - Dawn of the Data Age Lecture Series
Data Drive Your Content Creation - Dawn of the Data Age Lecture SeriesData Drive Your Content Creation - Dawn of the Data Age Lecture Series
Data Drive Your Content Creation - Dawn of the Data Age Lecture Series
 

Dernier

一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 

Dernier (20)

一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 

Machine Learning - Startup weekend UCSB 2018

  • 1. Data Driven Solutions: A practical overview on Machine Learning Startup Weekend Raul Eulogio
  • 2. Introduction ● Raul Eulogio ○ Data Analyst at Hospice of Santa Barbara ○ Co-founder: inertia7.com ○ President of Data Science at UCSB ○ Self taught Machine learning enthusiast
  • 3. Data Science at UCSB and Farmer’s Insurance Competition Farmers Insurance is challenging you to put your data skills to the test. Seize this opportunity to practically apply data science to tackle a problem in the insurance field. The top-performing teams will bring home: ● 1st place: $2000 ● 2nd place: $1000 ● 3rd place: $500 Additionally, all winning teams will get to present their work to a panel of Farmers employees. MUST BE UCSB Student and Paid Member of Data Science at UCSB. Application here
  • 4. ● Use data collection to your advantage ○ The authors of Lean Analytics state: “data driven learning is the cornerstone of success in startups. It’s how you learn what’s working and iterate towards the right product and market before the money runs out.” ● Data Science ○ Enhancing the interpretation of reality ○ Automating machines to respond to their environments * I will use Machine Learning and Data Science interchangeably Why Machine Learning*?
  • 5. Data Driven Solutions for a Data Driven Organization in a Data Driven World
  • 6. I’m not here ... ● to tell you Data Science “is the sexiest job of the 21st century” ● to tell you that studies show less than 1% of data is being analyzed ● to show you the usual Venn Diagram that is presented at almost every data science talk
  • 7.
  • 8. Multifaceted Domain Machine Learning can be ... ○ Exploratory Analysis ■ Exploring trends in data ■ Creating a narrative with data ○ Unsupervised Learning ■ Exploring Trends and Patterns on a larger scale ■ Find hidden structure in data ○ Supervised Learning ■ Predicting an output based on inputs ■ Regression and Classification
  • 9. Case Studies Show through examples, all data available online and all work is open source and on my Github Repository! ○ Exploratory Analysis - Apple Watch Data ■ Existing data collected by customer/user ● Data Collection (Python) and Data Exploration (R) ○ Unsupervised Learning - Spotify Data ■ Data made available by 3rd party source ● Data Exploration and Data Modeling (Python) ○ Supervised Learning - IBM Customer Churn Data ■ Data collected by organization ● Data Exploration (R) and Data Modeling (Python)
  • 10. Exploratory Analysis: A case study on Apple Watch ● Sisense states: “You do [EDA] by taking a broad look at patterns, trends, outliers, unexpected results and so on in your existing data, using visual and quantitative methods to get a sense of the story this tells. You’re looking for clues that suggest your logical next steps, questions or areas of research.” ● Example data was gathered by Apple Watch on a daily basis to help fitness tracking and other health related data
  • 11.
  • 12. How can we use customer’s daily fitness regimen to identify important trends? Can we detect and prevent days/weeks where our customers will reduce their workout regimen? Is there correlation between our customer’s workout regimen and use of our services? Questions to consider
  • 13. Unsupervised Learning: A Case Study on Spotify Music ● Noticing trends and patterns within the data ○ Combining all features and usually unlabelled data ● Using Spotify API to create a recommender based on distance metric ● Ability to create clusters within our data ○ Recommend other use cases/product based on customer preference ■ Examples include Recommended videos on Youtube, Customers also bought on Amazon, Daily mix on Spotify
  • 14. How does it work? ● Nearest Neighbor Algorithm ● Algorithm creates feature space using all inputs ● Inputs include: ○ Danceability ○ Loudness ○ Tempo ○ Category ○ And more
  • 15. Examples of recommender at work After some research I found that a lot of these songs were very similar in nature! I have more knowledge in hip hop so I can say many of these were good recommendations some interesting songs were Don’t Wanna Know and Summertime Sadness.
  • 16. Questions to consider How can we provide a seamless music experience for users? Can we understand a users musical taste to maximize daily workout regime? How can we effectively utilize 3rd party data to benefit our product?
  • 17. ● Can we accurately predict when a customer will stop using a service/business? ● Binary Classification problem using customer information including: ○ Tenure ○ Total Charges ○ Monthly Charges ○ Gender ○ Utilization of Phone Services? ○ And more... ● Models used: ○ Gradient Boosting ○ Logistic Regression ● Things to consider: Data Preprocessing, Data leakage, Class Imbalance Supervised Learning: A case study on Churn Rate
  • 18. Variable Importance gathered by Gradient Boosting
  • 19. Exploratory Analysis on Churn Data Set
  • 20. Exploratory Analysis on Churn Data Set
  • 21. Exploratory Analysis on Churn Data Set
  • 22. Final Results on Churn Data Set ● Gradient Boosting: 76% Accuracy (CV) ● Logistic Regression: 77% Accuracy (CV) ● Not high but we can still gain insight ○ Variable Importance for GB ○ Coefficients for variables for LR ● Customers with Month-to-Month Contracts most likely to Churn ● Neural Networks? Careful of Black Box Model ● Collect more data!
  • 23. Results ● “All models are wrong but some are useful" - George Box ● ~77% accuracy for both Logistic Regression and Gradient Boosting: Not too high in terms of groundbreaking results but can still can give insight. Typically 90% accuracy is a good start ● Key Takeaway: Data Science/Machine Learning is a life cycle not a one and done procedure. ● Iterations are key; if model and data didn’t output wanted results, collect more data. Ask what data should be collected and how it should be collected with key stakeholders.
  • 24. Questions to consider How can we integrate Customer Reviews into our Machine Learning process? What other covariates can we consider when creating our models? Are we collecting the right data? Which model can give us the most insight into our data without being to computationally expensive?
  • 25. Q&A ● If you have any questions or would like to contribute to these projects email me: raul.eulogio@inertia7.com ● Check out inertia7.com if you want to learn all things Machine Learning and Data Science
  • 26. Resources ● Overview of Machine Learning using scikit-learn ● Introduction to Gradient Boosting ● Book Recommender (Inspired Spotify Recommender) ● Github Repo with Source code for presentation ● Logistic Regression with Scikit-learn