This document provides an introduction to data science, including:
- Why data science has gained popularity due to advances in AI research and commoditized hardware.
- Examples of where data science is applied, such as e-commerce, healthcare, and marketing.
- Definitions of data science, data scientists, and their roles.
- Overviews of machine learning techniques like supervised learning, unsupervised learning, deep learning and examples of their applications.
- How data science can be used by businesses to understand customers, create personalized experiences, and optimize processes.
10. WHAT IS DATA SCIENTIST
A Data Scientist is someone with deliberate dual personality who can first build a
curious business case defined with a telescopic vision and can then dive deep with
microscopic lens to sift through DATA to reach the goal while defining and
executing all the intermittent tasks.
http://www.datasciencecentral.com/profiles/blogs/are-you-a-data-scientist
20. “Gives computers the ability to learn without being explicitly programmed”
- Artur Samuel, 1959
Machine Learning
Artificial Intelligence
“Creation of intelligent machines that work and react like humans”
- John McCarthy, 1956
Some definitions...
21. Supervised
Learning
● Data with clearly defined
output is given
● Direct feedback is given
● Predicts outcome/future
● Resolves classification
and regression problems
Unsupervised
Learning
● Machine understands the
data (clustering,
association rules)
● Evaluation is qualitative or
indirect
● Does not predict/find
anything specific
Reinforcement
Learning
● Intelligent agent that
learns how to act in a
certain environment,
based on maximizing
rewards
● Used to optimize goals
Types of Machine Learning
27. Machine Learning Quiz
Supervised Learning
● Classification (C)
● Regression (R)
● Recommender
Systems (RS)
Unsupervised Learning
● Clustering (CL)
● Association Rules (A)
1. Which products should I offer to a customer?
2. How will be sales for the next month?
3. Which customers are prone to churn?
4. Which products are commonly bought together (market basket)?
5. Is a transaction fraudulent?
6. How can my customers be segmented for targeting?
7. How can I personalize search results for user context?
8. Which product is this (based on a picture)?
9. What are the main topics of messages from a chatbot?
10. What will be a company’s stock prices in the end of the day?
11. Which customers should I offer a product?
28. Supervised Learning
● Classification (C)
● Regression (R)
● Recommender
Systems (RS)
Unsupervised Learning
● Clustering (CL)
● Association Rules (A)
1. Which products should I offer to a customer?
2. How will be sales for the next month?
3. Which customers are prone to churn?
4. Which products are commonly bought together (market basket)?
5. Is a transaction fraudulent?
6. How can my customers be segmented for targeting?
7. How can I personalize search results for user context?
8. Which product is this (based on a picture)?
9. What are the main topics of messages from a chatbot?
10. What will be a company’s stock prices in the end of the day?
11. Which customers should I offer a product?
(RS)
(R)
(C)
(A)
(C)
(CL)
(RS)
(C)
(CL)
(R)
(RS)
Machine Learning Quiz
33. Cognitive & Advanced Analytics
UX
Business
Machine
Learning
Big Data
Advanced
Analytics
Customer
Centric
UNDERSTAND YOUR
CUSTOMER
Company
Centric
CREATE
PROACTIVE
EXPERIENCES
Cognitive
OPTIMIZE YOUR
PROCESSES
36. Cognitive & Adv. Analytics - Quiz
● Cognitive (C)
● Data Science / Advance
Analytics
○ Descriptive (ADes)
○ Diagnostic (ADia)
○ Predictive (APred)
○ Prescriptive (APres)
1. How many products were sold last month?
2. Which products were commonly bought together?
3. How can customers be segmented based on purchases?
4. How many products I will sell next month for a customer segment?
5. Which products with cross-sell opportunity should I offer for each
customer segment?
6. During a journey user, which products I could recommend
automatically, based on his historical behaviour?
(ADes)
(ADia)
(ADia)
(APred)
(APres)
(C)
E-commerce
42. Cognitive Lifecycle Overview
1 3
Identify
Opportunity
Objectives & pain
points to build the
use case
Data
Exploration
Data availability and
analysis to support
use case
4
Modeling
Select algorithms,
features, train and
evaluate models
5
Offline
evaluation
Value demonstration
based on results of
best model
10
Human in the
Loop
Acquire new data based
on feedback loops from
users, reviewers, etc
8
Monitor Model
Evaluate results and
retrain or rebuild when
performance degrades
7
Development &
Deployment
Development of related
systems and Deployment
of systems and model
ML Engineer
Data Scientist
2
Data Ingestion
and Cleansing
Implements the ETL
pipeline from
multiple data
sources
Data Engineer
Data Scientist /
Business
6
Experiments
Design
Design of the
experiments to
evaluate with real users
Online evaluation
A/B testing
results analysis 9
47. DATA SCIENCE COURSES
• Fundamentos de AI e Machine Learning (Udacity)
• Data Science at Scale (Univ. of Washington)
• Data Science specialization (Johns Hopkins)
• Machine Learning (Stanford)
• Statistical Learning (Stanford)