RABBIT: A CLI tool for identifying bots based on their GitHub events.
Data science - An Introduction
1. Data Science – An Introduction
Ravishankar Rajagopalan, Ph.D.
Founder, CourseBricks
2. B.E. Mechanical Engineering (India)
Machine Learning methods for Industrial Layout Design
M.S. Optimization (University of Alabama)
Optimization for Industrial Layout Design
Ph.D. in Applied Statistics and Optimization (The Ohio
State University)
Applied Statistics and Optimization for Industrial Applications (Welding,
Casting etc)
2000
2004
2009
The Buckeye Background
3. Data Science in the Industry
Overall 11+ years of industry experience in working with the business to
understand the pain points and solve them using AI and Machine Learning
2013
2010
2009
2020
2017
2019
Market Research,
Product Development
Mu Sigma
Text Analytics
GE Energy
ML Pipelines, Intent
Prediction, Speech/Chat
NLP
[24]7.ai
Computer Vision and
Text Analytics
CourseBricks
Deep Learning on Healthcare Data
United Health Group (Optum)
Data Science for Digital
Procurement
Petronas
4. CourseBricks Core Areas
Core Areas
Next Gen Research in Computer Vision
and Text Analytics
Platform for rapid development of
Computer Vision models
High end capabilities with Deep
Learning for Image and Video
Analytics
1 Clients
Clients served in:
Healthcare
Technology/Startups
Media/Television
Manufacturing
Oil & Gas
2
CoursreBricks Labs focusses on next generation products/services in Data
Science/AI
5. Agenda
Data Science - Introduction
1.
Data Science - Skills Required, and Project Lifecycle
2.
3.
4.
6.
5.
Unusual Data Science Applications in Real World (2 use cases)
Data Science Team Aspects and Soft Skills
Data Science Interviews and Preparation
Data Science Resources
6. Data Science in Day-to-Day Life
Amazon, Netflix, Flipkart
(Recommender Systems)
Uber/Ola Routing
(Optimization)
Alexa/Siri
(Speech Recognition) Digital Advertising
7. What is Data Science?
Data
Data Science is the Art and Science of using Algorithms on Data to generate actionable insights,
make predictions and prescribe actions
Algorithms
Technology
Descriptive Predictive Prescriptive
Generate Insights Predict Future Prescribe Optimal Actions
8. Banking and
Finance
eCommerce
Healthcare
Risk Models, Fraud Detection,
Algorithmic Trading
Where is Data Science Applied?
Manufacturing
Retail
Advertising
Travel
Marketing
Medical Imaging, Drug
Discovery, Disease Prevention
Personalization, Dynamic
Pricing, Recommender
Systems
IoT, Predictive Maintenance,
Demand Forecasting,
Inventory Management,
Warranty Analysis
Market Basket Analysis, Price
Optimization, Inventory
Management, Store Location
Optimization
Customer Segmentation,
Market Mix Models,
Campaign Optimization, Lead
Scoring
Dynamic Pricing, Demand
Forecasting, Personalized
Recommendations, Trip
Planning
Bid Pricing, Customer
Segmentation, Attribution,
Fraud Detection
9. Agenda
Data Science - Introduction
2. Data Science - Project Lifecycle and Skills Required
3.
4.
6.
5.
Unusual Data Science Applications in Real World (3 use cases)
Data Science Team Aspects and Soft Skills
Data Science Interviews and Preparation
Data Science Resources
1.
11. Fundamentals
Probability/ Statistics
Machine Learning
Unstructured Data
Deep Learning
Natural Language Processing
Image Processing
Video analytics
Speech Recognition
Prescriptive
Optimization
Simulation
Programming
R/Python/Java/
Scala/Julia
Big Data
Databases (SQL)
Hadoop
Spark
Visualization
Tableau
R/Python
T
E
C
H
N
O
L
O
G
Y
S
C
I
E
N
C
E
Core Tech and Science Skills for Data Scientists
✚
12. Problem
Definition
Work with the business to
identify the pain point and
identify an appropriate
solution
Data
Extraction
Extract and curate
data from various
sources
Exploratory
Data
Analysis
Extract and curate
data from various
sources (visualization
driven)
Feature
Engineering
Create enriched features
from existing columns
Model
Building
Build Machine
Learning/Deep Learning
Models
Model
Validation
Validate the model with
real world data
Model
Deployment
Deploy the model for
integration with a product
Model
Performance
Monitoring
Monitor the model
performance to know
when to retrain the model
Data Science Lifecycle
13. Agenda
Data Science - Introduction
3.
Data Science - Project Lifecycle and Skills Required
4.
6.
5.
Unusual Data Science Applications in Real World (3 use cases)
Data Science - Skills and Career Progression
Data Science Interviews and Preparation
Data Science Resources
1.
2.
15. Millions of $$$
Spare Parts
Inventory
üLocked up capital
üLoss or Damage
üStorage Costs
What was the Business Pain Point?
16. When do we need
spare parts?
Only when the machine
breaks down and the part
needs to be replaced
17. Use Data Science to
predict when the parts
would fail
Order parts as per the
expected failure rate
18. How was Data
Science Used?
ü Reliability Models – Predict when a spare part
would fail
ü Monte Carlo Simulation – Simulate from the failure
distribution and calculate the expected number of
failures
ü Optimization – Recommend optimal inventory
levels
Millions of dollars of reduction in inventory!!!
20. Interactive Voice
Response (IVR) is
painful to reach the
customer service rep
What was the Business Pain Point?
ü Long time to reach the
correct customer service
rep (2-3 mins)
ü Poor Customer
Satisfaction Ratings
21. Customer’s Speaks in
the phone of their
wants
Predict the
customer’s intent in
Real Time
Want to increase
my internet
bandwidth
☓ Mobile
☓ Digital TV
ü Broadband
☓ Bundle Offers
Can Speech Recognition replace IVR?
Mobile
Broadband
Digital TV
Offers
22. How was Data
Science Used?
Access Time improved from 2-3 minutes to 1-2 seconds
Improved Ratings for Customer Service
Speech Recognition
Natural Language
Processing/Text
Mining/Machine
Learning
Intent
Speech Text
23. Agenda
Data Science - Introduction
4.
Data Science - Project Lifecycle and Skills Required
6.
5.
Unusual Data Science Applications in Real World (2 use cases)
Data Science - Soft Skills and Career Progression
Data Science Interviews and Preparation
Data Science Resources
1.
2.
3.
24. Data Science is a Team Work
Projects involving Data Science typically involve multi disciplinary teams from various
stakeholders
Data Science
Promotions only work
as well as the marketing.
Data Engineering
Promotions only work
as well as the marketing.
Business
Stakeholders
Promotions only work
as well as the marketing.
Product
Management
Promotions only work
as well as the marketing.
Architecture
Promotions only work
as well as the marketing.
Project
Management
Promotions only work
as well as the marketing.
QA
Promotions only work
as well as the marketing.
Engineering (Product Development)
Promotions only work
as well as the marketing.
Data Science
is a
Team Work
25. 01
02
03
04
05
06
07
08 10
09
01 Science
02 Technology
03
Understand data from a
business context
Business/Domain Knowledge
04
Ability to think on the feat
and come up with solutions
Problem Solving
05
Ability to ask the right
questions
Critical Thinking
06
Visualization and
Presentation
Story Telling
07
Ability to work as part of
cross functional teams
Team Work
08
Ability to handle stressful
situations
Emotional Intelligence
09
Influencing skills when
dealing with upper
management
Stakeholder Management
10
Networking with
professionals inside/outside
the organization
Networking
Data Science several soft skills to
navigate projects
26. Data Science Career Ladder
Technical skills are sufficient to get started as a Data Scientist but Soft Skills takes a
precedence as one grows towards an Expert Data Scientist
Ability to use Data
Science tools to extract,
explore and build basic
ML Models with
supervision
Beginner Data
Scientist
Ability to use ML
Algorithms to solve
specific business
problems (without
supervision)
Intermediate Data
Scientist
Ability to choose
appropriate models,
modify them as
required to suit the
business problem and
defend the choice
Advanced Data
Scientist
Invent new
algorithms as
required and be a
thought leader in
driving company
level initiatives
Expert Data
Scientist
27. Agenda
Data Science - Introduction
5.
Data Science - Project Lifecycle and Skills Required
6.
Unusual Data Science Applications in Real World (2 use cases)
Data Science - Skills and Career Progression
Data Science Interviews and Preparation
Data Science Resources
1.
2.
3.
4.
28. Data Science Interviews
Data Science Interview focuses on six major areas.
1 6
2 5
3 4
Puzzles, Business Cases and
Open Ended Problems
Problem Solving
Most businesses already know
that social media platforms.
Deep Learning
Most businesses already know
that social media platforms.
Machine Learning
Most businesses already know
that social media platforms.
Data Structures and
Algorithms
Most businesses already know
that social media platforms.
SQL
Most businesses already know
that social media platforms.
Probability & Statistics
29. 1 1
2
3
Research about the company,
team and their work in Data
Science
Company Research
Data Science Interviews – Do’s and Dont’s
3
2
Get to know the interviewer
profile in advance (LinkedIn)
Interviewer Profile
Express interest in the
position by asking relevant
questions
Ask Questions
Learning advanced topics
without learning the
fundamentals
Failure to Focus on Fundamentals
Several problems are open
ended and have multiple
answers
Giving up on a problem
Not knowing what goes
behind the commands
Python
Too focused on Packages
Do’s Dont’s
30. Agenda
Data Science - Introduction
6.
Data Science - Project Lifecycle and Skills Required
Unusual Data Science Applications in Real World (3 use cases)
Data Science - Skills and Career Progression
Data Science Interviews and Preparation
Data Science Resources
1.
2.
3.
4.
5.
31. How do you learn and Practice Data Science?
üCoursera
ü Udacity
ü Edx
ü NPTEL
ü Data Camp
ü Khan Academy
ü Udemy
ü PluralSight
üKaggle
üDriven Data
üCrowd AI
üTianChi
üAnalytics Vidhya
üData Science
Society
Practice
Learn
üHackathons
üInternships
üContribution to
Open Source
Apply
32. Practice Data Structures/Algorithms/SQL?
ü HackerRank
ü LeetCode
ü HackerEarth
ü TopCoder
ü CoderByte
ü HackerTrail
ü CodeChef
ü InterviewBit
ü TestDome
• Learn
üMode Analytics
üCodeAcademy
üLearnSQL.com
üSQLZoo
üSQLBolt
• Practice
üHackerRank
üSQLFiddle
üLearnSQL.com
SQL
Programming Algorithms