This document outlines an agenda for a data science training presentation. The agenda includes sections on why data science, what data science is, who a data scientist is, what they do, how to solve problems in data science, data science tools, and a demo. Key points are that data science uses tools, algorithms and machine learning to discover patterns in raw data and gain insights. It involves tasks like processing, cleaning, mining and modeling data, as well as communicating results. The problem solving process involves discovery, preparation, planning, building, operationalizing and communicating models.
Breaking the Kubernetes Kill Chain: Host Path Mount
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial Using R | Edureka
1. Agenda
Why Data Science?
What is Data Science?
Who is a Data Scientist?
What does a Data Scientist do?
How to solve a problem in Data Science?
Data Science Tools
Demo
2. Agenda
Why Data Science?
What is Data Science?
Who is a Data Scientist?
What does a Data Scientist do?
How to solve a problem in Data Science?
Data Science Tools
Demo
4. www.edureka.co/data-scienceData Science Certification Course using R
Why Data Science?
You can make better decisions, you can reduce your production costs by coming out with efficient ways, and give your
customers what they actually want!
Cost Reduction Faster & Better
Decision Making
Improved Services
and Products
Risk Detection
5. www.edureka.co/data-scienceData Science Certification Course using R
Why Data Science?
Data Science can help prevent Fraudulent transactions using advanced Machine Learning algorithms and prevent great
monetary losses.
7. www.edureka.co/data-scienceData Science Certification Course using R
What is Data Science?
Data Science is a blend of various tools, algorithms, and machine learning principles with the goal to discover hidden patterns
from the raw data.
DATA SCIENCE
Analysis Structure Algorithm Process Programming Insight
8. www.edureka.co/data-scienceData Science Certification Course using R
What is Data Science?
It is an inter-disciplinary field deploying scientific methods, processes and systems to gain insight from data in various forms.
Tell us something we don’t know already.
Statistics Code
Business
9. www.edureka.co/data-scienceData Science Certification Course using R
What is Data Science?
How is this different from what statisticians have been doing for years?
Business Administration
Exploratory Data Analysis
Machine Learning &
Advanced Algorithms
Data Product Engineering
Business Analyst
Data Scientist
18. www.edureka.co/data-scienceData Science Certification Course using R
Processing &
Cleansing Data
What does a Data Scientist do?
Data Mining
Building
Prediction
Models
Extending
Data
Optimizing and
building classifiers
using
Machine Learning
19. www.edureka.co/data-scienceData Science Certification Course using R
Processing &
Cleansing Data
What does a Data Scientist do?
Data Mining
Building
Prediction
Models
Extending
Data
Optimizing and
building classifiers
using
Machine Learning
20. www.edureka.co/data-scienceData Science Certification Course using R
What does a Data Scientist do?
Data Mining
Processing &
Cleansing Data
Building
Prediction
Models
Extending
Data
Optimizing and
building classifiers
using
Machine Learning
21. www.edureka.co/data-scienceData Science Certification Course using R
What does a Data Scientist do?
Data Mining
Processing &
Cleansing Data
Building
Prediction
Models
Extending
Data
Optimizing and
building classifiers
using
Machine Learning
22. www.edureka.co/data-scienceData Science Certification Course using R
What does a Data Scientist do?
Data Mining
Processing &
Cleansing Data
Building
Prediction
Models
Extending
Data
Optimizing and
building classifiers
using
Machine Learning
23. www.edureka.co/data-scienceData Science Certification Course using R
What does a Data Scientist do?
Data Mining
Processing &
Cleansing Data
Building
Prediction
Models
Extending
Data
Optimizing and
building classifiers
using
Machine Learning
25. www.edureka.co/data-scienceData Science Certification Course using R
How to solve a problem in Data Science?
3 62 41 5
Discovery
Data
Preparation
Model
Planning
Model
Building
Operationalize
Communicating
Results
26. www.edureka.co/data-scienceData Science Certification Course using R
How to solve a problem in Data Science?
1
3
2
4
Discovery
Data Preparation
Model Planning
Model Building
5
6
Operationalize
Communicate
➢ Discovery involves acquiring data from all identifies internal and
external resources that can help with a business solution.
➢ You assess if you have the required resources present in terms of
people, technology, time and data to support the project.
27. www.edureka.co/data-scienceData Science Certification Course using R
How to solve a problem in Data Science?
1
3
2
4
Discovery
Data Preparation
Model Planning
Model Building
5
6
Operationalize
Communicate
➢ In this phase, you require analytical sandbox in which you can
perform analytics for the entire duration of the project.
➢ This is what a Sandbox is supposed to look like;
➢ ETLT means to Extract, Transform, Load and Transform.
Preparing the
Analytics Sandbox
Performing ETLT Data Conditioning Survey & Visualize
28. www.edureka.co/data-scienceData Science Certification Course using R
How to solve a problem in Data Science?
1
3
2
4
Discovery
Data Preparation
Model Planning
Model Building
5
6
Operationalize
Communicate
➢ You will apply Exploratory Data Analytics (EDA) using various
statistical formulas and visualization tools.
Common Tools for Model Planning
R SAS/ ACCESS
SQL Service
Analysis Services
29. www.edureka.co/data-scienceData Science Certification Course using R
How to solve a problem in Data Science?
1
3
2
4
Discovery
Data Preparation
Model Planning
Model Building
5
6
Operationalize
Communicate
➢ In this phase, you will develop datasets for training and testing
purposes.
Common Tools for Model Building
SAS
Miner
WEKA SPCS MATLAB
Alpine
Miner
Statistica
30. www.edureka.co/data-scienceData Science Certification Course using R
How to solve a problem in Data Science?
1
3
2
4
Discovery
Data Preparation
Model Planning
Model Building
5
6
Operationalize
Communicate
➢ In this phase, you deliver final reports, briefings, code and technical
documents.
➢ In addition, sometimes a pilot project is also implemented in a real-
time production environment.
➢ This will provide you a clear picture of the performance and other
related constraints on a small scale before full deployment.
31. www.edureka.co/data-scienceData Science Certification Course using R
How to solve a problem in Data Science?
1
3
2
4
Discovery
Data Preparation
Model Planning
Model Building
5
6
Operationalize
Communicate
➢ You do the following things in this phase;
1. You identify all the key findings
2. communicate to the stakeholders
3. Look for performance constraints, if any
4. determine if the results of the project are a success or a failure
32. www.edureka.co/data-scienceData Science Certification Course using R
How to Choose an Algorithm in Data Science?
Is it A or B? Classification Algorithm
Is this weird? Anomaly Detection Algorithm
How much / How many? Regression Algorithm
How is this organised? Clustering Algorithm
What should I do next? Reinforcement Learning
33. www.edureka.co/data-scienceData Science Certification Course using R
What is machine Learning?
It is a type of Artificial Intelligence that makes the computers capable of learning on their own i.e without explicitly being
programmed. With machine learning, machines can update their own code, whenever they come across a new situation.
34. www.edureka.co/data-scienceData Science Certification Course using R
Categories of Algorithm
Supervised
Learning
1
Supervised Learning
is a type of machine
learning algorithm
that uses a known
dataset to make
predictions.
Unsupervised
Learning
2
Unsupervised
Learning is a type of
machine learning
algorithm that uses a
input datasets
without labelled
responses to draw
inference.
Reinforcement
Learning
3
Reinforcement
Learning is a type of
algorithm inspired by
behaviourist
psychology,
concerned with
taking actions to
maximise reward.