SlideShare une entreprise Scribd logo
1  sur  14
Data science has become an essential business tool. With access to
incredible amounts of data—thanks to advanced computing and the
“Internet of things”—companies are now able to measure every aspect
of their operations in granular detail.
Introduction
There are no shortcuts for data exploration. If you are in a state of
mind, that machine learning can sail you away from every data storm,
trust me, it won’t. After some point of time, you’ll realize that you are
struggling at improving model’s accuracy. In such situation, data
exploration techniques will come to your rescue.
Steps of Data Exploration and
Preparation
Below are the steps involved to understand, clean and prepare your
data for building your predictive model:-
Variable Identification
Univariate Analysis
Bi-variate Analysis
Missing values treatment
Outlier treatment
Variable transformation
Variable creation
Variable Identification
First, identify Predictor (Input) and Target (output) variables. Next,
identify the data type and category of the variables.
Univariate Analysis
At this stage, we explore variables one by one. Method to perform uni-
variate analysis will depend on whether the variable type is categorical
or continuous.
Bi-variate Analysis
Bi-variate Analysis finds out the relationship between two variables.
Here, we look for association and disassociation between variables at a
pre-defined significance level.
Missing Value Treatment
Missing data in the training data set can reduce the power / fit of a
model or can lead to a biased model because we have not analysed the
behavior and relationship with other variables correctly. It can lead to
wrong prediction or classification.
We looked at the importance of treatment of missing values in a
dataset. Now, let’s identify the reasons for occurrence of these missing
values. They may occur at two stages:
Data Extraction
Data Collection
Outlier treatment
Outlier is a commonly used terminology by analysts and data scientists
as it needs close attention else it can result in wildly wrong estimations.
Outlier can be of two types: Univariate and Multivariate.
Outliers can drastically change the results of the data analysis and
statistical modeling.
It increases the error variance and reduces the power of statistical
tests.
If the outliers are non-randomly distributed, they can decrease
normality.
They can bias or influence estimates that may be of substantive
interest.
Working of Data Analysis
A working knowledge of data
science can help leaders turn
analytics into genuine insight. It
can also save them from making
decisions based on faulty
assumptions. “When analytics
goes bad,”
How can leaders learn to distinguish
between good and bad analytics?
It all starts with understanding the data-generation process.You cannot
judge the quality of the analytics if you don’t have a very clear idea of
where the data came from.
Guide to data analytics

Contenu connexe

Tendances

IPT Tools 2
IPT Tools 2IPT Tools 2
IPT Tools 2MR Z
 
Statistical Fundamentals in Total Quality Management
Statistical Fundamentals in Total Quality ManagementStatistical Fundamentals in Total Quality Management
Statistical Fundamentals in Total Quality ManagementDr.Raja R
 
Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)Dr. Amjad Ali Arain
 
Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Salford Systems
 
Topic 4 intro spss_stata
Topic 4 intro spss_stataTopic 4 intro spss_stata
Topic 4 intro spss_stataSizwan Ahammed
 
Applications of sas and minitab in data analysis
Applications of sas and minitab in data analysisApplications of sas and minitab in data analysis
Applications of sas and minitab in data analysisVeenaV29
 
How to Use NPT
How to Use NPTHow to Use NPT
How to Use NPTNEQOS
 
K10765 Operation Planning Control
K10765 Operation Planning ControlK10765 Operation Planning Control
K10765 Operation Planning ControlShraddhey Bhandari
 
A predictive analytics primer
A predictive analytics primerA predictive analytics primer
A predictive analytics primerRaminder Singh
 
Analyst’s Nightmare or Laundering Massive Spreadsheets
Analyst’s Nightmare or Laundering Massive SpreadsheetsAnalyst’s Nightmare or Laundering Massive Spreadsheets
Analyst’s Nightmare or Laundering Massive SpreadsheetsPyData
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkStats Statswork
 
All you want to know about sensitivity analysis
All you want to know about sensitivity analysisAll you want to know about sensitivity analysis
All you want to know about sensitivity analysisRajan Vishwakarma
 
Edu 8006 8 assignment edu8006
Edu 8006 8 assignment edu8006Edu 8006 8 assignment edu8006
Edu 8006 8 assignment edu8006arnitaetsitty
 
A quest for better sleep
A quest for better sleepA quest for better sleep
A quest for better sleepAlex Martinelli
 
Types of statistical analysis infographic
Types of statistical analysis infographicTypes of statistical analysis infographic
Types of statistical analysis infographicIntellspot
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsNitin George
 
Imputation of missing data in clinical trials
Imputation of missing data in clinical trialsImputation of missing data in clinical trials
Imputation of missing data in clinical trialsSeema Ahirwar
 

Tendances (19)

IPT Tools 2
IPT Tools 2IPT Tools 2
IPT Tools 2
 
Statistical Fundamentals in Total Quality Management
Statistical Fundamentals in Total Quality ManagementStatistical Fundamentals in Total Quality Management
Statistical Fundamentals in Total Quality Management
 
Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)
 
Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
Topic 4 intro spss_stata
Topic 4 intro spss_stataTopic 4 intro spss_stata
Topic 4 intro spss_stata
 
Applications of sas and minitab in data analysis
Applications of sas and minitab in data analysisApplications of sas and minitab in data analysis
Applications of sas and minitab in data analysis
 
How to Use NPT
How to Use NPTHow to Use NPT
How to Use NPT
 
Spring 2016
Spring 2016Spring 2016
Spring 2016
 
K10765 Operation Planning Control
K10765 Operation Planning ControlK10765 Operation Planning Control
K10765 Operation Planning Control
 
A predictive analytics primer
A predictive analytics primerA predictive analytics primer
A predictive analytics primer
 
Analyst’s Nightmare or Laundering Massive Spreadsheets
Analyst’s Nightmare or Laundering Massive SpreadsheetsAnalyst’s Nightmare or Laundering Massive Spreadsheets
Analyst’s Nightmare or Laundering Massive Spreadsheets
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - Statswork
 
All you want to know about sensitivity analysis
All you want to know about sensitivity analysisAll you want to know about sensitivity analysis
All you want to know about sensitivity analysis
 
Edu 8006 8 assignment edu8006
Edu 8006 8 assignment edu8006Edu 8006 8 assignment edu8006
Edu 8006 8 assignment edu8006
 
A quest for better sleep
A quest for better sleepA quest for better sleep
A quest for better sleep
 
Types of statistical analysis infographic
Types of statistical analysis infographicTypes of statistical analysis infographic
Types of statistical analysis infographic
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trials
 
Imputation of missing data in clinical trials
Imputation of missing data in clinical trialsImputation of missing data in clinical trials
Imputation of missing data in clinical trials
 

Similaire à Guide to data analytics

Data Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersData Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersSatyam Jaiswal
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxcloudserviceuit
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysisData analysis ireland
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionDerek Kane
 
data science course with placement in hyderabad
data science course with placement in hyderabaddata science course with placement in hyderabad
data science course with placement in hyderabadmaneesha2312
 
Data Analysis and Analytics.pdf
Data Analysis and Analytics.pdfData Analysis and Analytics.pdf
Data Analysis and Analytics.pdfrohitgautam105831
 
AI in anomaly detection.pdf
AI in anomaly detection.pdfAI in anomaly detection.pdf
AI in anomaly detection.pdfStephenAmell4
 
Regression and correlation
Regression and correlationRegression and correlation
Regression and correlationVrushaliSolanke
 
Understanding The Pattern Of Recognition
Understanding The Pattern Of RecognitionUnderstanding The Pattern Of Recognition
Understanding The Pattern Of RecognitionRahul Bedi
 
Uncover Trends and Patterns with Data Science.pdf
Uncover Trends and Patterns with Data Science.pdfUncover Trends and Patterns with Data Science.pdf
Uncover Trends and Patterns with Data Science.pdfUncodemy
 
AI in anomaly detection - An Overview.pdf
AI in anomaly detection - An Overview.pdfAI in anomaly detection - An Overview.pdf
AI in anomaly detection - An Overview.pdfStephenAmell4
 
Moh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptxMoh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptxAbdullahEmam4
 
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docxhttphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docxadampcarr67227
 
Running Head Data Mining in The Cloud .docx
Running Head Data Mining in The Cloud                            .docxRunning Head Data Mining in The Cloud                            .docx
Running Head Data Mining in The Cloud .docxhealdkathaleen
 
Machine Learning Approaches and its Challenges
Machine Learning Approaches and its ChallengesMachine Learning Approaches and its Challenges
Machine Learning Approaches and its Challengesijcnes
 
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfJamieDornan2
 
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfJamieDornan2
 

Similaire à Guide to data analytics (20)

Data Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersData Analyst Interview Questions & Answers
Data Analyst Interview Questions & Answers
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysis
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model Selection
 
data science course with placement in hyderabad
data science course with placement in hyderabaddata science course with placement in hyderabad
data science course with placement in hyderabad
 
Unit2
Unit2Unit2
Unit2
 
Data Analysis and Analytics.pdf
Data Analysis and Analytics.pdfData Analysis and Analytics.pdf
Data Analysis and Analytics.pdf
 
Data analytics
Data analyticsData analytics
Data analytics
 
AI in anomaly detection.pdf
AI in anomaly detection.pdfAI in anomaly detection.pdf
AI in anomaly detection.pdf
 
Regression and correlation
Regression and correlationRegression and correlation
Regression and correlation
 
Understanding The Pattern Of Recognition
Understanding The Pattern Of RecognitionUnderstanding The Pattern Of Recognition
Understanding The Pattern Of Recognition
 
Data analytics
Data analyticsData analytics
Data analytics
 
Uncover Trends and Patterns with Data Science.pdf
Uncover Trends and Patterns with Data Science.pdfUncover Trends and Patterns with Data Science.pdf
Uncover Trends and Patterns with Data Science.pdf
 
AI in anomaly detection - An Overview.pdf
AI in anomaly detection - An Overview.pdfAI in anomaly detection - An Overview.pdf
AI in anomaly detection - An Overview.pdf
 
Moh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptxMoh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptx
 
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docxhttphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
 
Running Head Data Mining in The Cloud .docx
Running Head Data Mining in The Cloud                            .docxRunning Head Data Mining in The Cloud                            .docx
Running Head Data Mining in The Cloud .docx
 
Machine Learning Approaches and its Challenges
Machine Learning Approaches and its ChallengesMachine Learning Approaches and its Challenges
Machine Learning Approaches and its Challenges
 
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
 
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
 

Plus de Debashish Jana

Lies damned lies and statistics
Lies damned lies and statisticsLies damned lies and statistics
Lies damned lies and statisticsDebashish Jana
 
Data to make hit tv show
Data to make hit tv showData to make hit tv show
Data to make hit tv showDebashish Jana
 
The beauty of data visualization
The beauty of data visualizationThe beauty of data visualization
The beauty of data visualizationDebashish Jana
 
How to start thinking like a data scientist
How to start thinking like a data scientistHow to start thinking like a data scientist
How to start thinking like a data scientistDebashish Jana
 

Plus de Debashish Jana (10)

Lies damned lies and statistics
Lies damned lies and statisticsLies damned lies and statistics
Lies damned lies and statistics
 
Data to make hit tv show
Data to make hit tv showData to make hit tv show
Data to make hit tv show
 
Bad statistics
Bad statisticsBad statistics
Bad statistics
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
 
Data communication
Data communicationData communication
Data communication
 
The beauty of data visualization
The beauty of data visualizationThe beauty of data visualization
The beauty of data visualization
 
Make data more human
Make data more humanMake data more human
Make data more human
 
How to start thinking like a data scientist
How to start thinking like a data scientistHow to start thinking like a data scientist
How to start thinking like a data scientist
 
Big data
Big dataBig data
Big data
 
Data analysis
Data analysisData analysis
Data analysis
 

Dernier

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 

Dernier (20)

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 

Guide to data analytics

  • 1.
  • 2. Data science has become an essential business tool. With access to incredible amounts of data—thanks to advanced computing and the “Internet of things”—companies are now able to measure every aspect of their operations in granular detail.
  • 3. Introduction There are no shortcuts for data exploration. If you are in a state of mind, that machine learning can sail you away from every data storm, trust me, it won’t. After some point of time, you’ll realize that you are struggling at improving model’s accuracy. In such situation, data exploration techniques will come to your rescue.
  • 4. Steps of Data Exploration and Preparation Below are the steps involved to understand, clean and prepare your data for building your predictive model:- Variable Identification Univariate Analysis Bi-variate Analysis Missing values treatment Outlier treatment Variable transformation Variable creation
  • 5. Variable Identification First, identify Predictor (Input) and Target (output) variables. Next, identify the data type and category of the variables.
  • 6. Univariate Analysis At this stage, we explore variables one by one. Method to perform uni- variate analysis will depend on whether the variable type is categorical or continuous.
  • 7. Bi-variate Analysis Bi-variate Analysis finds out the relationship between two variables. Here, we look for association and disassociation between variables at a pre-defined significance level.
  • 8. Missing Value Treatment Missing data in the training data set can reduce the power / fit of a model or can lead to a biased model because we have not analysed the behavior and relationship with other variables correctly. It can lead to wrong prediction or classification.
  • 9. We looked at the importance of treatment of missing values in a dataset. Now, let’s identify the reasons for occurrence of these missing values. They may occur at two stages: Data Extraction Data Collection
  • 10. Outlier treatment Outlier is a commonly used terminology by analysts and data scientists as it needs close attention else it can result in wildly wrong estimations. Outlier can be of two types: Univariate and Multivariate.
  • 11. Outliers can drastically change the results of the data analysis and statistical modeling. It increases the error variance and reduces the power of statistical tests. If the outliers are non-randomly distributed, they can decrease normality. They can bias or influence estimates that may be of substantive interest.
  • 12. Working of Data Analysis A working knowledge of data science can help leaders turn analytics into genuine insight. It can also save them from making decisions based on faulty assumptions. “When analytics goes bad,”
  • 13. How can leaders learn to distinguish between good and bad analytics? It all starts with understanding the data-generation process.You cannot judge the quality of the analytics if you don’t have a very clear idea of where the data came from.