Guide to data analytics

•Télécharger en tant que PPTX, PDF•

0 j'aime•68 vues

Debashish Jana

A Leader’s Guide to Data Analytics.

Données & analyses

Data science has become an essential business tool. With access to
incredible amounts of data—thanks to advanced computing and the
“Internet of things”—companies are now able to measure every aspect
of their operations in granular detail.

Introduction
There are no shortcuts for data exploration. If you are in a state of
mind, that machine learning can sail you away from every data storm,
trust me, it won’t. After some point of time, you’ll realize that you are
struggling at improving model’s accuracy. In such situation, data
exploration techniques will come to your rescue.

Steps of Data Exploration and
Preparation
Below are the steps involved to understand, clean and prepare your
data for building your predictive model:-
Variable Identification
Univariate Analysis
Bi-variate Analysis
Missing values treatment
Outlier treatment
Variable transformation
Variable creation

Variable Identification
First, identify Predictor (Input) and Target (output) variables. Next,
identify the data type and category of the variables.

Univariate Analysis
At this stage, we explore variables one by one. Method to perform uni-
variate analysis will depend on whether the variable type is categorical
or continuous.

Bi-variate Analysis
Bi-variate Analysis finds out the relationship between two variables.
Here, we look for association and disassociation between variables at a
pre-defined significance level.

Missing Value Treatment
Missing data in the training data set can reduce the power / fit of a
model or can lead to a biased model because we have not analysed the
behavior and relationship with other variables correctly. It can lead to
wrong prediction or classification.

We looked at the importance of treatment of missing values in a
dataset. Now, let’s identify the reasons for occurrence of these missing
values. They may occur at two stages:
Data Extraction
Data Collection

Outlier treatment
Outlier is a commonly used terminology by analysts and data scientists
as it needs close attention else it can result in wildly wrong estimations.
Outlier can be of two types: Univariate and Multivariate.

Outliers can drastically change the results of the data analysis and
statistical modeling.
It increases the error variance and reduces the power of statistical
tests.
If the outliers are non-randomly distributed, they can decrease
normality.
They can bias or influence estimates that may be of substantive
interest.

Working of Data Analysis
A working knowledge of data
science can help leaders turn
analytics into genuine insight. It
can also save them from making
decisions based on faulty
assumptions. “When analytics
goes bad,”

How can leaders learn to distinguish
between good and bad analytics?
It all starts with understanding the data-generation process.You cannot
judge the quality of the analytics if you don’t have a very clear idea of
where the data came from.

Contenu connexe

Tendances

IPT Tools 2MR Z

Statistical Fundamentals in Total Quality ManagementDr.Raja R

Computer Assisted Data Analysis (Hands-on Practice)Dr. Amjad Ali Arain

Imputation Techniques For Market Research Datasets With Missing Values Salford Systems

Excel Datamining Addin AdvancedDataminingTools Inc

Topic 4 intro spss_stataSizwan Ahammed

Applications of sas and minitab in data analysisVeenaV29

How to Use NPTNEQOS

Spring 2016Jean Ramirez

K10765 Operation Planning ControlShraddhey Bhandari

A predictive analytics primerRaminder Singh

Analyst’s Nightmare or Laundering Massive SpreadsheetsPyData

How to establish and evaluate clinical prediction models - StatsworkStats Statswork

All you want to know about sensitivity analysisRajan Vishwakarma

Edu 8006 8 assignment edu8006arnitaetsitty

A quest for better sleepAlex Martinelli

Types of statistical analysis infographicIntellspot

Imputation techniques for missing data in clinical trialsNitin George

Imputation of missing data in clinical trialsSeema Ahirwar

Tendances (19)

IPT Tools 2

Statistical Fundamentals in Total Quality Management

Computer Assisted Data Analysis (Hands-on Practice)

Imputation Techniques For Market Research Datasets With Missing Values

Excel Datamining Addin Advanced

Topic 4 intro spss_stata

Applications of sas and minitab in data analysis

How to Use NPT

Spring 2016

K10765 Operation Planning Control

A predictive analytics primer

Analyst’s Nightmare or Laundering Massive Spreadsheets

How to establish and evaluate clinical prediction models - Statswork

All you want to know about sensitivity analysis

Edu 8006 8 assignment edu8006

A quest for better sleep

Types of statistical analysis infographic

Imputation techniques for missing data in clinical trials

Imputation of missing data in clinical trials

Similaire à Guide to data analytics

Data Analyst Interview Questions & AnswersSatyam Jaiswal

Lesson 1 - Overview of Machine Learning and Data Analysis.pptxcloudserviceuit

what is ..how to process types and methods involved in data analysisData analysis ireland

Data Science - Part III - EDA & Model SelectionDerek Kane

data science course with placement in hyderabadmaneesha2312

Unit2DrChetanNagar

Data Analysis and Analytics.pdfrohitgautam105831

Data analyticsBhanu Pratap

AI in anomaly detection.pdfStephenAmell4

Regression and correlationVrushaliSolanke

Understanding The Pattern Of RecognitionRahul Bedi

Data analyticsTilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL

Uncover Trends and Patterns with Data Science.pdfUncodemy

AI in anomaly detection - An Overview.pdfStephenAmell4

Moh.Abd-Ellatif_DataAnalysis1.pptxAbdullahEmam4

httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docxadampcarr67227

Running Head Data Mining in The Cloud .docxhealdkathaleen

Machine Learning Approaches and its Challengesijcnes

Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfJamieDornan2

Similaire à Guide to data analytics (20)

Data Analyst Interview Questions & Answers

Lesson 1 - Overview of Machine Learning and Data Analysis.pptx

what is ..how to process types and methods involved in data analysis

Data Science - Part III - EDA & Model Selection

data science course with placement in hyderabad

Unit2

Data Analysis and Analytics.pdf

Data analytics

AI in anomaly detection.pdf

Regression and correlation

Understanding The Pattern Of Recognition

Data analytics

Uncover Trends and Patterns with Data Science.pdf

AI in anomaly detection - An Overview.pdf

Moh.Abd-Ellatif_DataAnalysis1.pptx

httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx

Running Head Data Mining in The Cloud .docx

Machine Learning Approaches and its Challenges

Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf

Plus de Debashish Jana

Lies damned lies and statisticsDebashish Jana

Data to make hit tv showDebashish Jana

Bad statisticsDebashish Jana

Predictive analyticsDebashish Jana

Data communicationDebashish Jana

The beauty of data visualizationDebashish Jana

Make data more humanDebashish Jana

How to start thinking like a data scientistDebashish Jana

Big dataDebashish Jana

Data analysisDebashish Jana

Plus de Debashish Jana (10)

Lies damned lies and statistics

Data to make hit tv show

Bad statistics

Predictive analytics

Data communication

The beauty of data visualization

Make data more human

How to start thinking like a data scientist

Big data

Data analysis

Dernier

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster

Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly

Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7

Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics

20240419 - Measurecamp Amsterdam - SAM.pdfHuman37

办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss

专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss

DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Universitat Politècnica de Catalunya

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss

LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter

detection and classification of knee osteoarthritis.pptxAleenaJamil4

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics

Vision, Mission, Goals and Objectives ppt..pptxellehsormae

INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman

Learn How Data Science Changes Our WorldEduminds Learning

Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research

Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy

Dernier (20)

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024

Generative AI for Social Good at Open Data Science East 2024

Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...

Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...

20240419 - Measurecamp Amsterdam - SAM.pdf

办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree

专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改

DBA Basics: Getting Started with Performance Tuning.pdf

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理

LLMs, LMMs, their Improvement Suggestions and the Path towards AGI

detection and classification of knee osteoarthritis.pptx

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...

Vision, Mission, Goals and Objectives ppt..pptx

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD

Learn How Data Science Changes Our World

Biometric Authentication: The Evolution, Applications, Benefits and Challenge...

Student Profile Sample report on improving academic performance by uniting gr...

Guide to data analytics

2. Data science has become an essential business tool. With access to incredible amounts of data—thanks to advanced computing and the “Internet of things”—companies are now able to measure every aspect of their operations in granular detail.

3. Introduction There are no shortcuts for data exploration. If you are in a state of mind, that machine learning can sail you away from every data storm, trust me, it won’t. After some point of time, you’ll realize that you are struggling at improving model’s accuracy. In such situation, data exploration techniques will come to your rescue.

4. Steps of Data Exploration and Preparation Below are the steps involved to understand, clean and prepare your data for building your predictive model:- Variable Identification Univariate Analysis Bi-variate Analysis Missing values treatment Outlier treatment Variable transformation Variable creation

5. Variable Identification First, identify Predictor (Input) and Target (output) variables. Next, identify the data type and category of the variables.

6. Univariate Analysis At this stage, we explore variables one by one. Method to perform univariate analysis will depend on whether the variable type is categorical or continuous.

7. Bi-variate Analysis Bi-variate Analysis finds out the relationship between two variables. Here, we look for association and disassociation between variables at a pre-defined significance level.

8. Missing Value Treatment Missing data in the training data set can reduce the power / fit of a model or can lead to a biased model because we have not analysed the behavior and relationship with other variables correctly. It can lead to wrong prediction or classification.

9. We looked at the importance of treatment of missing values in a dataset. Now, let’s identify the reasons for occurrence of these missing values. They may occur at two stages: Data Extraction Data Collection

10. Outlier treatment Outlier is a commonly used terminology by analysts and data scientists as it needs close attention else it can result in wildly wrong estimations. Outlier can be of two types: Univariate and Multivariate.

11. Outliers can drastically change the results of the data analysis and statistical modeling. It increases the error variance and reduces the power of statistical tests. If the outliers are non-randomly distributed, they can decrease normality. They can bias or influence estimates that may be of substantive interest.

12. Working of Data Analysis A working knowledge of data science can help leaders turn analytics into genuine insight. It can also save them from making decisions based on faulty assumptions. “When analytics goes bad,”

13. How can leaders learn to distinguish between good and bad analytics? It all starts with understanding the data-generation process.You cannot judge the quality of the analytics if you don’t have a very clear idea of where the data came from.

Guide to data analytics

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (19)

Similaire à Guide to data analytics

Similaire à Guide to data analytics (20)

Plus de Debashish Jana

Plus de Debashish Jana (10)

Dernier

Dernier (20)

Guide to data analytics