SlideShare a Scribd company logo
1 of 25
Download to read offline
R for Statistics
Prof. Ashwini Mathur
Descriptive Statistics ..
Overview of Descriptive analysis ...
Descriptive statistics are used to summarize data in a way that provides insight
into the information contained in the data.
For Example:
Examining the mean or median of numeric data or the frequency of
observations for nominal data. Plots can be created that show the data and
indicating summary statistics.
Descriptive statistics for different types of data
Choosing which summary statistics are appropriate depend on the type of
variable being examined.
Different statistics should be used for interval/ratio, ordinal, and nominal
data.
These statistical terms are also called as Measurement scales.
Nominal Data
Nominal scales are used for labeling variables, without any quantitative value. “Nominal” scales
could simply be called “labels.”
Ordinal Data
With ordinal scales, the order of the values is what’s important and significant,
but the differences between each one is not really known.
For example, is the difference between “OK” and “Unhappy” the same as the
difference between “Very Happy” and “Happy?” We can’t say.
Ordinal scales are typically measures of non-numeric concepts like
satisfaction, happiness, discomfort, etc.
Ordinal Data
Interval Data
Interval scales are numeric scales in which we know both the order and the
exact differences between the values.
The classic example of an interval scale is Celsius temperature because the
difference between each value is the same.
For example, the difference between 60 and 50 degrees is a measurable 10
degrees, as is the difference between 80 and 70 degrees.
Interval
Ratio
Ratio scales are the mostly used when it comes to data measurement scales
because they tell us about the order, they tell us the exact value between
units, AND they also have an absolute zero–which allows for a wide range of
both descriptive and inferential statistics to be applied
Ratio scales provide a wealth of possibilities when it comes to statistical
analysis.
These variables can be meaningfully added, subtracted, multiplied, divided
(ratios).
Central tendency can be measured by mode, median, or mean; measures of
dispersion, such as standard deviation and coefficient of variation can also be
calculated from ratio scales.
Summary
Describing or examining data :
Location is also called central tendency. It is a measure of the values of the
data.
For example, are the values close to 10 or 100 or 1000?
Measures of location include mean and median, as well as somewhat more exotic
statistics like M-estimators or Winsorized means.
Cont..
Variation is also called dispersion. It is a measure of how far the data points
lie from one another. Common statistics include standard deviation and
coefficient of variation. For data that aren’t normally-distributed, percentiles or
the interquartile range might be used.
Shape refers to the distribution of values. The best tools to evaluate the
shape of data are histograms and related plots. Statistics include skewness and
kurtosis, though they are less useful than visual inspection. We can describe data
shape as normally-distributed, log-normal, uniform, skewed, bi-modal, and
others.
Descriptive statistics for interval/ratio data
For this example, imagine that Ren and Stimpy have
each held eight workshops to educating the public
about water conservation at home.
They are interested in how many people showed up
to the workshops.
Because the data formatted in a data frame, we can use the convention Data$Attendees to access the
variable Attendees
within the data frame Data.
Input = ("
Instructor Location Attendees
Ren North 7
Ren North 22
Ren North 6
Ren North 15
Ren South 12
Ren South 13
Ren South 14
Ren South 16
Stimpy North 18
Stimpy North 17
Stimpy North 15
Stimpy North 9
Stimpy South 15
Stimpy South 11
Stimpy South 19
Stimpy South 23
")
R Syntax :
Data = read.table(textConnection(Input),header=TRUE)
Data ### Will output data frame called Data
str(Data) ### but shows the structure of the data
frame
summary(Data) ### but summarizes variables in the data
frame
Functions sum and length
The sum of a variable can be found with the sum function, and the number of
observations can be found with the length function.
sum(Data$Attendees)
232
length(Data$Attendees)
16
Statistics of location for interval/ratio data
Mean
The mean is the arithmetic average, and is a common statistic used with
interval/ratio data. It is simply the sum of the values divided by the number of
values. The mean function in R will return the mean.
sum(Data$Attendees) / length(Data$Attendees)
14.5
mean(Data$Attendees)
14.5
Important :
Caution should be used when reporting mean values with skewed data, as the mean
may not be representative of the center of the data.
For example, imagine a town with 10 families, nine of whom have an income of less than $50, 000 per
year, but with one family with an income of $2,000,000 per year. The mean income for families in the
town would be $233,000, but this may not be a reasonable way to summarize the income of the
town.
Income = c(49000, 44000, 25000, 18000, 32000, 47000, 37000, 45000, 36000,
2000000)
mean(Income)
233300
Median
The median is defined as the value below which are 50% of the observations. To find this value
manually, you would order the observations, and separate the lowest 50% from the highest 50%. For data
sets with an odd number of observations, the median is the middle value. For data sets with an even
number of observations, the median falls half-way between the two middle values.
The median is a robust statistic in that it is not affected by adding extreme values. For example, if we
changed Stimpy’s last Attendees value from 23 to 1000, it would not affect the median.
median(Data$Attendees)
15
### Note that in this case the mean and median are close in value
### to one another. The mean and median will be more different
### the more the data are skewed.
The median is appropriate for either skewed or unskewed data. The median income for the town
discussed above is $40,500. Half the families in the town have an income above this amount, and half
have an income below this amount.
Income = c(49000, 44000, 25000, 18000, 32000, 47000, 37000, 45000, 36000,
2000000)
median(Income)
40500
Note that medians are sometimes reported as the “average person” or “typical family”. Saying, “The
average American family earned $54,000 last year” means that the median income for families was
$54,000. The “average family” is that one with the median income.
Mode
The mode is a summary statistic that is used rarely in practice, but is normally included in any
discussion of mean and medians. When there are discrete values for a variable, the mode is simply the
value which occurs most frequently. For example, in the Statistics Learning Center video in the
Required Readings below, Dr. Nic gives an example of counting the number of pairs of shoes each student
owns. The most common answer was 10, and therefore 10 is the mode for that data set.
For our Ren and Stimpy example, the value 15 occurs three times and so is the mode.
The Mode function can be found in the package DescTools.
library(DescTools)
Mode(Data$Attendees)
15
DOUBT ???

More Related Content

What's hot

Das20502 chapter 1 descriptive statistics
Das20502 chapter 1 descriptive statisticsDas20502 chapter 1 descriptive statistics
Das20502 chapter 1 descriptive statistics
Rozainita Rosley
 
Fernandos Statistics
Fernandos StatisticsFernandos Statistics
Fernandos Statistics
teresa_soto
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boa
raileeanne
 

What's hot (20)

Statistical Analysis with R -I
Statistical Analysis with R -IStatistical Analysis with R -I
Statistical Analysis with R -I
 
Areas In Statistics
Areas In StatisticsAreas In Statistics
Areas In Statistics
 
Das20502 chapter 1 descriptive statistics
Das20502 chapter 1 descriptive statisticsDas20502 chapter 1 descriptive statistics
Das20502 chapter 1 descriptive statistics
 
Basic statisctis -Anandh Shankar
Basic statisctis -Anandh ShankarBasic statisctis -Anandh Shankar
Basic statisctis -Anandh Shankar
 
Panel slides
Panel slidesPanel slides
Panel slides
 
Is the Data Scaled, Ordinal, or Nominal Proportional?
Is the Data Scaled, Ordinal, or Nominal Proportional?Is the Data Scaled, Ordinal, or Nominal Proportional?
Is the Data Scaled, Ordinal, or Nominal Proportional?
 
Statistics Class 10 CBSE
Statistics Class 10 CBSE Statistics Class 10 CBSE
Statistics Class 10 CBSE
 
Statistics
StatisticsStatistics
Statistics
 
Exploratory data analysis project
Exploratory data analysis project Exploratory data analysis project
Exploratory data analysis project
 
Major types of statistics terms that you should know
Major types of statistics terms that you should knowMajor types of statistics terms that you should know
Major types of statistics terms that you should know
 
Data Analysis and Statistics
Data Analysis and StatisticsData Analysis and Statistics
Data Analysis and Statistics
 
Fernandos Statistics
Fernandos StatisticsFernandos Statistics
Fernandos Statistics
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
statistic
statisticstatistic
statistic
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boa
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
 
PG STAT 531 lecture 1 introduction about statistics and collection, compilati...
PG STAT 531 lecture 1 introduction about statistics and collection, compilati...PG STAT 531 lecture 1 introduction about statistics and collection, compilati...
PG STAT 531 lecture 1 introduction about statistics and collection, compilati...
 
STATISTICAL PROCEDURES (Discriptive Statistics).pptx
STATISTICAL PROCEDURES (Discriptive Statistics).pptxSTATISTICAL PROCEDURES (Discriptive Statistics).pptx
STATISTICAL PROCEDURES (Discriptive Statistics).pptx
 
Statistical analysis and its applications
Statistical analysis and its applicationsStatistical analysis and its applications
Statistical analysis and its applications
 
Statistical table
Statistical tableStatistical table
Statistical table
 

Similar to R for statistics session 1

B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2
marshalkalra
 
initial postWhat are the characteristics, uses, advantages, and di.docx
initial postWhat are the characteristics, uses, advantages, and di.docxinitial postWhat are the characteristics, uses, advantages, and di.docx
initial postWhat are the characteristics, uses, advantages, and di.docx
JeniceStuckeyoo
 
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdfGraphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Himakshi7
 

Similar to R for statistics session 1 (20)

Data science 101 statistics overview
Data science 101 statistics overviewData science 101 statistics overview
Data science 101 statistics overview
 
Data science Course: Statistics overview
Data science Course: Statistics overviewData science Course: Statistics overview
Data science Course: Statistics overview
 
B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2
 
MMW (Data Management)-Part 1 for ULO 2 (1).pptx
MMW (Data Management)-Part 1 for ULO 2 (1).pptxMMW (Data Management)-Part 1 for ULO 2 (1).pptx
MMW (Data Management)-Part 1 for ULO 2 (1).pptx
 
initial postWhat are the characteristics, uses, advantages, and di.docx
initial postWhat are the characteristics, uses, advantages, and di.docxinitial postWhat are the characteristics, uses, advantages, and di.docx
initial postWhat are the characteristics, uses, advantages, and di.docx
 
Machine learning session1
Machine learning   session1Machine learning   session1
Machine learning session1
 
STATISTICS.pptx
STATISTICS.pptxSTATISTICS.pptx
STATISTICS.pptx
 
Presentation1
Presentation1Presentation1
Presentation1
 
Quatitative Data Analysis
Quatitative Data Analysis Quatitative Data Analysis
Quatitative Data Analysis
 
EDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.pptEDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.ppt
 
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdfGraphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
 
Topic 2 Measures of Central Tendency.pptx
Topic 2   Measures of Central Tendency.pptxTopic 2   Measures of Central Tendency.pptx
Topic 2 Measures of Central Tendency.pptx
 
Statistics-24-04-2021-20210618114031.ppt
Statistics-24-04-2021-20210618114031.pptStatistics-24-04-2021-20210618114031.ppt
Statistics-24-04-2021-20210618114031.ppt
 
Statistics-24-04-2021-20210618114031.ppt
Statistics-24-04-2021-20210618114031.pptStatistics-24-04-2021-20210618114031.ppt
Statistics-24-04-2021-20210618114031.ppt
 
Statistics-24-04-2021-20210618114031.ppt
Statistics-24-04-2021-20210618114031.pptStatistics-24-04-2021-20210618114031.ppt
Statistics-24-04-2021-20210618114031.ppt
 
Statistics as a discipline
Statistics as a disciplineStatistics as a discipline
Statistics as a discipline
 
Business statistics (Basics)
Business statistics (Basics)Business statistics (Basics)
Business statistics (Basics)
 
DATA UNIT-3.pptx
DATA UNIT-3.pptxDATA UNIT-3.pptx
DATA UNIT-3.pptx
 
Descriptive and Inferential Statistics.docx
Descriptive and Inferential Statistics.docxDescriptive and Inferential Statistics.docx
Descriptive and Inferential Statistics.docx
 
statistics class 11
statistics class 11statistics class 11
statistics class 11
 

More from Ashwini Mathur

Linear programming optimization in r
Linear programming optimization in r Linear programming optimization in r
Linear programming optimization in r
Ashwini Mathur
 
Introduction to python along with the comparitive analysis with r
Introduction to python   along with the comparitive analysis with r Introduction to python   along with the comparitive analysis with r
Introduction to python along with the comparitive analysis with r
Ashwini Mathur
 
Example 2 summerization notes for descriptive statistics using r
Example   2    summerization notes for descriptive statistics using r Example   2    summerization notes for descriptive statistics using r
Example 2 summerization notes for descriptive statistics using r
Ashwini Mathur
 
Descriptive analytics in r programming language
Descriptive analytics in r programming languageDescriptive analytics in r programming language
Descriptive analytics in r programming language
Ashwini Mathur
 

More from Ashwini Mathur (19)

S3 classes and s4 classes
S3 classes and s4 classesS3 classes and s4 classes
S3 classes and s4 classes
 
Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r
 
Rpy2 demonstration
Rpy2 demonstrationRpy2 demonstration
Rpy2 demonstration
 
Rpy package
Rpy packageRpy package
Rpy package
 
R programming lab 3 - jupyter notebook
R programming lab   3 - jupyter notebookR programming lab   3 - jupyter notebook
R programming lab 3 - jupyter notebook
 
R programming lab 2 - jupyter notebook
R programming lab   2 - jupyter notebookR programming lab   2 - jupyter notebook
R programming lab 2 - jupyter notebook
 
R programming lab 1 - jupyter notebook
R programming lab   1 - jupyter notebookR programming lab   1 - jupyter notebook
R programming lab 1 - jupyter notebook
 
R for statistics 2
R for statistics 2R for statistics 2
R for statistics 2
 
Play with matrix in r
Play with matrix in rPlay with matrix in r
Play with matrix in r
 
Object oriented programming in r
Object oriented programming in r Object oriented programming in r
Object oriented programming in r
 
Linear programming optimization in r
Linear programming optimization in r Linear programming optimization in r
Linear programming optimization in r
 
Introduction to python along with the comparitive analysis with r
Introduction to python   along with the comparitive analysis with r Introduction to python   along with the comparitive analysis with r
Introduction to python along with the comparitive analysis with r
 
Example 2 summerization notes for descriptive statistics using r
Example   2    summerization notes for descriptive statistics using r Example   2    summerization notes for descriptive statistics using r
Example 2 summerization notes for descriptive statistics using r
 
Descriptive statistics assignment
Descriptive statistics assignment Descriptive statistics assignment
Descriptive statistics assignment
 
Descriptive analytics in r programming language
Descriptive analytics in r programming languageDescriptive analytics in r programming language
Descriptive analytics in r programming language
 
Data analysis for covid 19
Data analysis for covid 19Data analysis for covid 19
Data analysis for covid 19
 
Correlation and linear regression
Correlation and linear regression Correlation and linear regression
Correlation and linear regression
 
Calling c functions from r programming unit 5
Calling c functions from r programming    unit 5Calling c functions from r programming    unit 5
Calling c functions from r programming unit 5
 
Anova (analysis of variance) test
Anova (analysis of variance) test Anova (analysis of variance) test
Anova (analysis of variance) test
 

Recently uploaded

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 

Recently uploaded (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 

R for statistics session 1

  • 1. R for Statistics Prof. Ashwini Mathur
  • 3. Overview of Descriptive analysis ... Descriptive statistics are used to summarize data in a way that provides insight into the information contained in the data. For Example: Examining the mean or median of numeric data or the frequency of observations for nominal data. Plots can be created that show the data and indicating summary statistics.
  • 4. Descriptive statistics for different types of data Choosing which summary statistics are appropriate depend on the type of variable being examined. Different statistics should be used for interval/ratio, ordinal, and nominal data. These statistical terms are also called as Measurement scales.
  • 5. Nominal Data Nominal scales are used for labeling variables, without any quantitative value. “Nominal” scales could simply be called “labels.”
  • 6. Ordinal Data With ordinal scales, the order of the values is what’s important and significant, but the differences between each one is not really known. For example, is the difference between “OK” and “Unhappy” the same as the difference between “Very Happy” and “Happy?” We can’t say. Ordinal scales are typically measures of non-numeric concepts like satisfaction, happiness, discomfort, etc.
  • 8. Interval Data Interval scales are numeric scales in which we know both the order and the exact differences between the values. The classic example of an interval scale is Celsius temperature because the difference between each value is the same. For example, the difference between 60 and 50 degrees is a measurable 10 degrees, as is the difference between 80 and 70 degrees.
  • 10. Ratio Ratio scales are the mostly used when it comes to data measurement scales because they tell us about the order, they tell us the exact value between units, AND they also have an absolute zero–which allows for a wide range of both descriptive and inferential statistics to be applied
  • 11. Ratio scales provide a wealth of possibilities when it comes to statistical analysis. These variables can be meaningfully added, subtracted, multiplied, divided (ratios). Central tendency can be measured by mode, median, or mean; measures of dispersion, such as standard deviation and coefficient of variation can also be calculated from ratio scales.
  • 13. Describing or examining data : Location is also called central tendency. It is a measure of the values of the data. For example, are the values close to 10 or 100 or 1000? Measures of location include mean and median, as well as somewhat more exotic statistics like M-estimators or Winsorized means.
  • 14. Cont.. Variation is also called dispersion. It is a measure of how far the data points lie from one another. Common statistics include standard deviation and coefficient of variation. For data that aren’t normally-distributed, percentiles or the interquartile range might be used. Shape refers to the distribution of values. The best tools to evaluate the shape of data are histograms and related plots. Statistics include skewness and kurtosis, though they are less useful than visual inspection. We can describe data shape as normally-distributed, log-normal, uniform, skewed, bi-modal, and others.
  • 15. Descriptive statistics for interval/ratio data
  • 16. For this example, imagine that Ren and Stimpy have each held eight workshops to educating the public about water conservation at home. They are interested in how many people showed up to the workshops.
  • 17. Because the data formatted in a data frame, we can use the convention Data$Attendees to access the variable Attendees within the data frame Data. Input = (" Instructor Location Attendees Ren North 7 Ren North 22 Ren North 6 Ren North 15 Ren South 12 Ren South 13 Ren South 14 Ren South 16 Stimpy North 18 Stimpy North 17 Stimpy North 15 Stimpy North 9 Stimpy South 15 Stimpy South 11 Stimpy South 19 Stimpy South 23 ")
  • 18. R Syntax : Data = read.table(textConnection(Input),header=TRUE) Data ### Will output data frame called Data str(Data) ### but shows the structure of the data frame summary(Data) ### but summarizes variables in the data frame
  • 19. Functions sum and length The sum of a variable can be found with the sum function, and the number of observations can be found with the length function. sum(Data$Attendees) 232 length(Data$Attendees) 16
  • 20. Statistics of location for interval/ratio data Mean The mean is the arithmetic average, and is a common statistic used with interval/ratio data. It is simply the sum of the values divided by the number of values. The mean function in R will return the mean. sum(Data$Attendees) / length(Data$Attendees) 14.5 mean(Data$Attendees) 14.5
  • 21. Important : Caution should be used when reporting mean values with skewed data, as the mean may not be representative of the center of the data. For example, imagine a town with 10 families, nine of whom have an income of less than $50, 000 per year, but with one family with an income of $2,000,000 per year. The mean income for families in the town would be $233,000, but this may not be a reasonable way to summarize the income of the town. Income = c(49000, 44000, 25000, 18000, 32000, 47000, 37000, 45000, 36000, 2000000) mean(Income) 233300
  • 22. Median The median is defined as the value below which are 50% of the observations. To find this value manually, you would order the observations, and separate the lowest 50% from the highest 50%. For data sets with an odd number of observations, the median is the middle value. For data sets with an even number of observations, the median falls half-way between the two middle values. The median is a robust statistic in that it is not affected by adding extreme values. For example, if we changed Stimpy’s last Attendees value from 23 to 1000, it would not affect the median. median(Data$Attendees) 15 ### Note that in this case the mean and median are close in value ### to one another. The mean and median will be more different ### the more the data are skewed.
  • 23. The median is appropriate for either skewed or unskewed data. The median income for the town discussed above is $40,500. Half the families in the town have an income above this amount, and half have an income below this amount. Income = c(49000, 44000, 25000, 18000, 32000, 47000, 37000, 45000, 36000, 2000000) median(Income) 40500 Note that medians are sometimes reported as the “average person” or “typical family”. Saying, “The average American family earned $54,000 last year” means that the median income for families was $54,000. The “average family” is that one with the median income.
  • 24. Mode The mode is a summary statistic that is used rarely in practice, but is normally included in any discussion of mean and medians. When there are discrete values for a variable, the mode is simply the value which occurs most frequently. For example, in the Statistics Learning Center video in the Required Readings below, Dr. Nic gives an example of counting the number of pairs of shoes each student owns. The most common answer was 10, and therefore 10 is the mode for that data set. For our Ren and Stimpy example, the value 15 occurs three times and so is the mode. The Mode function can be found in the package DescTools. library(DescTools) Mode(Data$Attendees) 15