SlideShare a Scribd company logo
1 of 35
Introduction to
Descriptive Statistics I
Sanju Rusara Seneviratne MBPsS
Overview of Intro to Descriptive Statistics I
This lecture will cover the following topics:
ļµ Definition and Types of Descriptive Statistics
ļµ Mean, Median, Mode and Range
ļµ Skewness and Kurtosis
ļµ Normality Curve
ļµ Variance and Standard Deviation
ļµ Quartiles
ļµ Percentiles
ļµ Using Excel for Descriptive Statistics
Defining Descriptive Statistics
The analysis of data that helps describe, show or summarize
data in a meaningful way such that, for example, patterns
might emerge from the data.
They do not, however, allow us to make conclusions beyond
the data we have analyzed or reach conclusions regarding
any hypotheses we might have made.
Descriptive vs. Inferential:
Descriptive statistics are used to describe our samples and
inferential statistics are used to generalize from our samples to
the wider population.
Types of Descriptive Statistic
1. Measures of central tendency:
These are ways of describing the central position of a
frequency distribution for a group of data.
ļµ We can describe this central position using a number of statistics,
including the mode, median, and mean.
2. Measures of spread:
These are ways of summarizing a group of data by
describing how spread out the scores are.
ļµ Measures of spread help us to summarize how spread out data
are. To describe this spread, a number of statistics are available
us, including the range, quartiles, absolute deviation, variance
and standard deviation.
Summarizing Descriptive Statistics
When we use descriptive statistics it is useful to summarize
our group of data using a combination of:
ā€¢ tabulated description (i.e., tables)
ā€¢ graphical description (i.e., graphs and charts)
ā€¢ statistical commentary (i.e., a discussion of the results)
Mean, Median, Mode and Range
ā€¢ Mean - The mean is the average of all numbers and is sometimes
called the arithmetic mean. To calculate mean, add all of the
in a set and then divide the sum by the total count of numbers.
ā€¢ Median - The statistical median is the middle number in a sequence
of numbers. To find the median, organize each number in order by
size; the number in the middle is the median.
ā€¢ Mode - The mode is the number that occurs most often within a set
of numbers.
ā€¢ Range - The range is the difference between the highest and lowest
values within a set of numbers. To calculate range, subtract the
smallest number from the largest number in the set.
Skewness and Kurtosis
ā€¢ Skewness - a measure of symmetry, or more precisely,
the lack of symmetry. A distribution, or data set, is
symmetric if it looks the same to the left and right of the
center point.
ā€¢ Kurtosis - a measure of whether the data are heavy-
tailed or light-tailed relative to a normal distribution. That
is, data sets with high kurtosis tend to have heavy tails, or
outliers. Data sets with low kurtosis tend to have light
or lack of outliers. A uniform distribution would be the
extreme case.
ā€¢ The histogram is an effective graphical technique for
showing both the skewness and kurtosis of data set.
Normality Curve
ā€¢ The normal distribution is the most important and most widely used
distribution in statistics. It is sometimes called the "bell curveā€ and the
"Gaussian curveā€.
Seven Features of Normal Distributions
1. Normal distributions are symmetric around their mean.
2. The mean, median, and mode of a normal distribution are
equal.
3. The area under the normal curve is equal to 1.0.
4. Normal distributions are denser in the center and less dense in
the tails.
5. Normal distributions are defined by two parameters, the mean
(Ī¼) and the standard deviation (Ļƒ).
6. 68% of the area of a normal distribution is within one standard
deviation of the mean.
7. Approximately 95% of the area of a normal distribution is
within two standard deviations of the mean.
Variance and Standard Deviation
ā€¢ Variance: measures how far a data set is spread out. The
technical definition is ā€œThe average of the squared
differences from the mean,ā€ but all it really does is to give
you a very general idea of the spread of your data.
ļµ A value of zero means that there is no variability; All the
numbers in the data set are the same.
ā€¢ Standard Deviation: the square root of the variance.
While variance gives you a rough idea of spread, the
standard deviation is more concrete, giving you exact
distances from the mean.
Quartiles
ā€¢ Quartiles in statistics are values that divide your data into
quarters. They divide your data into four segments
according to where the numbers fall on the number line.
ā€¢ The four quarters that divide a data set into quartiles are:
ļµ The lowest 25% of numbers.
ļµ The next lowest 25% of numbers (up to the median).
ļµ The second highest 25% of numbers (above the median).
ļµ The highest 25% of numbers.
Percentiles
ā€¢ The most common definition of a percentile is a number where a certain
percentage of scores fall below that number.
ļµ The 25th percentile is also called the first quartile.
ļµ The 50th percentile is generally the median (if youā€™re using the third definitionā€”
see below).
ļµ The 75th percentile is also called the third quartile.
ļµ The difference between the third and first quartiles is the interquartile range.
ā€¢ Percentile Rank:
ļµ The nth percentile is the lowest score that is greater than a certain
percentage (ā€œnā€) of the scores.
ļµ The nth percentile is the smallest score that is greater than or equal to a
certain percentage of the scores. To rephrase this, itā€™s the percentage of
data that falls at or below a certain observation.
ā€¢ A percentile range is the difference between two specified percentiles.
Conducting Descriptive Analysis in Excel
ā€¢ Step 1: Type your data into Excel, in a single column. For
example, if you have ten items in your data set, type them
into cells A1 through A10.
ā€¢ Step 2: Click the ā€œDataā€ tab and then click ā€œData
Analysisā€ in the Analysis group.
ā€¢ Step 3: Highlight ā€œDescriptive Statisticsā€ in the pop-up
Data Analysis window.
ā€¢ Step 4: Type an input range into the ā€œInput Rangeā€
text box. For this example, type ā€œA1:A10ā€ into the box.
Conducting Descriptive Analysis in Excel
ā€¢ Step 5: Check the ā€œLabels in first rowā€ check box if you
have titled the column in row 1, otherwise leave the box
unchecked.
ā€¢ Step 6: Type a cell location into the ā€œOutput Rangeā€
box. For example, type ā€œC1.ā€ Make sure that two adjacent
columns do not have data in them.
ā€¢ Step 7: Click the ā€œSummary Statisticsā€ check box and
then click ā€œOKā€ to display Excel descriptive statistics. A
of descriptive statistics will be returned in the column you
selected as the Output Range.
Introduction to
Descriptive Statistics II
Sanju Rusara Seneviratne MBPsS
Overview of Intro to Descriptive Statistics II
This lecture will cover the following topics:
ļµ Bar Charts
ļµ Pie Charts
ļµ Histograms
ļµ Box-Plots
ļµ Scatter Plots
Bar Charts
ā€¢ A bar graph (also known as a bar chart or bar diagram) is
a visual tool that uses bars to compare data among
categories. A bar graph may run horizontally or vertically.
The important thing to know is that the longer the bar, the
greater its value.
ā€¢ Bar graphs consist of two axes.
ļµ On a vertical bar graph, the horizontal axis (or x-axis)
shows the data categories.
ļµ The vertical axis (or y-axis) is the scale.
Bar Charts
ā€¢ Bar graphs have three key attributes:
1. A bar diagram makes it easy to compare sets of data
between different groups at a glance.
2. The graph represents categories on one axis and a
discrete value in the other. The goal is to show the
relationship between the two axes.
3. Bar charts can also show big changes in data over
time.
Examples of Bar Charts
Examples of Bar Charts
Pie Charts
ā€¢ A pie chart is a circular graph that shows the relative
contribution that different categories contribute to an
overall total.
ā€¢ A wedge of the circle represents each categoryā€™s
contribution, such that the graph resembles a pie that
has been cut into different sized slices.
ā€¢ Every 1% contribution that a category contributes to the
total corresponds to a slice with an angle of 3.6 degrees.
Pie Charts
ā€¢ Pie charts are a visual way of displaying data that might
otherwise be given in a small table.
ā€¢ Pie charts are useful for displaying data that are classified
into nominal or ordinal categories.
ļµ Nominal data are categorised according to descriptive or
qualitative information such as county of birth or type of
pet owned.
ļµ Ordinal data are similar but the different categories can
also be ranked, for example in a survey people may be
asked to say whether they classed something as very poor,
poor, fair, good, very good.
Pie Charts
ā€¢ Pie charts are generally used to show percentage or
proportional data and usually the percentage represented
by each category is provided next to the corresponding
slice of pie.
ā€¢ Pie charts are good for displaying data for around 6
categories or fewer. When there are more categories it is
difficult for the eye to distinguish between the relative
sizes of the different sectors and so the chart becomes
difficult to interpret.
Examples of Pie Charts
Examples of Pie Charts
Histograms
ā€¢ A histogram is a plot that lets you discover, and show, the
underlying frequency distribution (shape) of a set
of continuous data. This allows the inspection of the data
for its underlying distribution (e.g., normal distribution),
outliers, skewness, etc.
ā€¢ The area of the bar that indicates the frequency of
occurrences for each bin. This means that the height of
the bar does not necessarily indicate how many
occurrences of scores there were within each individual
bin. It is the product of height multiplied by the width of
the bin that indicates the frequency of occurrences within
that bin.
Histograms
ā€¢ One of the reasons that the height of the bars is often
incorrectly assessed as indicating frequency and not the
area of the bar is due to the fact that a lot of histograms
often have equally spaced bars (bins), and under these
circumstances, the height of the bin does reflect the
frequency.
ā€¢ The major difference is that a histogram is only used to
plot the frequency of score occurrences in a continuous
data set that has been divided into classes, called bins. Bar
charts, on the other hand, can be used for a great deal of
other types of variables including ordinal and nominal
data sets.
Histograms
A histogram showing frequencies of
different age groups in a sample.
Thinking Point:
What can you infer about the
normal distribution of this data
from this chart?
Box-Plots
ā€¢ A boxplot is a standardized way of displaying the
distribution of data based on a five number summary
(ā€œminimumā€, first quartile (Q1), median, third quartile (Q3),
and ā€œmaximumā€).
ā€¢ It can tell you about your outliers and what their values
are.
ā€¢ It can also tell you if your data is symmetrical, how tightly
your data is grouped, and if and how your data is skewed.
Example of a Box-Plot
See next slide for description of this box-plot.
Elements of a Box-Plot
ā€¢ A boxplot is a graph that gives you a good indication of
how the values in the data are spread out.
ļµ median (Q2/50th Percentile): the middle value of the dataset.
ļµ first quartile (Q1/25th Percentile): the middle number between
the smallest number (not the ā€œminimumā€) and the median of the
dataset.
ļµ third quartile (Q3/75th Percentile): the middle value between
median and the highest value (not the ā€œmaximumā€) of the dataset.
ļµ interquartile range (IQR): 25th to the 75th percentile.
ļµ whiskers (shown in blue)
ļµ outliers (shown as green circles)
ļµ ā€œmaximumā€: Q3 + 1.5*IQR
ļµ ā€œminimumā€: Q1 -1.5*IQR
Scatter Plots
ā€¢ A scatter plot is a two-dimensional data visualization that
uses dots to represent the values obtained for two
different variables - one plotted along the x-axis and the
other plotted along the y-axis.
ā€¢ Scatter plots are used when you want to show the
relationship between two variables. Scatter plots are
sometimes called correlation plots because they show
how two variables are correlated.
ā€¢ However, not all relationships are linear.
Examples of Scatter Plots
A scatterplot showing the relationship between weight
(in lb) and height (in inches) in children.
This demonstrates a positive linear relationship.
Examples of Scatter Plots
References and Further Reading
Books:
ā€¢ Dancey, C. and Reidy, J. (2017). Statistics without Maths
for Psychology,7th Edition. New York: Pearson.
ā€¢ Howitt, D., & Cramer, D. (2017). Statistics in psychology
using SPSS. New York: Pearson.
Articles:
ā€¢ Bickel, P. J., & Lehmann, E. L. (1975). Descriptive Statistics
for Nonparametric Models I. Introduction. The Annals of
Statistics, 3(5), 1038-1044. doi:10.1214/aos/1176343239 |
https://link.springer.com/content/pdf/10.1007/978-1-
4614-1412-4_42.pdf

More Related Content

What's hot

Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
Nursing Path
Ā 
descriptive and inferential statistics
descriptive and inferential statisticsdescriptive and inferential statistics
descriptive and inferential statistics
Mona Sajid
Ā 
Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methods
guest9fa52
Ā 
Types of Statistics
Types of StatisticsTypes of Statistics
Types of Statistics
loranel
Ā 

What's hot (20)

Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
Ā 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
Ā 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
Ā 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
Ā 
Statistics "Descriptive & Inferential"
Statistics "Descriptive & Inferential"Statistics "Descriptive & Inferential"
Statistics "Descriptive & Inferential"
Ā 
descriptive and inferential statistics
descriptive and inferential statisticsdescriptive and inferential statistics
descriptive and inferential statistics
Ā 
Introduction to Statistics - Basic concepts
Introduction to Statistics - Basic conceptsIntroduction to Statistics - Basic concepts
Introduction to Statistics - Basic concepts
Ā 
Measures of dispersion
Measures  of  dispersionMeasures  of  dispersion
Measures of dispersion
Ā 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
Ā 
DATA Types
DATA TypesDATA Types
DATA Types
Ā 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
Ā 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
Ā 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
Ā 
Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methods
Ā 
Descriptive statistics and graphs
Descriptive statistics and graphsDescriptive statistics and graphs
Descriptive statistics and graphs
Ā 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
Ā 
Types of Statistics
Types of StatisticsTypes of Statistics
Types of Statistics
Ā 
Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)
Ā 
Normal distribution
Normal distributionNormal distribution
Normal distribution
Ā 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
Ā 

Similar to Introduction to Descriptive Statistics

Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdfGraphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Himakshi7
Ā 
Data Representations
Data RepresentationsData Representations
Data Representations
bujols
Ā 
Graphical presentation of data
Graphical presentation of dataGraphical presentation of data
Graphical presentation of data
drasifk
Ā 

Similar to Introduction to Descriptive Statistics (20)

Basic statisctis -Anandh Shankar
Basic statisctis -Anandh ShankarBasic statisctis -Anandh Shankar
Basic statisctis -Anandh Shankar
Ā 
Statistics for machine learning shifa noorulain
Statistics for machine learning   shifa noorulainStatistics for machine learning   shifa noorulain
Statistics for machine learning shifa noorulain
Ā 
2. chapter ii(analyz)
2. chapter ii(analyz)2. chapter ii(analyz)
2. chapter ii(analyz)
Ā 
Chapter 12 Data Analysis Descriptive Methods and Index Numbers
Chapter 12 Data Analysis Descriptive Methods and Index NumbersChapter 12 Data Analysis Descriptive Methods and Index Numbers
Chapter 12 Data Analysis Descriptive Methods and Index Numbers
Ā 
Data presenatation
Data presenatationData presenatation
Data presenatation
Ā 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
Ā 
STATISTICAL PROCEDURES (Discriptive Statistics).pptx
STATISTICAL PROCEDURES (Discriptive Statistics).pptxSTATISTICAL PROCEDURES (Discriptive Statistics).pptx
STATISTICAL PROCEDURES (Discriptive Statistics).pptx
Ā 
R training4
R training4R training4
R training4
Ā 
Descriptive Statistics and Data Visualization
Descriptive Statistics and Data VisualizationDescriptive Statistics and Data Visualization
Descriptive Statistics and Data Visualization
Ā 
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdfGraphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Ā 
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
Ā 
Working with Numerical Data
Working with  Numerical DataWorking with  Numerical Data
Working with Numerical Data
Ā 
Organizational Data Analysis by Mr Mumba.pptx
Organizational Data Analysis by Mr Mumba.pptxOrganizational Data Analysis by Mr Mumba.pptx
Organizational Data Analysis by Mr Mumba.pptx
Ā 
Statistics
StatisticsStatistics
Statistics
Ā 
Wynberg girls high-Jade Gibson-maths-data analysis statistics
Wynberg girls high-Jade Gibson-maths-data analysis statisticsWynberg girls high-Jade Gibson-maths-data analysis statistics
Wynberg girls high-Jade Gibson-maths-data analysis statistics
Ā 
Data Representations
Data RepresentationsData Representations
Data Representations
Ā 
Graphical presentation of data
Graphical presentation of dataGraphical presentation of data
Graphical presentation of data
Ā 
Statr sessions 4 to 6
Statr sessions 4 to 6Statr sessions 4 to 6
Statr sessions 4 to 6
Ā 
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical SciencesExploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Ā 
Descriptive Analysis.pptx
Descriptive Analysis.pptxDescriptive Analysis.pptx
Descriptive Analysis.pptx
Ā 

More from Sanju Rusara Seneviratne

More from Sanju Rusara Seneviratne (6)

Key Debates in Psychology
Key Debates in PsychologyKey Debates in Psychology
Key Debates in Psychology
Ā 
Behaviorism and Classical Conditioning
Behaviorism and Classical ConditioningBehaviorism and Classical Conditioning
Behaviorism and Classical Conditioning
Ā 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
Ā 
Sample Selection
Sample SelectionSample Selection
Sample Selection
Ā 
Certificate for Digital.Me : Managing Your Digital Self
Certificate for Digital.Me : Managing Your Digital SelfCertificate for Digital.Me : Managing Your Digital Self
Certificate for Digital.Me : Managing Your Digital Self
Ā 
Certificate for Bridging the Dementia Divide: Supporting People Living with D...
Certificate for Bridging the Dementia Divide: Supporting People Living with D...Certificate for Bridging the Dementia Divide: Supporting People Living with D...
Certificate for Bridging the Dementia Divide: Supporting People Living with D...
Ā 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
Ā 

Recently uploaded (20)

Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Ā 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
Ā 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
Ā 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
Ā 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
Ā 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
Ā 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
Ā 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
Ā 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Ā 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
Ā 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
Ā 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
Ā 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
Ā 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
Ā 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
Ā 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
Ā 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
Ā 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
Ā 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
Ā 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
Ā 

Introduction to Descriptive Statistics

  • 1. Introduction to Descriptive Statistics I Sanju Rusara Seneviratne MBPsS
  • 2. Overview of Intro to Descriptive Statistics I This lecture will cover the following topics: ļµ Definition and Types of Descriptive Statistics ļµ Mean, Median, Mode and Range ļµ Skewness and Kurtosis ļµ Normality Curve ļµ Variance and Standard Deviation ļµ Quartiles ļµ Percentiles ļµ Using Excel for Descriptive Statistics
  • 3. Defining Descriptive Statistics The analysis of data that helps describe, show or summarize data in a meaningful way such that, for example, patterns might emerge from the data. They do not, however, allow us to make conclusions beyond the data we have analyzed or reach conclusions regarding any hypotheses we might have made. Descriptive vs. Inferential: Descriptive statistics are used to describe our samples and inferential statistics are used to generalize from our samples to the wider population.
  • 4. Types of Descriptive Statistic 1. Measures of central tendency: These are ways of describing the central position of a frequency distribution for a group of data. ļµ We can describe this central position using a number of statistics, including the mode, median, and mean. 2. Measures of spread: These are ways of summarizing a group of data by describing how spread out the scores are. ļµ Measures of spread help us to summarize how spread out data are. To describe this spread, a number of statistics are available us, including the range, quartiles, absolute deviation, variance and standard deviation.
  • 5. Summarizing Descriptive Statistics When we use descriptive statistics it is useful to summarize our group of data using a combination of: ā€¢ tabulated description (i.e., tables) ā€¢ graphical description (i.e., graphs and charts) ā€¢ statistical commentary (i.e., a discussion of the results)
  • 6. Mean, Median, Mode and Range ā€¢ Mean - The mean is the average of all numbers and is sometimes called the arithmetic mean. To calculate mean, add all of the in a set and then divide the sum by the total count of numbers. ā€¢ Median - The statistical median is the middle number in a sequence of numbers. To find the median, organize each number in order by size; the number in the middle is the median. ā€¢ Mode - The mode is the number that occurs most often within a set of numbers. ā€¢ Range - The range is the difference between the highest and lowest values within a set of numbers. To calculate range, subtract the smallest number from the largest number in the set.
  • 7. Skewness and Kurtosis ā€¢ Skewness - a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point. ā€¢ Kurtosis - a measure of whether the data are heavy- tailed or light-tailed relative to a normal distribution. That is, data sets with high kurtosis tend to have heavy tails, or outliers. Data sets with low kurtosis tend to have light or lack of outliers. A uniform distribution would be the extreme case. ā€¢ The histogram is an effective graphical technique for showing both the skewness and kurtosis of data set.
  • 8. Normality Curve ā€¢ The normal distribution is the most important and most widely used distribution in statistics. It is sometimes called the "bell curveā€ and the "Gaussian curveā€.
  • 9. Seven Features of Normal Distributions 1. Normal distributions are symmetric around their mean. 2. The mean, median, and mode of a normal distribution are equal. 3. The area under the normal curve is equal to 1.0. 4. Normal distributions are denser in the center and less dense in the tails. 5. Normal distributions are defined by two parameters, the mean (Ī¼) and the standard deviation (Ļƒ). 6. 68% of the area of a normal distribution is within one standard deviation of the mean. 7. Approximately 95% of the area of a normal distribution is within two standard deviations of the mean.
  • 10. Variance and Standard Deviation ā€¢ Variance: measures how far a data set is spread out. The technical definition is ā€œThe average of the squared differences from the mean,ā€ but all it really does is to give you a very general idea of the spread of your data. ļµ A value of zero means that there is no variability; All the numbers in the data set are the same. ā€¢ Standard Deviation: the square root of the variance. While variance gives you a rough idea of spread, the standard deviation is more concrete, giving you exact distances from the mean.
  • 11. Quartiles ā€¢ Quartiles in statistics are values that divide your data into quarters. They divide your data into four segments according to where the numbers fall on the number line. ā€¢ The four quarters that divide a data set into quartiles are: ļµ The lowest 25% of numbers. ļµ The next lowest 25% of numbers (up to the median). ļµ The second highest 25% of numbers (above the median). ļµ The highest 25% of numbers.
  • 12. Percentiles ā€¢ The most common definition of a percentile is a number where a certain percentage of scores fall below that number. ļµ The 25th percentile is also called the first quartile. ļµ The 50th percentile is generally the median (if youā€™re using the third definitionā€” see below). ļµ The 75th percentile is also called the third quartile. ļµ The difference between the third and first quartiles is the interquartile range. ā€¢ Percentile Rank: ļµ The nth percentile is the lowest score that is greater than a certain percentage (ā€œnā€) of the scores. ļµ The nth percentile is the smallest score that is greater than or equal to a certain percentage of the scores. To rephrase this, itā€™s the percentage of data that falls at or below a certain observation. ā€¢ A percentile range is the difference between two specified percentiles.
  • 13. Conducting Descriptive Analysis in Excel ā€¢ Step 1: Type your data into Excel, in a single column. For example, if you have ten items in your data set, type them into cells A1 through A10. ā€¢ Step 2: Click the ā€œDataā€ tab and then click ā€œData Analysisā€ in the Analysis group. ā€¢ Step 3: Highlight ā€œDescriptive Statisticsā€ in the pop-up Data Analysis window. ā€¢ Step 4: Type an input range into the ā€œInput Rangeā€ text box. For this example, type ā€œA1:A10ā€ into the box.
  • 14. Conducting Descriptive Analysis in Excel ā€¢ Step 5: Check the ā€œLabels in first rowā€ check box if you have titled the column in row 1, otherwise leave the box unchecked. ā€¢ Step 6: Type a cell location into the ā€œOutput Rangeā€ box. For example, type ā€œC1.ā€ Make sure that two adjacent columns do not have data in them. ā€¢ Step 7: Click the ā€œSummary Statisticsā€ check box and then click ā€œOKā€ to display Excel descriptive statistics. A of descriptive statistics will be returned in the column you selected as the Output Range.
  • 15. Introduction to Descriptive Statistics II Sanju Rusara Seneviratne MBPsS
  • 16. Overview of Intro to Descriptive Statistics II This lecture will cover the following topics: ļµ Bar Charts ļµ Pie Charts ļµ Histograms ļµ Box-Plots ļµ Scatter Plots
  • 17. Bar Charts ā€¢ A bar graph (also known as a bar chart or bar diagram) is a visual tool that uses bars to compare data among categories. A bar graph may run horizontally or vertically. The important thing to know is that the longer the bar, the greater its value. ā€¢ Bar graphs consist of two axes. ļµ On a vertical bar graph, the horizontal axis (or x-axis) shows the data categories. ļµ The vertical axis (or y-axis) is the scale.
  • 18. Bar Charts ā€¢ Bar graphs have three key attributes: 1. A bar diagram makes it easy to compare sets of data between different groups at a glance. 2. The graph represents categories on one axis and a discrete value in the other. The goal is to show the relationship between the two axes. 3. Bar charts can also show big changes in data over time.
  • 19. Examples of Bar Charts
  • 20. Examples of Bar Charts
  • 21. Pie Charts ā€¢ A pie chart is a circular graph that shows the relative contribution that different categories contribute to an overall total. ā€¢ A wedge of the circle represents each categoryā€™s contribution, such that the graph resembles a pie that has been cut into different sized slices. ā€¢ Every 1% contribution that a category contributes to the total corresponds to a slice with an angle of 3.6 degrees.
  • 22. Pie Charts ā€¢ Pie charts are a visual way of displaying data that might otherwise be given in a small table. ā€¢ Pie charts are useful for displaying data that are classified into nominal or ordinal categories. ļµ Nominal data are categorised according to descriptive or qualitative information such as county of birth or type of pet owned. ļµ Ordinal data are similar but the different categories can also be ranked, for example in a survey people may be asked to say whether they classed something as very poor, poor, fair, good, very good.
  • 23. Pie Charts ā€¢ Pie charts are generally used to show percentage or proportional data and usually the percentage represented by each category is provided next to the corresponding slice of pie. ā€¢ Pie charts are good for displaying data for around 6 categories or fewer. When there are more categories it is difficult for the eye to distinguish between the relative sizes of the different sectors and so the chart becomes difficult to interpret.
  • 24. Examples of Pie Charts
  • 25. Examples of Pie Charts
  • 26. Histograms ā€¢ A histogram is a plot that lets you discover, and show, the underlying frequency distribution (shape) of a set of continuous data. This allows the inspection of the data for its underlying distribution (e.g., normal distribution), outliers, skewness, etc. ā€¢ The area of the bar that indicates the frequency of occurrences for each bin. This means that the height of the bar does not necessarily indicate how many occurrences of scores there were within each individual bin. It is the product of height multiplied by the width of the bin that indicates the frequency of occurrences within that bin.
  • 27. Histograms ā€¢ One of the reasons that the height of the bars is often incorrectly assessed as indicating frequency and not the area of the bar is due to the fact that a lot of histograms often have equally spaced bars (bins), and under these circumstances, the height of the bin does reflect the frequency. ā€¢ The major difference is that a histogram is only used to plot the frequency of score occurrences in a continuous data set that has been divided into classes, called bins. Bar charts, on the other hand, can be used for a great deal of other types of variables including ordinal and nominal data sets.
  • 28. Histograms A histogram showing frequencies of different age groups in a sample. Thinking Point: What can you infer about the normal distribution of this data from this chart?
  • 29. Box-Plots ā€¢ A boxplot is a standardized way of displaying the distribution of data based on a five number summary (ā€œminimumā€, first quartile (Q1), median, third quartile (Q3), and ā€œmaximumā€). ā€¢ It can tell you about your outliers and what their values are. ā€¢ It can also tell you if your data is symmetrical, how tightly your data is grouped, and if and how your data is skewed.
  • 30. Example of a Box-Plot See next slide for description of this box-plot.
  • 31. Elements of a Box-Plot ā€¢ A boxplot is a graph that gives you a good indication of how the values in the data are spread out. ļµ median (Q2/50th Percentile): the middle value of the dataset. ļµ first quartile (Q1/25th Percentile): the middle number between the smallest number (not the ā€œminimumā€) and the median of the dataset. ļµ third quartile (Q3/75th Percentile): the middle value between median and the highest value (not the ā€œmaximumā€) of the dataset. ļµ interquartile range (IQR): 25th to the 75th percentile. ļµ whiskers (shown in blue) ļµ outliers (shown as green circles) ļµ ā€œmaximumā€: Q3 + 1.5*IQR ļµ ā€œminimumā€: Q1 -1.5*IQR
  • 32. Scatter Plots ā€¢ A scatter plot is a two-dimensional data visualization that uses dots to represent the values obtained for two different variables - one plotted along the x-axis and the other plotted along the y-axis. ā€¢ Scatter plots are used when you want to show the relationship between two variables. Scatter plots are sometimes called correlation plots because they show how two variables are correlated. ā€¢ However, not all relationships are linear.
  • 33. Examples of Scatter Plots A scatterplot showing the relationship between weight (in lb) and height (in inches) in children. This demonstrates a positive linear relationship.
  • 35. References and Further Reading Books: ā€¢ Dancey, C. and Reidy, J. (2017). Statistics without Maths for Psychology,7th Edition. New York: Pearson. ā€¢ Howitt, D., & Cramer, D. (2017). Statistics in psychology using SPSS. New York: Pearson. Articles: ā€¢ Bickel, P. J., & Lehmann, E. L. (1975). Descriptive Statistics for Nonparametric Models I. Introduction. The Annals of Statistics, 3(5), 1038-1044. doi:10.1214/aos/1176343239 | https://link.springer.com/content/pdf/10.1007/978-1- 4614-1412-4_42.pdf