This presentation covers statistics, its importance, its applications, branches of statistics, basic concepts used in statistics, data sampling, types of sampling,types of data and collection of data.
2. Learning Objectives
Definition of Statistics
Importance of Statistics
Applications of Statistics
Branches of Statistics
Population and Sample
Data Sampling
Types of Sampling
Types of Data
Scales of Measurements
Collection of Data
3. Definition of Statistics
Statistics is a branch of science which deals with collection,
presentation, analysis and interpretation of data.
It provides methods for analyzing and assessing the
significance of data.
Statistics enables the transformation of data into information
that can then serve as the basis for decision-making.
4. Importance of Statistics
Presents facts and figures in a definite form.
Helps to condense the data.
Gives idea about the shape ,spread and symmetry of the data.
Facilitates comparison.
Measures the relationship between two or more variables.
Helps in estimation and prediction.
Helps in formulating and testing the hypothesis or a new
theory.
Helps in planning, controlling and decision making.
5. Applications of Statistics
Statistical methods are used in almost all fields at several
phases. Some of the fields are listed below.
• Business and Industry
• Agriculture
• Commerce
• Demography
• Economics
• Education
• Social Sciences
• Biological Sciences
• Medical Sciences
6. Branches of Statistics
There are two main branches of Statistics,
1. Descriptive Statistics:
• Organizes, describes and summarizes the characteristics of
data.
• It includes construction of graphs, charts ,tables and the
calculation of various numeric measures such as mean,
median, standard deviation, percentiles, etc.
• It does not involve generalizing beyond the data at hand.
Examples: a batsman wants to find his batting average for the
past 12 months , a politician wants to know the average
number of votes he received in the past 3 years ,average daily
temperature of a Pune city.
7. Branches of Statistics
2. Inferential Statistics
• Concerns with drawing conclusions or predictions about a
population from the analysis of a random sample drawn from
that population.
• It includes methods like,
• Point estimation
• Interval estimation
• Hypothesis testing
Examples: a politician would like to estimate based on pre-
election polling techniques such as opinion polls; his chance for
winning in the upcoming election, researcher wants to
determine if treatment A is better than treatment B.
8. Population and Sample
Population: An aggregate of objects or individuals under study.
Sample: Any part of population under study.
Example: We want to study the industrial development of XYZ city.
There are total 500 industries in this city. All these 500 industries
constitute a Population. If we randomly choose 100 industries from the
total of 500 industries, these 100 industries will constitute a sample.
Parameters- are numerical values that summarize
characteristics of a population under investigation. Parameter
values are typically unknown.
Statistics- are numerical values that summarize characteristics
of a sample, which can then be used to estimate parameters.
10. Data Sampling
What is Data Sampling?
Sampling is a statistical technique of obtaining a sample of data which
is representative of the population. So that the inferences based on
the sample hold true for population as well.
Why we do Sampling?
When it is not possible to measure every item in the population
and population is infinite.
When the results are needed urgently.
When the area of study is wide.
When the element gets destroyed under investigation.
11. Benefits of Sampling
Sampling reduces processing time. Results can be obtained
quickly due to time saved in data collection and further analysis.
Reduces expenses incurred in collection of data and its
analysis, thus sampling is economical.
Due to reduced volume of work ,data collection and analysis
can be completed efficiently using well trained staff and
sophisticated machinery .Thus it increases accuracy of results.
12. Key concepts used in Sampling
Sampling Units: Members or elements of population.
Sample Size: The number of units in a sample.
Sampling Frame: A list of all members or elements of
population.
13. Types of Sampling
Below are popularly used sampling methods,
Simple Random sampling
Stratified Random Sampling
Systematic Sampling
Cluster Sampling
14. Simple Random Sampling
Each element of a population has an equal chance of being
selected in the sample.
Simple random samples are obtained either by sampling with
replacement or by sampling without replacement.
Sampling with replacement : a population element can be
selected more than one time.
Sampling without replacement: a population element can be
selected only one time.
15. Simple Random Sampling
Generally, the simple random sampling is conducted without
replacement because it is more convenient and gives more
precise results.
Simple random sampling is effective if population is
homogenous i.e. population has no differentiated sections or
classes.
Example: In order to conduct a socio-economic survey of a
particular village, we can randomly select a sample of families
and find per capita income of a village.
17. Stratified Random Sampling
In this method ,the entire population is divided into different non
overlapping homogenous groups called as strata and then a
simple random sample of a suitable size is selected from each
stratum to form a sample.
The strata are divided according to some criterion such as
geographic location, age, gender, religion or income.
Example: To estimate annual income per family we divide the
population into homogenous groups such as families with
yearly income below Rs. 50,000; between Rs. 50,000 - Rs.1
lakh;between Rs. 1 lakh – Rs. 1.5 lakh and above Rs. 1.5 lakh.
Then we use stratified random sampling taking above groups
as strata.
19. Systematic Sampling
This method involves the selection of elements from an
ordered sampling frame.
To draw a systematic sample of size n,
sampling units are numbered from 1 to N where N is the
population size.
calculate the sampling interval k as N/n, where N is population
size and n is sample size.
select a random number say j from 1 to k (sampling interval) and
thereafter select every kth element j+k, j+2k,etc.
Thus systematic sample of size n will include jth ,(j+k) th,(j+2k) th
,…..,(j+(n-1))kth observations.
Only the first unit selected at random determines the entire
sample.
20. Systematic Sampling
Suppose a committee of n=6 students is to be selected from a
class of N=60 students.
To draw a systematic sample of size n = 6,
Students are numbered from 1 to 60 using their roll numbers.
calculate the sampling interval k = 60/6 = 10
select a random number from 1 to 10 (sampling interval),suppose it is 5.
If 5th student is selected ,then the systematic sample will include students
with roll numbers 5,15,25,35,45,55.
22. Cluster Sampling
This method is used when population is large and consists of
several groups. These groups are called as clusters.
In this method, cluster is considered as sampling unit. We
select a simple random sample of clusters. All observations in
the selected clusters are included in the sample.
Smaller the size of clusters better will be the results.
Example: In health survey of a state, state can be divided into
villages (clusters). A simple random sample of villages may be
selected first and then information about each individual in the
selected village can be collected.
25. Types of Data
Data : is any facts or observations collected together for
reference or analysis which is used as a basis for decision
making.
Variable :
Any characteristic which changes its values.
Examples: height, weight, sex, marital status, eye color.
Variables can be classified as Qualitative or Quantitative.
Constant:
A characteristic which does not changes its value or nature.
Example: height of a person after 25 years of age.
26. Types of Data
Qualitative Data:
It is non numerical data that can be arranged into categories.
This data is also called as Categorical data.
Examples: gender of an individual, nationality of a player ,grade in
examination.
Quantitative Data :
It is a numerical data that consists of counts or measurements.
Examples: weight of person, examination marks, profit of a
salesman.
27. Quantitative Data
Quantitative data can further be classified as discrete and
continuous data .
Discrete Data: takes on only a finite or countable number of
values. These are usually whole numbers.
Examples: population of a country, number of cases of certain
disease, number of student in a class.
Continuous Data: takes all possible values in a certain range and
thus have an infinite number of values. This data does not contain
any gaps, breaks or jumps.
Examples: height of a person, temperature at a certain place,
agricultural production.
28. Scales of Measurement
Variables can also be classified based on its scales of
measurements.
Steven S.S introduced four types of scales of measurements:
nominal, ordinal, interval and ratio scales.
There are two scales of measurement for categorical
variables: nominal and ordinal.
There are two scales of measurement for quantitative
variables: interval and ratio.
29. Scales of Measurement
Nominal Scale:
Consist of two or more named categories into which objects are
classified
Data at this level can't be ordered in a meaningful way.
Examples: Classification of individual using blood group,
Classification of individual using sex, caste, nationality.
Ordinal Scale:
Similar to nominal scale ,however data at this level can be
ordered in a meaningful way, but differences between data values
either can not be determined or are meaningless.
Examples: Groups of individuals according to income such as
poor, middle class, rich., Groups of students according to grades
in examination such as fail, second class, first class, first class
with distinction.
30. Scales of Measurement
Interval Scale:
Data from an interval scale can be rank-ordered and has a
sensible spacing of observations such that differences between
measurements are meaningful.
Interval scales lack the ability to calculate ratios between numbers
on the scale because there is no true zero point.
Example: Temperature on the Fahrenheit and Celsius scales.
Ratio Scale:
Data on a ratio scale includes all of the features of interval scale ,in
addition to a true zero point and can therefore accurately indicate
the ratio of difference between two spaces on the measurement
scale.
It is the best scale of measurement and used in almost all places.
Examples: monthly income, height in cm, weight in kg.
31. Collection of Data
There are two types of data according to the method of collection;
Primary Data:
This is the original data collected by investigator himself/herself for
a specific purpose.
This type of data are generally a fresh and collected for the first
time.
This data can be collected by below methods,
• Observation Method
• Interview Method
• Survey method
• Questionnaire method
Examples: Population census, Data collected by a researcher for
his/her project.
32. Collection of Data
Secondary Data:
Data collected by someone else prior to and for a purpose other
than the current project.
This is processed or finished data.
Secondary data is data that is being reused.
It involves less cost, time and effort.
Examples: Data taken from sources like office records, reports
which are already collected by some other agency, Data available
on Internet ,Data from books, Data from magazines.
Note: Data which are primary for one may be secondary for the
other.