Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Chapter 13 Data Analysis Inferential Methods and Analysis of Time Series
1. Chapter 13: Data Analysis: Inferential
Methods and Analysis of Time Series
2. What are Inferential Statistics?
• Inferential statistics provide ways of testing the reliability of the findings of a study and ‘inferring’
characteristics from a small group of participants or people (your sample) onto much larger groups of
people (the population).
• The focus of inferential statistics is on how to generalise the statistics obtained from a sample as accurately
as possible to represent the population.
• For using inferential statistical methods for analysing your survey data, you should ensure that the
following three conditions are met:
– You should have a complete list of the members of population.
– You should draw a random sample from this population.
– You should use a pre-established formula and determine that your sample size is large enough and
represents the population.
3. What are Inferential Statistics? (Contd.)
• Univariate statistics—one sample hypothesis test: You use this method when your aim is to (a) compare
responses of respondents of your study/programme on a pre- and post-test and (b) determine if
implemented programme had an impact on one particular outcome.
• Univariate statistics—confidence interval: This is used to determine a value/score in a population based on
the score of the participants in your sample.
• Bivariate statistics—contingency tables and chi-square statistics: You use this method to (a) analyse two
categorical variables and (b) to know if they are related and the strength of the relationship.
• Bivariate statistics—t-test or ANOVA: You can use this method when you have (a) a categorical and
continuous variable, and when you want to (b) compare mean scores of two or more groups.
• Bivariate statistics—Pearson correlation: You can use Pearson correlation method when you have a
continuous independent variable and a continuous dependent variable.
4. • Bivariate statistics—regression analysis: Regression analysis is used when you have a continuous
independent variable and a continuous dependent (outcome) variable.
• Multivariate statistics—elaborated chi-square statistics: This method is used when you have more than
one independent categorical variable and one dependent categorical variable.
• Multivariate statistics—multivariate regression: You can use multivariate regression when you have more
than one independent (casual) variable and one dependent (effect/outcome) variable.
• Limitations of inferential statistics: The most important limitation, which is inherent in all inferential
statistics, is that you are providing data only about a part of the population, that is, of the population that
you have not fully measured. Therefore, you cannot ever be completely sure that the values/statistics you
have calculated are correct.
What are Inferential Statistics? (Contd.)
5. Data Analysis—Inferential Statistics
The most commonly used statistical methods to analyse bivariate and multivariate data (inferential statistics)
include: correlation, linear regression and ANOVA.
• Correlation: Correlation measures the relationship between two variables. It is the most commonly used
statistical technique to identify and determine the relationship between two continuous variables.
• Broadly speaking, correlations can be classified into seven types: positive, negative, strong, weak, zero,
perfectly positive correlation and perfectly negative correlation.
• Correlation and causation: Even a high degree of correlation between two variables does not necessarily
imply causation or functional relationship between the variables though the existence of causation always
implies correlation. The high degree of correlation between the variables may be due to mutual
dependence, influence of third variable and pure chance.
6. Data Analysis—Inferential Statistics
(Contd.)
• Correlations are used for prediction, validity, reliability and verification:
Prediction: An important use of correlation is prediction. Correlations can be used to help make
predictions. If two variables have been known in the past to correlate, then we can assume they will
continue to correlate in the future.
Validity: The process for validating the new test of intelligence is based on correlation.
Reliability: We can use correlations to determine the reliability of some measurement process. If the
correlation is high, the test is reliable. If it is low, it is not.
Theory verification: There are several psychological theories which make specific predictions about the
relationship between two variables.
7. • For correlations, the effect size is called the ‘coefficient of determination’ and is defined as r2. The value of
coefficient of determination can be anywhere from 0 to 1.00. The coefficient of determination shows that
the proportion of variation in the scores can be predicted from the relationship between two variables.
• When there exists some relationship between two variables, we have to measure the degree of
relationship. This measure is called the measure of correlation or correlation coefficient, and it is shown by
r.
• Karl Pearson’s coefficient of correlation: This is the most widely used method for measuring the magnitude
of linear relationship between two variables. It is known as Pearsonian coefficient of correlation.
• Spearman’s rank correlation coefficient: We use Spearman rank correlation when we have two ranked
variables, and we want to see whether the two variables co-vary; whether, as one variable increases, the
other variable tends to increase or decrease.
Data Analysis—Inferential Statistics
(Contd.)
8. Regression Analysis
• Regression analysis is a statistical tool widely used for exploring relationships between variables.
• Regression analysis with a single explanatory variable is termed simple regression.
Simple linear regression: In simple linear regression, a single independent variable is used to predict the
value of a dependent variable.
Multiple linear regression: Multiple regression is a highly advanced statistical tool and it is very powerful
when we are trying to develop a ‘model’ for predicting a wide variety of outcomes. It allows us to examine
how multiple independent variables are related to a dependent variable.
9. Analysis of Variance
• ANOVA is relatively a sophisticated hypothesis-testing technique widely used in research studies. It is used to
evaluate mean differences between two or more populations.
• Like all other inferential statistical methods, in ANOVA also we use sample data as the basis for drawing overall
conclusions about populations. The key merit of ANOVA is that we can use this method to compare two or more
populations.
• In other words, ANOVA provides researchers with much greater flexibility in designing experiments and
interpreting results.
• ANOVA is used to compare several means. It is important to keep in mind that a t-test is used to test differences
between two means, that is, the mean of the experiment group versus control group. An ANOVA test, on the
other hand, is indicated when there are three or more means or populations to be examined.
10. Analysis of Time Series
• Time series modelling is a dynamic research area. The main aim of time series analysis is to carefully collect and
rigorously study the past observations of a time series to develop an appropriate model which describes the
inherent structure of the series.
• An arrangement of data by successive time period is called time series. The analysis of time series is extremely
useful to an educational planner and researcher in planning future operations and in assessing the effect of an
intervention in the system.
• A typical time series has four types of movements: secular trend or long-term movement (T), seasonal
movements or variations (S), cyclical movements variations or fluctuations (C) and irregular, accidental or random
movements (I).
• The analysis of time series comprises the description, measurement and isolation of the various components
present in the series. This analysis helps the economists, businessmen, researchers, planners and so on.
11. Analysis of Time Series (Contd.)
The methods commonly used for measuring trends are:
• Free hand curve or graphic method: This method makes use of graphs where the data points are plotted
on X-axis of a graph showing the time units (year, months and so on) and the value of the time series
variable along the Y-axis.
• Semi-averages method: In this method, the data are divided into two equal parts (in case the number of
values is odd, either the middle value is ignored or the series is divided unequally). The averages for each
part are calculated and placed against the centre of each part. The averages are plotted and joined by a
line. The line is extended to cover the whole data.
• Method of moving averages: This method is referred to as moving averages. In this method, you can find
the simple average successively taking a specific number of values at a time.
• Least square method: The straight line obtained by this method is the line of ‘best fit’ that approximates
the given time series data.