Application of Statistical and mathematical equations in Chemistry
Part 2
Accuracy
Precision
Propagation of Error
Confidence Limits
F-Test Values
Student’s t-test
Paired Sample t-test
Q test
Least Squares Method
correlation coefficient
2. Awad Nasser Albalwiـــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــ
Accuracy theory:
Accuracy refers to the agreement between experimental data and a known value. You
can think of it in terms of a bullseye in which the target is hit close to the center, yet
the marks in the target aren't necessarily close to each other.
Absolute Error Formula
Absolute error is defined as the magnitude of difference between the actual and the
individual values of any quantity in question.
Relative Error or fractional error
It is defined as the ration of mean absolute error to the mean value of the measured
quantity
δa =mean absolute value/mean value = Δamean/am
Percentage Error
It is the relative error measured in percentage. So
Percentage Error =mean absolute value/mean value X 100= Δamean/amX100
Accuracy theory equation:
µ-Absolute error = x
µ ) / µ ) * 100%-% Relative error = ( ( x
µ ) / µ ) * 1000%-% Relative error in thousand = ( ( x
The Precision Theory
3. Awad Nasser Albalwiـــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــ
Precision refers to how well experimental values agree with each other. If you hit a
bullseye precisely, then you are able to hit the same spot on the target each time, even
though that spot may be distant from the center.
the standard deviation (SD)( σ) shows how much variation or dispersion from the
average exists. A low standard deviation indicates that the data points tend to be very
close to the mean (also called expected value); a high standard deviation indicates
that the data points are spread out over a large range of values.
The Precision equation:
%Relative average deviation = (a.d. / X) * 100%
Standard deviation (S) = ( Σ (xi - X)2 / (N - 1) )1/2
Standard deviation of the mean (S, mean) = S / N1/2
X) / N-i(xΣAbsolute average deviation (a.d.) =
Propagation of Error
Propagation of Error (or Propagation of Uncertainty) is defined as the effects on a
function by a variable's uncertainty. It is a calculus derived statistical calculation
designed to combine uncertainties from multiple variables, in order to provide an
accurate measurement of uncertainty.
5. Awad Nasser Albalwiـــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــ
Definition of F-Test Values
In analysis of variance, an F-test is used to test group variance against a null
hypothesis, and is often used to determine whether any group of trials differs
significantly from an expected value. For example, the null hypothesis could be set as
the variance of two sample groups being equal, . To test
whether (sample 1 has significantly more variance than sample 2), take the
ratio and compare it to an F-test value in a table of pre-computed critical
values. To calculate the F-test value, find the degrees of freedom of each sample and
the desired confidence interval. If the calculated ratio is less than the table value,
accept the null hypothesis that the variance is not significantly different.
Formula
The F-test = S12 / S22
Student’s t-test
Student’s t-test, in statistics, a method of testing hypotheses about the mean of a
small sample drawn from a normally distributed population when the
population standard deviation is unknown.
In 1908 William Sealy Gosset, an Englishman publishing under the pseudonym
Student, developed thet-test and t distribution. The t distribution is a family of curves
in which the number of degrees of freedom (the number of independent observations
in the sample minus one) specifies a particular curve. As the sample size (and thus the
degrees of freedom) increases, the t distribution approaches the bell shape of the
standard normal distribution. In practice, for tests involving the mean of a sample of
size greater than 30, the normal distribution is usually applied.
It is usual first to formulate a null hypothesis, which states that there is no effective
difference between the observed sample mean and the hypothesized or stated
population mean—i.e., that any measured difference is due only to chance. In an
agricultural study, for example, the null hypothesis could be that an application of
6. Awad Nasser Albalwiـــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــ
fertilizer has had no effect on crop yield, and an experiment would be performed to
test whether it has increased the harvest. In general, a t-test may be either two-sided
(also termed two-tailed), stating simply that the means are not equivalent, or one-
sided, specifying whether the observed mean is larger or smaller than the
hypothesized mean. The test statistic t is then calculated. If the observed t-statistic is
more extreme than the critical value determined by the appropriate reference
distribution, the null hypothesis is rejected. The appropriate reference distribution for
the t-statistic is the t distribution. The critical value depends on the significance level
of the test (the probability of erroneously rejecting the null hypothesis).
For example, suppose a researcher wishes to test the hypothesis that a sample of
size n = 25 with meanx = 79 and standard deviation s = 10 was drawn at random from
a population with mean μ = 75 and unknown standard deviation. Using the formula
for the t-statistic,
the calculated t equals 2. For a two-sided test at a common level of
significance α = 0.05, the critical values from the t distribution on 24 degrees of
freedom are −2.064 and 2.064. The calculated tdoes not exceed these values, hence
the null hypothesis cannot be rejected with 95 percent confidence. (The confidence
level is 1 − α.)
A second application of the t distribution tests the hypothesis that two independent
random samples have the same mean. The t distribution can also be used to construct
confidence intervals for the true mean of a population (the first application) or for the
difference between two sample means (the second application).
Paired Sample t-test
A paired sample t-test is used to determine whether there is a significant difference
between the average values of the same measurement made under two different
conditions. Both measurements are made on each unit in a sample, and the test is
based on the paired differences between these two values. The usual null hypothesis
is that the difference in the mean values is zero. For example, the yield of two strains
of barley is measured in successive years in twenty different plots of agricultural land
(the units) to investigate whether one crop gives a significantly greater yield than the
other, on average.
The null hypothesis for the paired sample t-test is
7. Awad Nasser Albalwiـــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــ
H0: d = µ1 - µ2 = 0
where d is the mean value of the difference.
This null hypothesis is tested against one of the following alternative hypotheses,
depending on the question posed:
H1: d = 0
H1: d > 0
H1: d < 0
The paired sample t-test is a more powerful alternative to a two sample procedure,
such as the two sample t-test, but can only be used when we have matched samples.
Formula
N= ( N1N2 / (N1 + N2) )1/2
Sp = ( Σ(xi1-X1)2 + Σ(xi2-X2)2 ... + Σ(xik-Xk)2 / (N-K) ) 1/2
t = (X1-X2)/Sp ( N1N2 / (N1+N2)1/2 )
t test with multiple samples
In a t-test you are able to test one sample mean vs. a single value or two sample means
against each other. Analysis of Variance allows you to test if multiple means are equal
to each other.
In other words, the null hypothesis of the t-test is: μ1 = μ2
Formula
Sd = ( Σ(Di - D)2 / (N-1) )1/2
t = (D / Sd) * N1/2
8. Awad Nasser Albalwiـــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــ
Definition of Q test
The Q-test is a simple statistical test to determine if a data point that appears to be
very different from the rest of the data points in a set may be discarded. Only
one data point in a set may be rejected using the Q-test. The Q-test is:
The value of Q is compared to a critical value, Qc.
Table of Q critical values (90% confidence) N Qc 3 0.94 4 0.76 5 0.64 6 0.56 7 0.51 8
0.47 9 0.44 10 0.41
If Q is larger than Qc the outlier can be discarded with 90% confidence. Related topics:
• data handling
Definition of 'Least Squares Method'
A statistical technique to determine the line of best fit for a model. The least squares
method is specified by an equation with certain parameters to observed data. This
method is extensively used in regression analysis and estimation.
Formula
y = mx + b
s = Σ (yi - y1)2 = Σ (yi - (mxi + b))2
9. Awad Nasser Albalwiـــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــــ
m = Σ (xi yi) - ( (Σxi Σyi / n) ) / Σ(xi2 - ( (Σxi)2 / n) )
correlation coefficient theory:
A correlation coefficient is a statistical measure of the degree to which changes to the
value of one variable predict change to the value of another. In positively correlated
variables, the value increases or decreases in tandem. In negatively correlated
variables, the value of one increases as the value of the other decreases.
Correlation coefficients are expressed as values between +1 and -1. A coefficient of +1
indicates a perfect positive correlation: A change in the value of one variable will
predict a change in the same direction in the second variable. A coefficient of -1
indicates a perfectnegative correlation: A change in the value of one variable predicts
a change in the opposite direction in the second variable. Lesser degrees of
correlation are expressed as non-zero decimals. A coefficient of zero indicates there is
no discernable relationship between fluctuations of the variables.
equation:
1/2)2)iyΣ(-2iyΣ) (n2)ixΣ(-2ixΣ) / ( (niyΣixΣ-iyixΣr = (n