Probability Arunesh Chand Mankotia 2005

Sample Space

The possible outcomes of a random experiment
are called the basic outcomes, and the set of all
basic outcomes is called the sample space. The
symbol S will be used to denote the sample
space.

Sample Space
- An Example -

What is the sample space for a roll of a
single six-sided die?

S = [1, 2, 3, 4, 5, 6]

Mutually Exclusive

If the events A and B have no common basic outcomes,
they are mutually exclusive and their intersection A ∩ B
is said to be the empty set indicating that A ∩ B cannot
occur.
More generally, the K events E1, E2, . . . , EK are
said to be mutually exclusive if every pair of them is a
pair of mutually exclusive events.

Venn Diagrams

Venn Diagrams are drawings, usually using
geometric shapes, used to depict basic
concepts in set theory and the outcomes of
random experiments.

Intersection of Events A and B

S S

A A∩B B A B

(a) A∩B is the striped area (b) A and B are Mutually Exclusive

Collectively Exhaustive

Given the K events E1, E2, . . ., EK in the
sample space S. If E1 ∪ E2 ∪ . . . ∪EK = S,
these events are said to be collectively
exhaustive.
exhaustive

Complement

Let A be an event in the sample space S. The
set of basic outcomes of a random experiment
belonging to S but not to A is called the
complement of A and is denoted by A.

Venn Diagram for the
Complement of Event A

S

A A

Unions, Intersections, and
Complements
A die is rolled. Let A be the event “Number rolled is even”
and B be the event “Number rolled is at least 4.” Then

A = [2, 4, 6] and B = [4, 5, 6]

A = [1, 3, 5] and B = [1, 2, 3]
A ∩ B = [4, 6]
A ∪ B = [2, 4, 5, 6]
A ∪ A = [1, 2, 3, 4, 5, 6] = S

Classical Probability

The classical definition of probability is the
proportion of times that an event will occur,
assuming that all outcomes in a sample space are
equally likely to occur. The probability of an
event is determined by counting the number of
outcomes in the sample space that satisfy the
event and dividing by the number of outcomes in
the sample space.

Classical Probability
The probability of an event A is

NA
P(A) =
N
where NA is the number of outcomes that satisfy the
condition of event A and N is the total number of outcomes
in the sample space. The important idea here is that one
can develop a probability from fundamental reasoning
about the process.

Combinations
The counting process can be generalized by
using the following equation to compare
the number of combinations of n things
taken k at a time.

n!
C =
n
k 0!= 1
k!(n − k )!

Relative Frequency

The relative frequency definition of probability is
the limit of the proportion of times that an
event A occurs in a large number of trials, n,
nA
P(A) =
n
where nA is the number of A outcomes and n is
the total number of trials or outcomes in the
population. The probability is the limit as n
becomes large.

Subjective Probability

The subjective definition of probability
expresses an individual’s degree of belief about
the chance that an event will occur. These
subjective probabilities are used in certain
management decision procedures.

Probability Postulates
Let S denote the sample space of a random experiment, Oi, the
basic outcomes, and A, an event. For each event A of the
sample space S, we assume that a number P(A) is defined
and we have the postulates
q If A is any event in the sample space S
0 ≤ P ( A) ≤ 1
q Let A be an event in S, and let Oi denote the basic outcomes.
Then
P ( A) = ∑ P (Oi )
A

where the notation implies that the summation extends over
all the basic outcomes in A.
3. P(S) = 1

Probability Rules

Let A be an event and A its complement.
The the complement rule is:
is
P ( A ) = 1 − P ( A)

Probability Rules

The Addition Rule of Probabilities:
Probabilities
Let A and B be two events. The probability of
their union is
P ( A ∪ B ) = P ( A) + P ( B ) − P ( A ∩ B )

Probability Rules
Venn Diagram for Addition Rule
P ( A ∪ B ) = P ( A) + P ( B ) − P ( A ∩ B )
P(A∪B)

A B

=
P(A) P(B) P(A∩B)

A B + A B - A B

Probability Rules

Conditional Probability:
Probability
Let A and B be two events. The conditional probability of
event A, given that event B has occurred, is denoted by the
symbol P(A|B) and is found to be:

P( A ∩ B)
P( A | B) =
P( B)
provided that P(B > 0).

Probability Rules

Conditional Probability:
Probability
Let A and B be two events. The conditional probability of
event B, given that event A has occurred, is denoted by the
symbol P(B|A) and is found to be:

P( A ∩ B)
P ( B | A) =
P ( A)
provided that P(A > 0).

Probability Rules

The Multiplication Rule of Probabilities:
Probabilities
Let A and B be two events. The probability of
their intersection can be derived from the
conditional probability as
P( A ∩ B) = P( A | B) P( B)
Also,
P ( A ∩ B ) = P ( B | A) P ( A)

Statistical Independence

Let A and B be two events. These events are said to be
statistically independent if and only if

P ( A ∩ B) = P( A) P ( B)
From the multiplication rule it also follows that
P(A | B) = P(A) (if P(B) > 0)
P(B | A) = P(B) (if P(A) > 0)
More generally, the events E1, E2, . . ., Ek are mutually
statistically independent if and only if
P(E1 ∩ E 2 ∩  ∩ E K ) = P(E1 ) P(E 2 )  P(E K )

Bivariate Probabilities
B1 B2 ... Bk

A1 P(A1∩B1) P(A1∩B2) ... P(A1∩Bk)

A2 P(A2∩B1) P(A2∩B2) ... P(A2∩Bk)

. . . . .
. . . . .
. . . . .
Ah P(Ah∩B1) P(Ah∩B2) ... P(Ah∩Bk)

Outcomes for Bivariate Events

Joint and Marginal Probabilities

In the context of bivariate probabilities, the
intersection probabilities P(Ai ∩ Bj) are called joint
probabilities. The probabilities for individual events
P(Ai) and P(Bj) are called marginal probabilities.
probabilities
Marginal probabilities are at the margin of a
bivariate table and can be computed by summing the
corresponding row or column.

Probabilities for the Television
Viewing and Income Example

Viewing High Middle Low Totals
Frequency Income Income
Income
Regular 0.04 0.13 0.04 0.21

Occasional 0.10 0.11 0.06 0.27

Never 0.13 0.17 0.22 0.52

Totals 0.27 0.41 0.32 1.00

Tree Diagrams

P(A1 ∩ B1) = .04

P(A1 ∩ B2) = .13

P(A1 ∩ B3) = .04
1
.2
)=

P(A2 ∩ B1) = .10
1
A
P(

P(A2) = .27
P(S) = 1 P(A2 ∩ B2) = .11

P(
A P(A2 ∩ B3) = .06
3 )=
.5 P(A3∩ B1) = .13
2
P(A3 ∩ B2) = .17

P(A3 ∩ B3) = .22

Probability Rules

Rule for Determining the Independence of Attributes
Let A and B be a pair of attributes, each broken into
mutually exclusive and collectively exhaustive event
categories denoted by labels A1, A2, . . ., Ah and
B1, B2, . . ., Bk. If every Ai is statistically independent of
every event Bj, then the attributes A and B are
independent.

Bayes’ Theorem

Let A and B be two events. Then Bayes’ Theorem states
that:
P(A | B)P(B)
P( A | B) =
P(A)
and
P(B | A)P(A)
P( A | B) =
P(B)

Bayes’ Theorem
(Alternative Statement)

Let E1, E2, . . . , Ek be mutually exclusive and collectively
exhaustive events and let A be some other event. The
conditional probability of Ei given A can be expressed as
Bayes’ Theorem:
Theorem

P(A | E i )P(E i )
P(E i | A) =
P(A | E1 )P(E1 ) + P(A | E 2 )P(E 2 ) +  + P(A | E K )P(E K )

Bayes’ Theorem
- Solution Steps -
1. Define the subset events from the
problem.
2. Define the probabilities for the events
defined in step 1.
3. Compute the complements of the
probabilities.
4. Apply Bayes’ theorem to compute the
probability for the problem solution.

Discrete Random Variables and
Probability Distributions

©

Random Variables

A random variable is a variable that takes on
numerical values determined by the outcome
of a random experiment.

Discrete Random Variables

A random variable is discrete if it can
take on no more than a countable
number of values.

Discrete Random Variables
(Examples)

1. The number of defective items in a sample of twenty
items taken from a large shipment.
2. The number of customers arriving at a check-out
counter in an hour.
3. The number of errors detected in a corporation’s
accounts.
4. The number of claims on a medical insurance policy
in a particular year.

Continuous Random Variables

A random variable is continuous if it
can take any value in an interval.

(Examples)

1. The income in a year for a family.
2. The amount of oil imported into the U.S. in a
particular month.
3. The change in the price of a share of IBM common
stock in a month.
4. The time that elapses between the installation of a new
computer and its failure.
5. The percentage of impurity in a batch of chemicals.

Discrete Probability Distributions

The probability distribution function (DPF), P(x),
of a discrete random variable expresses the
probability that X takes the value x, as a
function of x. That is
P ( x) = P ( X = x), for all values of x.

Discrete Probability Distributions

Graph the probability distribution function for
the roll of a single six-sided die.
P(x)

1/6

1 2 3 4 5 6 x

Required Properties of Probability
Distribution Functions of Discrete
Random Variables
Let X be a discrete random variable with
probability distribution function, P(x). Then
q P(x) ≥ 0 for any value of x
q The individual probabilities sum to 1; that is

∑ P( x) = 1
x

Where the notation indicates summation
over all possible values x.

Cumulative Probability Function

The cumulative probability function, F(x0), of a
random variable X expresses the probability
that X does not exceed the value x0, as a
function of x0. That is

F ( x0 ) = P ( X ≤ x0 )
Where the function is evaluated at all values x0

Derived Relationship Between Probability
Function and Cumulative Probability
Function

Let X be a random variable with probability function
P(x) and cumulative probability function F(x0). Then it
can be shown that
F ( x0 ) = ∑ P( X )
x ≤ x0
Where the notation implies that summation is over all
possible values x that are less than or equal to x0.

Derived Properties of Cumulative
Probability Functions for Discrete
Random Variables
Let X be a discrete random variable with a
cumulative probability function, F(x0).
Then we can show that
q 0 ≥ F(x0) ≥ 1 for every number x0
q If x0 and x1 are two numbers with x0 < x1,
then F(x0) ≤ F(x1)

Expected Value
The expected value, E(X), of a discrete random
variable X is defined
E ( X ) = ∑ xP( x)
x
Where the notation indicates that summation extends
over all possible values x.
The expected value of a random variable is called its
mean and is denoted µx.

Expected Value: Functions of
Random Variables

Let X be a discrete random variable with
probability function P(x) and let g(X) be some
function of X. Then the expected value, E[g(X)],
of that function is defined as
E[ g ( X )] = ∑ g ( x) P ( x)
x

Variance and Standard Deviation

Let X be a discrete random variable. The expectation
of the squared discrepancies about the mean, (X - µ)2,
is called the variance, denoted σ2x and is given by
variance

σ x = E ( X − µ x ) 2 = ∑ ( x − µ x ) 2 P( x)
2

x
The standard deviation, σx , is the positive square root
deviation
of the variance.

Variance
(Alternative Formula)
The variance of a discrete random variable X can be
Expressed as
σ = E( X ) − µx
2 2 2
x

= ∑ x P( x) − µ x
2 2

x

Expected Value and Variance for
Discrete Random Variable Using
Microsoft Excel
Sales P(x) Mean Variance
0 0.15 0 0.570375
1 0.3 0.3 0.27075
2 0.2 0.4 0.0005
3 0.2 0.6 0.2205
4 0.1 0.4 0.42025
5 0.05 0.25 0.465125
1.95 1.9475

Expected Value = 1.95 Variance = 1.9475

Summary of Properties for Linear
Function of a Random Variable
Let X be a random variable with mean µx , and variance σ2x
; and let a and b be any constant fixed numbers. Define the
random variable Y = a + bX. Then, the mean and variance
of Y are
µY = E (a + bX ) = a + bµ X
and

σ 2
Y = Var (a + bX ) = b σ
2 2
X

so that the standard deviation of Y is
σY = bσ X

Summary Results for the Mean and
Variance of Special Linear Functions
q Let b = 0 in the linear function, W = a + bX. Then W = a
(for any constant a).

E (a) = a and Var (a ) = 0
If a random variable always takes the value a, it will have a
mean a and a variance 0.
q Let a = 0 in the linear function, W = a + bX. Then W =
bX.
E (bX ) = bµ X and Var (a ) = b 2σ X
2

Mean and Variance of Z
Let a = -µX/σX and b = 1/ σX in the linear function Z = a
+ bX. Then,
X − µX
Z = a + bX =
σX
so that
 X − µX  µX 1
E
 σ =−
 + µX = 0
 X  σX σX

and
 X − µX  1 2
Var 
 σ  = 2 σ X =1
 σ
 X  X

Bernoulli Distribution

A Bernoulli distribution arises from a random experiment
which can give rise to just two possible outcomes. These
outcomes are usually labeled as either “success” or
“failure.” If π denotes the probability of a success and the
probability of a failure is (1 - π ), the the Bernoulli
probability function is
P (0) = (1 − π ) and P (1) = π

Mean and Variance of a Bernoulli
Random Variable

The mean is:
µ X = E ( X ) = ∑ xP( x) = (0)(1 − π ) + (1)π = π
X

And the variance is:
σ = E[( X − µ X ) ] = ∑ ( x − µ X ) P( x)
2
X
2 2

X

= (0 − π ) 2 (1 − π ) + (1 − π ) 2 π = π (1 − π )

Sequences of x Successes in n
Trials
The number of sequences with x successes in n independent
trials is:

n!
C =
n
x
x!(n − x)!
Where n! = n x (x – 1) x (n – 2) x . . . x 1 and 0! = 1.
n
These C x sequences are mutually exclusive,
since no two of them can occur at the same time.

Binomial Distribution
Suppose that a random experiment can result in two possible mutually
exclusive and collectively exhaustive outcomes, “success” and “failure,”
and that π is the probability of a success resulting in a single trial. If n
independent trials are carried out, the distribution of the resulting
number of successes “x” is called the binomial distribution. Its
distribution
probability distribution function for the binomial random variable X =
x is:

P(x successes in n independent trials)=

n! ( n− x )
P( x) = π (1 − π )
x

x!(n − x)!
for x = 0, 1, 2 . . . , n

Mean and Variance of a Binomial
Probability Distribution

Let X be the number of successes in n independent trials,
each with probability of success π. The x follows a
binomial distribution with mean,
mean
µ X = E ( X ) = nπ
and variance,
variance
σ = E[( X − µ ) ] = nπ (1 − π )
2
X
2

Binomial Probabilities
- An Example –

An insurance broker, has five contracts, and he believes
that for each contract, the probability of making a sale is
0.40.

What is the probability that he makes at most one sale?

P(at most one sale) = P(X ≤ 1) = P(X = 0) + P(X = 1)
= 0.078 + 0.259 = 0.337
5!
P(no sales) = P(0) = (0.4) 0 (0.6) 5 = 0.078
0!5!
5!
P(1 sale) = P(1) = (0.4)1 (0.6) 4 = 0.259
1!4!

Binomial Probabilities, n = 100, π =0.40

Sample size 100
Probability of success 0.4
Mean 40
Variance 24
Standard deviation 4.898979

Binomial Probabilities Table
X P(X) P(<=X) P(<X) P(>X) P(>=X)
36 0.059141 0.238611 0.179469 0.761389 0.820531
37 0.068199 0.30681 0.238611 0.69319 0.761389
38 0.075378 0.382188 0.30681 0.617812 0.69319
39 0.079888 0.462075 0.382188 0.537925 0.617812
40 0.081219 0.543294 0.462075 0.456706 0.537925
41 0.079238 0.622533 0.543294 0.377467 0.456706
42 0.074207 0.69674 0.622533 0.30326 0.377467
43 0.066729 0.763469 0.69674 0.236531 0.30326

Poisson Probability Distribution

Assume that an interval is divided into a very large number of
subintervals so that the probability of the occurrence of an
event in any subinterval is very small. The assumptions of
a Poisson probability distribution are:
2) The probability of an occurrence of an event is constant for
all subintervals.
3) There can be no more than one occurrence in each
subinterval.
4) Occurrences are independent; that is, the number of
occurrences in any non-overlapping intervals in
independent of one another.

Poisson Probability Distribution

The random variable X is said to follow the Poisson
probability distribution if it has the probability function:
e − λ λx
P( x) = , for x = 0, 1,2,...
x!
where
P(x) = the probability of x successes over a given period of
time or space, given λ
λ = the expected number of successes per time or space
unit; λ > 0
e = 2.71828 (the base for natural logarithms)
The mean and variance of the Poisson probability distribution are:
are
µ x = E ( X ) = λ and σ x2 = E[( X − µ ) 2 ] = λ

Partial Poisson Probabilities for λ = 0.03
Obtained Using Microsoft Excel

Poisson Probabilities Table
X P(X) P(<=X) P(<X) P(>X) P(>=X)
0 0.970446 0.970446 0.000000 0.029554 1.000000
1 0.029113 0.999559 0.970446 0.000441 0.029554
2 0.000437 0.999996 0.999559 0.000004 0.000441
3 0.000004 1.000000 0.999996 0.000000 0.000004
4 0.000000 1.000000 1.000000 0.000000 0.000000

Poisson Approximation to the
Binomial Distribution

Let X be the number of successes resulting from n independent
trials, each with a probability of success, π. The distribution of the
number of successes X is binomial, with mean nπ. If the number of
trials n is large and nπ is of only moderate size (preferably nπ ≤ 7),
this distribution can be approximated by the Poisson distribution
with λ = nπ. The probability function of the approximating
distribution is then:
e − nπ (nπ ) x
P( x) = , for x = 0, 1,2,...
x!

Covariance
Let X be a random variable with mean µ X , and let Y be a
random variable with mean, µ Y . The expected value of (X -
µ X )(Y - µ Y ) is called the covariance between X and Y,
denoted Cov(X, Y).
For discrete random variables
Cov ( X , Y ) = E[( X − µ X )(Y − µY )] = ∑∑ ( x − µ x )( y − µ y ) P ( x, y )
x y

An equivalent expression is
Cov ( X , Y ) = E ( XY ) − µ x µ y = ∑∑ xyP( x, y ) − µ x µ y
x y

Correlation

Let X and Y be jointly distributed random variables.
The correlation between X and Y is:
Cov ( X , Y )
ρ = Corr ( X , Y ) =
σ XσY

Covariance and Statistical
Independence
If two random variables are statistically
independent, the covariance between them is 0.
independent
However, the converse is not necessarily true.

Portfolio Analysis
The random variable X is the price for stock A and the
random variable Y is the price for stock B. The market
value, W, for the portfolio is given by the linear function,

W = aX + bY
Where, a, is the number of shares of stock A and, b, is the
number of shares of stock B.

Portfolio Analysis

The mean value for W is,
µW = E[W ] = E[aX + bY ]
= aµ X + bµY
The variance for W is,
σ = a σ + b σ + 2abCov ( X , Y )
2
W
2 2
X
2 2
Y
or using the correlation,
σ = a σ + b σ + 2abCorr ( X , Y )σ X σ Y
2
W
2 2
X
2 2
Y

and Probability Distributions

©

Cumulative Distribution Function

The cumulative distribution function, F(x), for a
function
continuous random variable X expresses the
probability that X does not exceed the value of x, as
a function of x
F ( x) = P( X ≤ x)


F(x)

1

0 1
Cumulative Distribution Function for a Random variable Over 0 to 1


Let X be a continuous random variable with a
cumulative distribution function F(x), and let a and
b be two possible values of X, with a < b. The
probability that X lies between a and b is
P(a < X < b) = F (b) − F (a )

Probability Density Function
 Let X be a continuous random variable, and let x be any number lying in
the range of values this random variable can take. The probability density
function, f(x), of the random variable is a function with the following
function
properties:
q f(x) > 0 for all values of x
q The area under the probability density function f(x) over all values of the
random variable X is equal to 1.0
q Suppose this density function is graphed. Let a and b be two possible
values of the random variable X, with a<b. Then the probability that X lies
between a and b is the area under the density function between these points.
q The cumulative density function F(x0) is the area under the probability
density function f(x) up to x0

x0

f ( x0 ) = ∫ f ( x)dx
xm

Shaded Area is the Probability That
X is Between a and b

0 a b x

Probability Density Function for a
Uniform 0 to 1 Random Variable

f(x)

1

0 1 x

Areas Under Continuous Probability
Density Functions
 Let X be a continuous random variable with the
probability density function f(x) and cumulative
distribution F(x). Then the following properties
hold:
q The total area under the curve f(x) = 1.
q The area under the curve f(x) to the left of x0 is
F(x0), where x0 is any value that the random
variable can take.

Properties of the Probability Density
Function

f(x)

Comments
1
Total area under
the uniform
probability density
function is 1.
0
0 x0 1 x

Properties of the Probability Density
Function

Comments
f(x)
Area under the uniform
probability density
function to the left of
1
x0 is F(x0), which is
equal to x0 for this
uniform distribution
because f(x)=1.

0
0 x0 1 x

Rationale for Expectations of

Suppose that a random experiment leads to an
outcome that can be represented by a continuous
random variable. If N independent replications of
this experiment are carried out, then the expected
value of the random variable is the average of the
values taken, as the number of replications becomes
infinitely large. The expected value of a random
variable is denoted by E(X).

Rationale for Expectations of
(continued)
Similarly, if g(x) is any function of the random
variable, X, then the expected value of this function is
the average value taken by the function over repeated
independent trials, as the number of trials becomes
infinitely large. This expectation is denoted E[g(X)].
By using calculus we can define expected values for
continuous random variables similarly to that used for
discrete random variables.

E[ g ( x)] = ∫ g ( x) f ( x)dx
x

Mean, Variance, and Standard
Deviation
Let X be a continuous random variable. There are two important expected values
that are used routinely to define continuous probability distributions.
q The mean of X, denoted by µX, is defined as the expected value of X.
X

µ X = E( X )
q The variance of X, denoted by σX2, is defined as the expectation of the
X
squared deviation, (X - µX)2, of a random variable from its mean
σ = E[( X − µ X ) ]
2
X
2

Or an alternative expression can be derived
σ X = E( X 2 ) − µ X
2 2

q The standard deviation of X, σX, is the square root of the variance.
X

Linear Functions of Variables

Let X be a continuous random variable with mean µ X and
variance σ X2, and let a and b any constant fixed numbers.
Define the random variable W as
W = a + bX
Then the mean and variance of W are

µW = E (a + bX ) = a + bµ X
and
σ = Var (a + bX ) = b σ
2
W
2 2
X

and the standard deviation of W is
σW = bσ X

Linear Functions of Variable
(continued)

An important special case of the previous results is the
standardized random variable
X − µX
Z=
σX
which has a mean 0 and variance 1.

Reasons for Using the Normal
Distribution
1. The normal distribution closely approximates the
probability distributions of a wide range of random
variables.
2. Distributions of sample means approach a normal
distribution given a “large” sample size.
3. Computations of probabilities are direct and
elegant.
4. The normal probability distribution has led to good
business decisions for a number of applications.

Probability Density Function for a
Normal Distribution
(Figure 6.8)

0.4

0.3

0.2

0.1

0.0
µ x

Probability Density Function of
the Normal Distribution
The probability density function for a normally
distributed random variable X is

1 − ( x − µ ) 2 / 2σ 2
f ( x) = e for - ∞ < x < ∞
2πσ 2

Where µ and σ 2 are any number such that -∞ < µ < ∞
and -∞ < σ 2 < ∞ and where e and π are physical
constants, e = 2.71828. . . and π = 3.14159. . .

Properties of the Normal
Distribution
Suppose that the random variable X follows a normal distribution with
parameters µ and σ2. Then the following properties hold:
q The mean of the random variable is µ,
E( X ) = µ
q The variance of the random variable is σ2,
E[( X − µ X ) 2 ] = σ 2
q The shape of the probability density function is a symmetric bell-
shaped curve centered on the mean µ as shown in Figure 6.8.
vii. By knowing the mean and variance we can define the normal
distribution by using the notation

X ~ N (µ ,σ ) 2

Effects of µ on the Probability Density
Function of a Normal Random Variable

0.4

0.3 Mean = 5 Mean = 6

0.2

0.1

0.0

1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 x

Effects of σ2 on the Probability Density
Function of a Normal Random Variable

0.4 Variance = 0.0625

0.3

0.2
Variance = 1
0.1

0.0

1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5
x

of the Normal Distribution
Suppose that X is a normal random variable with mean
µ and variance σ 2 ; that is X~N(µ, σ 2). Then the
cumulative distribution function is

F ( x0 ) = P ( X ≤ x0 )
This is the area under the normal probability density
function to the left of x0, as illustrated in Figure 6.10. As
for any proper density function, the total area under the
curve is 1; that is F(∞) = 1.

Shaded Area is the Probability that X
does not Exceed x0 for a Normal
Random Variable

f(x)

x0 x

Range Probabilities for Normal
Random Variables

Let X be a normal random variable with cumulative
distribution function F(x), and let a and b be two
possible values of X, with a < b. Then
P (a < X < b) = F (b) − F (a )
The probability is the area under the corresponding
probability density function between a and b.

Range Probabilities for Normal
Random Variables

f(x)

a µ b x

The Standard Normal Distribution

Let Z be a normal random variable with mean 0 and
variance 1; that is
Z ~ N (0,1)
We say that Z follows the standard normal distribution.
Denote the cumulative distribution function as F(z), and a
and b as two numbers with a < b, then
P (a < Z < b) = F (b) − F (a)

Standard Normal Distribution with
Probability for z = 1.25

0.8944

z 1.25

Finding Range Probabilities for Normally
Distributed Random Variables
Let X be a normally distributed random variable with mean µ
and variance σ 2. Then the random variable Z = (X - µ)/σ has a
standard normal distribution: Z ~ N(0, 1)
It follows that if a and b are any numbers with a < b, then

a−µ b−µ 
P ( a < X < b) = P <Z< 
 σ σ 
b−µ  a−µ 
= F  − F 
 σ   σ 
where Z is the standard normal random variable and F(z) denotes
its cumulative distribution function.

Computing Normal Probabilities

A very large group of students obtains test scores that are
normally distributed with mean 60 and standard deviation 15.
What proportion of the students obtained scores between 85
and 95?
 85 − 60 95 − 60 
P (85 < X < 95) = P <Z< 
 15 15 
= P (1.67 < Z < 2.33)
= F (2.33) − F (1.67)
= 0.9901 − 0.9525 = 0.0376

That is, 3.76% of the students obtained scores in the range 85 to 95.

Approximating Binomial Probabilities
Using the Normal Distribution
Let X be the number of successes from n independent Bernoulli
trials, each with probability of success π. The number of successes,
X, is a Binomial random variable and if nπ(1 - π) > 9 a good
approximation is
 a − nπ b − nπ 
P ( a < X < b) = P ≤Z≤ 
 nπ (1 − π ) nπ (1 − π ) 
 
Or if 5 < nπ(1 - π) < 9 we can use the continuity correction factor to
obtain
 a − 0.5 − nπ b + 0.5 − nπ 
P ( a ≤ X ≤ b) = P  ≤Z≤ 
 nπ (1 − π ) nπ (1 − π ) 
 
where Z is a standard normal variable.

Covariance
Let X and Y be a pair of continuous random variables,
with respective means µ x and µ y. The expected value of (x
- µ x)(Y - µ y) is called the covariance between X and Y.
That is
Cov( X , Y ) = E[( X − µ x )(Y − µ y )]
An alternative but equivalent expression can be derived as
Cov( X , Y ) = E ( XY ) − µ x µ y
If the random variables X and Y are independent, then the
covariance between them is 0. However, the converse is
not true.

Correlation

Let X and Y be jointly distributed random variables. The
correlation between X and Y is
Cov( X , Y )
ρ = Corr ( X , Y ) =
σ XσY

Sums of Random Variables
Let X1, X2, . . .Xk be k random variables with means µ1, µ2,. . .
µk and variances σ12, σ22,. . ., σk2. The following properties
hold:
ii. The mean of their sum is the sum of their means; that is
E ( X 1 + X 2 +  + X k ) = µ1 + µ 2 +  + µ k
iv. If the covariance between every pair of these random
variables is 0, then the variance of their sum is the sum of
their variances; that is
Var ( X 1 + X 2 +  + X k ) = σ 12 + σ 2 +  + σ k2
2

However, if the covariances between pairs of random
variables are not 0, the variance of their sum is
K −1 K
Var ( X 1 + X 2 +  + X k ) = σ 12 + σ 2 +  + σ k2 + 2∑
2
∑ Cov( X , X
i j )
i =1 j =i +1

Differences Between a Pair of
Random Variables
Let X and Y be a pair of random variables with means µX and µY and
variances σX2 and σY2. The following properties hold:
ii. The mean of their difference is the difference of their means; that is
E ( X − Y ) = µ X − µY
iv. If the covariance between X and Y is 0, then the variance of their
difference is
Var ( X − Y ) = σ X + σ Y
2 2

vi. If the covariance between X and Y is not 0, then the variance of their
difference is

Var ( X − Y ) = σ X + σ Y − 2Cov ( X , Y )
2 2

Linear Combinations of Random
Variables
The linear combination of two random variables, X and Y, is
W = aX + bY
Where a and b are constant numbers.
The mean for W is,
µW = E[W ] = E[aX + bY ] = aµ X + bµY
The variance for W is,
σ W = a 2σ X + b 2σ Y + 2abCov ( X , Y )
2 2 2

Or using the correlation,
σ W = a 2σ X + b 2σ Y + 2abCorr ( X , Y )σ X σ Y
2 2 2

If both X and Y are joint normally distributed random variables
then the resulting random variable, W, is also normally distributed
with mean and variance derived above.

Probability Arunesh Chand Mankotia 2005

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (11)

Similaire à Probability Arunesh Chand Mankotia 2005

Similaire à Probability Arunesh Chand Mankotia 2005 (20)

Plus de Consultonmic

Plus de Consultonmic (20)

Probability Arunesh Chand Mankotia 2005