2. Overview
• Definitions and Introduction to DPNs
• Learning from complete data
• Experimental Results
• Applications
3. Regular probabilistic networks (Bayesian networks) are
well established for representing probabilistic
relationships among many random variables.
Dynamic Probabilistic Networks (DPNs), however,
extend this representation to the modeling of
stochastic evolution of a set of random variables over
time.
(Think “probabilistic state machines”)
4. Notation
• Capital letters (X,Y,Z)- sets of variables
• X - Random variable of set X
i
• Val(X ) - Finite set of values of X
i i
• |X | - Size of Val(X )
i i
• Lowercase italic (x,y,z)- set instantiations
5. DPNs are an extension to the common Bayesian
network representation where the probability
distribution changes with respect to time according to
some stochastic process.
Assume that X is a set of variables in a
PN which vary according to time.
Then Xi[t] is the value of the attribute Xi at time t,
and X[t] is the collection of such variables.
6. For simplicity’s sake, we assume that the stochastic
process governing transitions is Markovian:
P(X[t+1] | X[0...t]) = P(X[t+1] | X[t])
That is, the probability of a certain instantiation is
dependent only upon its immediate predecessor.
7. We also assume the process is stationary, i.e.,
P(X[t+1] | X[t]) is independent of t.
8. Given these two assumptions, we can describe a DPN
representing the joint distribution over all possible
trajectories of a process using two parts:
A prior network B0 that specifies a distribution over
the initial states X[0]; and
A transition network B-> over the variables
X[0] ∪ X[1]
which specifies the transition probability
P(X[t+1] | X[t]) for all t.
9. A prior network (left) and transition network (right)
for a dynamic probabilistic network
10. In light of this structure, the joint distribution over the
entire history of the DPN at time T is given as
PB(x[0...T]) =
PB0(x[0]) ∏(t=0...T-1) PB->(x[t+1] | x[t])
in other words, the product of all previous
distributions.
12. Common traditional methods:
search algorithims using scoring methods (BIC, BDe)
(given a dataset D)
DPN methods:
search algorithms using scoring methods!
(given a dataset D, consisting of Nseq observations)
13. So each entry in our dataset consists of an observation
of a set of variables over time.
The mth such sequence has length Nm and specifies
values for the variable set
Xm[0...Nm]
We then have Nseq instances of the initial state, and
N = ∑m Nm instances of transitions. We can use these
to learn the structure of the prior network and the
transition network, respectively.