1. The document describes Probabilistic Soft Logic (PSL), a probabilistic modeling language based on logics.
2. PSL uses rules to capture dependencies and constraints between continuous random variables represented as atoms. A PSL program consists of rules, sets, constraints, and atoms.
3. PSL provides a mathematical foundation based on constrained continuous Markov random fields and a logical foundation based on generalized annotated logic programs. It allows for collective probabilistic inference and learning over relational domains.
3. Probabilistic Soft Logic
Ontologies
provides work for
Organization
buys interacts
Service & Products Customers Employees
develops sells to
helps
Software Hardware IT Services Developer Sales Person Staff
Networks Process Optim. ERP Systems Accountant
= sub-concept
Instance data not shown!
relationship
3
4. Probabilistic Soft Logic
Ontologies
provides work for
Organization
buys interacts
Service & Products Customers Employees
develops sells to
helps
Software Hardware IT Services Developer Sales Person
Database Schema Staff
Networks Process Optim. ERP Systems Accountant
= sub-concept
Instances not shown!
relationship
4
5. Probabilistic Soft Logic
Multiple Ontologies
provides work for
Organization
buys interacts
Service & Products Customers Employees
develops helps sells to
Software Hardware IT Services Developer Sales Person Staff
develop works for
Company
buys interacts with
Products & Services Customer Employee
sells
helps
Software Dev Hardware Consulting Technician Sales Accountant
5
6. Probabilistic Soft Logic
Ontology Alignment [3 ]
provides work for
Organization
buys interacts
Service & Products Customers Employees
develops helps sells to
Software Hardware IT Services Developer Sales Person Staff
develop works for
Company
buys interacts with
Products & Services Customer Employee
sells
helps
Software Dev Hardware Consulting Technician Sales Accountant
6
7. Probabilistic Soft Logic
Ontology Alignment
provides work for
Organization
buys interacts
Service & Products Customers Employees
develops helps sells to
Software Hardware IT Services Developer Sales Person Staff
Match, Don’t Match? develop
Company
works for
buys interacts with
Products & Services Customer Employee
sells
helps
Software Dev Hardware Consulting Technician Sales Accountant
7
8. Probabilistic Soft Logic
Ontology Alignment
provides work for
Organization
buys interacts
Service & Products Customers Employees
develops helps sells to
Software Hardware IT Services Developer Sales Person Staff
Similar to what extent? develop
Company
works for
buys interacts with
Products & Services Customer Employee
sells
helps
Software Dev Hardware Consulting Technician Sales Accountant
8
9. Probabilistic Soft Logic
Personalized Medicine [2 ]
Joe Black
Age: 51
BMI: 27
Diet: high in fat
Rectal exam: no signs
PSA (blood test): 5.2
Mutations on: LMTK2, KLK3,
JAZF1
Discomfort when urinating
Example
Diagnosis and Treatment
of Prostate Cancer
10. Probabilistic Soft Logic
Bob Black Joe Black
Died at age 79 Age: 51
Never diagnosed with father BMI: 27
prostate cancer Diet: high in fat
PSA levels: 3.2-8.9 Rectal exam: no signs
BMI: 23 PSA (blood test): 5.2
Mutations on: LMTK2, KLK3,
JAZF1
Frank Black Discomfort when urinating
Age: 48
BMI: 24
PSA: 3.1, 4.2, 4.9, 55 Mary Black
Biopsy: 8/12 positive brother wife Age: 45
Grade P1: 2-3, 60/40 BMI: 32
Grade P2: 4-5, 90/10 Diet: high in fat
Mutations on: Diagnosed with
LMTK2, KLK3, JAZF1, breast cancer,
CDH13 XMRV virus
detected
11. Probabilistic Soft Logic
Bob Black Joe Black
Died at age 79 Age: 51
Never diagnosed with father BMI: 27
prostate cancer Diet: high in fat
PSA levels: 3.2-8.9 Rectal exam: no signs
BMI: 23 PSA (blood test): 5.2
Mutations on: LMTK2, KLK3,
JAZF1
Frank Black Discomfort when urinating
Age: 48
BMI: 24
PSA: 3.1, 4.2, 4.9, 55
Support Medical Mary Black
Biopsy: 8/12 positive
Grade P1: 2-3, 60/40 Decision Making
brother wife Age: 45
BMI: 32
Grade P2: 4-5, 90/10 Diet: high in fat
Mutations on: Diagnosed with
LMTK2, KLK3, JAZF1, breast cancer,
CDH13 XMRV virus
detected
21. Probabilistic Soft Logic
Why PSL?
Continuous Random Variables
Mathematical Foundation
Logic Foundation
Inference & Learning
Sets and Aggregators
Extensible
High Performance
21
22. Probabilistic Soft Logic
What is PSL?
Declarative language based on logics to
express collective probabilistic
inference problems
- Predicate = relationship or property
- (Ground) Atom = (continuous) random variable
- Rule = capture dependency or constraint
- Set = define aggregates
PSL Program = Rules, Sets, Constraints, Atoms
22
23. Probabilistic Soft Logic
Ontology Alignment
similar(A,B) [A≈B]
provides
Organization
work for
buys interacts
Service & Products
similar(Customer,Customers) Customers Employees
develops helps sells to
[Customer≈Customers]
Software Hardware IT Services Developer Sales Person Staff
domain(C,D)
develop works for
Company
domain(work for, Employees)
buys interacts with
Products & Services Customer Employee
sells
helps
Software Dev Hardware Consulting Technician Sales Accountant
23
24. Probabilistic Soft Logic
Ontology Alignment
provides work for
Organization
buys interacts
Service & Products Customers Employees
develops helps sells to
Software Hardware IT Services Developer Sales Person Staff
develop works for
Company
buys interacts with
Products & Services Customer Employee
sells
helps
R≈T
Software Dev Hardware ôConsulting Technician
domainOf(R,A) domainOf(T,B)
Sales Accountant
24
A≈B R≠T : 0.7
25. Probabilistic Soft Logic
{A.subConcept}≈{B.subConcept} ô A≠B
Ontology Alignment
A≈B type(A,concept) type(B,concept) :0.8
provides work for
Organization
buys interacts
Service & Products Customers Employees
develops helps sells to
Software Hardware IT Services Developer Sales Person Staff
develop works for
Company
buys interacts with
Products & Services Customer Employee
sells
helps
Software Dev Hardware Consulting Technician Sales Accountant
25
26. Probabilistic Soft Logic
Ontology Alignment
provides work for
Organization
buys interacts
Service & Products Customers Employees
develops helps sells to
Software Hardware IT Services Developer Sales Person Staff
develop works for
Company
buys interacts with
Products & Services Customer Employee
sells
helps
Software Dev Hardware
similar := partial-functional Accountant
Consulting Technician Sales
26 := inverse partial-functional
29. Probabilistic Soft Logic
CCMRF
Constrained Continuous Markov Random Field
Markov Random Field
- Undirected
- Entropy-maximizing
Continuous (Random Variables)
Constrained (Domain)
29
30. Probabilistic Soft Logic
CMRF RVs Range of RVs Domain of MRF
n
X = {X1 , .., Xn } : Di ⊂R D= ×i=1 Di
Feature or Compatibility Kernels Parameters
φ = {φ1 , .., φm } : φj : D → [0, M] ; Λ = {λ1 , .., λm }
Probability measure P over X defined through
m
Density 1
f (x) = exp[− λj φj (x)]
Function Z(Λ)
j=1
m
Partition
Z(Λ) = exp − λj φj (x) dx
Function D j=1
30
31. Probabilistic Soft Logic
CCMRF : Constraints [5 ]
Equality Constraints
kA kA
A(x) = a where A : D → R ,a ∈ R
Inequality Constraints
kB kB
B(x) ≤ b where B : D → R ,b ∈ R
Restricted Domain
˜
D = D ∩ {x|A(x) = a ∧ B(x) ≤ b}
Adjusted CCMRF
/ ˜
f (x) = 0 ∀x ∈ D
31
35. Probabilistic Soft Logic
Rules Ground Atoms
[3 ]
H1. ... Hm ô B1 , B2 ,... Bn
h
Atoms are real valued
- Interpretation I, atom A: I(A) [0,1]
- We will omit the interpretation and write A [0,1]
h is a combination function
- Arbitrary T-norms: [0,1]n Ø [0,1]
Based on the theory of Generalized Annotated
Logic Programs (GAP) [Kifer Subrahmanian ‘92]
- But restricted to real values
35
36. Probabilistic Soft Logic
Rules
H1. ... Hm ô B1 , B2 ,... Bn
h
h is a combination function
- Lukasiewicz T-norm
⊕ (h1, h2) = min(1, h1+h2 )
⊗ (h1, h2) = max(0, 1- h1+h2 )
We use the Lukasiewicz T-norm in the following.
36
40. Probabilistic Soft Logic
Rule Weights
R: H1. ... Hm ô B1 , B2 ,... Bn
w
Weighted Distance to Satisfaction
- d(R,I) = w * max(⊗ (B1,..,Bn)- ⊕ (H1,..,Hm), 0)
40
41. Probabilistic Soft Logic
Rule Weights
R: H1. ... Hm ô B1 , B2 ,... Bn
w
Weighted Distance to Satisfaction
- d(R,I) = w * max(⊗ (B1,..,Bn)- ⊕ (H1,..,Hm), 0)
Every ground rule R in a PSL program P
contributes a compatibility kernel ϕR =
d(R,I) to the CCMRF associated with P.
41
46. Probabilistic Soft Logic
MAP Inference [3 ]
Most Probable Interpretation
- Most likely truth value assignment given some facts.
argmax ( I | P)
I
ñ
argmin d(P,I)
I
46
47. Probabilistic Soft Logic
MAP Inference Theory
Exact PSL inference in polynomial time
- Convex optimization problem
due to our choices in combination functions
O(n3.5) inference
- Second Order Cone Program
- n=number of (active) ground rules
- Efficient commercial optimization packages
47
48. Probabilistic Soft Logic
Inference Algorithm
Each ground rule
constitutes a linear or
conic constraint
introducing a rule
specific “dissatisfaction”
variable that is added to
the objective function.
48
49. Probabilistic Soft Logic
Inference Algorithm
Conservative Grounding:
Most rules trivially have
satisfaction distance=0.
Save time and space by
not grounding them out
in the first place.
Don’t reason about it if you
don’t absolutely have to!
49
50. Probabilistic Soft Logic
Parallelizing MAP Inference [4 ]
MAP inference is O(n3.5)
- Limited scalability
Achieve scalability by dividing inference
problem into smaller “chunks”
- Allows for parallelization and distribution
of workload
- Similar to message-passing but on entire
subgraphs of the factor graph
50
52. Probabilistic Soft Logic
Factor Graph
vote(Mary,Dem) vote(Jane, Dem)
vote(Mary,Dem) spouse(John,Mary) vote(Jane,Dem) friend(John,Jane)
vote(John,Dem) : 0.8 vote(John,Dem) : 0.3
vote(John,Dem)
Idea: Partition Dependency graph into
strongly connected components and
solve MAP on each independently
52
53. Probabilistic Soft Logic
Approximate Algorithm
1. Ground out factor graph conservatively
2. Partition dependency graph using a
modularity maximizing clustering alg
- Inspired by Blondel et al [06]
- Aggregate rule weights
3. Compute MAP on each cluster fixing
confidence values of outside atoms
4. Go to 1 until change in I Θ
53
54. USA
dean author
Probabilistic Soft Logic
member
Prof Prof
Jones Baneri Italy
in
Paper
“ABC”
comment
author
UC
CS
UMD
CS
in
faculty
friends
faculty
Prof
Calero
department in
member
faculty presented
Prof
Dooley
attended
Social
Science
department
University
MD
Universita
Calabria
department in
dean
ASONAM
09
attended
faculty
submitted
Prof
Roma
author
UMD
Physics
author
member
visited
organized
accepted friends
author
KPLLC Paper
09 “UVW”
S3
Prof
Smith
Paper
“HIJ”
submitted
Paper
“XYZ”
comment
attended
comment
student of
S2
student of
Prof
Olsen
collaborates
Prof
Lund
member
dean
Prof
Larsen
faculty
Jamie
Lock
member
Karl
Oede
Social
Science
visited
Odense SDU
Physics Odense
colleagues
John
Doe
department
Denmark
55. Probabilistic Soft Logic
Scalability
16000
14000
Exact vs Approximate Algorithm Running Times
12000
Time in Seconds
Exact Algorithm
10000
Approximate Algorithm with Parameters A
8000
6000
4000
2000
0
0 10000 20000 30000 40000 50000 60000 70000 80000
# Compatibility Kernels in Graph
55
56. Probabilistic Soft Logic
Accuracy
7%
Relative Error compared to Exact Inference
6%
Percentage Relative Error
5%
Parameters B
4%
Parameters A
3% Parameters C
Parameters D
2%
Parameters E
1%
0%
0 10000 20000 30000 40000 50000 60000 70000 80000
Number of Compatibility Kernels
56
57. Probabilistic Soft Logic
Runtime
Running Time Comparison of Approximate
500
Algorithm
450
Parameters B
400
Parameters A
350 Parameters C
Time in Seconds
300 Parameters D
250 Parameters E
200
150
100
50
0
0 10000 20000 30000 40000 50000 60000 70000 80000
Number of Compatibility Kernels
57
58. Probabilistic Soft Logic
Accuracy on very large Graphs
Relative Error Comparison
6%
5%
Percentage Relative Error
Parameters B
Parameters C
4%
Parameters D
3% Parameters E
2%
1%
0%
3.5E+05 7.0E+05 1.4E+06 2.8E+06 5.6E+06
Number of Compatibility Kernels
Log-scale
58
59. Probabilistic Soft Logic
Runtime on very large Graphs
Runtime Comparison
40000
Time in Seconds
4000 Parameters B
Parameters A
2M edges
Parameters C
in 48 min
Parameters D
Parameters E
400
3.5E+05 7.0E+05 1.4E+06 2.8E+06 5.6E+06
Number of Compatibility Kernels
Log-log-scale
59
60. Probabilistic Soft Logic
Computing Marginals [5 ]
For a subset of RVs X ⊂ X RV = atom
- In our case X = {Xi }
Compute the marginal density function
fX (x ) = f (x , y)dy
˜
y∈×Di ,s.t.Xi ∈X
/
f
| |
0 1
Technician≈Developer
60
62. Probabilistic Soft Logic
Computing Marginal in Theory
Computing the marginal probability
density function for a subset X ⊂ X
under the probability measure
defined by a CCMRF is #P hard in
the worst case.
- Related to volume computation of
polytopes, based on [Broecheler et al, ‘09]
62
63. Probabilistic Soft Logic
Sampling Scheme
Approximate the marginal
distributions using an MCMC
sampling scheme restricted to the
convex polytope defined by D˜
- Again, inspired by work on volume
computation
63
72. Probabilistic Soft Logic
Sampling in theory
Theorem:
The complexity of computing an approximate
distribution σ* using the hit-and-run sampling
scheme such that the total variation distance
of σ* and P is less than ε is
∗
3
O n (kB + n + m)
˜ ˜
where n = n − kA , under the assumptions that
˜
we start from an initial distribution σ such
that the density function dσ/dP is bounded
by M except on a set S with σ(S)≤ε/s
[Lovasz Vempala ‘04]
72
73. Probabilistic Soft Logic
Sampling in Practice
Starting distribution = MAP state
How do we get out of corners?
73
74. Probabilistic Soft Logic
Sampling in Practice
How do we get out of corners?
Use relaxation method [Agmom ‘54] to solve system of
linear inequalities to find a feasible direction d
zk − W k d i T
di+1 = di + 2 Wk
ε1 Wk 2
ε2
74
81. Probabilistic Soft Logic
OAEI comparison [3 ]
1
0.8
F1 Score
0.6
0.4
0.2
0
Other results as reported by the benchmark
participants.
81
82. Probabilistic Soft Logic
Attribute Similarity Functions
A≈B ô A.name ≈x B.name
Maximum flexibility for attribute similarity
Customization to particular problem domains
- Camel-case common in web-ontologies
Users can define arbitrary similarity functions
≈x to be integrated into PSL
- e.g. String similarity measures such as Levenshtein
82
83. Probabilistic Soft Logic
Sets in PSL
{A.subConcept}≈{B.subConcept} ô A≠B
A≈B type(A,concept) type(B,concept) :0.8
provides
Organization
work for
buys interacts
Service Products Customers Employees
develops helps sells to
Software Hardware IT Services Developer Sales Person Staff
develop works for
Company
buys interacts with
Products Services Customer Employee
sells
helps
Software Dev Hardware Consulting Technician Sales Accountant
83
84. Probabilistic Soft Logic
Explicit Set Treatment
A≈B ô {A.subConcept} ≈{} {B.subConcept}
Reason about the similarity of sets of entities
Allow to integrate aggregates measures
Default Set equality measure: Jaccard-type
2 x∈X y∈Y x≈y
X≈Y =
|X| + |Y |
- Allow users to define alternative set equalities
• Based on inference engine
• Initially, PSL provides some predefined set overlap measures
84
85. Probabilistic Soft Logic
Support for Sets
Using relational syntax…
- X.name, X.father, X.friend (a friend)
- Binary predicates only
…makes it easier to specify sets
- {X.friend} - all friends
- {X.friend.friend} - all second level friends
Inverse of binary relation
- X.knows(inv) (who knows X?)
Union, Intersection
- {X.knows} u {X.knows(inv)} = {Y.knows} u {Y.knows(inv)}
85
86. Probabilistic Soft Logic
Utility of Sets in PSL
Compare set vs non-set version
of rules on synthetic ontology
alignment benchmark
A≈B ô {A.subConcept} ≈{} {B.subConcept}
vs
A≈B ô A.subConcept ≈ B.subConcept
86
89. Probabilistic Soft Logic
Probabilistic Query Analysis
Query Type
Marginal Distribution Most Probable World
continuum
Most Probable Sub-World
1 Atom Entire World
Examples: Examples:
Collective Classification Image Denoising
Link Prediction Complex System Configuration
Interesting: Constraints Ising Model
Decision Level Perspective System Level Perspective
89
90. Probabilistic Soft Logic
Probabilistic Query Analysis
Decision-driven Query Type
Marginal Distribution Most Probable World
continuum
Most Probable Sub-World
1 Atom Entire World
Examples: Examples:
Collective Classification Image Denoising
Link Prediction Complex System Configuration
Interesting: Constraints Ising Model
Decision Level Perspective System Level Perspective
90
91. Probabilistic Soft Logic
Probabilistic Query Analysis
Decision-driven Query Type
Marginal Distribution Most Probable World
continuum
Most Probable Sub-World
1 Atom Entire World
Cannot be
Examples: meaningfully Examples:
Collective Classification analyzed Image Denoising
Link Prediction Complex System Configuration
Interesting: Constraints Ising Model
Decision Level Perspective System Level Perspective
91
92. Probabilistic Soft Logic
Probabilistic Query Analysis
Query Type
Marginal Distribution Most Probable World
continuum
Most Probable Sub-World
1 Atom Entire World
Examples: Decision-driven
Collective Classification
Examples:
Image Denoising
Link Prediction Modeling Complex System Configuration
Interesting: Constraints Ising Model
Decision Level Perspective System Level Perspective
92
93. Probabilistic Soft Logic
Decision Driven Modeling (DDM) [ 2 ]
Predicates are typed as probability distributions
- e.g. Bernoulli distributions, parameterized by p ε [0,1]
Atoms are RVs over parameterized distributions
Defines a second-order probability distribution
defined by a CCMRF
Allows integration of external classifiers
- Important, e.g. in personalized medicine
Aggregation of evidence
- Can handle sets and other continuous aggregations
93
94. Probabilistic Soft Logic
Experiments: Wikipedia [3 ]
Wikipedia Category Prediction
- 2460 featured documents
- Links, talks
- Predict: category(D,C): Bernoulli
- 2 setups: seed split
link
talk talk
link
talk
talk
link talk
94
95. Probabilistic Soft Logic
Wikipedia Rules
hasCat(A,C) ô hasCat(B,C) A!=B
unknown(A) document(A,T)
document(B,U) similarText(T,U)
hasCat(A,C) ô hasCat(B,C) unknown(A)
link(A,B) A!=B
hasCat(D,C) ô talk(D,A) talk(E,A)
hasCat(E,C) unkonwn(D) A!=B
95
96. Probabilistic Soft Logic
Wikipedia – External Classifier
0.8
0.78
0.76
0.74
0.72
0.7
F1
0.68
0.66 Attributes Only
0.64 Attributes + Links
0.62 Attributes + Links + Talks
0.6
250 375 500 625 750
Number of Training Documents
96
97. Probabilistic Soft Logic
Wikipedia – Seed Classification
0.7
0.6
0.5
F1
0.4
0.3
0.2
0.15 (220) 0.2 (290) 0.25 (370) 0.3 (440)
Percentage of Seed Document (# Documents)
Attributes only Attributes + Links Attributes + Links + Talks
97
98. Probabilistic Soft Logic
Confidence Analysis [5 ]
Analyze the confidence in a prediction by
computing its marginal density function in the
second order probability distribution
- What does the density function look like around the
MAP state?
Novel aspect in SRL
f f
vs
| | | |
0 1 0 1
Category(Doc1,Theory) Category(Doc1,Theory)
98
101. Probabilistic Soft Logic
PSL Implementation
Implemented in Java / Groovy
Declarative model definition and
imperative model interaction
~40k LOC but still alpha
Performance oriented
- Database backend
- Memory efficient data structures
- High performance solver integration
102. Input Model
Probabilistic
Rules
Similarity
Input Data
A≈B similarID(A.name,B.name)
Graph
Preprocessing Logic {A.subClass}≈{B.subClass} A≈B
System Overview Constraints
RDBMS Partial functional: ≈
Similarity Functions
similarID(A,B) = new SimFun(){}
Groovy PSL
Programming
Environment
Factor Graph
Analysis Grounding
Evalua:on Tools
Framework
Op#miza#on Toolbox
Reasoner +
Learning Similarity Func#ons
Inference Result
105. Probabilistic Soft Logic
Conclusion
Simple and expressive formalism to
reason about similarity and
uncertainty collectively
- Sets aggregates, external functions
Scalable due to continuous rather than
combinatorial formulation
Future: Structure learning, extend
framework, additional use cases.
105
109. Probabilistic Soft Logic
Presented Work
[5] Computing marginal distributions over continuous Markov networks for
statistical relational learning, Matthias Broecheler, and Lise Getoor,
Advances in Neural Information Processing Systems (NIPS) 2010
[4] A Scalable Framework for Modeling Competitive Diffusion in Social Networks,
Matthias Broecheler, Paulo Shakarian, and V.S. Subrahmanian, International
Conference on Social Computing (SocialCom) 2010, Symposium Section
[3] Probabilistic Similarity Logic, Matthias Broecheler, Lilyana Mihalkova and Lise
Getoor, Conference on Uncertainty in Artificial Intelligence 2010
[2] Decision-Driven Models with Probabilistic Soft Logic, Stephen H. Bach,
Matthias Broecheler, Stanley Kok, Lise Getoor, NIPS Workshop on Predictive
Models in Personalized Medicine 2010
[1] Probabilistic Similarity Logic, Matthias Broecheler, and Lise Getoor,
International Workshop on Statistical Relational Learning 2009
This presentation also covers joint work with Paulo Shakarian and
Dr. V.S. Subrahmanian.
109
110. Probabilistic Soft Logic
References
Introduction to Statistical Relational Learning, Lise Getoor and Ben Taskar,
MIT Press, 2007
Theory of generalized annotated logic programming and its applications,
Michael Kifer and V.S. Subrahmanian, Journal of Logic Programming, Volume
12 Issue 4, April 1992
Using Histograms to Better Answer Queries to Probabilistic Logic Programs,
Matthias Broecheler, Gerardo I. Simari, and V.S. Subrahmanian, International
Conference on Logic Programming 2009
Hit-and-run from a corner, L. Lovasz and S. Vempala, ACM Symposium on
Theory of computing, 2004
110