SlideShare une entreprise Scribd logo
1  sur  110
Probabilistic Soft Logic
                                       by woodleywonderworks ©




   Probabilistic Soft Logic
              Matthias Bröcheler
Lilyana Mihalkova, Stephen Bach, Stanley Kok,
               and Lise Getoor
Probabilistic Soft Logic

               Applications



1. Ontology Alignment          2. Personalized Medicine




2
                3. Diffusion Modeling
Probabilistic Soft Logic


    Ontologies
                       provides                                    work for
                                         Organization


                                  buys                       interacts
        Service & Products                Customers                           Employees
                              develops                  sells to
                                          helps



Software    Hardware       IT Services       Developer       Sales Person         Staff




    Networks Process Optim.        ERP Systems                                Accountant

           = sub-concept
                                             Instance data not shown!
            relationship
3
Probabilistic Soft Logic


    Ontologies
                       provides                                    work for
                                         Organization


                                  buys                       interacts
        Service & Products                Customers                           Employees
                              develops                  sells to
                                          helps



Software    Hardware       IT Services       Developer Sales Person
                                                Database Schema Staff


    Networks Process Optim.        ERP Systems                                Accountant

           = sub-concept
                                                  Instances not shown!
            relationship
4
Probabilistic Soft Logic


    Multiple Ontologies
                          provides                                        work for
                                             Organization
                                      buys                          interacts
          Service & Products                  Customers                              Employees
                                develops helps              sells to

    Software   Hardware     IT Services              Developer      Sales Person         Staff



                          develop                                        works for
                                             Company
                                     buys                    interacts with
         Products & Services                 Customer                                Employee
                                                                 sells
                                             helps

Software Dev Hardware         Consulting              Technician           Sales       Accountant
5
Probabilistic Soft Logic


    Ontology Alignment                                                                          [3 ]
                          provides                                        work for
                                             Organization
                                      buys                          interacts
           Service & Products                 Customers                              Employees
                                develops helps              sells to

    Software   Hardware     IT Services              Developer      Sales Person         Staff



                          develop                                        works for
                                             Company
                                     buys                    interacts with
         Products & Services                 Customer                                Employee
                                                                 sells
                                             helps

Software Dev Hardware         Consulting              Technician           Sales       Accountant
6
Probabilistic Soft Logic


    Ontology Alignment
                          provides                                        work for
                                             Organization
                                      buys                          interacts
           Service & Products                 Customers                              Employees
                                develops helps              sells to

    Software   Hardware     IT Services              Developer      Sales Person         Staff


Match, Don’t Match?       develop
                                             Company
                                                                         works for

                                     buys                    interacts with
         Products & Services                 Customer                                Employee
                                                                 sells
                                             helps

Software Dev Hardware         Consulting              Technician           Sales       Accountant
7
Probabilistic Soft Logic


    Ontology Alignment
                          provides                                        work for
                                             Organization
                                      buys                          interacts
           Service & Products                 Customers                              Employees
                                develops helps              sells to

    Software   Hardware     IT Services              Developer      Sales Person         Staff


Similar to what extent?   develop
                                             Company
                                                                         works for

                                     buys                    interacts with
         Products & Services                 Customer                                Employee
                                                                 sells
                                             helps

Software Dev Hardware         Consulting              Technician           Sales       Accountant
8
Probabilistic Soft Logic


Personalized Medicine                       [2 ]

                            Joe Black
                     Age: 51
                     BMI: 27
                     Diet: high in fat
                     Rectal exam: no signs
                     PSA (blood test): 5.2
                     Mutations on: LMTK2, KLK3,
                     JAZF1
                     Discomfort when urinating

     Example
      Diagnosis and Treatment
         of Prostate Cancer
Probabilistic Soft Logic


    Bob Black                                         Joe Black
Died at age 79                               Age: 51
Never diagnosed with               father    BMI: 27
prostate cancer                              Diet: high in fat
PSA levels: 3.2-8.9                          Rectal exam: no signs
BMI: 23                                      PSA (blood test): 5.2
                                             Mutations on: LMTK2, KLK3,
                                             JAZF1
   Frank Black                               Discomfort when urinating
Age: 48
BMI: 24
PSA: 3.1, 4.2, 4.9, 55                                   Mary Black
Biopsy: 8/12 positive    brother            wife      Age: 45
Grade P1: 2-3, 60/40                                  BMI: 32
Grade P2: 4-5, 90/10                                  Diet: high in fat
Mutations on:                                         Diagnosed with
LMTK2, KLK3, JAZF1,                                   breast cancer,
CDH13                                                 XMRV virus
                                                      detected
Probabilistic Soft Logic


    Bob Black                                 Joe Black
Died at age 79                         Age: 51
Never diagnosed with          father   BMI: 27
prostate cancer                        Diet: high in fat
PSA levels: 3.2-8.9                    Rectal exam: no signs
BMI: 23                                PSA (blood test): 5.2
                                       Mutations on: LMTK2, KLK3,
                                       JAZF1
   Frank Black                         Discomfort when urinating
Age: 48
BMI: 24
PSA: 3.1, 4.2, 4.9, 55
                         Support Medical         Mary Black
Biopsy: 8/12 positive
Grade P1: 2-3, 60/40     Decision Making
                          brother   wife      Age: 45
                                              BMI: 32
Grade P2: 4-5, 90/10                          Diet: high in fat
Mutations on:                                 Diagnosed with
LMTK2, KLK3, JAZF1,                           breast cancer,
CDH13                                         XMRV virus
                                              detected
Probabilistic Soft Logic


 Diffusion in Social Networks                          [4 ]
  Diffusion is a widely studied dynamic
    of social networks
     -  Epidemiology
        •  SIR Disease Model
     -  Marketing
        •  Viral Marketing
     -  Health
        •  Obesity Study
     -  Campaign Management             © Christakis, Fowler

        •  Opinion Leaders
12
Probabilistic Soft Logic

                   500 million users



50M tweets / day




       Data is available
                                © Ludwig Gatzke
Probabilistic Soft Logic


Voter Opinion Modeling



              ?
             
Probabilistic Soft Logic


Voter Opinion Modeling



                  ?
     Status
     update
              
              $       $


                          Tweet
Probabilistic Soft Logic


Voter Opinion Modeling



      spouse


            colleague        friend
                                      friend




                                                           spouse
                                               friend




   
friend


                    spouse
                                               colleague
Probabilistic Soft Logic


 What’s the commonality?

         Collective Probabilistic
     Reasoning in Relational Domains




17
Probabilistic Soft Logic


 What’s the commonality?

         Collective Probabilistic
     Reasoning in Relational Domains


     Statistical Relational
            Learning
                   [Getoor & Taskar ’07]

18
Probabilistic Soft Logic


 SRL Alphabet Soup




19
Probabilistic Soft Logic


 SRL Alphabet Soup




            PSL?

20
Probabilistic Soft Logic


 Why PSL?
     Continuous Random Variables
        Mathematical Foundation
        Logic Foundation
        Inference & Learning
     Sets and Aggregators
     Extensible
     High Performance
21
Probabilistic Soft Logic


 What is PSL?
 Declarative language based on logics to
  express collective probabilistic
  inference problems
   -  Predicate = relationship or property
   -  (Ground) Atom = (continuous) random variable
   -  Rule = capture dependency or constraint
   -  Set = define aggregates
 PSL Program = Rules, Sets, Constraints, Atoms
22
Probabilistic Soft Logic


 Ontology Alignment
 similar(A,B) [A≈B]
                 provides
                                             Organization
                                                                          work for


                                      buys                          interacts
        Service & Products
 similar(Customer,Customers)        Customers                                        Employees
                           develops helps                   sells to
 [Customer≈Customers]
     Software   Hardware     IT Services             Developer      Sales Person         Staff

                                             domain(C,D)
                           develop                                       works for
                                             Company
                                             domain(work for, Employees)
                                     buys                    interacts with
          Products & Services                Customer                                Employee
                                                                 sells
                                             helps

Software Dev Hardware          Consulting             Technician           Sales       Accountant
23
Probabilistic Soft Logic


 Ontology Alignment
                           provides                                        work for
                                              Organization
                                       buys                          interacts
            Service & Products                 Customers                              Employees
                                 develops helps              sells to

     Software   Hardware     IT Services              Developer      Sales Person         Staff



                           develop                                        works for
                                              Company
                                      buys                    interacts with
          Products & Services                 Customer                                Employee
                                                                  sells
                                              helps
              R≈T
Software Dev Hardware  ôConsulting Technician
                          domainOf(R,A)                          domainOf(T,B)
                                                                  Sales Accountant
24
                                              A≈B         R≠T : 0.7
Probabilistic Soft Logic

{A.subConcept}≈{B.subConcept} ô A≠B
 Ontology Alignment
   A≈B type(A,concept) type(B,concept) :0.8
                           provides                                        work for
                                              Organization
                                       buys                          interacts
            Service & Products                 Customers                              Employees
                                 develops helps              sells to

     Software   Hardware     IT Services              Developer      Sales Person         Staff



                           develop                                        works for
                                              Company
                                      buys                    interacts with
          Products & Services                 Customer                                Employee
                                                                  sells
                                              helps

Software Dev Hardware          Consulting              Technician           Sales       Accountant
25
Probabilistic Soft Logic


 Ontology Alignment
                           provides                                        work for
                                              Organization
                                       buys                          interacts
            Service & Products                 Customers                              Employees
                                 develops helps              sells to

     Software   Hardware     IT Services              Developer      Sales Person         Staff



                           develop                                        works for
                                              Company
                                      buys                    interacts with
          Products & Services                 Customer                                Employee
                                                                  sells
                                              helps

Software Dev Hardware
                              similar := partial-functional Accountant
                               Consulting    Technician Sales
26                                      := inverse partial-functional
Probabilistic Soft Logic


Voter Opinion Modeling
            vote(A,P)         friend(B,A)  vote(B,P) : 0.3




      spouse


            colleague        friend
                                      friend




                                                           spouse
                                               friend




   
friend


                    spouse
                                               colleague




            vote(A,P)         spouse(B,A)  vote(B,P) : 0.8
Probabilistic Soft Logic




Mathematical Foundation
Probabilistic Soft Logic


 CCMRF
 Constrained Continuous Markov Random Field
      Markov Random Field
       -  Undirected
       -  Entropy-maximizing
      Continuous (Random Variables)
      Constrained (Domain)


29
Probabilistic Soft Logic


 CMRF            RVs              Range of RVs        Domain of MRF
                                                             n
     X = {X1 , .., Xn }   : Di      ⊂R              D=      ×i=1 Di
 Feature or Compatibility Kernels                       Parameters
φ = {φ1 , .., φm } : φj : D → [0, M] ; Λ = {λ1 , .., λm }
Probability measure P over X defined through
                              m
 Density            1
           f (x) =      exp[−    λj φj (x)]
Function           Z(Λ)
                             j=1 
                                          m
                                           
Partition
               Z(Λ) =             exp −         λj φj (x) dx
Function                      D            j=1
30
Probabilistic Soft Logic


 CCMRF : Constraints                                    [5 ]
                Equality Constraints
                                  kA               kA
     A(x) = a where A : D → R          ,a ∈ R
                Inequality Constraints
                                 kB              kB
 B(x) ≤ b where B : D → R             ,b ∈ R
                 Restricted Domain
 ˜
 D = D ∩ {x|A(x) = a ∧ B(x) ≤ b}
                  Adjusted CCMRF
                            / ˜
               f (x) = 0 ∀x ∈ D
31
Probabilistic Soft Logic


 Geometric Intuition
          X1
     1          x1 + x3 ≤ 1
                φ1 (x) = x1
                φ2 (x) = max(0, x1 − x2 )
                φ3 (x) = max(0, x2 − x3 )

                       X3
      0
                   1
                        Λ = {1, 2, 1}
X2
                   X = {X1 , X2 , X3 }
32
Probabilistic Soft Logic


 Geometric Intuition
          X1
     1          x1 + x3 ≤ 1
                φ1 (x) = x1
                φ2 (x) = max(0, x1 − x2 )
                φ3 (x) = max(0, x2 − x3 )

                            Highest Probability
                       X3
      0
                   1
                        Λ = {1, 2, 1}
X2
                   X = {X1 , X2 , X3 }
33
Probabilistic Soft Logic




   Logic Foundation
- Syntax  Semantics -
Probabilistic Soft Logic


 Rules                                     Ground Atoms
                                                                     [3 ]

     H1. ... Hm ô B1 , B2 ,... Bn
                              h


       Atoms are real valued
        -  Interpretation I, atom A: I(A)  [0,1]
        -  We will omit the interpretation and write A  [0,1]
       h is a combination function
        -  Arbitrary T-norms: [0,1]n Ø [0,1]
       Based on the theory of Generalized Annotated
         Logic Programs (GAP) [Kifer  Subrahmanian ‘92]
        -  But restricted to real values
35
Probabilistic Soft Logic


 Rules

     H1. ... Hm ô B1 , B2 ,... Bn
                         h

       h is a combination function
         -  Lukasiewicz T-norm
            ⊕ (h1, h2) = min(1, h1+h2 )
            ⊗ (h1, h2) = max(0, 1- h1+h2 )


     We use the Lukasiewicz T-norm in the following.

36
Probabilistic Soft Logic


 Satisfaction

     H1. ... Hm ô B1 , B2 ,... Bn
       Establish Satisfaction
         -  ⊕(H1,..,Hm) ¥ ⊗(B1,..,Bn)


           R≈T:?ôA≈B:0.7           D≈E:0.8

                                 Interpretation implicit!
37
Probabilistic Soft Logic


 Satisfaction

     H1. ... Hm ô B1 , B2 ,... Bn
       Establish Satisfaction
         -  ⊕(H1,..,Hm) ¥ ⊗(B1,..,Bn)


         R≈T:≥0.5ôA≈B:0.7               D≈E:0.8

                                 Interpretation implicit!
38
Probabilistic Soft Logic


 Distance to Satisfaction

     H1. ... Hm ô B1 , B2 ,... Bn
       Distance to Satisfaction
        -  max(   ⊗(B1,..,Bn) - ⊕(H1,..,Hm)   , 0)


        R≈T:0.7ôA≈B:0.7            D≈E:0.8      0.0

        R≈T:0.2ôA≈B:0.7            D≈E:0.8      0.3

39
Probabilistic Soft Logic


 Rule Weights

     R: H1. ... Hm ô B1 , B2 ,... Bn
                   w

      Weighted Distance to Satisfaction
       -  d(R,I) = w * max(⊗ (B1,..,Bn)- ⊕ (H1,..,Hm), 0)




40
Probabilistic Soft Logic


 Rule Weights

     R: H1. ... Hm ô B1 , B2 ,... Bn
                   w

      Weighted Distance to Satisfaction
       -  d(R,I) = w * max(⊗ (B1,..,Bn)- ⊕ (H1,..,Hm), 0)


      Every ground rule R in a PSL program P
        contributes a compatibility kernel ϕR =
       d(R,I) to the CCMRF associated with P.
41
Probabilistic Soft Logic


 Geometric Intuition
          R2

           1   |



     d(R2,I)
               |




               0     |       |
                   d(R1,I)   1    R1

42
Probabilistic Soft Logic


 Geometric Intuition
                            1
        R2       P(I|P ) =      exp [−d(P, I)]
                           Z(w)
                             
        w2
             |




     d(R2)
             |




                   norm = d(P,I)

             0     |               |
                 d(R1)             w1     R1


43
Probabilistic Soft Logic


 Geometric Intuition
          X1
     1          x1 + x3 ≤ 1
                φ1 (x) = x1
                φ2 (x) = max(0, x1 − x2 )
                φ3 (x) = max(0, x2 − x3 )

                            Highest Probability
                       X3
      0
                   1
                        Λ = {1, 2, 1}
X2
                   X = {X1 , X2 , X3 }
44
Probabilistic Soft Logic




    Inference
- MAP  Marginals -
Probabilistic Soft Logic


 MAP Inference                                                   [3 ]
      Most Probable Interpretation
       -  Most likely truth value assignment given some facts.


                   argmax ( I | P)
                         I
                             ñ
                   argmin d(P,I)
                         I

46
Probabilistic Soft Logic


 MAP Inference Theory
      Exact PSL inference in polynomial time
       -  Convex optimization problem
          due to our choices in combination functions


      O(n3.5) inference
       -  Second Order Cone Program
       -  n=number of (active) ground rules
       -  Efficient commercial optimization packages

47
Probabilistic Soft Logic


 Inference Algorithm
                       Each ground rule
                       constitutes a linear or
                       conic constraint
                       introducing a rule
                       specific “dissatisfaction”
                       variable that is added to
                       the objective function.




48
Probabilistic Soft Logic


 Inference Algorithm
                           Conservative Grounding:
                           Most rules trivially have
                           satisfaction distance=0.
                           Save time and space by
                           not grounding them out
                           in the first place.




             Don’t reason about it if you
             don’t absolutely have to!
49
Probabilistic Soft Logic


 Parallelizing MAP Inference                           [4 ]
  MAP inference is O(n3.5)
     -  Limited scalability
  Achieve scalability by dividing inference
    problem into smaller “chunks”
     -  Allows for parallelization and distribution
         of workload
     -  Similar to message-passing but on entire
         subgraphs of the factor graph

50
Probabilistic Soft Logic


 Factor Graph

     vote(Mary,Dem)                            vote(Jane, Dem)

vote(Mary,Dem)   spouse(John,Mary)   vote(Jane,Dem)   friend(John,Jane)
 vote(John,Dem) : 0.8                vote(John,Dem) : 0.3

                        vote(John,Dem)




51
Probabilistic Soft Logic


 Factor Graph

     vote(Mary,Dem)                            vote(Jane, Dem)

vote(Mary,Dem)   spouse(John,Mary)   vote(Jane,Dem)   friend(John,Jane)
 vote(John,Dem) : 0.8                vote(John,Dem) : 0.3

                        vote(John,Dem)


          Idea: Partition Dependency graph into
          strongly connected components and
          solve MAP on each independently

52
Probabilistic Soft Logic


 Approximate Algorithm
 1.  Ground out factor graph conservatively
 2.  Partition dependency graph using a
      modularity maximizing clustering alg
     -    Inspired by Blondel et al [06]
     -    Aggregate rule weights
 3.  Compute MAP on each cluster fixing
      confidence values of outside atoms
 4.  Go to 1 until change in I  Θ

53
USA



                                                                                   dean                                                                                                                                                                                                                                           author
                                                                                                                                                                                                                                                                                                                                                                                                                                                 Probabilistic Soft Logic
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 member



                                                                                                                                                                                                     Prof                                                                                                                                                                                                                                          Prof
                                                                                                                                                                                                    Jones                                                                                                                                                                                                                                         Baneri                                                                                                                      Italy



               in




                                                                                                                                                                                                                                                                                          Paper
                                                                                                                                                                                                                                                                                          “ABC”


                                                                                                                                                                                                                                                                                                                        comment


                                                                                                                                                                                                                         author




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             UC
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             CS




                                                                             UMD
                                                                              CS



                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      in




                                                                                                                                    faculty




                                                                                                                                                                                                                                                       friends




                                                                                                                                                                                                                                                                                                                                                                                                                                       faculty
                                                                                                                                                                                                                                                                                                                                                                  Prof
                                                                                                                                                                                                                                                                                                                                                                 Calero




                    department in




                                                                                                                                                                                                                                                                                                                                                                                          member




                                                                                                          faculty                                                                                                                                                                                                                    presented




                                                                                                                                                                                                                 Prof
                                                                                                                                                                                                                Dooley


                                                                                                                                                                                                                                                                 attended




                                                                                                                                                                                                                                                                                                                                                                                                                    Social
                                                                                                                                                                                                                                                                                                                                                                                                                   Science
                                                                                                                                                                                                                                                                                                                                                                                                                                                           department
                                                 University
                                                    MD




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Universita
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Calabria
                                                                                          department in




                                                                                                                                                                                                                                                                                                                                                                                                            dean


                                                                                                                                                                                                                                                                                                               ASONAM
                                                                                                                                                                                                                                                                                                                 09




                                                                                                                                                                                                    attended




     faculty


                                                                                                                                                                                                                                                                                                                                                                          submitted
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Prof
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Roma




                                                                                                                                                                                                                                                                       author



                                                                                                                                         UMD
                                                                                                                                        Physics



                                                                                                                                                                                                                                                                                                                                                                                                                                       author




                                                          member




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        visited

                                                                                                                                                                           organized




                                                                                                                                                                                       accepted                                                                                                                                                                                                                                                                                                                   friends




                                                                                                                    author




                                                                                                                                                                                                                                                                                 KPLLC                                                                                                                                                                            Paper
                                                                                                                                                                                                                                                                                  09                                                                                                                                                                             “UVW”


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         S3



                                                                            Prof
                                                                           Smith




                                                                                                                                                                                                                                                                                                                                                                                      Paper
                                                                                                                                                                                                                                                                                                                                                                                      “HIJ”


                                                                                                                                                                                                                                                                                         submitted




                                                                                                                                                                                                    Paper
                                                                                                                                                                                                    “XYZ”




                                                                                                                                                                                                                                                                                                                                                                                                                             comment




                                                                                                                                                                                                                                            attended




                                                                                                                                                                                                            comment




                                    student of




S2




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            student of
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Prof
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Olsen




                                                                                                                                                  collaborates



                                                                                                                                                                                                                                                                                                                                                 Prof
                                                                                                                                                                                                                                                                                                                                                 Lund
                                                                                                                                                                                                                                                                                                                                                                                                   member




                                                                                                                                                                                                                                                                                                                                                                                                                                                                        dean



                                                                                                                                                                                                                                                                    Prof
                                                                                                                                                                                                                                                                   Larsen




                                                                                                                                                                                                                                                                                                     faculty



                                                                   Jamie
                                                                    Lock
                                                                                                                                                                                           member




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Karl
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Oede




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Social
                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Science


                                                                                                                                                                 visited




                                                                                                                                                                                                                                  Odense                                                                                                                 SDU
                                                                                                                                                                                                                                  Physics                                                                                                               Odense



     colleagues
                                                                                                                             John
                                                                                                                             Doe
                                                                                                                                                                                                                                                                    department




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Denmark
Probabilistic Soft Logic


 Scalability
                       16000

                       14000
                                   Exact vs Approximate Algorithm Running Times

                       12000
     Time in Seconds




                                    Exact Algorithm
                       10000
                                    Approximate Algorithm with Parameters A
                        8000

                        6000

                        4000

                        2000

                           0
                               0    10000   20000     30000   40000    50000   60000    70000   80000
                                                 # Compatibility Kernels in Graph




55
Probabilistic Soft Logic


 Accuracy
                                 7%
                                          Relative Error compared to Exact Inference
                                 6%
     Percentage Relative Error




                                 5%
                                           Parameters B
                                 4%
                                           Parameters A

                                 3%        Parameters C

                                           Parameters D
                                 2%
                                           Parameters E

                                 1%

                                 0%
                                      0   10000   20000   30000   40000   50000   60000    70000   80000
                                                     Number of Compatibility Kernels



56
Probabilistic Soft Logic


 Runtime
                                 Running Time Comparison of Approximate
                       500
                                               Algorithm
                       450
                                 Parameters B
                       400
                                 Parameters A
                       350       Parameters C
     Time in Seconds




                       300       Parameters D
                       250       Parameters E
                       200
                       150
                       100
                        50
                         0
                             0   10000   20000      30000   40000    50000    60000   70000   80000
                                                Number of Compatibility Kernels


57
Probabilistic Soft Logic


 Accuracy on very large Graphs
                                                Relative Error Comparison
                                 6%

                                 5%
     Percentage Relative Error




                                           Parameters B
                                           Parameters C
                                 4%
                                           Parameters D
                                 3%        Parameters E

                                 2%

                                 1%

                                 0%
                                 3.5E+05       7.0E+05           1.4E+06        2.8E+06      5.6E+06
                                                         Number of Compatibility Kernels

                                                                                            Log-scale
58
Probabilistic Soft Logic


 Runtime on very large Graphs
                                       Runtime Comparison
                       40000
     Time in Seconds




                        4000                                                     Parameters B
                                                                                 Parameters A
                                                                                2M edges
                                                                                 Parameters C


                                                                                in 48 min
                                                                                 Parameters D
                                                                                 Parameters E
                         400
                          3.5E+05   7.0E+05     1.4E+06         2.8E+06          5.6E+06
                                              Number of Compatibility Kernels

              Log-log-scale
59
Probabilistic Soft Logic


 Computing Marginals                                               [5 ]
                                   
  For a subset of RVs X ⊂ X          RV = atom
                    
    -  In our case X = {Xi }
  Compute the marginal density function
                  
                                                     
     fX (x ) =                                  f (x , y)dy
                         ˜
                      y∈×Di ,s.t.Xi ∈X
                                    /
              f




                  |                          |
                  0                          1
                      Technician≈Developer
60
Probabilistic Soft Logic


 Geometric Intuition
                  X1           f
          1

                                   |                        |
                                   0                        1


                       P(0.4 ≤ X2 ≤ 0.6)


                                            X3
              0
                                        1


     X2

61
Probabilistic Soft Logic


 Computing Marginal in Theory
 Computing the marginal probability
                                  
   density function for a subset X ⊂ X
   under the probability measure
   defined by a CCMRF is #P hard in
   the worst case.
     -  Related to volume computation of
         polytopes, based on [Broecheler et al, ‘09]


62
Probabilistic Soft Logic


 Sampling Scheme
  Approximate the marginal
    distributions using an MCMC
    sampling scheme restricted to the
    convex polytope defined by D˜
     -  Again, inspired by work on volume
         computation




63
Probabilistic Soft Logic


 Histogram Sampling




                                  Xi
64
Probabilistic Soft Logic


 Histogram Sampling




                                  Xi
65
Probabilistic Soft Logic


 Histogram Sampling




                                  Xi
66
Probabilistic Soft Logic


 Histogram Sampling




                                  Xi
67
Probabilistic Soft Logic


 Random Ball Walk

                Need to sample from 
                restricted to the ball
                 difficult


                     q1

                 r
                          p


                     q2
68
Probabilistic Soft Logic


 Hit-and-Run




               d


               p




69
Probabilistic Soft Logic


 Hit-and-Run

                           q



                               d

 Compute density
 function induced on           p
 line and sample from it
  easy

70
Probabilistic Soft Logic


 Hit-and-Run

               q



                   d


                   p




71
Probabilistic Soft Logic


 Sampling in theory
 Theorem:
  The complexity of computing an approximate
   distribution σ* using the hit-and-run sampling
   scheme such that the total variation distance
   of σ* and P is less than ε is
                    ∗
                          3
                                           
                O         n (kB + n + m)
                          ˜       ˜
     where n = n − kA , under the assumptions that
            ˜
     we start from an initial distribution σ such
     that the density function dσ/dP is bounded
     by M except on a set S with σ(S)≤ε/s
                                     [Lovasz  Vempala ‘04]
72
Probabilistic Soft Logic


 Sampling in Practice
  Starting distribution = MAP state
  How do we get out of corners?




73
Probabilistic Soft Logic


 Sampling in Practice
  How do we get out of corners?
   Use relaxation method [Agmom ‘54] to solve system of
    linear inequalities to find a feasible direction d
                                       zk − W k d i T
                       di+1   = di + 2             Wk
                ε1                       Wk 2


                 ε2


74
Probabilistic Soft Logic


 Algorithm Convergence
                            KL Divergence by Sample Size
                  5
 KL Divergence




                 0.5
                        Average KL Divergence
                        Lowest Quartile KL Divergence
                        Highest Quartile KL Divergence
        0.05
           30000                           300000                       3000000
                                       Number of Samples

                       Averaged over 30 randomly snow-ball sampled folds
Lowest Quartile = 322-413 atoms                   Highest Quartile = 174-224 atoms
75
Probabilistic Soft Logic


 Algorithm Performance
                                Runtime for 1000 Samples
                   35
                   30
                   25
     Time in sec




                   20
                   15
                   10
                    5
                    0
                        0    2000     4000      6000       8000     10000
                            Number of Compatibility Kernels




76
Probabilistic Soft Logic




Weight Learning
Probabilistic Soft Logic


 Weight Learning                                              [3 ]
       Given: Rules + Training Instance | Want: Weights
           
           w∗ = argmaxw P(IT |P ) − ||w||2
                                     
       Approach: Maximize likelihood of observation by
         optimizing weights
       -  Plus prior on weights
       Problem: Cannot compute partition function Z
         tractably
       Workaround: Use MAP state to approximate Z
       -  Invoke reasoner during gradient computation
       -  BFGS and Perceptron implementations
78
Probabilistic Soft Logic




Similarity Reasoning
Probabilistic Soft Logic


 Experiments: Ontology Alignment
  OAEI Ontology Alignment
   Benchmark (2008)
     -  Real world ontologies (300s)
     -  Synthetic ontology pairs
     -  Approx 100 entities
     -  21 rules, modified standard string
         similarity measures



80
Probabilistic Soft Logic


   OAEI comparison                                             [3 ]
            1

           0.8
F1 Score




           0.6

           0.4

           0.2

            0




           Other results as reported by the benchmark
             participants.
81
Probabilistic Soft Logic


 Attribute Similarity Functions
     A≈B   ô      A.name ≈x B.name
       Maximum flexibility for attribute similarity
       Customization to particular problem domains
        -  Camel-case common in web-ontologies
       Users can define arbitrary similarity functions
        ≈x to be integrated into PSL
        -  e.g. String similarity measures such as Levenshtein



82
Probabilistic Soft Logic


 Sets in PSL
         {A.subConcept}≈{B.subConcept} ô A≠B
            A≈B type(A,concept) type(B,concept) :0.8
                   provides
                             Organization
                                          work for


                                      buys                          interacts
            Service  Products                Customers                              Employees
                                 develops helps             sells to

     Software   Hardware     IT Services             Developer      Sales Person         Staff



                           develop                                       works for
                                             Company
                                     buys                    interacts with
          Products  Services                Customer                                Employee
                                                                 sells
                                             helps

Software Dev Hardware          Consulting             Technician           Sales       Accountant
83
Probabilistic Soft Logic


 Explicit Set Treatment
     A≈B   ô      {A.subConcept} ≈{} {B.subConcept}

       Reason about the similarity of sets of entities
       Allow to integrate aggregates measures
       Default Set equality measure: Jaccard-type
                                 
                             2     x∈X        y∈Y   x≈y
              X≈Y =
                                      |X| + |Y |
        -  Allow users to define alternative set equalities
           •  Based on inference engine
           •  Initially, PSL provides some predefined set overlap measures
84
Probabilistic Soft Logic


 Support for Sets
       Using relational syntax…
        -  X.name, X.father, X.friend (a friend)
        -  Binary predicates only
       …makes it easier to specify sets
        -  {X.friend} - all friends
        -  {X.friend.friend} - all second level friends
       Inverse of binary relation
        -  X.knows(inv) (who knows X?)
       Union, Intersection
        -  {X.knows} u {X.knows(inv)} = {Y.knows} u {Y.knows(inv)}


85
Probabilistic Soft Logic


 Utility of Sets in PSL
  Compare set vs non-set version
    of rules on synthetic ontology
    alignment benchmark

 A≈B   ô   {A.subConcept} ≈{} {B.subConcept}
                     vs
 A≈B   ô   A.subConcept ≈ B.subConcept


86
Probabilistic Soft Logic


  Ontology Set Comparison
                    1
Structural
                  0.9
Noise: 0.2
                  0.8
             F1



                  0.7
                  0.6
                  0.5        Complete PSL         setFree PSL
                  0.4
   Attribute Noise 0        0.15 0.3   0.4   0.5 0.55 0.6 0.65 0.7 0.75 0.8
                    1
Structural
                  0.9
Noise: 0.4
                  0.8
             F1




                  0.7
                  0.6
                  0.5         Complete PSL         setFree PSL
                  0.4
  Attribute Noise       0   0.15 0.3   0.4   0.5 0.55 0.6 0.65 0.7 0.75 0.8
 87
Probabilistic Soft Logic




Decision Making
Probabilistic Soft Logic


         Probabilistic Query Analysis
                             Query Type


 Marginal Distribution                     Most Probable World
            continuum
                                                   Most Probable Sub-World
1 Atom                  Entire World
 Examples:                                 Examples:
   Collective Classification                Image Denoising
   Link Prediction                          Complex System Configuration
        Interesting: Constraints                 Ising Model

Decision Level Perspective                System Level Perspective
89
Probabilistic Soft Logic


         Probabilistic Query Analysis
 Decision-driven             Query Type


 Marginal Distribution                     Most Probable World
            continuum
                                                   Most Probable Sub-World
1 Atom                  Entire World
 Examples:                                 Examples:
   Collective Classification                Image Denoising
   Link Prediction                          Complex System Configuration
        Interesting: Constraints                 Ising Model

Decision Level Perspective                System Level Perspective
90
Probabilistic Soft Logic


         Probabilistic Query Analysis
 Decision-driven             Query Type


 Marginal Distribution                        Most Probable World
            continuum
                                                      Most Probable Sub-World
1 Atom                  Entire World
                              Cannot be
 Examples:                    meaningfully    Examples:
   Collective Classification analyzed          Image Denoising
   Link Prediction                             Complex System Configuration
        Interesting: Constraints                    Ising Model

Decision Level Perspective                   System Level Perspective
91
Probabilistic Soft Logic


         Probabilistic Query Analysis
                            Query Type


 Marginal Distribution                    Most Probable World
           continuum
                                                  Most Probable Sub-World
1 Atom                 Entire World
 Examples:                 Decision-driven
   Collective Classification
                                      Examples:
                                        Image Denoising
   Link Prediction           Modeling Complex System Configuration
                                       
       Interesting: Constraints               Ising Model

Decision Level Perspective               System Level Perspective
92
Probabilistic Soft Logic


 Decision Driven Modeling (DDM) [ 2 ]
  Predicates are typed as probability distributions
     -  e.g. Bernoulli distributions, parameterized by p ε [0,1]

  Atoms are RVs over parameterized distributions
  Defines a second-order probability distribution
    defined by a CCMRF
  Allows integration of external classifiers
     -  Important, e.g. in personalized medicine
  Aggregation of evidence
     -  Can handle sets and other continuous aggregations

93
Probabilistic Soft Logic


 Experiments: Wikipedia                                          [3 ]
  Wikipedia Category Prediction
     -  2460 featured documents
     -  Links, talks
     -  Predict: category(D,C): Bernoulli
     -  2 setups: seed  split
           link


                talk         talk
                                                  link
                                        talk
                    talk


         link                   talk
                                                         
94
Probabilistic Soft Logic


 Wikipedia Rules
 hasCat(A,C) ô hasCat(B,C)       A!=B 

    	
  	
 unknown(A)    document(A,T) 

    	
  	
 document(B,U)   similarText(T,U)

 hasCat(A,C)   ô     hasCat(B,C)   unknown(A)
               link(A,B)   A!=B

 hasCat(D,C) ô talk(D,A)   talk(E,A)
          hasCat(E,C) unkonwn(D)     A!=B

95
Probabilistic Soft Logic


 Wikipedia – External Classifier
           0.8
          0.78
          0.76
          0.74
          0.72
           0.7
     F1




          0.68
          0.66                       Attributes Only
          0.64                       Attributes + Links
          0.62                       Attributes + Links + Talks
           0.6
                 250    375         500        625          750
                       Number of Training Documents


96
Probabilistic Soft Logic


 Wikipedia – Seed Classification
      0.7

      0.6

      0.5
 F1




      0.4

      0.3

      0.2
              0.15 (220)      0.2 (290)      0.25 (370)    0.3 (440)
                       Percentage of Seed Document (# Documents)
       Attributes only   Attributes + Links   Attributes + Links + Talks


97
Probabilistic Soft Logic


 Confidence Analysis                                                          [5 ]
  Analyze the confidence in a prediction by
    computing its marginal density function in the
    second order probability distribution
         -  What does the density function look like around the
            MAP state?
  Novel aspect in SRL
     f                                        f

                                         vs
         |                           |            |                           |
         0                           1            0                           1
             Category(Doc1,Theory)                    Category(Doc1,Theory)

98
Probabilistic Soft Logic


 Experiments: Confidence Analysis
   Split predications into S+, S-
   Compute avg standard deviations for each
                          σ− − σ+
   Compare: ∆(σ) = 2
                          σ+ + σ−
   Hypothesis: ∆(σ)  0

      Folds    P(Null Hypothesis)   Relative Difference Δ(σ)
       20          1.95E-09                  38.3%
       25          2.40E-13                  41.2%
       30          1.00E-16                 43.5%
       35          4.54E-08                  39.0%


99
Probabilistic Soft Logic




PSL System Overview
Probabilistic Soft Logic


PSL Implementation
 Implemented in Java / Groovy
 Declarative model definition and
   imperative model interaction
 ~40k LOC but still alpha
 Performance oriented
  -  Database backend
  -  Memory efficient data structures
  -  High performance solver integration
Input Model
                                      Probabilistic
                                                      Rules
                                       Similarity
Input Data

                                                      A≈B  similarID(A.name,B.name)
                      Graph
                      Preprocessing      Logic        {A.subClass}≈{B.subClass}  A≈B
                                      System Overview Constraints
                    RDBMS                             Partial functional: ≈
                                                      Similarity Functions
                                                      similarID(A,B) = new SimFun(){}
                                       Groovy PSL
                                        Programming
                                        Environment
                                                               Factor Graph


                 Analysis            Grounding  
              Evalua:on  Tools  
                                      Framework  

                                                              Op#miza#on  Toolbox  
                                      Reasoner  +  
                                       Learning                Similarity  Func#ons  
             Inference Result
Probabilistic Soft Logic


Defining the Model
Probabilistic Soft Logic


Interacting with the Model
Probabilistic Soft Logic


 Conclusion
  Simple and expressive formalism to
    reason about similarity and
    uncertainty collectively
      -  Sets  aggregates, external functions
  Scalable due to continuous rather than
    combinatorial formulation
  Future: Structure learning, extend
    framework, additional use cases.
105
Probabilistic Soft Logic




psl.umiacs.umd.edu
?
         Probabilistic Soft Logic




Questions?
Comments?
Probabilistic Soft Logic




References  Bibliography
Probabilistic Soft Logic


 Presented Work
 [5] Computing marginal distributions over continuous Markov networks for
     statistical relational learning, Matthias Broecheler, and Lise Getoor,
     Advances in Neural Information Processing Systems (NIPS) 2010
 [4] A Scalable Framework for Modeling Competitive Diffusion in Social Networks,
     Matthias Broecheler, Paulo Shakarian, and V.S. Subrahmanian, International
     Conference on Social Computing (SocialCom) 2010, Symposium Section
 [3] Probabilistic Similarity Logic, Matthias Broecheler, Lilyana Mihalkova and Lise
     Getoor, Conference on Uncertainty in Artificial Intelligence 2010
 [2] Decision-Driven Models with Probabilistic Soft Logic, Stephen H. Bach,
     Matthias Broecheler, Stanley Kok, Lise Getoor, NIPS Workshop on Predictive
     Models in Personalized Medicine 2010
 [1] Probabilistic Similarity Logic, Matthias Broecheler, and Lise Getoor,
     International Workshop on Statistical Relational Learning 2009


 This presentation also covers joint work with Paulo Shakarian and
   Dr. V.S. Subrahmanian.
109
Probabilistic Soft Logic


 References
   Introduction to Statistical Relational Learning, Lise Getoor and Ben Taskar,
     MIT Press, 2007
   Theory of generalized annotated logic programming and its applications,
     Michael Kifer and V.S. Subrahmanian, Journal of Logic Programming, Volume
     12 Issue 4, April 1992
   Using Histograms to Better Answer Queries to Probabilistic Logic Programs,
    Matthias Broecheler, Gerardo I. Simari, and V.S. Subrahmanian, International
     Conference on Logic Programming 2009
   Hit-and-run from a corner, L. Lovasz and S. Vempala, ACM Symposium on
     Theory of computing, 2004




110

Contenu connexe

Similaire à Probabilistic Soft Logic for Personalized Medicine

To Each Their Own: How to Solve Analytic Complexity
To Each Their Own: How to Solve Analytic ComplexityTo Each Their Own: How to Solve Analytic Complexity
To Each Their Own: How to Solve Analytic ComplexityInside Analysis
 
Dossier corporativo en inglés
Dossier corporativo en inglésDossier corporativo en inglés
Dossier corporativo en inglésGrupo Arelance
 
1 jazz overview-karthik_k
1 jazz overview-karthik_k1 jazz overview-karthik_k
1 jazz overview-karthik_kIBM
 
Jazz Overview- Karthik K
Jazz Overview-  Karthik KJazz Overview-  Karthik K
Jazz Overview- Karthik KRoopa Nadkarni
 
Flotree customer centered vision
Flotree   customer centered visionFlotree   customer centered vision
Flotree customer centered visionDave Flotree
 
Empowering the Business with Agile Analytics
Empowering the Business with Agile AnalyticsEmpowering the Business with Agile Analytics
Empowering the Business with Agile AnalyticsInside Analysis
 
Business Process Management
Business Process ManagementBusiness Process Management
Business Process ManagementIBMGovernmentCA
 
Taptera company profile 2012
Taptera company profile 2012Taptera company profile 2012
Taptera company profile 2012taptera
 
Intel's Global Approach to Scale Editorial and Content Planning
Intel's Global Approach to Scale Editorial and Content PlanningIntel's Global Approach to Scale Editorial and Content Planning
Intel's Global Approach to Scale Editorial and Content PlanningPam Didner
 
Bi Is Not An Isolated Decision
Bi Is Not An Isolated DecisionBi Is Not An Isolated Decision
Bi Is Not An Isolated DecisionJoseph Lopez
 
Ecz Services(2)
Ecz Services(2)Ecz Services(2)
Ecz Services(2)saima10
 
La produttivita nella gestione documentale secondo Microsoft
La produttivita nella gestione documentale secondo MicrosoftLa produttivita nella gestione documentale secondo Microsoft
La produttivita nella gestione documentale secondo MicrosoftDOCFLOW
 
Introduccion a SQL Server Master Data Services
Introduccion a SQL Server Master Data ServicesIntroduccion a SQL Server Master Data Services
Introduccion a SQL Server Master Data ServicesEduardo Castro
 
Pulse Design & Delivery Panel
Pulse Design & Delivery PanelPulse Design & Delivery Panel
Pulse Design & Delivery PanelMauricio Godoy
 
Customer Experience by Richard Perry, FOUNDED, cxfounded
Customer Experience by Richard Perry, FOUNDED, cxfoundedCustomer Experience by Richard Perry, FOUNDED, cxfounded
Customer Experience by Richard Perry, FOUNDED, cxfoundedFOUNDED London
 

Similaire à Probabilistic Soft Logic for Personalized Medicine (20)

To Each Their Own: How to Solve Analytic Complexity
To Each Their Own: How to Solve Analytic ComplexityTo Each Their Own: How to Solve Analytic Complexity
To Each Their Own: How to Solve Analytic Complexity
 
Dossier corporativo en inglés
Dossier corporativo en inglésDossier corporativo en inglés
Dossier corporativo en inglés
 
1 jazz overview-karthik_k
1 jazz overview-karthik_k1 jazz overview-karthik_k
1 jazz overview-karthik_k
 
Jazz Overview- Karthik K
Jazz Overview-  Karthik KJazz Overview-  Karthik K
Jazz Overview- Karthik K
 
Flotree customer centered vision
Flotree   customer centered visionFlotree   customer centered vision
Flotree customer centered vision
 
Empowering the Business with Agile Analytics
Empowering the Business with Agile AnalyticsEmpowering the Business with Agile Analytics
Empowering the Business with Agile Analytics
 
[StepTalks2011] Agility @ Scale - Rien Schot
[StepTalks2011] Agility @ Scale - Rien Schot[StepTalks2011] Agility @ Scale - Rien Schot
[StepTalks2011] Agility @ Scale - Rien Schot
 
Business Process Management
Business Process ManagementBusiness Process Management
Business Process Management
 
Taptera company profile 2012
Taptera company profile 2012Taptera company profile 2012
Taptera company profile 2012
 
Savvy bears final 2012 berkeley
Savvy bears final 2012 berkeleySavvy bears final 2012 berkeley
Savvy bears final 2012 berkeley
 
Intel's Global Approach to Scale Editorial and Content Planning
Intel's Global Approach to Scale Editorial and Content PlanningIntel's Global Approach to Scale Editorial and Content Planning
Intel's Global Approach to Scale Editorial and Content Planning
 
Bi Is Not An Isolated Decision
Bi Is Not An Isolated DecisionBi Is Not An Isolated Decision
Bi Is Not An Isolated Decision
 
Ecz Services(2)
Ecz Services(2)Ecz Services(2)
Ecz Services(2)
 
La produttivita nella gestione documentale secondo Microsoft
La produttivita nella gestione documentale secondo MicrosoftLa produttivita nella gestione documentale secondo Microsoft
La produttivita nella gestione documentale secondo Microsoft
 
How to Build a World-Class Back Office
How to Build a World-Class Back OfficeHow to Build a World-Class Back Office
How to Build a World-Class Back Office
 
Introduccion M D S
Introduccion M D SIntroduccion M D S
Introduccion M D S
 
Introduccion a SQL Server Master Data Services
Introduccion a SQL Server Master Data ServicesIntroduccion a SQL Server Master Data Services
Introduccion a SQL Server Master Data Services
 
Pulse Design & Delivery Panel
Pulse Design & Delivery PanelPulse Design & Delivery Panel
Pulse Design & Delivery Panel
 
Customer Experience by Richard Perry, FOUNDED, cxfounded
Customer Experience by Richard Perry, FOUNDED, cxfoundedCustomer Experience by Richard Perry, FOUNDED, cxfounded
Customer Experience by Richard Perry, FOUNDED, cxfounded
 
E biz blueprint
E biz blueprintE biz blueprint
E biz blueprint
 

Plus de Matthias Broecheler

Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Matthias Broecheler
 
Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013Matthias Broecheler
 
Titan - Graph Computing with Cassandra
Titan - Graph Computing with CassandraTitan - Graph Computing with Cassandra
Titan - Graph Computing with CassandraMatthias Broecheler
 
Adding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and FaunusAdding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and FaunusMatthias Broecheler
 
Titan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraTitan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraMatthias Broecheler
 
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksPMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksMatthias Broecheler
 
Budget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large NetworksBudget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large NetworksMatthias Broecheler
 
Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Matthias Broecheler
 
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksA Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksMatthias Broecheler
 
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
COSI: Cloud Oriented Subgraph Identification in Massive Social NetworksCOSI: Cloud Oriented Subgraph Identification in Massive Social Networks
COSI: Cloud Oriented Subgraph Identification in Massive Social NetworksMatthias Broecheler
 

Plus de Matthias Broecheler (14)

Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3
 
Titan @ Gitpro Conference 2014
Titan @ Gitpro Conference 2014Titan @ Gitpro Conference 2014
Titan @ Gitpro Conference 2014
 
Titan NYC Meetup March 2014
Titan NYC Meetup March 2014Titan NYC Meetup March 2014
Titan NYC Meetup March 2014
 
Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013
 
Titan - Graph Computing with Cassandra
Titan - Graph Computing with CassandraTitan - Graph Computing with Cassandra
Titan - Graph Computing with Cassandra
 
Data Day Texas 2013
Data Day Texas 2013Data Day Texas 2013
Data Day Texas 2013
 
Adding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and FaunusAdding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and Faunus
 
Big Graph Data
Big Graph DataBig Graph Data
Big Graph Data
 
Titan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraTitan: Big Graph Data with Cassandra
Titan: Big Graph Data with Cassandra
 
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksPMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
 
Budget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large NetworksBudget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large Networks
 
Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010
 
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksA Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
 
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
COSI: Cloud Oriented Subgraph Identification in Massive Social NetworksCOSI: Cloud Oriented Subgraph Identification in Massive Social Networks
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
 

Dernier

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Dernier (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Probabilistic Soft Logic for Personalized Medicine

  • 1. Probabilistic Soft Logic by woodleywonderworks © Probabilistic Soft Logic Matthias Bröcheler Lilyana Mihalkova, Stephen Bach, Stanley Kok, and Lise Getoor
  • 2. Probabilistic Soft Logic Applications 1. Ontology Alignment 2. Personalized Medicine 2 3. Diffusion Modeling
  • 3. Probabilistic Soft Logic Ontologies provides work for Organization buys interacts Service & Products Customers Employees develops sells to helps Software Hardware IT Services Developer Sales Person Staff Networks Process Optim. ERP Systems Accountant = sub-concept Instance data not shown! relationship 3
  • 4. Probabilistic Soft Logic Ontologies provides work for Organization buys interacts Service & Products Customers Employees develops sells to helps Software Hardware IT Services Developer Sales Person Database Schema Staff Networks Process Optim. ERP Systems Accountant = sub-concept Instances not shown! relationship 4
  • 5. Probabilistic Soft Logic Multiple Ontologies provides work for Organization buys interacts Service & Products Customers Employees develops helps sells to Software Hardware IT Services Developer Sales Person Staff develop works for Company buys interacts with Products & Services Customer Employee sells helps Software Dev Hardware Consulting Technician Sales Accountant 5
  • 6. Probabilistic Soft Logic Ontology Alignment [3 ] provides work for Organization buys interacts Service & Products Customers Employees develops helps sells to Software Hardware IT Services Developer Sales Person Staff develop works for Company buys interacts with Products & Services Customer Employee sells helps Software Dev Hardware Consulting Technician Sales Accountant 6
  • 7. Probabilistic Soft Logic Ontology Alignment provides work for Organization buys interacts Service & Products Customers Employees develops helps sells to Software Hardware IT Services Developer Sales Person Staff Match, Don’t Match? develop Company works for buys interacts with Products & Services Customer Employee sells helps Software Dev Hardware Consulting Technician Sales Accountant 7
  • 8. Probabilistic Soft Logic Ontology Alignment provides work for Organization buys interacts Service & Products Customers Employees develops helps sells to Software Hardware IT Services Developer Sales Person Staff Similar to what extent? develop Company works for buys interacts with Products & Services Customer Employee sells helps Software Dev Hardware Consulting Technician Sales Accountant 8
  • 9. Probabilistic Soft Logic Personalized Medicine [2 ] Joe Black Age: 51 BMI: 27 Diet: high in fat Rectal exam: no signs PSA (blood test): 5.2 Mutations on: LMTK2, KLK3, JAZF1 Discomfort when urinating Example Diagnosis and Treatment of Prostate Cancer
  • 10. Probabilistic Soft Logic Bob Black Joe Black Died at age 79 Age: 51 Never diagnosed with father BMI: 27 prostate cancer Diet: high in fat PSA levels: 3.2-8.9 Rectal exam: no signs BMI: 23 PSA (blood test): 5.2 Mutations on: LMTK2, KLK3, JAZF1 Frank Black Discomfort when urinating Age: 48 BMI: 24 PSA: 3.1, 4.2, 4.9, 55 Mary Black Biopsy: 8/12 positive brother wife Age: 45 Grade P1: 2-3, 60/40 BMI: 32 Grade P2: 4-5, 90/10 Diet: high in fat Mutations on: Diagnosed with LMTK2, KLK3, JAZF1, breast cancer, CDH13 XMRV virus detected
  • 11. Probabilistic Soft Logic Bob Black Joe Black Died at age 79 Age: 51 Never diagnosed with father BMI: 27 prostate cancer Diet: high in fat PSA levels: 3.2-8.9 Rectal exam: no signs BMI: 23 PSA (blood test): 5.2 Mutations on: LMTK2, KLK3, JAZF1 Frank Black Discomfort when urinating Age: 48 BMI: 24 PSA: 3.1, 4.2, 4.9, 55 Support Medical Mary Black Biopsy: 8/12 positive Grade P1: 2-3, 60/40 Decision Making brother wife Age: 45 BMI: 32 Grade P2: 4-5, 90/10 Diet: high in fat Mutations on: Diagnosed with LMTK2, KLK3, JAZF1, breast cancer, CDH13 XMRV virus detected
  • 12. Probabilistic Soft Logic Diffusion in Social Networks [4 ]  Diffusion is a widely studied dynamic of social networks -  Epidemiology •  SIR Disease Model -  Marketing •  Viral Marketing -  Health •  Obesity Study -  Campaign Management © Christakis, Fowler •  Opinion Leaders 12
  • 13. Probabilistic Soft Logic 500 million users 50M tweets / day Data is available © Ludwig Gatzke
  • 14. Probabilistic Soft Logic Voter Opinion Modeling ? 
  • 15. Probabilistic Soft Logic Voter Opinion Modeling ? Status update  $ $ Tweet
  • 16. Probabilistic Soft Logic Voter Opinion Modeling    spouse colleague friend friend spouse friend    friend spouse colleague
  • 17. Probabilistic Soft Logic What’s the commonality? Collective Probabilistic Reasoning in Relational Domains 17
  • 18. Probabilistic Soft Logic What’s the commonality? Collective Probabilistic Reasoning in Relational Domains Statistical Relational Learning [Getoor & Taskar ’07] 18
  • 19. Probabilistic Soft Logic SRL Alphabet Soup 19
  • 20. Probabilistic Soft Logic SRL Alphabet Soup PSL? 20
  • 21. Probabilistic Soft Logic Why PSL? Continuous Random Variables Mathematical Foundation Logic Foundation Inference & Learning Sets and Aggregators Extensible High Performance 21
  • 22. Probabilistic Soft Logic What is PSL? Declarative language based on logics to express collective probabilistic inference problems -  Predicate = relationship or property -  (Ground) Atom = (continuous) random variable -  Rule = capture dependency or constraint -  Set = define aggregates PSL Program = Rules, Sets, Constraints, Atoms 22
  • 23. Probabilistic Soft Logic Ontology Alignment similar(A,B) [A≈B] provides Organization work for buys interacts Service & Products similar(Customer,Customers) Customers Employees develops helps sells to [Customer≈Customers] Software Hardware IT Services Developer Sales Person Staff domain(C,D) develop works for Company domain(work for, Employees) buys interacts with Products & Services Customer Employee sells helps Software Dev Hardware Consulting Technician Sales Accountant 23
  • 24. Probabilistic Soft Logic Ontology Alignment provides work for Organization buys interacts Service & Products Customers Employees develops helps sells to Software Hardware IT Services Developer Sales Person Staff develop works for Company buys interacts with Products & Services Customer Employee sells helps R≈T Software Dev Hardware ôConsulting Technician domainOf(R,A) domainOf(T,B) Sales Accountant 24 A≈B R≠T : 0.7
  • 25. Probabilistic Soft Logic {A.subConcept}≈{B.subConcept} ô A≠B Ontology Alignment A≈B type(A,concept) type(B,concept) :0.8 provides work for Organization buys interacts Service & Products Customers Employees develops helps sells to Software Hardware IT Services Developer Sales Person Staff develop works for Company buys interacts with Products & Services Customer Employee sells helps Software Dev Hardware Consulting Technician Sales Accountant 25
  • 26. Probabilistic Soft Logic Ontology Alignment provides work for Organization buys interacts Service & Products Customers Employees develops helps sells to Software Hardware IT Services Developer Sales Person Staff develop works for Company buys interacts with Products & Services Customer Employee sells helps Software Dev Hardware similar := partial-functional Accountant Consulting Technician Sales 26 := inverse partial-functional
  • 27. Probabilistic Soft Logic Voter Opinion Modeling vote(A,P) friend(B,A)  vote(B,P) : 0.3    spouse colleague friend friend spouse friend    friend spouse colleague vote(A,P) spouse(B,A)  vote(B,P) : 0.8
  • 29. Probabilistic Soft Logic CCMRF Constrained Continuous Markov Random Field  Markov Random Field -  Undirected -  Entropy-maximizing  Continuous (Random Variables)  Constrained (Domain) 29
  • 30. Probabilistic Soft Logic CMRF RVs Range of RVs Domain of MRF n X = {X1 , .., Xn } : Di ⊂R D= ×i=1 Di Feature or Compatibility Kernels Parameters φ = {φ1 , .., φm } : φj : D → [0, M] ; Λ = {λ1 , .., λm } Probability measure P over X defined through m Density 1 f (x) = exp[− λj φj (x)] Function Z(Λ)  j=1  m Partition Z(Λ) = exp − λj φj (x) dx Function D j=1 30
  • 31. Probabilistic Soft Logic CCMRF : Constraints [5 ] Equality Constraints kA kA A(x) = a where A : D → R ,a ∈ R Inequality Constraints kB kB B(x) ≤ b where B : D → R ,b ∈ R Restricted Domain ˜ D = D ∩ {x|A(x) = a ∧ B(x) ≤ b} Adjusted CCMRF / ˜ f (x) = 0 ∀x ∈ D 31
  • 32. Probabilistic Soft Logic Geometric Intuition X1 1 x1 + x3 ≤ 1 φ1 (x) = x1 φ2 (x) = max(0, x1 − x2 ) φ3 (x) = max(0, x2 − x3 ) X3 0 1 Λ = {1, 2, 1} X2 X = {X1 , X2 , X3 } 32
  • 33. Probabilistic Soft Logic Geometric Intuition X1 1 x1 + x3 ≤ 1 φ1 (x) = x1 φ2 (x) = max(0, x1 − x2 ) φ3 (x) = max(0, x2 − x3 ) Highest Probability X3 0 1 Λ = {1, 2, 1} X2 X = {X1 , X2 , X3 } 33
  • 34. Probabilistic Soft Logic Logic Foundation - Syntax Semantics -
  • 35. Probabilistic Soft Logic Rules Ground Atoms [3 ] H1. ... Hm ô B1 , B2 ,... Bn h   Atoms are real valued -  Interpretation I, atom A: I(A) [0,1] -  We will omit the interpretation and write A [0,1]   h is a combination function -  Arbitrary T-norms: [0,1]n Ø [0,1]   Based on the theory of Generalized Annotated Logic Programs (GAP) [Kifer Subrahmanian ‘92] -  But restricted to real values 35
  • 36. Probabilistic Soft Logic Rules H1. ... Hm ô B1 , B2 ,... Bn h   h is a combination function -  Lukasiewicz T-norm   ⊕ (h1, h2) = min(1, h1+h2 )   ⊗ (h1, h2) = max(0, 1- h1+h2 ) We use the Lukasiewicz T-norm in the following. 36
  • 37. Probabilistic Soft Logic Satisfaction H1. ... Hm ô B1 , B2 ,... Bn   Establish Satisfaction -  ⊕(H1,..,Hm) ¥ ⊗(B1,..,Bn) R≈T:?ôA≈B:0.7 D≈E:0.8 Interpretation implicit! 37
  • 38. Probabilistic Soft Logic Satisfaction H1. ... Hm ô B1 , B2 ,... Bn   Establish Satisfaction -  ⊕(H1,..,Hm) ¥ ⊗(B1,..,Bn) R≈T:≥0.5ôA≈B:0.7 D≈E:0.8 Interpretation implicit! 38
  • 39. Probabilistic Soft Logic Distance to Satisfaction H1. ... Hm ô B1 , B2 ,... Bn   Distance to Satisfaction -  max( ⊗(B1,..,Bn) - ⊕(H1,..,Hm) , 0) R≈T:0.7ôA≈B:0.7 D≈E:0.8 0.0 R≈T:0.2ôA≈B:0.7 D≈E:0.8 0.3 39
  • 40. Probabilistic Soft Logic Rule Weights R: H1. ... Hm ô B1 , B2 ,... Bn w  Weighted Distance to Satisfaction -  d(R,I) = w * max(⊗ (B1,..,Bn)- ⊕ (H1,..,Hm), 0) 40
  • 41. Probabilistic Soft Logic Rule Weights R: H1. ... Hm ô B1 , B2 ,... Bn w  Weighted Distance to Satisfaction -  d(R,I) = w * max(⊗ (B1,..,Bn)- ⊕ (H1,..,Hm), 0)  Every ground rule R in a PSL program P contributes a compatibility kernel ϕR = d(R,I) to the CCMRF associated with P. 41
  • 42. Probabilistic Soft Logic Geometric Intuition R2 1 | d(R2,I) | 0 | | d(R1,I) 1 R1 42
  • 43. Probabilistic Soft Logic Geometric Intuition 1 R2 P(I|P ) = exp [−d(P, I)] Z(w) w2 | d(R2) | norm = d(P,I) 0 | | d(R1) w1 R1 43
  • 44. Probabilistic Soft Logic Geometric Intuition X1 1 x1 + x3 ≤ 1 φ1 (x) = x1 φ2 (x) = max(0, x1 − x2 ) φ3 (x) = max(0, x2 − x3 ) Highest Probability X3 0 1 Λ = {1, 2, 1} X2 X = {X1 , X2 , X3 } 44
  • 45. Probabilistic Soft Logic Inference - MAP Marginals -
  • 46. Probabilistic Soft Logic MAP Inference [3 ]  Most Probable Interpretation -  Most likely truth value assignment given some facts. argmax ( I | P) I ñ argmin d(P,I) I 46
  • 47. Probabilistic Soft Logic MAP Inference Theory  Exact PSL inference in polynomial time -  Convex optimization problem due to our choices in combination functions  O(n3.5) inference -  Second Order Cone Program -  n=number of (active) ground rules -  Efficient commercial optimization packages 47
  • 48. Probabilistic Soft Logic Inference Algorithm Each ground rule constitutes a linear or conic constraint introducing a rule specific “dissatisfaction” variable that is added to the objective function. 48
  • 49. Probabilistic Soft Logic Inference Algorithm Conservative Grounding: Most rules trivially have satisfaction distance=0. Save time and space by not grounding them out in the first place. Don’t reason about it if you don’t absolutely have to! 49
  • 50. Probabilistic Soft Logic Parallelizing MAP Inference [4 ]  MAP inference is O(n3.5) -  Limited scalability  Achieve scalability by dividing inference problem into smaller “chunks” -  Allows for parallelization and distribution of workload -  Similar to message-passing but on entire subgraphs of the factor graph 50
  • 51. Probabilistic Soft Logic Factor Graph vote(Mary,Dem) vote(Jane, Dem) vote(Mary,Dem) spouse(John,Mary) vote(Jane,Dem) friend(John,Jane)  vote(John,Dem) : 0.8  vote(John,Dem) : 0.3 vote(John,Dem) 51
  • 52. Probabilistic Soft Logic Factor Graph vote(Mary,Dem) vote(Jane, Dem) vote(Mary,Dem) spouse(John,Mary) vote(Jane,Dem) friend(John,Jane)  vote(John,Dem) : 0.8  vote(John,Dem) : 0.3 vote(John,Dem) Idea: Partition Dependency graph into strongly connected components and solve MAP on each independently 52
  • 53. Probabilistic Soft Logic Approximate Algorithm 1.  Ground out factor graph conservatively 2.  Partition dependency graph using a modularity maximizing clustering alg -  Inspired by Blondel et al [06] -  Aggregate rule weights 3.  Compute MAP on each cluster fixing confidence values of outside atoms 4.  Go to 1 until change in I Θ 53
  • 54. USA dean author Probabilistic Soft Logic member Prof Prof Jones Baneri Italy in Paper “ABC” comment author UC CS UMD CS in faculty friends faculty Prof Calero department in member faculty presented Prof Dooley attended Social Science department University MD Universita Calabria department in dean ASONAM 09 attended faculty submitted Prof Roma author UMD Physics author member visited organized accepted friends author KPLLC Paper 09 “UVW” S3 Prof Smith Paper “HIJ” submitted Paper “XYZ” comment attended comment student of S2 student of Prof Olsen collaborates Prof Lund member dean Prof Larsen faculty Jamie Lock member Karl Oede Social Science visited Odense SDU Physics Odense colleagues John Doe department Denmark
  • 55. Probabilistic Soft Logic Scalability 16000 14000 Exact vs Approximate Algorithm Running Times 12000 Time in Seconds Exact Algorithm 10000 Approximate Algorithm with Parameters A 8000 6000 4000 2000 0 0 10000 20000 30000 40000 50000 60000 70000 80000 # Compatibility Kernels in Graph 55
  • 56. Probabilistic Soft Logic Accuracy 7% Relative Error compared to Exact Inference 6% Percentage Relative Error 5% Parameters B 4% Parameters A 3% Parameters C Parameters D 2% Parameters E 1% 0% 0 10000 20000 30000 40000 50000 60000 70000 80000 Number of Compatibility Kernels 56
  • 57. Probabilistic Soft Logic Runtime Running Time Comparison of Approximate 500 Algorithm 450 Parameters B 400 Parameters A 350 Parameters C Time in Seconds 300 Parameters D 250 Parameters E 200 150 100 50 0 0 10000 20000 30000 40000 50000 60000 70000 80000 Number of Compatibility Kernels 57
  • 58. Probabilistic Soft Logic Accuracy on very large Graphs Relative Error Comparison 6% 5% Percentage Relative Error Parameters B Parameters C 4% Parameters D 3% Parameters E 2% 1% 0% 3.5E+05 7.0E+05 1.4E+06 2.8E+06 5.6E+06 Number of Compatibility Kernels Log-scale 58
  • 59. Probabilistic Soft Logic Runtime on very large Graphs Runtime Comparison 40000 Time in Seconds 4000 Parameters B Parameters A 2M edges Parameters C in 48 min Parameters D Parameters E 400 3.5E+05 7.0E+05 1.4E+06 2.8E+06 5.6E+06 Number of Compatibility Kernels Log-log-scale 59
  • 60. Probabilistic Soft Logic Computing Marginals [5 ]  For a subset of RVs X ⊂ X RV = atom -  In our case X = {Xi }  Compute the marginal density function fX (x ) = f (x , y)dy ˜ y∈×Di ,s.t.Xi ∈X / f | | 0 1 Technician≈Developer 60
  • 61. Probabilistic Soft Logic Geometric Intuition X1 f 1 | | 0 1 P(0.4 ≤ X2 ≤ 0.6) X3 0 1 X2 61
  • 62. Probabilistic Soft Logic Computing Marginal in Theory Computing the marginal probability density function for a subset X ⊂ X under the probability measure defined by a CCMRF is #P hard in the worst case. -  Related to volume computation of polytopes, based on [Broecheler et al, ‘09] 62
  • 63. Probabilistic Soft Logic Sampling Scheme  Approximate the marginal distributions using an MCMC sampling scheme restricted to the convex polytope defined by D˜ -  Again, inspired by work on volume computation 63
  • 64. Probabilistic Soft Logic Histogram Sampling Xi 64
  • 65. Probabilistic Soft Logic Histogram Sampling Xi 65
  • 66. Probabilistic Soft Logic Histogram Sampling Xi 66
  • 67. Probabilistic Soft Logic Histogram Sampling Xi 67
  • 68. Probabilistic Soft Logic Random Ball Walk Need to sample from  restricted to the ball  difficult q1 r p q2 68
  • 69. Probabilistic Soft Logic Hit-and-Run d p 69
  • 70. Probabilistic Soft Logic Hit-and-Run q d Compute density function induced on p line and sample from it  easy 70
  • 71. Probabilistic Soft Logic Hit-and-Run q d p 71
  • 72. Probabilistic Soft Logic Sampling in theory Theorem: The complexity of computing an approximate distribution σ* using the hit-and-run sampling scheme such that the total variation distance of σ* and P is less than ε is ∗ 3 O n (kB + n + m) ˜ ˜ where n = n − kA , under the assumptions that ˜ we start from an initial distribution σ such that the density function dσ/dP is bounded by M except on a set S with σ(S)≤ε/s [Lovasz Vempala ‘04] 72
  • 73. Probabilistic Soft Logic Sampling in Practice  Starting distribution = MAP state  How do we get out of corners? 73
  • 74. Probabilistic Soft Logic Sampling in Practice  How do we get out of corners?   Use relaxation method [Agmom ‘54] to solve system of linear inequalities to find a feasible direction d zk − W k d i T di+1 = di + 2 Wk ε1 Wk 2 ε2 74
  • 75. Probabilistic Soft Logic Algorithm Convergence KL Divergence by Sample Size 5 KL Divergence 0.5 Average KL Divergence Lowest Quartile KL Divergence Highest Quartile KL Divergence 0.05 30000 300000 3000000 Number of Samples Averaged over 30 randomly snow-ball sampled folds Lowest Quartile = 322-413 atoms Highest Quartile = 174-224 atoms 75
  • 76. Probabilistic Soft Logic Algorithm Performance Runtime for 1000 Samples 35 30 25 Time in sec 20 15 10 5 0 0 2000 4000 6000 8000 10000 Number of Compatibility Kernels 76
  • 78. Probabilistic Soft Logic Weight Learning [3 ]   Given: Rules + Training Instance | Want: Weights w∗ = argmaxw P(IT |P ) − ||w||2   Approach: Maximize likelihood of observation by optimizing weights -  Plus prior on weights   Problem: Cannot compute partition function Z tractably   Workaround: Use MAP state to approximate Z -  Invoke reasoner during gradient computation -  BFGS and Perceptron implementations 78
  • 80. Probabilistic Soft Logic Experiments: Ontology Alignment  OAEI Ontology Alignment Benchmark (2008) -  Real world ontologies (300s) -  Synthetic ontology pairs -  Approx 100 entities -  21 rules, modified standard string similarity measures 80
  • 81. Probabilistic Soft Logic OAEI comparison [3 ] 1 0.8 F1 Score 0.6 0.4 0.2 0 Other results as reported by the benchmark participants. 81
  • 82. Probabilistic Soft Logic Attribute Similarity Functions A≈B ô A.name ≈x B.name   Maximum flexibility for attribute similarity   Customization to particular problem domains -  Camel-case common in web-ontologies   Users can define arbitrary similarity functions ≈x to be integrated into PSL -  e.g. String similarity measures such as Levenshtein 82
  • 83. Probabilistic Soft Logic Sets in PSL {A.subConcept}≈{B.subConcept} ô A≠B A≈B type(A,concept) type(B,concept) :0.8 provides Organization work for buys interacts Service Products Customers Employees develops helps sells to Software Hardware IT Services Developer Sales Person Staff develop works for Company buys interacts with Products Services Customer Employee sells helps Software Dev Hardware Consulting Technician Sales Accountant 83
  • 84. Probabilistic Soft Logic Explicit Set Treatment A≈B ô {A.subConcept} ≈{} {B.subConcept}   Reason about the similarity of sets of entities   Allow to integrate aggregates measures   Default Set equality measure: Jaccard-type 2 x∈X y∈Y x≈y X≈Y = |X| + |Y | -  Allow users to define alternative set equalities •  Based on inference engine •  Initially, PSL provides some predefined set overlap measures 84
  • 85. Probabilistic Soft Logic Support for Sets   Using relational syntax… -  X.name, X.father, X.friend (a friend) -  Binary predicates only   …makes it easier to specify sets -  {X.friend} - all friends -  {X.friend.friend} - all second level friends   Inverse of binary relation -  X.knows(inv) (who knows X?)   Union, Intersection -  {X.knows} u {X.knows(inv)} = {Y.knows} u {Y.knows(inv)} 85
  • 86. Probabilistic Soft Logic Utility of Sets in PSL  Compare set vs non-set version of rules on synthetic ontology alignment benchmark A≈B ô {A.subConcept} ≈{} {B.subConcept} vs A≈B ô A.subConcept ≈ B.subConcept 86
  • 87. Probabilistic Soft Logic Ontology Set Comparison 1 Structural 0.9 Noise: 0.2 0.8 F1 0.7 0.6 0.5 Complete PSL setFree PSL 0.4 Attribute Noise 0 0.15 0.3 0.4 0.5 0.55 0.6 0.65 0.7 0.75 0.8 1 Structural 0.9 Noise: 0.4 0.8 F1 0.7 0.6 0.5 Complete PSL setFree PSL 0.4 Attribute Noise 0 0.15 0.3 0.4 0.5 0.55 0.6 0.65 0.7 0.75 0.8 87
  • 89. Probabilistic Soft Logic Probabilistic Query Analysis Query Type Marginal Distribution Most Probable World continuum Most Probable Sub-World 1 Atom Entire World Examples: Examples:   Collective Classification   Image Denoising   Link Prediction   Complex System Configuration   Interesting: Constraints   Ising Model Decision Level Perspective System Level Perspective 89
  • 90. Probabilistic Soft Logic Probabilistic Query Analysis Decision-driven Query Type Marginal Distribution Most Probable World continuum Most Probable Sub-World 1 Atom Entire World Examples: Examples:   Collective Classification   Image Denoising   Link Prediction   Complex System Configuration   Interesting: Constraints   Ising Model Decision Level Perspective System Level Perspective 90
  • 91. Probabilistic Soft Logic Probabilistic Query Analysis Decision-driven Query Type Marginal Distribution Most Probable World continuum Most Probable Sub-World 1 Atom Entire World Cannot be Examples: meaningfully Examples:   Collective Classification analyzed   Image Denoising   Link Prediction   Complex System Configuration   Interesting: Constraints   Ising Model Decision Level Perspective System Level Perspective 91
  • 92. Probabilistic Soft Logic Probabilistic Query Analysis Query Type Marginal Distribution Most Probable World continuum Most Probable Sub-World 1 Atom Entire World Examples: Decision-driven   Collective Classification Examples:   Image Denoising   Link Prediction Modeling Complex System Configuration     Interesting: Constraints   Ising Model Decision Level Perspective System Level Perspective 92
  • 93. Probabilistic Soft Logic Decision Driven Modeling (DDM) [ 2 ]  Predicates are typed as probability distributions -  e.g. Bernoulli distributions, parameterized by p ε [0,1]  Atoms are RVs over parameterized distributions  Defines a second-order probability distribution defined by a CCMRF  Allows integration of external classifiers -  Important, e.g. in personalized medicine  Aggregation of evidence -  Can handle sets and other continuous aggregations 93
  • 94. Probabilistic Soft Logic Experiments: Wikipedia [3 ]  Wikipedia Category Prediction -  2460 featured documents -  Links, talks -  Predict: category(D,C): Bernoulli -  2 setups: seed split link   talk talk  link talk talk  link talk   94
  • 95. Probabilistic Soft Logic Wikipedia Rules hasCat(A,C) ô hasCat(B,C) A!=B 
 unknown(A) document(A,T) 
 document(B,U) similarText(T,U) hasCat(A,C) ô hasCat(B,C) unknown(A) link(A,B) A!=B hasCat(D,C) ô talk(D,A) talk(E,A) hasCat(E,C) unkonwn(D) A!=B 95
  • 96. Probabilistic Soft Logic Wikipedia – External Classifier 0.8 0.78 0.76 0.74 0.72 0.7 F1 0.68 0.66 Attributes Only 0.64 Attributes + Links 0.62 Attributes + Links + Talks 0.6 250 375 500 625 750 Number of Training Documents 96
  • 97. Probabilistic Soft Logic Wikipedia – Seed Classification 0.7 0.6 0.5 F1 0.4 0.3 0.2 0.15 (220) 0.2 (290) 0.25 (370) 0.3 (440) Percentage of Seed Document (# Documents) Attributes only Attributes + Links Attributes + Links + Talks 97
  • 98. Probabilistic Soft Logic Confidence Analysis [5 ]  Analyze the confidence in a prediction by computing its marginal density function in the second order probability distribution -  What does the density function look like around the MAP state?  Novel aspect in SRL f f vs | | | | 0 1 0 1 Category(Doc1,Theory) Category(Doc1,Theory) 98
  • 99. Probabilistic Soft Logic Experiments: Confidence Analysis   Split predications into S+, S-   Compute avg standard deviations for each σ− − σ+   Compare: ∆(σ) = 2 σ+ + σ−   Hypothesis: ∆(σ) 0 Folds P(Null Hypothesis) Relative Difference Δ(σ) 20 1.95E-09 38.3% 25 2.40E-13 41.2% 30 1.00E-16 43.5% 35 4.54E-08 39.0% 99
  • 100. Probabilistic Soft Logic PSL System Overview
  • 101. Probabilistic Soft Logic PSL Implementation  Implemented in Java / Groovy  Declarative model definition and imperative model interaction  ~40k LOC but still alpha  Performance oriented -  Database backend -  Memory efficient data structures -  High performance solver integration
  • 102. Input Model Probabilistic Rules Similarity Input Data A≈B  similarID(A.name,B.name) Graph Preprocessing Logic {A.subClass}≈{B.subClass}  A≈B System Overview Constraints RDBMS Partial functional: ≈ Similarity Functions similarID(A,B) = new SimFun(){} Groovy PSL Programming Environment Factor Graph Analysis     Grounding   Evalua:on  Tools   Framework   Op#miza#on  Toolbox   Reasoner  +   Learning   Similarity  Func#ons   Inference Result
  • 105. Probabilistic Soft Logic Conclusion  Simple and expressive formalism to reason about similarity and uncertainty collectively -  Sets aggregates, external functions  Scalable due to continuous rather than combinatorial formulation  Future: Structure learning, extend framework, additional use cases. 105
  • 107. ? Probabilistic Soft Logic Questions? Comments?
  • 109. Probabilistic Soft Logic Presented Work [5] Computing marginal distributions over continuous Markov networks for statistical relational learning, Matthias Broecheler, and Lise Getoor, Advances in Neural Information Processing Systems (NIPS) 2010 [4] A Scalable Framework for Modeling Competitive Diffusion in Social Networks, Matthias Broecheler, Paulo Shakarian, and V.S. Subrahmanian, International Conference on Social Computing (SocialCom) 2010, Symposium Section [3] Probabilistic Similarity Logic, Matthias Broecheler, Lilyana Mihalkova and Lise Getoor, Conference on Uncertainty in Artificial Intelligence 2010 [2] Decision-Driven Models with Probabilistic Soft Logic, Stephen H. Bach, Matthias Broecheler, Stanley Kok, Lise Getoor, NIPS Workshop on Predictive Models in Personalized Medicine 2010 [1] Probabilistic Similarity Logic, Matthias Broecheler, and Lise Getoor, International Workshop on Statistical Relational Learning 2009 This presentation also covers joint work with Paulo Shakarian and Dr. V.S. Subrahmanian. 109
  • 110. Probabilistic Soft Logic References   Introduction to Statistical Relational Learning, Lise Getoor and Ben Taskar, MIT Press, 2007   Theory of generalized annotated logic programming and its applications, Michael Kifer and V.S. Subrahmanian, Journal of Logic Programming, Volume 12 Issue 4, April 1992   Using Histograms to Better Answer Queries to Probabilistic Logic Programs, Matthias Broecheler, Gerardo I. Simari, and V.S. Subrahmanian, International Conference on Logic Programming 2009   Hit-and-run from a corner, L. Lovasz and S. Vempala, ACM Symposium on Theory of computing, 2004 110