SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Iterative Methods for Network Alignment

                                                                       David F. Gleich
                                                                    Computer Science
                                                                     Purdue University
                                                                                     
                                                                                         with
                                                                                             
                                                                    Arif Khan, Alex Pothen !
                                                        Purdue University, Computer Science
Work supported by DOE CSCAPES Institute grant (DE-
FC02-08ER25864), NSF CAREER grant 1149756-CCF,                  Mahantesh Halappanavar!
and the Center for Adaptive Super Computing Software        Pacific Northwest National Labs
Multithreaded Architectures (CASS-MT) at PNNL.                Mohsen Bayati, Amin Saberi!
Stanford’s CADS grant from the Library of Congress.
PNNL is operated by Battelle Memorial Institute under
                                                                           Stanford University
contract DE-AC06-76RL01830
                                               Ying Wang Google




                                                                                             1
Network alignment"
What is the best way of matching "
graph A to B?

                                        w
            v
                                s
    r




                t                   u


        A                           B




                                            2
the    Figure 2. The NetworkBLAST local network alignment algorithm. Given two input
s) or
odes
 lem
         Network alignment"
         networks, a network alignment graph is constructed. Nodes in this graph correspond
         to pairs of sequence-similar proteins, one from each species, and edges correspond to
         conserved interactions. A search algorithm identifies highly similar subnetworks that


         
         follow a prespecified interaction pattern. Adapted from Sharan and Ideker.30
n the
 ent;
nied
 ped
 lem
         
  net-
 one
 one
plest
 ying
 eins
ome
  the
  be-

d as
  aph
ever,
   ap-           From Sharan and Ideker, Modeling cellular machinery through biological
rked             network comparison. Nat. Biotechnol. 24, 4 (Apr. 2006), 427–433. 
 , we            
         Figure 3. Performance comparison of computational approaches.




                                                                                                 3
mon-
Gleich (Purdue)   Network alignment"
d F. Gleich (Purdue)
                             Motivation                  DavidINFORMS (Purdue)
                                                               F. Gleich Seminar

                                                       Network alignment
                                                                                   8 / 40        Motivation              INFORMS Semin

                                                                                                                     INFORMS Sem
                   
H/Wikipedia: Simple alignment fa
mm                     40   60                80       100 mm                120 40              60           80     100

  
 mm
40
                             40                         6040                                80
                                                                                                 j
                                                                                                                   100
                                    r                                                        s



60                                                              60



                                                                            t
80  40                                             t            80
                                          A                       L                          B
     A                  LCSH                   297,266 vertices, 248,230 edges
     B                  Wikipedia              205,948 vertices, 382,353 edges
     L                  links                  4,971,629 edges
    60




                                                                                                                            4
Sometimes small procedure on the labels of the two
                            data
he size of these datasets is reported in Table 6.5. For this a
ome from a text-matching
      becomes big …
                            Table 6.5
 size of non-Web datasets. The product graph is never formed explicitly

           Dataset                      Size         Nonzeros
           LCSH-2                    59,849            227,464
           WC-3                      70,509            403,960
           Product graph      4,219,893,141    91,886,357,440


 periment, we do not investigate all the issues involved in using
      … Ananth has some better techniques to work with these large problems …
 d problem and focus on the performance of the inner-outer al
 nking context. Without any parameter optimization (i.e., usi
 ), the inner-outer scheme shows a significant performance adv
d in Table 6.6.




                                                                         5
Network alignment"
What is the best way of matching "
graph A to B using only edges in L?

                                       w
            v
                               s
    r




                        wtu
                t                  u


        A           L              B




                                           6
Network alignment"
Matching? 1-1 relationship"
Best? highest weight and overlap

                                            w
            v
                    Overlap         s
    r




                              wtu
                t                       u


        A                L              B




                                                7
Network alignment"
… is NP-hard"
… has no approximation algorithm
                                        w
        v
r               Overlap         s           •    Computer Vision
                                            •    Ontology matching
                                            •    Database matching
                          wtu               •    Bioinformatics
            t                       u


    A                L              B


objective = α matching + βoverlap




                                                                      8
Network alignment"
via mathematical programming

                                                maximize       ↵wT x + 2 xT Sx
                                            w
            v
                                    s
                                                subject to Ax  e, xi 2 {0, 1}
    r               Overlap




                                                 Let xi be an indicator over edges in L
                              wtu
                                                 
                t                       u        If A is the node-edge incidence matrix
                                                 for L, then x is a 1-1 matching
        A                L              B

    Find a 1-1 matching between vertices
    with as many overlaps as possible.




                                                                                       9
Network alignment"
via mathematical programming

                                                maximize       ↵wT x + 2 xT Sx
                                            w
            v
                                    s
                                                subject to Ax  e, xi 2 {0, 1}
    r               Overlap




                                                 Let xi be an indicator over edges in L
                              wtu
                                                 
                t                       u        Let Sij = 1 when xi and xj overlap, then
                                                 xTSx is twice the overlapped count.
        A                L              B

    Find a 1-1 matching between vertices
    with as many overlaps as possible.




                                                                                        10
Our contributions
A new belief propagation method (Bayati et al. 2009, 2013)"
Outperformed state-of-the-art PageRank and optimization-
based heuristic methods

High performance C++ implementations (Khan et al. 2012)"
40 times faster (C++ ~ 3, complexity ~ 2, threading ~ 8)"
5 million edge alignments ~ 10 sec"

www.cs.purdue.edu/~dgleich/codes/netalignmc




                                                              11
Iterative methods for network alignment
Iterative methods "
    for network alignment
  
Each iteration involves
                                     Let x[i] be the score for
Matrix-vector-ish computations       each pair-wise match in L
with a sparse matrix, e.g. sparse
matrix vector products in a semi-    for i=1 to ...
ring, dot-products, axpy, etc. 
       update x[i] to y[i]
Bipartite max-weight matching          compute a
using a different weight vector at       max-weight match
                                         with y
each iteration
                                       update y[i] to x[i]
"                                        (using match in MR)
No “convergence” "
100-1000 iterations




                                                                 13
Open question 1!

Any sort of property of "
these methods beyond ... 
(i) Principled derivation and "
(ii) “David and Ananth say they work”?




                                      14
David F. Gleich (Purdue)        Algorithms                      INFORMS Seminar   25 / 40



 Belief propagation methods
Belief propagation: Our algorithm
          mm                    40   60                 80        100              120
Summary                                    History
 …   Construct a probability                      …   BP used for computing
     model where the most
          40                                          marginal probabilities and
     likely state is the solution!                    maximum aposterori
 …   Locally update information                       probability
 …   Like a generalized dynamic                   …   Wildly successful at solving
     program
         60                                           satisfiability problems
                                                  …   Convergent algorithm for
 …   It works                                         max-weight matching

 …   Most likely, it won’t
        80
     converge




                                                                                                 15
                                                                        Bayati et al. 2005;
David F. Gleich (Purdue)                     Algorithms                              INFORM

      Belief propagation for
   The neighbor operation used to define the left-hand vector x@fi is implicitly defined b
           NetAlign factor graph: Loopy BP
   the set of variables used on the right-hand side of the equation. In words, the functio
      network alignment
   node fi (gi0 ) enforces the matching constraint at i (i0 )
   Another type of function nodes check the validity of squares. For each square ii0 ⇤ j
                               mm
   define a function node hii0 jj 0 : {0, 1}40|+|S| ! R: 60
                                                     |EL                             80                100
                                        (
                                           1 xii0 jj 0 = xii0 xjj 0 Functions
                                                       Variables
               hii0 jj 0 x@hii0 jj0 =                                        for all (ii0 , jj 0 ) 2 VS .
                            A          B 0 otherwise                         ƒ1
                                                       11 0
   In other words, 1 ii0 jj 0 40
                         h                                                   ƒ2
                                   guarantees that xii0 jj 0 = 1 if and only if xii0 = xjj 0 = 1.
                                           10          12 0

  The edges of the factor graph are simply0 connecting each g0               function node to the variab
                                                                               1
                                                       22
nodes it acts on. For example each fi is connected to all variable nodes ii0 2 EL and eac
                         2                 20 0 0                              0
                                0     0
hii0 jj 0 is connected to ii , jj and ii jj in EL      23 0 [ V . Thereforeg2 factor graph is bipartit
                                                                              the
                                                               S
                                                                             g0
  Figure 3 shows an example of a graph pair A, B and their factor-graph representation a
                                                                               3
described above.                  60       30
                                                 110 220
  Now define the following probability distribution                           h110 220
                          2                                                        3
                            n                 m
                        1 4Y                 Y               Y                             T         T
      p(xL , xS ) =               fi (x@fi )     gj (x@gj )        hijrs (x@hijrs )5 e↵w xL + 2 1|S| xS    (4
                        Z i=1
                                 80          j=1           ijrs2VS

where Z is just a normalization term to make p(xL , xS ) a probability distribution. I




                                                                                                               16
             Note It’s pretty hairy to put all the stuff I should put here on a single slide. Most of it is in the pap
particular,  The rest is just “turning the crank” with standard tricks in BP algorithms.
                            2                                                               3
Algorithms                                INFORMS Seminar   26 / 40




                      M !j { = s} =
                        Y
             i              Mj0 ! { = s}
60               80                    100              120
                      j0 2{N(   )j}

             j
      variable tells function j what it thinks
      about being in state s. This is just the
      product of what all the other functions tell
       about being in state s.

             i        Mj! {            = s} = m xim m
ns                                        y:all possible choices




                                                                      17
                                           for variables 0 2N(j)
variable tells function j what it thinks
      about being in state s. This is just the
      product of what all the other functions tell
       about being in state s.

            i       Mj! {     = s} = m xim m
 ns                                 y:all possible choices
                                     for variables 0 2N(j)
                       2                               3
es          j                  Y
                       6                     0          7
                       4ƒj (y)   M 0 !j {        = y 0 }5
                            0 2{N(j)   }


      function j tells variable what it thinks
      about being in state s. This means that we
l     have to locally maxamize ƒj among all
      possible choices. Note y = s always (too
      cumbersome to include in notation.)




                                                             18
Belief propagation for
   network alignment
    For t         1, the messages in iteration t are obtained from the messages in
iteration t       1 recursively. In particular for all ii 0 2 EL
                          ✓              h                i ◆+
   (t)                             (t 1)
  mii 0 !fi = ↵wii 0          max mki 0 !g 0
                               k 6=i                  i

                                                               X                 ✓                                           ◆
                                                                                                           (t 1)
                                                       +                   min           , max(0,       + mjj 0 !h 0        ) . (1)
                                                                                     2              2             ii jj   0
                                                           ii 0 jj 0 2VS

                     (t)
The update rule for mii 0 !g 0 is similar, and
                                          i


                              ✓               h                i ◆+         ✓            h          i ◆+
   (t)                                 (t 1)                                  (t 1)
  mii 0 !h 0 0 = ↵wii 0           max mki 0 !g 0                   max mik 0 !fi
          ii jj                        k6=i               i        k 0 6=i 0
                                                          X       ✓                               ◆
                                                                                    (t 1)
                                                  +           min        , max(0, mkk 0 !h 0 0 + ) . (2)
                                                          0 0
                                                                    2                     ii kk 2
                                                          kk 6=jj
                                                      ii 0 kk 0 2VS
Synthetic evaluation of
       network alignment

                                                        1


                                                       0.8




                                    fraction correct
                                                       0.6


                                                       0.4

                                                             MR
                                                       0.2   BP
                                                             BPSC
                                                             IsoRank
                                                        0
      10              15       20                        0          5           10              15       20
degree of noise in L (p ⋅ n)                                     expected degree of noise in L (p ⋅ n)
Open question 2!

When could we hope to solve such
synthetic problems in asymptotic
regimes?




                                   21
Does it work?
                LCSH – Library of Congress subject headings
 Network Alignment                                                                                        :25
                   Rameau – French National Library subject headings
Table IV. The alignment results for LCSH and Rameau. The first set of results shows the statistics of the known
                     Manually matched
alignment and the results from the max-weight matching algorithm. Next we show results from our algorithms for
three objective parameters. The columns are: objective parameters, algorithms, matching weight, matching edge
overlap, time, total correct, recall, precision, and matching triangle overlap.
   Obj.           Alg.       Weight      Overlap    Time (s)    Correct    Rec.        Prec.       Triangles
                  Sol.       36332.42    39847      —           57645      100%        100%        2073
                  MWM        93279.0     16990      29.6        29098      50.5%       23.3%       350
   ↵ = 1,   =1   BP"
                  MP         84622.0     46400      23522.0     32585      56.5%       27.6%       1515
                 BP++"
                  MP++       85810.1     46942      27115.6     32857      57.0%       27.4%       1548
                 MR
                  MR         87588.6     48367      33366.9     33225      57.6%       27.0%       1617
   ↵ = 1,   =2   BP"
                  MP         81752.6     46569      23427.1     31724      55.0%       27.6%       1483
                 BP++"
                  MP++       84615.7     46656      26673.1     31952      55.4%       26.7%       1531
                 MR
                  MR         85438.4     48934      56961.6     32303      56.0%       26.3%       1604
   ↵ = 0,   =1   BP"
                  MP         60617.9     45247      14284.8     24794      43.0%       23.2%       1467
                 BP++"
                  MP++       60502.8     41592      13979.5     24498      42.5%       23.0%       1484
                  MR
                 MR
         65994.2     46163      10384.4     25455      44.2%       21.5%       1602




                                                                                                           22
 protein-protein interaction networks and ontologies. In the future, we envision applications
 of these techniques in mapping large social network structure.
Open question 3!

How can we evaluate alignments? "
What are possible null-models? 




                                    23
David F. Gleich (Purdue)           Results                     INFORMS Seminar   35 / 40




Matching results: A little too hot!
        mm                    40     60
                                   LCSH     WC     80         100               120
  Science fiction television series          Science fiction television programs
                       Turing test          Turing test
                Machine learning            Machine learning
                         Hot tubs           Hot dog
          40




          60




          80




                                                                                              24
Higher-order "
    }     This proposal is for match-
    network alignment
 using tensor
          ing triangles
          methods:
  k in                             0
          j       Triangle          j
g this        i
                                     k
                                       i
                                       0
                                           0

          k
 euris-
 using
            A          L              B
 istics
        If xi , xj , and xk are
  algo- indicators associated with




                                               25
volves the edges (i, i0 ), (j, j 0 ), and
Network alignment"
          A            L      B
     This proposal is for programming
       via mathematical match-
     ing triangles using tensor
       
     methods:
n                                             maximize   ↵wT x + 2 xT Sx
      j           Triangle   j0               subject to Ax  e, xi 2 {0, 1}
s             i
                              k   0
                                      i   0

      k
 -
g
          A            L      B
s
     If xi , xj , and xk are
 -   indicators1-1 matching between vertices with
         Find a associated with
s    theas many overlaps0 ), and
          edges (i, i0 ), (j, j as possible.




                                                                           26
o    (k, k 0 ), then we want to
Triangle alignment"
           A            L      B
     This proposal is for programming
       via mathematical match-
     ing triangles using tensor
       
     methods:
n                                                    ↵wT x + 2 xT Sx
                                                       X
       j           Triangle   j0         maximize    +    Tijk xi xj xk
s              i
                               k0
                                    i0                  ijk
       k
                                         subject to Ax  e, xi 2 {0, 1}
 -
g
           A            L      B
s
     If xi , xj , and xk are
 -   indicators1-1 matching between vertices with
         Find a associated with
s         edges (i, i0 ), (j, j and triangles as possible.
     theas many overlaps0 ), and




                                                                          27
o    (k, k 0 ), then we want to
Tensor eigenvalues"
            A             L              B
     This proposal is for match-
       and a power method
     ing triangles using tensor
       
     methods:                                                      P
                                 maximize                              ijk   Tijk xi xj xk
n                                                   subject to kxk2 = 1
       j            Triangle            j0
s               i
                                         k0
                                              i0
                                                   [x(next)
                                                            ]i = ⇢ · (
                                                                      X
                                                                        Tijk xj xk + xi )
        k
 -                                                                   jk
                                                       where 𝜌 ensures the 2-norm
g                                                        SSHOPM method due to "
            A             L              B                  Kolda and Mayo
s
     IfHuman,protein interaction networks 48,228 triangles
          xi xj , and xk are
 -   indicatorsinteraction networks with triangles 
       Yeast protein
                       associated nonzeros
257,978
       The tensor T has ~100,000,000,000
s    the 
We work with it i0 ), (j, j 0 ), and
            edges (i, implicitly




                                                                                             28
o    (k, k 0 ), then we want to
Synthetic evaluation of
                         network alignment

                                                                                       1
                    1

                                                                                      0.8
                   0.8




                                                                   fraction correct
fraction correct




                   0.6                                                                0.6

                   0.4
                                                                                      0.4

                   0.2   Eigen                                                              MR
                         Teigen                                                       0.2   BP
                         Iso
                    0                                                                       BPSC
                     0          5           10              15     20
                             expected degree of noise in L (p n)                            IsoRank
                                                                                       0
      10              15                               20                               0          5           10              15       20
degree of noise in L (p ⋅ n)                                                                    expected degree of noise in L (p ⋅ n)
Open question 4!

When do we need triangles?




                              30
Iterative methods for network alignment

Contenu connexe

En vedette

Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutDavid Gleich
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceDavid Gleich
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveDavid Gleich
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesDavid Gleich
 
How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...David Gleich
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksDavid Gleich
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisDavid Gleich
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential David Gleich
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationDavid Gleich
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...David Gleich
 
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisDavid Gleich
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networksDavid Gleich
 
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detectionDavid Gleich
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLDavid Gleich
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphsDavid Gleich
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduceDavid Gleich
 
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsDavid Gleich
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
 
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph miningDavid Gleich
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcDavid Gleich
 

En vedette (20)

Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCut
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduce
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspective
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architectures
 
How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysis
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
 
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysis
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detection
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQL
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphs
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphs
 
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph mining
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimc
 

Similaire à Iterative methods for network alignment

Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationDavid Gleich
 
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal ClubMed_KU
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
sequence alignment
sequence alignmentsequence alignment
sequence alignmentammar kareem
 
Financial Networks VI - Correlation Networks
Financial Networks VI - Correlation NetworksFinancial Networks VI - Correlation Networks
Financial Networks VI - Correlation NetworksKimmo Soramaki
 
IRJET- Comparative Study of Radial and Ring Type Distribution System
IRJET-  	  Comparative Study of Radial and Ring Type Distribution SystemIRJET-  	  Comparative Study of Radial and Ring Type Distribution System
IRJET- Comparative Study of Radial and Ring Type Distribution SystemIRJET Journal
 
M.E Computer Science Image Processing Projects
M.E Computer Science Image Processing ProjectsM.E Computer Science Image Processing Projects
M.E Computer Science Image Processing ProjectsVijay Karan
 
Adithya Rajan_Jan_2016
Adithya Rajan_Jan_2016Adithya Rajan_Jan_2016
Adithya Rajan_Jan_2016Adithya Rajan
 
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsVijay Karan
 
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsVijay Karan
 
Quantum persistent k cores for community detection
Quantum persistent k cores for community detectionQuantum persistent k cores for community detection
Quantum persistent k cores for community detectionColleen Farrelly
 
Computation and System Biology Assignment Help
Computation and System Biology Assignment HelpComputation and System Biology Assignment Help
Computation and System Biology Assignment HelpNursing Assignment Help
 
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphsUse of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphscsandit
 
Eli plots visualizing innumerable number of correlations
Eli plots   visualizing innumerable number of correlationsEli plots   visualizing innumerable number of correlations
Eli plots visualizing innumerable number of correlationsLeonardo Auslender
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions
 

Similaire à Iterative methods for network alignment (20)

Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregation
 
Protein Threading
Protein ThreadingProtein Threading
Protein Threading
 
Seq alignment
Seq alignment Seq alignment
Seq alignment
 
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
As 7
As 7As 7
As 7
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
Financial Networks VI - Correlation Networks
Financial Networks VI - Correlation NetworksFinancial Networks VI - Correlation Networks
Financial Networks VI - Correlation Networks
 
IRJET- Comparative Study of Radial and Ring Type Distribution System
IRJET-  	  Comparative Study of Radial and Ring Type Distribution SystemIRJET-  	  Comparative Study of Radial and Ring Type Distribution System
IRJET- Comparative Study of Radial and Ring Type Distribution System
 
M.E Computer Science Image Processing Projects
M.E Computer Science Image Processing ProjectsM.E Computer Science Image Processing Projects
M.E Computer Science Image Processing Projects
 
Adithya Rajan_Jan_2016
Adithya Rajan_Jan_2016Adithya Rajan_Jan_2016
Adithya Rajan_Jan_2016
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing Projects
 
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing Projects
 
Quantum persistent k cores for community detection
Quantum persistent k cores for community detectionQuantum persistent k cores for community detection
Quantum persistent k cores for community detection
 
Computation and System Biology Assignment Help
Computation and System Biology Assignment HelpComputation and System Biology Assignment Help
Computation and System Biology Assignment Help
 
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphsUse of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
 
Eli plots visualizing innumerable number of correlations
Eli plots   visualizing innumerable number of correlationsEli plots   visualizing innumerable number of correlations
Eli plots visualizing innumerable number of correlations
 
F14 lec12graphs
F14 lec12graphsF14 lec12graphs
F14 lec12graphs
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 

Plus de David Gleich

Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisDavid Gleich
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksDavid Gleich
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansDavid Gleich
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningDavid Gleich
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsDavid Gleich
 
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresDavid Gleich
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structuresDavid Gleich
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreDavid Gleich
 
Matrix methods for Hadoop
Matrix methods for HadoopMatrix methods for Hadoop
Matrix methods for HadoopDavid Gleich
 

Plus de David Gleich (9)

Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network Analysis
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-means
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based Learning
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chains
 
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structures
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and more
 
Matrix methods for Hadoop
Matrix methods for HadoopMatrix methods for Hadoop
Matrix methods for Hadoop
 

Dernier

UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 

Dernier (20)

UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 

Iterative methods for network alignment

  • 1. Iterative Methods for Network Alignment David F. Gleich Computer Science Purdue University with Arif Khan, Alex Pothen ! Purdue University, Computer Science Work supported by DOE CSCAPES Institute grant (DE- FC02-08ER25864), NSF CAREER grant 1149756-CCF, Mahantesh Halappanavar! and the Center for Adaptive Super Computing Software Pacific Northwest National Labs Multithreaded Architectures (CASS-MT) at PNNL. Mohsen Bayati, Amin Saberi! Stanford’s CADS grant from the Library of Congress. PNNL is operated by Battelle Memorial Institute under Stanford University contract DE-AC06-76RL01830 Ying Wang Google 1
  • 2. Network alignment" What is the best way of matching " graph A to B? w v s r t u A B 2
  • 3. the Figure 2. The NetworkBLAST local network alignment algorithm. Given two input s) or odes lem Network alignment" networks, a network alignment graph is constructed. Nodes in this graph correspond to pairs of sequence-similar proteins, one from each species, and edges correspond to conserved interactions. A search algorithm identifies highly similar subnetworks that follow a prespecified interaction pattern. Adapted from Sharan and Ideker.30 n the ent; nied ped lem net- one one plest ying eins ome the be- d as aph ever, ap- From Sharan and Ideker, Modeling cellular machinery through biological rked network comparison. Nat. Biotechnol. 24, 4 (Apr. 2006), 427–433. , we Figure 3. Performance comparison of computational approaches. 3 mon-
  • 4. Gleich (Purdue) Network alignment" d F. Gleich (Purdue) Motivation DavidINFORMS (Purdue) F. Gleich Seminar Network alignment 8 / 40 Motivation INFORMS Semin INFORMS Sem H/Wikipedia: Simple alignment fa mm 40 60 80 100 mm 120 40 60 80 100 mm 40 40 6040 80 j 100 r s 60 60 t 80 40 t 80 A L B A LCSH 297,266 vertices, 248,230 edges B Wikipedia 205,948 vertices, 382,353 edges L links 4,971,629 edges 60 4
  • 5. Sometimes small procedure on the labels of the two data he size of these datasets is reported in Table 6.5. For this a ome from a text-matching becomes big … Table 6.5 size of non-Web datasets. The product graph is never formed explicitly Dataset Size Nonzeros LCSH-2 59,849 227,464 WC-3 70,509 403,960 Product graph 4,219,893,141 91,886,357,440 periment, we do not investigate all the issues involved in using … Ananth has some better techniques to work with these large problems … d problem and focus on the performance of the inner-outer al nking context. Without any parameter optimization (i.e., usi ), the inner-outer scheme shows a significant performance adv d in Table 6.6. 5
  • 6. Network alignment" What is the best way of matching " graph A to B using only edges in L? w v s r wtu t u A L B 6
  • 7. Network alignment" Matching? 1-1 relationship" Best? highest weight and overlap w v Overlap s r wtu t u A L B 7
  • 8. Network alignment" … is NP-hard" … has no approximation algorithm w v r Overlap s •  Computer Vision •  Ontology matching •  Database matching wtu •  Bioinformatics t u A L B objective = α matching + βoverlap 8
  • 9. Network alignment" via mathematical programming maximize ↵wT x + 2 xT Sx w v s subject to Ax  e, xi 2 {0, 1} r Overlap Let xi be an indicator over edges in L wtu t u If A is the node-edge incidence matrix for L, then x is a 1-1 matching A L B Find a 1-1 matching between vertices with as many overlaps as possible. 9
  • 10. Network alignment" via mathematical programming maximize ↵wT x + 2 xT Sx w v s subject to Ax  e, xi 2 {0, 1} r Overlap Let xi be an indicator over edges in L wtu t u Let Sij = 1 when xi and xj overlap, then xTSx is twice the overlapped count. A L B Find a 1-1 matching between vertices with as many overlaps as possible. 10
  • 11. Our contributions A new belief propagation method (Bayati et al. 2009, 2013)" Outperformed state-of-the-art PageRank and optimization- based heuristic methods High performance C++ implementations (Khan et al. 2012)" 40 times faster (C++ ~ 3, complexity ~ 2, threading ~ 8)" 5 million edge alignments ~ 10 sec" www.cs.purdue.edu/~dgleich/codes/netalignmc 11
  • 13. Iterative methods " for network alignment Each iteration involves Let x[i] be the score for Matrix-vector-ish computations each pair-wise match in L with a sparse matrix, e.g. sparse matrix vector products in a semi- for i=1 to ... ring, dot-products, axpy, etc. update x[i] to y[i] Bipartite max-weight matching compute a using a different weight vector at max-weight match with y each iteration update y[i] to x[i] " (using match in MR) No “convergence” " 100-1000 iterations 13
  • 14. Open question 1! Any sort of property of " these methods beyond ... (i) Principled derivation and " (ii) “David and Ananth say they work”? 14
  • 15. David F. Gleich (Purdue) Algorithms INFORMS Seminar 25 / 40 Belief propagation methods Belief propagation: Our algorithm mm 40 60 80 100 120 Summary History … Construct a probability … BP used for computing model where the most 40 marginal probabilities and likely state is the solution! maximum aposterori … Locally update information probability … Like a generalized dynamic … Wildly successful at solving program 60 satisfiability problems … Convergent algorithm for … It works max-weight matching … Most likely, it won’t 80 converge 15 Bayati et al. 2005;
  • 16. David F. Gleich (Purdue) Algorithms INFORM Belief propagation for The neighbor operation used to define the left-hand vector x@fi is implicitly defined b NetAlign factor graph: Loopy BP the set of variables used on the right-hand side of the equation. In words, the functio network alignment node fi (gi0 ) enforces the matching constraint at i (i0 ) Another type of function nodes check the validity of squares. For each square ii0 ⇤ j mm define a function node hii0 jj 0 : {0, 1}40|+|S| ! R: 60 |EL 80 100 ( 1 xii0 jj 0 = xii0 xjj 0 Functions Variables hii0 jj 0 x@hii0 jj0 = for all (ii0 , jj 0 ) 2 VS . A B 0 otherwise ƒ1 11 0 In other words, 1 ii0 jj 0 40 h ƒ2 guarantees that xii0 jj 0 = 1 if and only if xii0 = xjj 0 = 1. 10 12 0 The edges of the factor graph are simply0 connecting each g0 function node to the variab 1 22 nodes it acts on. For example each fi is connected to all variable nodes ii0 2 EL and eac 2 20 0 0 0 0 0 hii0 jj 0 is connected to ii , jj and ii jj in EL 23 0 [ V . Thereforeg2 factor graph is bipartit the S g0 Figure 3 shows an example of a graph pair A, B and their factor-graph representation a 3 described above. 60 30 110 220 Now define the following probability distribution h110 220 2 3 n m 1 4Y Y Y T T p(xL , xS ) = fi (x@fi ) gj (x@gj ) hijrs (x@hijrs )5 e↵w xL + 2 1|S| xS (4 Z i=1 80 j=1 ijrs2VS where Z is just a normalization term to make p(xL , xS ) a probability distribution. I 16 Note It’s pretty hairy to put all the stuff I should put here on a single slide. Most of it is in the pap particular, The rest is just “turning the crank” with standard tricks in BP algorithms. 2 3
  • 17. Algorithms INFORMS Seminar 26 / 40 M !j { = s} = Y i Mj0 ! { = s} 60 80 100 120 j0 2{N( )j} j variable tells function j what it thinks about being in state s. This is just the product of what all the other functions tell about being in state s. i Mj! { = s} = m xim m ns y:all possible choices 17 for variables 0 2N(j)
  • 18. variable tells function j what it thinks about being in state s. This is just the product of what all the other functions tell about being in state s. i Mj! { = s} = m xim m ns y:all possible choices for variables 0 2N(j) 2 3 es j Y 6 0 7 4ƒj (y) M 0 !j { = y 0 }5 0 2{N(j) } function j tells variable what it thinks about being in state s. This means that we l have to locally maxamize ƒj among all possible choices. Note y = s always (too cumbersome to include in notation.) 18
  • 19. Belief propagation for network alignment For t 1, the messages in iteration t are obtained from the messages in iteration t 1 recursively. In particular for all ii 0 2 EL ✓ h i ◆+ (t) (t 1) mii 0 !fi = ↵wii 0 max mki 0 !g 0 k 6=i i X ✓ ◆ (t 1) + min , max(0, + mjj 0 !h 0 ) . (1) 2 2 ii jj 0 ii 0 jj 0 2VS (t) The update rule for mii 0 !g 0 is similar, and i ✓ h i ◆+ ✓ h i ◆+ (t) (t 1) (t 1) mii 0 !h 0 0 = ↵wii 0 max mki 0 !g 0 max mik 0 !fi ii jj k6=i i k 0 6=i 0 X ✓ ◆ (t 1) + min , max(0, mkk 0 !h 0 0 + ) . (2) 0 0 2 ii kk 2 kk 6=jj ii 0 kk 0 2VS
  • 20. Synthetic evaluation of network alignment 1 0.8 fraction correct 0.6 0.4 MR 0.2 BP BPSC IsoRank 0 10 15 20 0 5 10 15 20 degree of noise in L (p ⋅ n) expected degree of noise in L (p ⋅ n)
  • 21. Open question 2! When could we hope to solve such synthetic problems in asymptotic regimes? 21
  • 22. Does it work? LCSH – Library of Congress subject headings Network Alignment :25 Rameau – French National Library subject headings Table IV. The alignment results for LCSH and Rameau. The first set of results shows the statistics of the known Manually matched alignment and the results from the max-weight matching algorithm. Next we show results from our algorithms for three objective parameters. The columns are: objective parameters, algorithms, matching weight, matching edge overlap, time, total correct, recall, precision, and matching triangle overlap. Obj. Alg. Weight Overlap Time (s) Correct Rec. Prec. Triangles Sol. 36332.42 39847 — 57645 100% 100% 2073 MWM 93279.0 16990 29.6 29098 50.5% 23.3% 350 ↵ = 1, =1 BP" MP 84622.0 46400 23522.0 32585 56.5% 27.6% 1515 BP++" MP++ 85810.1 46942 27115.6 32857 57.0% 27.4% 1548 MR MR 87588.6 48367 33366.9 33225 57.6% 27.0% 1617 ↵ = 1, =2 BP" MP 81752.6 46569 23427.1 31724 55.0% 27.6% 1483 BP++" MP++ 84615.7 46656 26673.1 31952 55.4% 26.7% 1531 MR MR 85438.4 48934 56961.6 32303 56.0% 26.3% 1604 ↵ = 0, =1 BP" MP 60617.9 45247 14284.8 24794 43.0% 23.2% 1467 BP++" MP++ 60502.8 41592 13979.5 24498 42.5% 23.0% 1484 MR MR 65994.2 46163 10384.4 25455 44.2% 21.5% 1602 22 protein-protein interaction networks and ontologies. In the future, we envision applications of these techniques in mapping large social network structure.
  • 23. Open question 3! How can we evaluate alignments? " What are possible null-models? 23
  • 24. David F. Gleich (Purdue) Results INFORMS Seminar 35 / 40 Matching results: A little too hot! mm 40 60 LCSH WC 80 100 120 Science fiction television series Science fiction television programs Turing test Turing test Machine learning Machine learning Hot tubs Hot dog 40 60 80 24
  • 25. Higher-order " } This proposal is for match- network alignment using tensor ing triangles methods: k in 0 j Triangle j g this i k i 0 0 k euris- using A L B istics If xi , xj , and xk are algo- indicators associated with 25 volves the edges (i, i0 ), (j, j 0 ), and
  • 26. Network alignment" A L B This proposal is for programming via mathematical match- ing triangles using tensor methods: n maximize ↵wT x + 2 xT Sx j Triangle j0 subject to Ax  e, xi 2 {0, 1} s i k 0 i 0 k - g A L B s If xi , xj , and xk are - indicators1-1 matching between vertices with Find a associated with s theas many overlaps0 ), and edges (i, i0 ), (j, j as possible. 26 o (k, k 0 ), then we want to
  • 27. Triangle alignment" A L B This proposal is for programming via mathematical match- ing triangles using tensor methods: n ↵wT x + 2 xT Sx X j Triangle j0 maximize + Tijk xi xj xk s i k0 i0 ijk k subject to Ax  e, xi 2 {0, 1} - g A L B s If xi , xj , and xk are - indicators1-1 matching between vertices with Find a associated with s edges (i, i0 ), (j, j and triangles as possible. theas many overlaps0 ), and 27 o (k, k 0 ), then we want to
  • 28. Tensor eigenvalues" A L B This proposal is for match- and a power method ing triangles using tensor methods: P maximize ijk Tijk xi xj xk n subject to kxk2 = 1 j Triangle j0 s i k0 i0 [x(next) ]i = ⇢ · ( X Tijk xj xk + xi ) k - jk where 𝜌 ensures the 2-norm g SSHOPM method due to " A L B Kolda and Mayo s IfHuman,protein interaction networks 48,228 triangles xi xj , and xk are - indicatorsinteraction networks with triangles Yeast protein associated nonzeros 257,978 The tensor T has ~100,000,000,000 s the We work with it i0 ), (j, j 0 ), and edges (i, implicitly 28 o (k, k 0 ), then we want to
  • 29. Synthetic evaluation of network alignment 1 1 0.8 0.8 fraction correct fraction correct 0.6 0.6 0.4 0.4 0.2 Eigen MR Teigen 0.2 BP Iso 0 BPSC 0 5 10 15 20 expected degree of noise in L (p n) IsoRank 0 10 15 20 0 5 10 15 20 degree of noise in L (p ⋅ n) expected degree of noise in L (p ⋅ n)
  • 30. Open question 4! When do we need triangles? 30