SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Iterative Methods for Network Alignment

                                                                       David F. Gleich
                                                                    Computer Science
                                                                     Purdue University
                                                                    Arif Khan, Alex Pothen !
                                                        Purdue University, Computer Science
Work supported by DOE CSCAPES Institute grant (DE-
FC02-08ER25864), NSF CAREER grant 1149756-CCF,                  Mahantesh Halappanavar!
and the Center for Adaptive Super Computing Software        Pacific Northwest National Labs
Multithreaded Architectures (CASS-MT) at PNNL.                Mohsen Bayati, Amin Saberi!
Stanford’s CADS grant from the Library of Congress.
PNNL is operated by Battelle Memorial Institute under
                                                                           Stanford University
contract DE-AC06-76RL01830
                                               Ying Wang Google

Network alignment"
What is the best way of matching "
graph A to B?


                t                   u

        A                           B

the    Figure 2. The NetworkBLAST local network alignment algorithm. Given two input
s) or
         Network alignment"
         networks, a network alignment graph is constructed. Nodes in this graph correspond
         to pairs of sequence-similar proteins, one from each species, and edges correspond to
         conserved interactions. A search algorithm identifies highly similar subnetworks that

         follow a prespecified interaction pattern. Adapted from Sharan and Ideker.30
n the

d as
   ap-           From Sharan and Ideker, Modeling cellular machinery through biological
rked             network comparison. Nat. Biotechnol. 24, 4 (Apr. 2006), 427–433. 
 , we            
         Figure 3. Performance comparison of computational approaches.

Gleich (Purdue)   Network alignment"
d F. Gleich (Purdue)
                             Motivation                  DavidINFORMS (Purdue)
                                                               F. Gleich Seminar

                                                       Network alignment
                                                                                   8 / 40        Motivation              INFORMS Semin

                                                                                                                     INFORMS Sem
H/Wikipedia: Simple alignment fa
mm                     40   60                80       100 mm                120 40              60           80     100

                             40                         6040                                80
                                    r                                                        s

60                                                              60

80  40                                             t            80
                                          A                       L                          B
     A                  LCSH                   297,266 vertices, 248,230 edges
     B                  Wikipedia              205,948 vertices, 382,353 edges
     L                  links                  4,971,629 edges

Sometimes small procedure on the labels of the two
he size of these datasets is reported in Table 6.5. For this a
ome from a text-matching
      becomes big …
                            Table 6.5
 size of non-Web datasets. The product graph is never formed explicitly

           Dataset                      Size         Nonzeros
           LCSH-2                    59,849            227,464
           WC-3                      70,509            403,960
           Product graph      4,219,893,141    91,886,357,440

 periment, we do not investigate all the issues involved in using
      … Ananth has some better techniques to work with these large problems …
 d problem and focus on the performance of the inner-outer al
 nking context. Without any parameter optimization (i.e., usi
 ), the inner-outer scheme shows a significant performance adv
d in Table 6.6.

Network alignment"
What is the best way of matching "
graph A to B using only edges in L?


                t                  u

        A           L              B

Network alignment"
Matching? 1-1 relationship"
Best? highest weight and overlap

                    Overlap         s

                t                       u

        A                L              B

Network alignment"
… is NP-hard"
… has no approximation algorithm
r               Overlap         s           •    Computer Vision
                                            •    Ontology matching
                                            •    Database matching
                          wtu               •    Bioinformatics
            t                       u

    A                L              B

objective = α matching + βoverlap

Network alignment"
via mathematical programming

                                                maximize       ↵wT x + 2 xT Sx
                                                subject to Ax  e, xi 2 {0, 1}
    r               Overlap

                                                 Let xi be an indicator over edges in L
                t                       u        If A is the node-edge incidence matrix
                                                 for L, then x is a 1-1 matching
        A                L              B

    Find a 1-1 matching between vertices
    with as many overlaps as possible.

Network alignment"
via mathematical programming

                                                maximize       ↵wT x + 2 xT Sx
                                                subject to Ax  e, xi 2 {0, 1}
    r               Overlap

                                                 Let xi be an indicator over edges in L
                t                       u        Let Sij = 1 when xi and xj overlap, then
                                                 xTSx is twice the overlapped count.
        A                L              B

    Find a 1-1 matching between vertices
    with as many overlaps as possible.

Our contributions
A new belief propagation method (Bayati et al. 2009, 2013)"
Outperformed state-of-the-art PageRank and optimization-
based heuristic methods

High performance C++ implementations (Khan et al. 2012)"
40 times faster (C++ ~ 3, complexity ~ 2, threading ~ 8)"
5 million edge alignments ~ 10 sec"

Iterative methods "
    for network alignment
Each iteration involves
                                     Let x[i] be the score for
Matrix-vector-ish computations       each pair-wise match in L
with a sparse matrix, e.g. sparse
matrix vector products in a semi-    for i=1 to ...
ring, dot-products, axpy, etc. 
       update x[i] to y[i]
Bipartite max-weight matching          compute a
using a different weight vector at       max-weight match
                                         with y
each iteration
                                       update y[i] to x[i]
"                                        (using match in MR)
No “convergence” "
100-1000 iterations

Open question 1!

Any sort of property of "
these methods beyond ... 
(i) Principled derivation and "
(ii) “David and Ananth say they work”?

David F. Gleich (Purdue)        Algorithms                      INFORMS Seminar   25 / 40

 Belief propagation methods
Belief propagation: Our algorithm
          mm                    40   60                 80        100              120
Summary                                    History
 …   Construct a probability                      …   BP used for computing
     model where the most
          40                                          marginal probabilities and
     likely state is the solution!                    maximum aposterori
 …   Locally update information                       probability
 …   Like a generalized dynamic                   …   Wildly successful at solving
         60                                           satisfiability problems
                                                  …   Convergent algorithm for
 …   It works                                         max-weight matching

 …   Most likely, it won’t

                                                                        Bayati et al. 2005;
David F. Gleich (Purdue)                     Algorithms                              INFORM

      Belief propagation for
   The neighbor operation used to define the left-hand vector x@fi is implicitly defined b
           NetAlign factor graph: Loopy BP
   the set of variables used on the right-hand side of the equation. In words, the functio
      network alignment
   node fi (gi0 ) enforces the matching constraint at i (i0 )
   Another type of function nodes check the validity of squares. For each square ii0 ⇤ j
   define a function node hii0 jj 0 : {0, 1}40|+|S| ! R: 60
                                                     |EL                             80                100
                                           1 xii0 jj 0 = xii0 xjj 0 Functions
               hii0 jj 0 x@hii0 jj0 =                                        for all (ii0 , jj 0 ) 2 VS .
                            A          B 0 otherwise                         ƒ1
                                                       11 0
   In other words, 1 ii0 jj 0 40
                         h                                                   ƒ2
                                   guarantees that xii0 jj 0 = 1 if and only if xii0 = xjj 0 = 1.
                                           10          12 0

  The edges of the factor graph are simply0 connecting each g0               function node to the variab
nodes it acts on. For example each fi is connected to all variable nodes ii0 2 EL and eac
                         2                 20 0 0                              0
                                0     0
hii0 jj 0 is connected to ii , jj and ii jj in EL      23 0 [ V . Thereforeg2 factor graph is bipartit
  Figure 3 shows an example of a graph pair A, B and their factor-graph representation a
described above.                  60       30
                                                 110 220
  Now define the following probability distribution                           h110 220
                          2                                                        3
                            n                 m
                        1 4Y                 Y               Y                             T         T
      p(xL , xS ) =               fi (x@fi )     gj (x@gj )        hijrs (x@hijrs )5 e↵w xL + 2 1|S| xS    (4
                        Z i=1
                                 80          j=1           ijrs2VS

where Z is just a normalization term to make p(xL , xS ) a probability distribution. I

             Note It’s pretty hairy to put all the stuff I should put here on a single slide. Most of it is in the pap
particular,  The rest is just “turning the crank” with standard tricks in BP algorithms.
                            2                                                               3
Algorithms                                INFORMS Seminar   26 / 40

                      M !j { = s} =
             i              Mj0 ! { = s}
60               80                    100              120
                      j0 2{N(   )j}

      variable tells function j what it thinks
      about being in state s. This is just the
      product of what all the other functions tell
       about being in state s.

             i        Mj! {            = s} = m xim m
ns                                        y:all possible choices

                                           for variables 0 2N(j)
variable tells function j what it thinks
      about being in state s. This is just the
      product of what all the other functions tell
       about being in state s.

            i       Mj! {     = s} = m xim m
 ns                                 y:all possible choices
                                     for variables 0 2N(j)
                       2                               3
es          j                  Y
                       6                     0          7
                       4ƒj (y)   M 0 !j {        = y 0 }5
                            0 2{N(j)   }

      function j tells variable what it thinks
      about being in state s. This means that we
l     have to locally maxamize ƒj among all
      possible choices. Note y = s always (too
      cumbersome to include in notation.)

Belief propagation for
   network alignment
    For t         1, the messages in iteration t are obtained from the messages in
iteration t       1 recursively. In particular for all ii 0 2 EL
                          ✓              h                i ◆+
   (t)                             (t 1)
  mii 0 !fi = ↵wii 0          max mki 0 !g 0
                               k 6=i                  i

                                                               X                 ✓                                           ◆
                                                                                                           (t 1)
                                                       +                   min           , max(0,       + mjj 0 !h 0        ) . (1)
                                                                                     2              2             ii jj   0
                                                           ii 0 jj 0 2VS

The update rule for mii 0 !g 0 is similar, and

                              ✓               h                i ◆+         ✓            h          i ◆+
   (t)                                 (t 1)                                  (t 1)
  mii 0 !h 0 0 = ↵wii 0           max mki 0 !g 0                   max mik 0 !fi
          ii jj                        k6=i               i        k 0 6=i 0
                                                          X       ✓                               ◆
                                                                                    (t 1)
                                                  +           min        , max(0, mkk 0 !h 0 0 + ) . (2)
                                                          0 0
                                                                    2                     ii kk 2
                                                          kk 6=jj
                                                      ii 0 kk 0 2VS
Synthetic evaluation of
       network alignment



                                    fraction correct


                                                       0.2   BP
      10              15       20                        0          5           10              15       20
degree of noise in L (p ⋅ n)                                     expected degree of noise in L (p ⋅ n)
Open question 2!

When could we hope to solve such
synthetic problems in asymptotic

Does it work?
                LCSH – Library of Congress subject headings
 Network Alignment                                                                                        :25
                   Rameau – French National Library subject headings
Table IV. The alignment results for LCSH and Rameau. The first set of results shows the statistics of the known
                     Manually matched
alignment and the results from the max-weight matching algorithm. Next we show results from our algorithms for
three objective parameters. The columns are: objective parameters, algorithms, matching weight, matching edge
overlap, time, total correct, recall, precision, and matching triangle overlap.
   Obj.           Alg.       Weight      Overlap    Time (s)    Correct    Rec.        Prec.       Triangles
                  Sol.       36332.42    39847      —           57645      100%        100%        2073
                  MWM        93279.0     16990      29.6        29098      50.5%       23.3%       350
   ↵ = 1,   =1   BP"
                  MP         84622.0     46400      23522.0     32585      56.5%       27.6%       1515
                  MP++       85810.1     46942      27115.6     32857      57.0%       27.4%       1548
                  MR         87588.6     48367      33366.9     33225      57.6%       27.0%       1617
   ↵ = 1,   =2   BP"
                  MP         81752.6     46569      23427.1     31724      55.0%       27.6%       1483
                  MP++       84615.7     46656      26673.1     31952      55.4%       26.7%       1531
                  MR         85438.4     48934      56961.6     32303      56.0%       26.3%       1604
   ↵ = 0,   =1   BP"
                  MP         60617.9     45247      14284.8     24794      43.0%       23.2%       1467
                  MP++       60502.8     41592      13979.5     24498      42.5%       23.0%       1484
         65994.2     46163      10384.4     25455      44.2%       21.5%       1602

 protein-protein interaction networks and ontologies. In the future, we envision applications
 of these techniques in mapping large social network structure.
Open question 3!

How can we evaluate alignments? "
What are possible null-models? 

David F. Gleich (Purdue)           Results                     INFORMS Seminar   35 / 40

Matching results: A little too hot!
        mm                    40     60
                                   LCSH     WC     80         100               120
  Science fiction television series          Science fiction television programs
                       Turing test          Turing test
                Machine learning            Machine learning
                         Hot tubs           Hot dog



Higher-order "
    }     This proposal is for match-
    network alignment
 using tensor
          ing triangles
  k in                             0
          j       Triangle          j
g this        i

            A          L              B
        If xi , xj , and xk are
  algo- indicators associated with

volves the edges (i, i0 ), (j, j 0 ), and
Network alignment"
          A            L      B
     This proposal is for programming
       via mathematical match-
     ing triangles using tensor
n                                             maximize   ↵wT x + 2 xT Sx
      j           Triangle   j0               subject to Ax  e, xi 2 {0, 1}
s             i
                              k   0
                                      i   0

          A            L      B
     If xi , xj , and xk are
 -   indicators1-1 matching between vertices with
         Find a associated with
s    theas many overlaps0 ), and
          edges (i, i0 ), (j, j as possible.

o    (k, k 0 ), then we want to
Triangle alignment"
           A            L      B
     This proposal is for programming
       via mathematical match-
     ing triangles using tensor
n                                                    ↵wT x + 2 xT Sx
       j           Triangle   j0         maximize    +    Tijk xi xj xk
s              i
                                    i0                  ijk
                                         subject to Ax  e, xi 2 {0, 1}
           A            L      B
     If xi , xj , and xk are
 -   indicators1-1 matching between vertices with
         Find a associated with
s         edges (i, i0 ), (j, j and triangles as possible.
     theas many overlaps0 ), and

o    (k, k 0 ), then we want to
Tensor eigenvalues"
            A             L              B
     This proposal is for match-
       and a power method
     ing triangles using tensor
     methods:                                                      P
                                 maximize                              ijk   Tijk xi xj xk
n                                                   subject to kxk2 = 1
       j            Triangle            j0
s               i
                                                            ]i = ⇢ · (
                                                                        Tijk xj xk + xi )
 -                                                                   jk
                                                       where 𝜌 ensures the 2-norm
g                                                        SSHOPM method due to "
            A             L              B                  Kolda and Mayo
     IfHuman,protein interaction networks 48,228 triangles
          xi xj , and xk are
 -   indicatorsinteraction networks with triangles 
       Yeast protein
                       associated nonzeros
       The tensor T has ~100,000,000,000
s    the 
We work with it i0 ), (j, j 0 ), and
            edges (i, implicitly

o    (k, k 0 ), then we want to
Synthetic evaluation of
                         network alignment



                                                                   fraction correct
fraction correct

                   0.6                                                                0.6


                   0.2   Eigen                                                              MR
                         Teigen                                                       0.2   BP
                    0                                                                       BPSC
                     0          5           10              15     20
                             expected degree of noise in L (p n)                            IsoRank
      10              15                               20                               0          5           10              15       20
degree of noise in L (p ⋅ n)                                                                    expected degree of noise in L (p ⋅ n)
Open question 4!

When do we need triangles?

Iterative methods for network alignment

Contenu connexe

En vedette

Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutDavid Gleich
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceDavid Gleich
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveDavid Gleich
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesDavid Gleich
How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...David Gleich
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksDavid Gleich
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisDavid Gleich
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential David Gleich
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationDavid Gleich
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...David Gleich
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisDavid Gleich
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networksDavid Gleich
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detectionDavid Gleich
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLDavid Gleich
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphsDavid Gleich
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduceDavid Gleich
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsDavid Gleich
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph miningDavid Gleich
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcDavid Gleich

En vedette (20)

Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduce
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspective
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architectures
How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysis
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysis
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detection
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQL
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphs
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphs
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph mining
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimc

Similaire à Iterative methods for network alignment

Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationDavid Gleich
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal ClubMed_KU
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
sequence alignment
sequence alignmentsequence alignment
sequence alignmentammar kareem
Financial Networks VI - Correlation Networks
Financial Networks VI - Correlation NetworksFinancial Networks VI - Correlation Networks
Financial Networks VI - Correlation NetworksKimmo Soramaki
IRJET- Comparative Study of Radial and Ring Type Distribution System
IRJET-  	  Comparative Study of Radial and Ring Type Distribution SystemIRJET-  	  Comparative Study of Radial and Ring Type Distribution System
IRJET- Comparative Study of Radial and Ring Type Distribution SystemIRJET Journal
M.E Computer Science Image Processing Projects
M.E Computer Science Image Processing ProjectsM.E Computer Science Image Processing Projects
M.E Computer Science Image Processing ProjectsVijay Karan
Adithya Rajan_Jan_2016
Adithya Rajan_Jan_2016Adithya Rajan_Jan_2016
Adithya Rajan_Jan_2016Adithya Rajan
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsVijay Karan
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsVijay Karan
Quantum persistent k cores for community detection
Quantum persistent k cores for community detectionQuantum persistent k cores for community detection
Quantum persistent k cores for community detectionColleen Farrelly
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphsUse of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphscsandit
Eli plots visualizing innumerable number of correlations
Eli plots   visualizing innumerable number of correlationsEli plots   visualizing innumerable number of correlations
Eli plots visualizing innumerable number of correlationsLeonardo Auslender
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions

Similaire à Iterative methods for network alignment (20)

Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregation
Protein Threading
Protein ThreadingProtein Threading
Protein Threading
Seq alignment
Seq alignment Seq alignment
Seq alignment
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
As 7
As 7As 7
As 7
sequence alignment
sequence alignmentsequence alignment
sequence alignment
Financial Networks VI - Correlation Networks
Financial Networks VI - Correlation NetworksFinancial Networks VI - Correlation Networks
Financial Networks VI - Correlation Networks
IRJET- Comparative Study of Radial and Ring Type Distribution System
IRJET-  	  Comparative Study of Radial and Ring Type Distribution SystemIRJET-  	  Comparative Study of Radial and Ring Type Distribution System
IRJET- Comparative Study of Radial and Ring Type Distribution System
M.E Computer Science Image Processing Projects
M.E Computer Science Image Processing ProjectsM.E Computer Science Image Processing Projects
M.E Computer Science Image Processing Projects
Adithya Rajan_Jan_2016
Adithya Rajan_Jan_2016Adithya Rajan_Jan_2016
Adithya Rajan_Jan_2016
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing Projects
Quantum persistent k cores for community detection
Quantum persistent k cores for community detectionQuantum persistent k cores for community detection
Quantum persistent k cores for community detection
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphsUse of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Eli plots visualizing innumerable number of correlations
Eli plots   visualizing innumerable number of correlationsEli plots   visualizing innumerable number of correlations
Eli plots visualizing innumerable number of correlations
F14 lec12graphs
F14 lec12graphsF14 lec12graphs
F14 lec12graphs
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
resume_Weizhen Sun 16
resume_Weizhen Sun 16resume_Weizhen Sun 16
resume_Weizhen Sun 16

Plus de David Gleich

Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisDavid Gleich
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksDavid Gleich
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansDavid Gleich
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningDavid Gleich
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsDavid Gleich
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresDavid Gleich
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structuresDavid Gleich
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreDavid Gleich
Matrix methods for Hadoop
Matrix methods for HadoopMatrix methods for Hadoop
Matrix methods for HadoopDavid Gleich

Plus de David Gleich (9)

Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network Analysis
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-means
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based Learning
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chains
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structures
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and more
Matrix methods for Hadoop
Matrix methods for HadoopMatrix methods for Hadoop
Matrix methods for Hadoop


TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Dernier (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs

Iterative methods for network alignment

  • 1. Iterative Methods for Network Alignment David F. Gleich Computer Science Purdue University with Arif Khan, Alex Pothen ! Purdue University, Computer Science Work supported by DOE CSCAPES Institute grant (DE- FC02-08ER25864), NSF CAREER grant 1149756-CCF, Mahantesh Halappanavar! and the Center for Adaptive Super Computing Software Pacific Northwest National Labs Multithreaded Architectures (CASS-MT) at PNNL. Mohsen Bayati, Amin Saberi! Stanford’s CADS grant from the Library of Congress. PNNL is operated by Battelle Memorial Institute under Stanford University contract DE-AC06-76RL01830 Ying Wang Google 1
  • 2. Network alignment" What is the best way of matching " graph A to B? w v s r t u A B 2
  • 3. the Figure 2. The NetworkBLAST local network alignment algorithm. Given two input s) or odes lem Network alignment" networks, a network alignment graph is constructed. Nodes in this graph correspond to pairs of sequence-similar proteins, one from each species, and edges correspond to conserved interactions. A search algorithm identifies highly similar subnetworks that follow a prespecified interaction pattern. Adapted from Sharan and Ideker.30 n the ent; nied ped lem net- one one plest ying eins ome the be- d as aph ever, ap- From Sharan and Ideker, Modeling cellular machinery through biological rked network comparison. Nat. Biotechnol. 24, 4 (Apr. 2006), 427–433. , we Figure 3. Performance comparison of computational approaches. 3 mon-
  • 4. Gleich (Purdue) Network alignment" d F. Gleich (Purdue) Motivation DavidINFORMS (Purdue) F. Gleich Seminar Network alignment 8 / 40 Motivation INFORMS Semin INFORMS Sem H/Wikipedia: Simple alignment fa mm 40 60 80 100 mm 120 40 60 80 100 mm 40 40 6040 80 j 100 r s 60 60 t 80 40 t 80 A L B A LCSH 297,266 vertices, 248,230 edges B Wikipedia 205,948 vertices, 382,353 edges L links 4,971,629 edges 60 4
  • 5. Sometimes small procedure on the labels of the two data he size of these datasets is reported in Table 6.5. For this a ome from a text-matching becomes big … Table 6.5 size of non-Web datasets. The product graph is never formed explicitly Dataset Size Nonzeros LCSH-2 59,849 227,464 WC-3 70,509 403,960 Product graph 4,219,893,141 91,886,357,440 periment, we do not investigate all the issues involved in using … Ananth has some better techniques to work with these large problems … d problem and focus on the performance of the inner-outer al nking context. Without any parameter optimization (i.e., usi ), the inner-outer scheme shows a significant performance adv d in Table 6.6. 5
  • 6. Network alignment" What is the best way of matching " graph A to B using only edges in L? w v s r wtu t u A L B 6
  • 7. Network alignment" Matching? 1-1 relationship" Best? highest weight and overlap w v Overlap s r wtu t u A L B 7
  • 8. Network alignment" … is NP-hard" … has no approximation algorithm w v r Overlap s •  Computer Vision •  Ontology matching •  Database matching wtu •  Bioinformatics t u A L B objective = α matching + βoverlap 8
  • 9. Network alignment" via mathematical programming maximize ↵wT x + 2 xT Sx w v s subject to Ax  e, xi 2 {0, 1} r Overlap Let xi be an indicator over edges in L wtu t u If A is the node-edge incidence matrix for L, then x is a 1-1 matching A L B Find a 1-1 matching between vertices with as many overlaps as possible. 9
  • 10. Network alignment" via mathematical programming maximize ↵wT x + 2 xT Sx w v s subject to Ax  e, xi 2 {0, 1} r Overlap Let xi be an indicator over edges in L wtu t u Let Sij = 1 when xi and xj overlap, then xTSx is twice the overlapped count. A L B Find a 1-1 matching between vertices with as many overlaps as possible. 10
  • 11. Our contributions A new belief propagation method (Bayati et al. 2009, 2013)" Outperformed state-of-the-art PageRank and optimization- based heuristic methods High performance C++ implementations (Khan et al. 2012)" 40 times faster (C++ ~ 3, complexity ~ 2, threading ~ 8)" 5 million edge alignments ~ 10 sec" 11
  • 12.
  • 13. Iterative methods " for network alignment Each iteration involves Let x[i] be the score for Matrix-vector-ish computations each pair-wise match in L with a sparse matrix, e.g. sparse matrix vector products in a semi- for i=1 to ... ring, dot-products, axpy, etc. update x[i] to y[i] Bipartite max-weight matching compute a using a different weight vector at max-weight match with y each iteration update y[i] to x[i] " (using match in MR) No “convergence” " 100-1000 iterations 13
  • 14. Open question 1! Any sort of property of " these methods beyond ... (i) Principled derivation and " (ii) “David and Ananth say they work”? 14
  • 15. David F. Gleich (Purdue) Algorithms INFORMS Seminar 25 / 40 Belief propagation methods Belief propagation: Our algorithm mm 40 60 80 100 120 Summary History … Construct a probability … BP used for computing model where the most 40 marginal probabilities and likely state is the solution! maximum aposterori … Locally update information probability … Like a generalized dynamic … Wildly successful at solving program 60 satisfiability problems … Convergent algorithm for … It works max-weight matching … Most likely, it won’t 80 converge 15 Bayati et al. 2005;
  • 16. David F. Gleich (Purdue) Algorithms INFORM Belief propagation for The neighbor operation used to define the left-hand vector x@fi is implicitly defined b NetAlign factor graph: Loopy BP the set of variables used on the right-hand side of the equation. In words, the functio network alignment node fi (gi0 ) enforces the matching constraint at i (i0 ) Another type of function nodes check the validity of squares. For each square ii0 ⇤ j mm define a function node hii0 jj 0 : {0, 1}40|+|S| ! R: 60 |EL 80 100 ( 1 xii0 jj 0 = xii0 xjj 0 Functions Variables hii0 jj 0 x@hii0 jj0 = for all (ii0 , jj 0 ) 2 VS . A B 0 otherwise ƒ1 11 0 In other words, 1 ii0 jj 0 40 h ƒ2 guarantees that xii0 jj 0 = 1 if and only if xii0 = xjj 0 = 1. 10 12 0 The edges of the factor graph are simply0 connecting each g0 function node to the variab 1 22 nodes it acts on. For example each fi is connected to all variable nodes ii0 2 EL and eac 2 20 0 0 0 0 0 hii0 jj 0 is connected to ii , jj and ii jj in EL 23 0 [ V . Thereforeg2 factor graph is bipartit the S g0 Figure 3 shows an example of a graph pair A, B and their factor-graph representation a 3 described above. 60 30 110 220 Now define the following probability distribution h110 220 2 3 n m 1 4Y Y Y T T p(xL , xS ) = fi (x@fi ) gj (x@gj ) hijrs (x@hijrs )5 e↵w xL + 2 1|S| xS (4 Z i=1 80 j=1 ijrs2VS where Z is just a normalization term to make p(xL , xS ) a probability distribution. I 16 Note It’s pretty hairy to put all the stuff I should put here on a single slide. Most of it is in the pap particular, The rest is just “turning the crank” with standard tricks in BP algorithms. 2 3
  • 17. Algorithms INFORMS Seminar 26 / 40 M !j { = s} = Y i Mj0 ! { = s} 60 80 100 120 j0 2{N( )j} j variable tells function j what it thinks about being in state s. This is just the product of what all the other functions tell about being in state s. i Mj! { = s} = m xim m ns y:all possible choices 17 for variables 0 2N(j)
  • 18. variable tells function j what it thinks about being in state s. This is just the product of what all the other functions tell about being in state s. i Mj! { = s} = m xim m ns y:all possible choices for variables 0 2N(j) 2 3 es j Y 6 0 7 4ƒj (y) M 0 !j { = y 0 }5 0 2{N(j) } function j tells variable what it thinks about being in state s. This means that we l have to locally maxamize ƒj among all possible choices. Note y = s always (too cumbersome to include in notation.) 18
  • 19. Belief propagation for network alignment For t 1, the messages in iteration t are obtained from the messages in iteration t 1 recursively. In particular for all ii 0 2 EL ✓ h i ◆+ (t) (t 1) mii 0 !fi = ↵wii 0 max mki 0 !g 0 k 6=i i X ✓ ◆ (t 1) + min , max(0, + mjj 0 !h 0 ) . (1) 2 2 ii jj 0 ii 0 jj 0 2VS (t) The update rule for mii 0 !g 0 is similar, and i ✓ h i ◆+ ✓ h i ◆+ (t) (t 1) (t 1) mii 0 !h 0 0 = ↵wii 0 max mki 0 !g 0 max mik 0 !fi ii jj k6=i i k 0 6=i 0 X ✓ ◆ (t 1) + min , max(0, mkk 0 !h 0 0 + ) . (2) 0 0 2 ii kk 2 kk 6=jj ii 0 kk 0 2VS
  • 20. Synthetic evaluation of network alignment 1 0.8 fraction correct 0.6 0.4 MR 0.2 BP BPSC IsoRank 0 10 15 20 0 5 10 15 20 degree of noise in L (p ⋅ n) expected degree of noise in L (p ⋅ n)
  • 21. Open question 2! When could we hope to solve such synthetic problems in asymptotic regimes? 21
  • 22. Does it work? LCSH – Library of Congress subject headings Network Alignment :25 Rameau – French National Library subject headings Table IV. The alignment results for LCSH and Rameau. The first set of results shows the statistics of the known Manually matched alignment and the results from the max-weight matching algorithm. Next we show results from our algorithms for three objective parameters. The columns are: objective parameters, algorithms, matching weight, matching edge overlap, time, total correct, recall, precision, and matching triangle overlap. Obj. Alg. Weight Overlap Time (s) Correct Rec. Prec. Triangles Sol. 36332.42 39847 — 57645 100% 100% 2073 MWM 93279.0 16990 29.6 29098 50.5% 23.3% 350 ↵ = 1, =1 BP" MP 84622.0 46400 23522.0 32585 56.5% 27.6% 1515 BP++" MP++ 85810.1 46942 27115.6 32857 57.0% 27.4% 1548 MR MR 87588.6 48367 33366.9 33225 57.6% 27.0% 1617 ↵ = 1, =2 BP" MP 81752.6 46569 23427.1 31724 55.0% 27.6% 1483 BP++" MP++ 84615.7 46656 26673.1 31952 55.4% 26.7% 1531 MR MR 85438.4 48934 56961.6 32303 56.0% 26.3% 1604 ↵ = 0, =1 BP" MP 60617.9 45247 14284.8 24794 43.0% 23.2% 1467 BP++" MP++ 60502.8 41592 13979.5 24498 42.5% 23.0% 1484 MR MR 65994.2 46163 10384.4 25455 44.2% 21.5% 1602 22 protein-protein interaction networks and ontologies. In the future, we envision applications of these techniques in mapping large social network structure.
  • 23. Open question 3! How can we evaluate alignments? " What are possible null-models? 23
  • 24. David F. Gleich (Purdue) Results INFORMS Seminar 35 / 40 Matching results: A little too hot! mm 40 60 LCSH WC 80 100 120 Science fiction television series Science fiction television programs Turing test Turing test Machine learning Machine learning Hot tubs Hot dog 40 60 80 24
  • 25. Higher-order " } This proposal is for match- network alignment using tensor ing triangles methods: k in 0 j Triangle j g this i k i 0 0 k euris- using A L B istics If xi , xj , and xk are algo- indicators associated with 25 volves the edges (i, i0 ), (j, j 0 ), and
  • 26. Network alignment" A L B This proposal is for programming via mathematical match- ing triangles using tensor methods: n maximize ↵wT x + 2 xT Sx j Triangle j0 subject to Ax  e, xi 2 {0, 1} s i k 0 i 0 k - g A L B s If xi , xj , and xk are - indicators1-1 matching between vertices with Find a associated with s theas many overlaps0 ), and edges (i, i0 ), (j, j as possible. 26 o (k, k 0 ), then we want to
  • 27. Triangle alignment" A L B This proposal is for programming via mathematical match- ing triangles using tensor methods: n ↵wT x + 2 xT Sx X j Triangle j0 maximize + Tijk xi xj xk s i k0 i0 ijk k subject to Ax  e, xi 2 {0, 1} - g A L B s If xi , xj , and xk are - indicators1-1 matching between vertices with Find a associated with s edges (i, i0 ), (j, j and triangles as possible. theas many overlaps0 ), and 27 o (k, k 0 ), then we want to
  • 28. Tensor eigenvalues" A L B This proposal is for match- and a power method ing triangles using tensor methods: P maximize ijk Tijk xi xj xk n subject to kxk2 = 1 j Triangle j0 s i k0 i0 [x(next) ]i = ⇢ · ( X Tijk xj xk + xi ) k - jk where 𝜌 ensures the 2-norm g SSHOPM method due to " A L B Kolda and Mayo s IfHuman,protein interaction networks 48,228 triangles xi xj , and xk are - indicatorsinteraction networks with triangles Yeast protein associated nonzeros 257,978 The tensor T has ~100,000,000,000 s the We work with it i0 ), (j, j 0 ), and edges (i, implicitly 28 o (k, k 0 ), then we want to
  • 29. Synthetic evaluation of network alignment 1 1 0.8 0.8 fraction correct fraction correct 0.6 0.6 0.4 0.4 0.2 Eigen MR Teigen 0.2 BP Iso 0 BPSC 0 5 10 15 20 expected degree of noise in L (p n) IsoRank 0 10 15 20 0 5 10 15 20 degree of noise in L (p ⋅ n) expected degree of noise in L (p ⋅ n)
  • 30. Open question 4! When do we need triangles? 30