SlideShare a Scribd company logo
1 of 60
Download to read offline
The Spectre of the Spectrum
                                                                   An empirical study of the




                        ?
                                                                   spectra of large networks


                                                                                  David F. Gleich
                           
                                                                               Purdue University

                                                                              Indiana University
  Thanks to Ali Pinar, Jaideep Ray,Tammy Kolda,                                 Bloomington, IN
  C. Seshadhri, Rich Lehoucq @ Sandia
  and                                                          Supported by Sandia’s John von Neumann
  Jure Leskovec and Michael Mahoney @ Stanford                     postdoctoral fellowship and the DOE
  for helpful discussions.                                            Office of Science’s Graphs project.



David Gleich (Purdue)                             IU Seminar                                         1/51
Spectral density plots




                                                     Log(Density) ~ Count
Eigenvalue




                                                                            Eigenvalues

                     Eigenvalue index           0                                         2
                   Scree Plot                         Empirical Spectral
                                                          Measure
   David Gleich (Purdue)                IU Seminar                                        2/51
There’s information inside the spectra
    Log(Density) ~ Count




                            Eigenvalues


   0                                                              2
                 Words in dictionary definitions                                     Internet router network
                  111k vertices, 2.7M edges                                         192k vetices, 1.2M edges
                           These figures show the normalized Laplacian. Banerjee and Jost (2009) also noted such shapes in the spectra.
David Gleich (Purdue)                                             IU Seminar                                                       3/51
Overview
Graphs and their matrices

Data for our experiments

Issues with computing spectra

Many examples of graph spectra

A curious property around the eigenvalue one

Computing spectra for large networks

Ongoing studies

David Gleich (Purdue)       IU Seminar         4/51
Why are we interested in the spectra?
Modeling

Properties
  Moments of the adjacency

Regularities

Anomalies

Network Comparison
  Fay et al. 2010 – Weighted Spectral Density
  The network is as19971108 from Jure’s snap collect (a few thousand nodes) and we insert random connections from 50 nodes
David Gleich (Purdue)                                  IU Seminar                                                     5/51
Matrices from graphs
Adjacency matrix              Random walk matrix
                                              
                          
       if                     Modularity matrix
                                          
                                  
                                                        

Laplacian matrix
                                          Not covered
                                          Signless Laplacian matrix
                
                                          Incidence matrix
                                           (It is incidentally discussed)

                              Seidel matrix
Normalized Laplacian matrix
                              Heat Kernel
                                                  
                    
                             Everything is undirected. Mostly connected components only too.
David Gleich (Purdue)    IU Seminar                                                     6/51
Erdős–Rényi Semi-circles
Based on Wigner’s
  semi-circle law.

The eigenvalues of the




                                              Count
adjacency matrix for
n=1000, averaged
over 10 trials
                                                                                Warning
                                                                                Adjacency
Semi-circle with outlier                                                         matrix!
if average degree is
large enough.
                                                             Eigenvalue

                        Observed by Farkas and in the book “Network Alignment” edited by Brandes (Chapter 14)
David Gleich (Purdue)                     IU Seminar                                                     7/51
Previous results
Farkas et al. Significant deviation from the semi-
   circle law for the adjacency matrix

Mihail and Papadimitriou Leading eigenvalues of the
  adjacency matrix obey a power-law based on
  the degree-sequence

Chung et al. Normalized Laplacian still
  obeys a semi-circle law if min-degree large

Banerjee and Jost Study of types of patterns that
   emerge in evolving graph models – explain
   many features of the spectra
David Gleich (Purdue)          IU Seminar             8/51
In comparison to other empiric studies
We use “exact” computation of spectra,
 instead of approximation.

We study “all” of the standard matrices
 over a range of large networks.

Our “large” is bigger.

We look at a few random graph models
 preferential attachment
 random powerlaw
 copying model
 forest fire model
David Gleich (Purdue)       IU Seminar    9/51
Why
                                          you
                        ISSUES WITH       should
                                          be
                        COMPUTING         very
                            SPECTRA       careful
                                          with
                                          eigenvalues.




David Gleich (Purdue)            IU Seminar              10
Matlab!
Always a great starting point.
  My desktop has 24GB of RAM (less than $2500 now!)

24GB/8 bytes (per double) = 3 billion numbers
  ~ 50,000-by-50,000 matrix

Possibilities
  D = eig(A) – needs twice the memory for A,D
  [V,D] = eig(A) – needs three times the memory for A,D,V

These limit us to ~38000 and ~31000 respectively.


David Gleich (Purdue)      IU Seminar                       11/51
Bugs – Matlab


eig(A)
Returns incorrect eigenvectors




                           Seems to be the result of a bug in Intel’s MKL library. Fixed in R2011a
David Gleich (Purdue)      IU Seminar                                                       12/51
Bug – ScaLAPACK default

sudo apt-get install scalapack-openmpi
Allocate 36000x36000 local matrix
Run on 4 processors

Code crashes




                                         Fixed in upcoming Debian libscalapack
David Gleich (Purdue)      IU Seminar                                    13/51
Bug – LAPACK

Scalapack MRRR
Compare standard lapack/blas to atlas performance
Result: correct output from atlas
Result: incorrect output from lapack
Hypothesis: lapack contains a known bug that’s apparently in
  the default ubuntu lapack




David Gleich (Purdue)      IU Seminar                    14/51
Moral


Always test your software.
Extensively.




David Gleich (Purdue)        IU Seminar   15/51
David Gleich (Purdue)   IU Seminar   16
EXAMPLES




David Gleich (Purdue)         IU Seminar   17
Data sources
 SNAP                   Various                 100s-100,000s
 SNAP-p2p               Gnutella Network        5-60k, ~30 inst.
 SNAP-as-733            Autonomous Sys.         ~5,000, 733 inst.
 SNAP-caida             Router networks         ~20,000, ~125 inst.
 Pajek                  Various                 100s-100,000s
 Models                 Copying Model           1k-100k 9 inst. 324 gs
                        Pref. Attach            1k-100k 9 inst. 164 gs
                        Forest Fire             1k-100k 9 inst. 324 gs
 Mine                   Various                 2k-500k
 Newman                 Various
 Arenas                 Various
 Porter                 Facebook                100 schools, 5k-60k
 IsoRank, Natalie       Protein-Protein         <10k , 4 graphs
                                                   Thanks to all who make data available
David Gleich (Purdue)              IU Seminar                                      18/51
Big graphs
Arxiv                   86376             1035126                          Co-author
Dblp                    93156             356290                           Co-author
Dictionary(*)           111982            2750576                          Word defns.
Internet(*)             124651            414428                           Routers
Itdk0304                190914            1215220                          Routers
p2p-gnu(*)              62561             295756                           Peer-to-peer
Patents(*)              230686            1109898                          Citations
Roads                   126146            323900                           Roads
Wordnet(*)              75606             240036                           Word relation
web-nb.edu(*)           325729            2994268                          Web



                                 (*) denotes that this is a weakly connected component of a directed graph.
David Gleich (Purdue)                IU Seminar                                                      19/51
A $8,000 matrix computation




                        925 nodes and 7400 processors on Redsky for 10 hours normalized Laplacian matrix
David Gleich (Purdue)                 IU Seminar                                                   20/51
Indiana’s Facebook Network



                        Warning                                                Warning
                        Adjacency                                              Laplacian
                         marix!                                                 marix!




                                    Data from Mason Porter. Aka, the start of a $50,000,000,000 graph.
David Gleich (Purdue)               IU Seminar                                                   21/51
Yes!


                        These are cases where we have multiple instances of the same graph.
David Gleich (Purdue)   IU Seminar                                                   22/51
Already known?




                                     Just the facebook spectra.
David Gleich (Purdue)   IU Seminar                       23/51
Already known?




                        I soon realized I was searching for “spectre” instead of spectrum, oops.
David Gleich (Purdue)   IU Seminar                                                        24/51
Spikes?                                    Repeated rows
                                           Identical rows grow the null-space.

                                           Banerjee and Jost
Unit eigenvalue          Motif doubling and joining small
                                          repeated
                         graphs will tend to cause
                                           eigenvalues and null vectors.




                        Banerjee and Jost explained how evolving graphs should produce repeated eigenvalues
David Gleich (Purdue)                  IU Seminar                                                    25/51
Combining Eigenvalues
If A has an eigenvector
   with a zero
   component, then



“A + B” (as in the figure)
  has the same
  eigenvalue with
  eigenvector extended
  with zeros on B.


                                    Bannerjee and Jost observed this for the normalized Laplacian.
David Gleich (Purdue)        IU Seminar                                                     26/51
Spikes!                              1.33 (two!)

                                                   1.5, 0.5




                                                                  1.5


                        0.565741
               1.833    1.767592

                                            0.725708
                                            1.607625          1.5 (two)




David Gleich (Purdue)        IU Seminar                                   27/51
Classic spectra




                        From NASA: http://imagine.gsfc.nasa.gov/docs/science/how_l1/spectral_what.html
David Gleich (Purdue)             IU Seminar                                                    28/51
Random power law                       Random power law
                         12500 vertices, 500 (2.*) / 400 (1.8) min degree



Generate a power law
degree distribution.

Produce a random
graph with a
prescribed degree
distribution using the
Bayati-Kim-Saberi
procedure.




David Gleich (Purdue)             IU Seminar                            29/51
Preferential Attachment
Start graph with a k-node clique. Add a new node and
connect to k random nodes, chosen proportional to degree.




                        Semi-circle in log-space!
David Gleich (Purdue)                 IU Seminar        30/51
Copying model
Start graph with a k-node clique. Add a new node and pick a
parent uniformly at random. Copy edges of parent and make
an error with probability    




                        Obvious follow up here: does a random sample with the same degree distribution show the same thing?
David Gleich (Purdue)                                  IU Seminar                                                    31/51
Forest Fire models
Start graph with a k-node clique. Add a new node and pick a
  parent uniformly at random. Do a random “bfs’/”forest
  fire” and link to all nodes “burned”




David Gleich (Purdue)      IU Seminar                    32/51
Reality vs. graph models




David Gleich (Purdue)   IU Seminar   33/51
Where is this going?
We can compute
spectra for large
networks if
needed.

Study relationship
with known power-
laws in spectra

Eigenvector
localization

Directed Laplacians

David Gleich (Purdue)   IU Seminar   34/51
Just the degree distribution? No




David Gleich (Purdue)   IU Seminar   35/51
Facebook is not a copying model




David Gleich (Purdue)   IU Seminar   36/51
Why is one separated sometimes?




David Gleich (Purdue)   IU Seminar   37/51
Separation in copying, forest-fire




                        Note, the axes on these fiures aren’t comparable – different plotting scales – but the shapes are.
David Gleich (Purdue)                             IU Seminar                                                        38/51
Forest-fire plots




David Gleich (Purdue)   IU Seminar   39/51
Strong separation in random powerlaw




David Gleich (Purdue)   IU Seminar   40/51
Not edge-density alone




David Gleich (Purdue)   IU Seminar   41/51
Not edge-density alone




David Gleich (Purdue)   IU Seminar   42/51
COMPUTING
                 SPECTRA OF
            LARGE NETWORKS




David Gleich (Purdue)    IU Seminar   43
Redsky, Hopper I, Hopper II, and a Cielo testbed. Details if time.
David Gleich (Purdue)   IU Seminar                                                       44/51
Eigenvalues with ScaLAPACK
Mostly the same approach as in LAPACK

1.  Reduce to tridiagonal form
    (most time consuming part)
2.  Distribute tridiagonals to
     all processors
3.  Each processor finds
    all eigenvalues
4.  Each processor computes a
    subset of eigenvectors                          ScaLAPACK’s 2d block cyclic storage

I’m actually using the MRRR algorithm,
where steps 3 and 4 are better and faster
                        MRRR due to Parlett and Dhillon; implemented in ScaLAPACK by Christof Vomel.
David Gleich (Purdue)            IU Seminar                                                    45/51
Estimating the density directly
                                                                      and      have the same
                                                                   eigenvalue inertia if     is non-
                                                                   singular.

                                                                   Eigenvalue inertia = (p,n,z)
                                                                   Positive eigenvalues
                                                                   Negative eigenvalues
                                                                   Zero eigenvalues
                                                                   If       is diagonal, inertia is
                                                                   easy to compute

                                                                       has inertia (n-1,0,1)
                                                                          has inertia
                                                                   (sum(      ), sum(      ),…)



                        This is an old trick in linear algebra. I know that Fay et al. used it in their weighted spectral density.
David Gleich (Purdue)                                 IU Seminar                                                            46/51
Alternatives
Use ARPACK to get extrema

Use ARPACK to get interior around      via the folded
   spectrum
                    




                                                                      Large nearly
                                                                      repeated sets of
                                                                      eigenvalues will
                                                                      make this tricky.
                        Farkas et al. used this approach. Figure from somewhere on the web… sorry!
David Gleich (Purdue)          IU Seminar                                                    47/51
Adding MPI tasks vs. using threads
Most math libraries have threaded versions
   (Intel MKL, AMD ACML)
Is it better to use threads or MPI tasks?

It depends.

                                                                     Cray libsci
                        Intel MKL
                                                     Threads          Ranks             Time
Threads Ranks               Time-T   Time-E
                                                     1                64                1412.5
1                36         1271.4   339.0
                                                     4                16                1881.4
4                9          1058.1   456.6
                                                     16               4                 Omitted.

                                          Normalized Laplacian for 36k-by-36k co-author graph of CondMat
David Gleich (Purdue)                  IU Seminar                                                  48/51
Weak Parallel Scaling
                                                   Time               
Time in hours




                                                   Good strong
                                                     scaling up to
                                                     325,000 vertices

                                                   Estimated time for
                                                     500,000 nodes 9
                                                     hours with 925
                                                     nodes (7400
                                                     procs)
                                   
   David Gleich (Purdue)              IU Seminar                  49/51
Code will be available eventually. Image from good financial cents.
David Gleich (Purdue)                                 IU Seminar                              50 of <Total>
Same density, but somewhat different




                                     Both have a mean degree of 3.8
David Gleich (Purdue)   IU Seminar                            51/51
GRAPHS        As well as things we
                        AND THEIR       already know about
                                        graph spectra.
                         MATRICES




David Gleich (Purdue)          IU Seminar                      52
Spectral bounds from Gerschgorin

$$-d_{max} le lambda(mA) le d_{max}$$

$$0 le lambda(mL) le 2 d_{max}$$

$$0 le lambda(mat{tilde{L}}) le 2$$
  (from a slightly different approach)




David Gleich (Purdue)       IU Seminar        53/51
ScaLAPACK
LAPACK with distributed memory dense matrices
Scalapack uses a 2d block-cyclic dense matrix distribution




David Gleich (Purdue)       IU Seminar                       54/51
usroads




                                     Connected component from the us road network
David Gleich (Purdue)   IU Seminar                                          55/51
Movies of spectra…
As these models evolve, what do the spectra look like?




David Gleich (Purdue)       IU Seminar                   56/51
Gnutella large vs. resampled




David Gleich (Purdue)   IU Seminar   57/51
David Gleich (Purdue)   IU Seminar   58 of <Total>
Models
Preferential Attachment
Start graph with a k-node clique. Add a new node and
  connect to k random nodes, chosen proportional to degree.

Copying model
Start graph with a k-node clique. Add a new node and pick a
  parent uniformly at random. Copy edges of parent and
  make an error with probability $$alpha$$

Forest Fire
Start graph with a k-node clique. Add a new node and pick a
  parent uniformly at random. Do a random “bfs’/”forest fire”
  and link to all nodes “burned”
David Gleich (Purdue)      IU Seminar                    59/51
Nullspaces of the adjacency matrix
                                         

So unit eigenvalues of the normalized Laplacian are null-
  vectors of the adjacency matrix.




David Gleich (Purdue)       IU Seminar                      60/51

More Related Content

Viewers also liked

present trans
present transpresent trans
present transPTF
 
Lattice Energy LLC-LENRs ca 1950s-Sternglass Expts-Einstein & Bethe-Nov 25 2011
Lattice Energy LLC-LENRs ca 1950s-Sternglass Expts-Einstein & Bethe-Nov 25 2011Lattice Energy LLC-LENRs ca 1950s-Sternglass Expts-Einstein & Bethe-Nov 25 2011
Lattice Energy LLC-LENRs ca 1950s-Sternglass Expts-Einstein & Bethe-Nov 25 2011Lewis Larsen
 
Melisa rueda blandon ciencias politicas
Melisa rueda blandon ciencias politicasMelisa rueda blandon ciencias politicas
Melisa rueda blandon ciencias politicasMelisaRueda
 
Bilden till helga
Bilden till helgaBilden till helga
Bilden till helgadavvaman1
 
Capítulo 9 libro zacatollan una hist... copia
Capítulo 9 libro zacatollan una hist...   copiaCapítulo 9 libro zacatollan una hist...   copia
Capítulo 9 libro zacatollan una hist... copiaCesar Adame
 
UNIDADES BÁSICAS DE LA FÍSICA
UNIDADES BÁSICAS DE LA FÍSICAUNIDADES BÁSICAS DE LA FÍSICA
UNIDADES BÁSICAS DE LA FÍSICALUISRUBINOS
 
Trastornos específicos del lenguaje
Trastornos específicos del lenguajeTrastornos específicos del lenguaje
Trastornos específicos del lenguajeTami.Marimar
 
Insight from CloverPoint - 3D Asset and Facilities Management
Insight from CloverPoint - 3D Asset and Facilities ManagementInsight from CloverPoint - 3D Asset and Facilities Management
Insight from CloverPoint - 3D Asset and Facilities ManagementCloverpoint
 
Arbol navidadreyesmagos.2011
Arbol navidadreyesmagos.2011Arbol navidadreyesmagos.2011
Arbol navidadreyesmagos.2011MOTRICO
 
Como Ganar Dinero Con Youtube
Como Ganar Dinero Con YoutubeComo Ganar Dinero Con Youtube
Como Ganar Dinero Con YoutubeJesus Castillo
 
Clases de oraciones
Clases de oracionesClases de oraciones
Clases de oracionesgrego234
 
Trabajo en equipo aurissssss b
Trabajo en equipo aurissssss bTrabajo en equipo aurissssss b
Trabajo en equipo aurissssss bcristiarcilacsj
 
Y ahora como hago
Y ahora como hagoY ahora como hago
Y ahora como hagoAndres Duma
 
Finanzas internaconales situación económica ecuador
Finanzas internaconales situación económica ecuadorFinanzas internaconales situación económica ecuador
Finanzas internaconales situación económica ecuadorStalin Miguel Goyes Yandún
 
Datos socioeconómicos de consuegra (2)
Datos socioeconómicos de consuegra (2)Datos socioeconómicos de consuegra (2)
Datos socioeconómicos de consuegra (2)María De La Cruz
 

Viewers also liked (20)

present trans
present transpresent trans
present trans
 
Lattice Energy LLC-LENRs ca 1950s-Sternglass Expts-Einstein & Bethe-Nov 25 2011
Lattice Energy LLC-LENRs ca 1950s-Sternglass Expts-Einstein & Bethe-Nov 25 2011Lattice Energy LLC-LENRs ca 1950s-Sternglass Expts-Einstein & Bethe-Nov 25 2011
Lattice Energy LLC-LENRs ca 1950s-Sternglass Expts-Einstein & Bethe-Nov 25 2011
 
026 daniel
026 daniel026 daniel
026 daniel
 
Presentación1
Presentación1Presentación1
Presentación1
 
Producto final
Producto finalProducto final
Producto final
 
Melisa rueda blandon ciencias politicas
Melisa rueda blandon ciencias politicasMelisa rueda blandon ciencias politicas
Melisa rueda blandon ciencias politicas
 
Bilden till helga
Bilden till helgaBilden till helga
Bilden till helga
 
Capítulo 9 libro zacatollan una hist... copia
Capítulo 9 libro zacatollan una hist...   copiaCapítulo 9 libro zacatollan una hist...   copia
Capítulo 9 libro zacatollan una hist... copia
 
UNIDADES BÁSICAS DE LA FÍSICA
UNIDADES BÁSICAS DE LA FÍSICAUNIDADES BÁSICAS DE LA FÍSICA
UNIDADES BÁSICAS DE LA FÍSICA
 
Trastornos específicos del lenguaje
Trastornos específicos del lenguajeTrastornos específicos del lenguaje
Trastornos específicos del lenguaje
 
Insight from CloverPoint - 3D Asset and Facilities Management
Insight from CloverPoint - 3D Asset and Facilities ManagementInsight from CloverPoint - 3D Asset and Facilities Management
Insight from CloverPoint - 3D Asset and Facilities Management
 
Energia j
Energia jEnergia j
Energia j
 
Arbol navidadreyesmagos.2011
Arbol navidadreyesmagos.2011Arbol navidadreyesmagos.2011
Arbol navidadreyesmagos.2011
 
Como Ganar Dinero Con Youtube
Como Ganar Dinero Con YoutubeComo Ganar Dinero Con Youtube
Como Ganar Dinero Con Youtube
 
Clases de oraciones
Clases de oracionesClases de oraciones
Clases de oraciones
 
Trabajo en equipo aurissssss b
Trabajo en equipo aurissssss bTrabajo en equipo aurissssss b
Trabajo en equipo aurissssss b
 
Y ahora como hago
Y ahora como hagoY ahora como hago
Y ahora como hago
 
Finanzas internaconales situación económica ecuador
Finanzas internaconales situación económica ecuadorFinanzas internaconales situación económica ecuador
Finanzas internaconales situación económica ecuador
 
12 errores con hijos
12 errores con hijos12 errores con hijos
12 errores con hijos
 
Datos socioeconómicos de consuegra (2)
Datos socioeconómicos de consuegra (2)Datos socioeconómicos de consuegra (2)
Datos socioeconómicos de consuegra (2)
 

Similar to The spectre of the spectrum

Spectra of Large Network
Spectra of Large NetworkSpectra of Large Network
Spectra of Large NetworkDavid Gleich
 
Global Modeling of Biodiversity and Climate Change
Global Modeling of Biodiversity and Climate ChangeGlobal Modeling of Biodiversity and Climate Change
Global Modeling of Biodiversity and Climate ChangeSalford Systems
 
Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationDavid Gleich
 
Collaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph ClusteringCollaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph ClusteringWaqas Nawaz
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-finalmarpierc
 
Simulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific DatasetsSimulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific DatasetsDavid Gleich
 
A Methodology Of Forest Monitoring From Hyperspectral Images With Sparse Regu...
A Methodology Of Forest Monitoring From Hyperspectral Images With Sparse Regu...A Methodology Of Forest Monitoring From Hyperspectral Images With Sparse Regu...
A Methodology Of Forest Monitoring From Hyperspectral Images With Sparse Regu...sudare
 
A_Methodology_of_Forest_Monitoring_from_Hyperspectral_Images_with_Sparse_Regu...
A_Methodology_of_Forest_Monitoring_from_Hyperspectral_Images_with_Sparse_Regu...A_Methodology_of_Forest_Monitoring_from_Hyperspectral_Images_with_Sparse_Regu...
A_Methodology_of_Forest_Monitoring_from_Hyperspectral_Images_with_Sparse_Regu...grssieee
 
Two numerical graph algorithms
Two numerical graph algorithmsTwo numerical graph algorithms
Two numerical graph algorithmsDavid Gleich
 
e-Science for the Science Kilometre Array
e-Science for the Science Kilometre Arraye-Science for the Science Kilometre Array
e-Science for the Science Kilometre ArrayJoint ALMA Observatory
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computationDavid Gleich
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabImpetus Technologies
 
Fast Katz and Commuters
Fast Katz and CommutersFast Katz and Commuters
Fast Katz and CommutersDavid Gleich
 
An Effective Rule Miner for Instance Matching in a Web of Data
An Effective Rule Miner for Instance Matching in a Web of DataAn Effective Rule Miner for Instance Matching in a Web of Data
An Effective Rule Miner for Instance Matching in a Web of DataXing Niu
 
Spectral methods for linear systems with random inputs
Spectral methods for linear systems with random inputsSpectral methods for linear systems with random inputs
Spectral methods for linear systems with random inputsDavid Gleich
 
Data-Intensive Text Processing with MapReduce
Data-Intensive Text Processing  with MapReduce Data-Intensive Text Processing  with MapReduce
Data-Intensive Text Processing with MapReduce George Ang
 
Data-Intensive Text Processing with MapReduce
Data-Intensive Text Processing with MapReduceData-Intensive Text Processing with MapReduce
Data-Intensive Text Processing with MapReduceGeorge Ang
 
Neural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningNeural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningAsim Jalis
 

Similar to The spectre of the spectrum (20)

Spectra of Large Network
Spectra of Large NetworkSpectra of Large Network
Spectra of Large Network
 
Global Modeling of Biodiversity and Climate Change
Global Modeling of Biodiversity and Climate ChangeGlobal Modeling of Biodiversity and Climate Change
Global Modeling of Biodiversity and Climate Change
 
Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregation
 
Collaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph ClusteringCollaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph Clustering
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-final
 
Simulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific DatasetsSimulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific Datasets
 
A Methodology Of Forest Monitoring From Hyperspectral Images With Sparse Regu...
A Methodology Of Forest Monitoring From Hyperspectral Images With Sparse Regu...A Methodology Of Forest Monitoring From Hyperspectral Images With Sparse Regu...
A Methodology Of Forest Monitoring From Hyperspectral Images With Sparse Regu...
 
A_Methodology_of_Forest_Monitoring_from_Hyperspectral_Images_with_Sparse_Regu...
A_Methodology_of_Forest_Monitoring_from_Hyperspectral_Images_with_Sparse_Regu...A_Methodology_of_Forest_Monitoring_from_Hyperspectral_Images_with_Sparse_Regu...
A_Methodology_of_Forest_Monitoring_from_Hyperspectral_Images_with_Sparse_Regu...
 
Two numerical graph algorithms
Two numerical graph algorithmsTwo numerical graph algorithms
Two numerical graph algorithms
 
e-Science for the Science Kilometre Array
e-Science for the Science Kilometre Arraye-Science for the Science Kilometre Array
e-Science for the Science Kilometre Array
 
V. Batagelj - Big data Networks from data bases
V. Batagelj - Big data Networks from data basesV. Batagelj - Big data Networks from data bases
V. Batagelj - Big data Networks from data bases
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computation
 
Lightspeed SIGGRAPH talk
Lightspeed SIGGRAPH talkLightspeed SIGGRAPH talk
Lightspeed SIGGRAPH talk
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
 
Fast Katz and Commuters
Fast Katz and CommutersFast Katz and Commuters
Fast Katz and Commuters
 
An Effective Rule Miner for Instance Matching in a Web of Data
An Effective Rule Miner for Instance Matching in a Web of DataAn Effective Rule Miner for Instance Matching in a Web of Data
An Effective Rule Miner for Instance Matching in a Web of Data
 
Spectral methods for linear systems with random inputs
Spectral methods for linear systems with random inputsSpectral methods for linear systems with random inputs
Spectral methods for linear systems with random inputs
 
Data-Intensive Text Processing with MapReduce
Data-Intensive Text Processing  with MapReduce Data-Intensive Text Processing  with MapReduce
Data-Intensive Text Processing with MapReduce
 
Data-Intensive Text Processing with MapReduce
Data-Intensive Text Processing with MapReduceData-Intensive Text Processing with MapReduce
Data-Intensive Text Processing with MapReduce
 
Neural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningNeural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep Learning
 

More from David Gleich

Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisDavid Gleich
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksDavid Gleich
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networksDavid Gleich
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisDavid Gleich
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansDavid Gleich
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningDavid Gleich
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsDavid Gleich
 
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresDavid Gleich
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structuresDavid Gleich
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...David Gleich
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphsDavid Gleich
 
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutDavid Gleich
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential David Gleich
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreDavid Gleich
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...David Gleich
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsDavid Gleich
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLDavid Gleich
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksDavid Gleich
 

More from David Gleich (20)

Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network Analysis
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysis
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-means
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based Learning
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chains
 
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structures
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphs
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphs
 
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCut
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and more
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applications
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQL
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
 

Recently uploaded

Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 

Recently uploaded (20)

Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 

The spectre of the spectrum

  • 1. The Spectre of the Spectrum An empirical study of the ? spectra of large networks David F. Gleich     Purdue University Indiana University Thanks to Ali Pinar, Jaideep Ray,Tammy Kolda, Bloomington, IN C. Seshadhri, Rich Lehoucq @ Sandia and Supported by Sandia’s John von Neumann Jure Leskovec and Michael Mahoney @ Stanford postdoctoral fellowship and the DOE for helpful discussions. Office of Science’s Graphs project. David Gleich (Purdue) IU Seminar 1/51
  • 2. Spectral density plots Log(Density) ~ Count Eigenvalue Eigenvalues Eigenvalue index 0 2 Scree Plot Empirical Spectral Measure David Gleich (Purdue) IU Seminar 2/51
  • 3. There’s information inside the spectra Log(Density) ~ Count Eigenvalues 0 2 Words in dictionary definitions Internet router network 111k vertices, 2.7M edges 192k vetices, 1.2M edges These figures show the normalized Laplacian. Banerjee and Jost (2009) also noted such shapes in the spectra. David Gleich (Purdue) IU Seminar 3/51
  • 4. Overview Graphs and their matrices Data for our experiments Issues with computing spectra Many examples of graph spectra A curious property around the eigenvalue one Computing spectra for large networks Ongoing studies David Gleich (Purdue) IU Seminar 4/51
  • 5. Why are we interested in the spectra? Modeling Properties Moments of the adjacency Regularities Anomalies Network Comparison Fay et al. 2010 – Weighted Spectral Density The network is as19971108 from Jure’s snap collect (a few thousand nodes) and we insert random connections from 50 nodes David Gleich (Purdue) IU Seminar 5/51
  • 6. Matrices from graphs Adjacency matrix Random walk matrix                                                    if               Modularity matrix                                                                            Laplacian matrix                        Not covered Signless Laplacian matrix                  Incidence matrix                             (It is incidentally discussed) Seidel matrix Normalized Laplacian matrix Heat Kernel                                                                         Everything is undirected. Mostly connected components only too. David Gleich (Purdue) IU Seminar 6/51
  • 7. Erdős–Rényi Semi-circles Based on Wigner’s semi-circle law. The eigenvalues of the Count adjacency matrix for n=1000, averaged over 10 trials Warning Adjacency Semi-circle with outlier matrix! if average degree is large enough. Eigenvalue Observed by Farkas and in the book “Network Alignment” edited by Brandes (Chapter 14) David Gleich (Purdue) IU Seminar 7/51
  • 8. Previous results Farkas et al. Significant deviation from the semi- circle law for the adjacency matrix Mihail and Papadimitriou Leading eigenvalues of the adjacency matrix obey a power-law based on the degree-sequence Chung et al. Normalized Laplacian still obeys a semi-circle law if min-degree large Banerjee and Jost Study of types of patterns that emerge in evolving graph models – explain many features of the spectra David Gleich (Purdue) IU Seminar 8/51
  • 9. In comparison to other empiric studies We use “exact” computation of spectra, instead of approximation. We study “all” of the standard matrices over a range of large networks. Our “large” is bigger. We look at a few random graph models preferential attachment random powerlaw copying model forest fire model David Gleich (Purdue) IU Seminar 9/51
  • 10. Why you ISSUES WITH should be COMPUTING very SPECTRA careful with eigenvalues. David Gleich (Purdue) IU Seminar 10
  • 11. Matlab! Always a great starting point. My desktop has 24GB of RAM (less than $2500 now!) 24GB/8 bytes (per double) = 3 billion numbers ~ 50,000-by-50,000 matrix Possibilities D = eig(A) – needs twice the memory for A,D [V,D] = eig(A) – needs three times the memory for A,D,V These limit us to ~38000 and ~31000 respectively. David Gleich (Purdue) IU Seminar 11/51
  • 12. Bugs – Matlab eig(A) Returns incorrect eigenvectors Seems to be the result of a bug in Intel’s MKL library. Fixed in R2011a David Gleich (Purdue) IU Seminar 12/51
  • 13. Bug – ScaLAPACK default sudo apt-get install scalapack-openmpi Allocate 36000x36000 local matrix Run on 4 processors Code crashes Fixed in upcoming Debian libscalapack David Gleich (Purdue) IU Seminar 13/51
  • 14. Bug – LAPACK Scalapack MRRR Compare standard lapack/blas to atlas performance Result: correct output from atlas Result: incorrect output from lapack Hypothesis: lapack contains a known bug that’s apparently in the default ubuntu lapack David Gleich (Purdue) IU Seminar 14/51
  • 15. Moral Always test your software. Extensively. David Gleich (Purdue) IU Seminar 15/51
  • 16. David Gleich (Purdue) IU Seminar 16
  • 18. Data sources SNAP Various 100s-100,000s SNAP-p2p Gnutella Network 5-60k, ~30 inst. SNAP-as-733 Autonomous Sys. ~5,000, 733 inst. SNAP-caida Router networks ~20,000, ~125 inst. Pajek Various 100s-100,000s Models Copying Model 1k-100k 9 inst. 324 gs Pref. Attach 1k-100k 9 inst. 164 gs Forest Fire 1k-100k 9 inst. 324 gs Mine Various 2k-500k Newman Various Arenas Various Porter Facebook 100 schools, 5k-60k IsoRank, Natalie Protein-Protein <10k , 4 graphs Thanks to all who make data available David Gleich (Purdue) IU Seminar 18/51
  • 19. Big graphs Arxiv 86376 1035126 Co-author Dblp 93156 356290 Co-author Dictionary(*) 111982 2750576 Word defns. Internet(*) 124651 414428 Routers Itdk0304 190914 1215220 Routers p2p-gnu(*) 62561 295756 Peer-to-peer Patents(*) 230686 1109898 Citations Roads 126146 323900 Roads Wordnet(*) 75606 240036 Word relation web-nb.edu(*) 325729 2994268 Web (*) denotes that this is a weakly connected component of a directed graph. David Gleich (Purdue) IU Seminar 19/51
  • 20. A $8,000 matrix computation 925 nodes and 7400 processors on Redsky for 10 hours normalized Laplacian matrix David Gleich (Purdue) IU Seminar 20/51
  • 21. Indiana’s Facebook Network Warning Warning Adjacency Laplacian marix! marix! Data from Mason Porter. Aka, the start of a $50,000,000,000 graph. David Gleich (Purdue) IU Seminar 21/51
  • 22. Yes! These are cases where we have multiple instances of the same graph. David Gleich (Purdue) IU Seminar 22/51
  • 23. Already known? Just the facebook spectra. David Gleich (Purdue) IU Seminar 23/51
  • 24. Already known? I soon realized I was searching for “spectre” instead of spectrum, oops. David Gleich (Purdue) IU Seminar 24/51
  • 25. Spikes? Repeated rows Identical rows grow the null-space. Banerjee and Jost Unit eigenvalue Motif doubling and joining small                                           repeated graphs will tend to cause eigenvalues and null vectors. Banerjee and Jost explained how evolving graphs should produce repeated eigenvalues David Gleich (Purdue) IU Seminar 25/51
  • 26. Combining Eigenvalues If A has an eigenvector with a zero component, then “A + B” (as in the figure) has the same eigenvalue with eigenvector extended with zeros on B. Bannerjee and Jost observed this for the normalized Laplacian. David Gleich (Purdue) IU Seminar 26/51
  • 27. Spikes! 1.33 (two!) 1.5, 0.5 1.5 0.565741 1.833 1.767592 0.725708 1.607625 1.5 (two) David Gleich (Purdue) IU Seminar 27/51
  • 28. Classic spectra From NASA: http://imagine.gsfc.nasa.gov/docs/science/how_l1/spectral_what.html David Gleich (Purdue) IU Seminar 28/51
  • 29. Random power law Random power law 12500 vertices, 500 (2.*) / 400 (1.8) min degree Generate a power law degree distribution. Produce a random graph with a prescribed degree distribution using the Bayati-Kim-Saberi procedure. David Gleich (Purdue) IU Seminar 29/51
  • 30. Preferential Attachment Start graph with a k-node clique. Add a new node and connect to k random nodes, chosen proportional to degree. Semi-circle in log-space! David Gleich (Purdue) IU Seminar 30/51
  • 31. Copying model Start graph with a k-node clique. Add a new node and pick a parent uniformly at random. Copy edges of parent and make an error with probability     Obvious follow up here: does a random sample with the same degree distribution show the same thing? David Gleich (Purdue) IU Seminar 31/51
  • 32. Forest Fire models Start graph with a k-node clique. Add a new node and pick a parent uniformly at random. Do a random “bfs’/”forest fire” and link to all nodes “burned” David Gleich (Purdue) IU Seminar 32/51
  • 33. Reality vs. graph models David Gleich (Purdue) IU Seminar 33/51
  • 34. Where is this going? We can compute spectra for large networks if needed. Study relationship with known power- laws in spectra Eigenvector localization Directed Laplacians David Gleich (Purdue) IU Seminar 34/51
  • 35. Just the degree distribution? No David Gleich (Purdue) IU Seminar 35/51
  • 36. Facebook is not a copying model David Gleich (Purdue) IU Seminar 36/51
  • 37. Why is one separated sometimes? David Gleich (Purdue) IU Seminar 37/51
  • 38. Separation in copying, forest-fire Note, the axes on these fiures aren’t comparable – different plotting scales – but the shapes are. David Gleich (Purdue) IU Seminar 38/51
  • 39. Forest-fire plots David Gleich (Purdue) IU Seminar 39/51
  • 40. Strong separation in random powerlaw David Gleich (Purdue) IU Seminar 40/51
  • 41. Not edge-density alone David Gleich (Purdue) IU Seminar 41/51
  • 42. Not edge-density alone David Gleich (Purdue) IU Seminar 42/51
  • 43. COMPUTING SPECTRA OF LARGE NETWORKS David Gleich (Purdue) IU Seminar 43
  • 44. Redsky, Hopper I, Hopper II, and a Cielo testbed. Details if time. David Gleich (Purdue) IU Seminar 44/51
  • 45. Eigenvalues with ScaLAPACK Mostly the same approach as in LAPACK 1.  Reduce to tridiagonal form (most time consuming part) 2.  Distribute tridiagonals to all processors 3.  Each processor finds all eigenvalues 4.  Each processor computes a subset of eigenvectors ScaLAPACK’s 2d block cyclic storage I’m actually using the MRRR algorithm, where steps 3 and 4 are better and faster MRRR due to Parlett and Dhillon; implemented in ScaLAPACK by Christof Vomel. David Gleich (Purdue) IU Seminar 45/51
  • 46. Estimating the density directly    and have the same eigenvalue inertia if     is non- singular. Eigenvalue inertia = (p,n,z) Positive eigenvalues Negative eigenvalues Zero eigenvalues If       is diagonal, inertia is easy to compute     has inertia (n-1,0,1)        has inertia (sum(      ), sum(      ),…) This is an old trick in linear algebra. I know that Fay et al. used it in their weighted spectral density. David Gleich (Purdue) IU Seminar 46/51
  • 47. Alternatives Use ARPACK to get extrema Use ARPACK to get interior around      via the folded spectrum                      Large nearly repeated sets of eigenvalues will make this tricky. Farkas et al. used this approach. Figure from somewhere on the web… sorry! David Gleich (Purdue) IU Seminar 47/51
  • 48. Adding MPI tasks vs. using threads Most math libraries have threaded versions (Intel MKL, AMD ACML) Is it better to use threads or MPI tasks? It depends. Cray libsci Intel MKL Threads Ranks Time Threads Ranks Time-T Time-E 1 64 1412.5 1 36 1271.4 339.0 4 16 1881.4 4 9 1058.1 456.6 16 4 Omitted. Normalized Laplacian for 36k-by-36k co-author graph of CondMat David Gleich (Purdue) IU Seminar 48/51
  • 49. Weak Parallel Scaling Time                Time in hours Good strong scaling up to 325,000 vertices Estimated time for 500,000 nodes 9 hours with 925 nodes (7400 procs)          David Gleich (Purdue) IU Seminar 49/51
  • 50. Code will be available eventually. Image from good financial cents. David Gleich (Purdue) IU Seminar 50 of <Total>
  • 51. Same density, but somewhat different Both have a mean degree of 3.8 David Gleich (Purdue) IU Seminar 51/51
  • 52. GRAPHS As well as things we AND THEIR already know about graph spectra. MATRICES David Gleich (Purdue) IU Seminar 52
  • 53. Spectral bounds from Gerschgorin $$-d_{max} le lambda(mA) le d_{max}$$ $$0 le lambda(mL) le 2 d_{max}$$ $$0 le lambda(mat{tilde{L}}) le 2$$ (from a slightly different approach) David Gleich (Purdue) IU Seminar 53/51
  • 54. ScaLAPACK LAPACK with distributed memory dense matrices Scalapack uses a 2d block-cyclic dense matrix distribution David Gleich (Purdue) IU Seminar 54/51
  • 55. usroads Connected component from the us road network David Gleich (Purdue) IU Seminar 55/51
  • 56. Movies of spectra… As these models evolve, what do the spectra look like? David Gleich (Purdue) IU Seminar 56/51
  • 57. Gnutella large vs. resampled David Gleich (Purdue) IU Seminar 57/51
  • 58. David Gleich (Purdue) IU Seminar 58 of <Total>
  • 59. Models Preferential Attachment Start graph with a k-node clique. Add a new node and connect to k random nodes, chosen proportional to degree. Copying model Start graph with a k-node clique. Add a new node and pick a parent uniformly at random. Copy edges of parent and make an error with probability $$alpha$$ Forest Fire Start graph with a k-node clique. Add a new node and pick a parent uniformly at random. Do a random “bfs’/”forest fire” and link to all nodes “burned” David Gleich (Purdue) IU Seminar 59/51
  • 60. Nullspaces of the adjacency matrix                                           So unit eigenvalues of the normalized Laplacian are null- vectors of the adjacency matrix. David Gleich (Purdue) IU Seminar 60/51