Clique Relaxation Models in Networks: Theory, Algorithms, and Applications
1. Clique Relaxation Models in Networks:
Theory, Algorithms, and Applications
Sergiy Butenko (butenko@tamu.edu)
Department of Industrial and Systems Engineering
Texas A&M University
College Station, TX 77843-3131
1/93
2. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
2/93
3. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
3/93
4. Graphs
Graph is the mathematical term for a “network” - often visualized as
vertices (points, nodes) connected by edges (lines, arcs)
4/93
5. Graph theory basics
The origins of graph theory are attributed to the Seven Bridges of
K¨nigsberg problem solved by Leonhard Euler in 1735.
o
5/93
6. Graph theory basics
A simple, undirected graph is a pair G = (V , E ), where V is a finite set of
vertices and E ⊆ V × V is a set of edges, with each edge defined on a pair
of vertices.
6
5 1
4 2
3
V = {1, 2, 3, 4, 5, 6}
E = {(1, 2), (1, 6), (2, 3), (2, 4), (3, 4), (4, 5), (5, 6)}
6/93
7. Graph theory basics
A graph can be represented in several ways.
We can represent a graph G = (V , E ) by its |V | × |V |
adjacency matrix AG = [aij ], such that
1, if (vi , vj ) ∈ E ,
aij =
0, otherwise.
Another way of representing a graph is by its adjacency lists,
where for each vertex v ∈ V we record the set of vertices
adjacent to it.
7/93
8. Graph theory basics
A multigraph is a graph with repeated edges.
A directed graph, or digraph, is a graph with directions assigned
to its edges.
2
€€ 7
€€
b
€€ I
d q
5 €
d
d I €€
d
d €€
€ q ~
1 E 3
€€
d d
8 E 10
d d
d €€ ‚
I
b
€€
q d
6 €
d
dd
I €€
€€ ‚
~
q
€
4 9
8/93
9. Graph theory basics
If G = (V , E ) is a graph and e = (u, v ) ∈ E , we say that u is
adjacent to v and vice-versa. We also say that u and v are
neighbors.
Neighborhood N(v ) of a vertex v is the set of all its neighbors
in G : N(v ) = {j : (i, j) ∈ E }.
If G = (V , E ) is a graph and e = (u, v ) ∈ E , we say that e is
incident to u and v .
The degree of a vertex v is the number of its incident edges.
9/93
10. Graph theory basics
G = (V , E ) is a simple undirected graph, V = {1, 2, . . . , n}.
G = (V , E ), is the complement graph of G = (V , E ), where
E = {(i, j) | i, j ∈ V , i = j and (i, j) ∈ E }.
/
For S ⊆ V , G (S) = (S, E ∩ S × S) the subgraph induced by S.
10/93
11. Cliques and independent sets
A subset of vertices C ⊆ V is called a clique if G (C ) is a
complete graph.
A subset I ⊆ V is called an independent set (stable set, vertex
packing) if G (I ) has no edges.
¯
C is a clique in G if and only if C is an independent set in G .
11/93
12. Cliques and independent sets
A clique (independent set) is said to be
– maximal, if it is not a subset of any larger clique (independent
set);
– maximum, if there is no larger clique (independent set) in the
graph.
ω(G ) – the clique number of G .
α(G ) – the independence (stability) number of G .
12/93
13. Cliques and independent sets
G 1 1 G
Complement
5 2 5 2
4 3 4 3
{1,2,5} : maximal clique {1,2,5} : maximal
independent set
{1,4} : maximal
{1,4} : maximal clique
independent set
{2,3,4,5} : maximum clique {2,3,4,5} : maximum
independent set
13/93
14. Social networks
A social network is described by G = (V , E ) where V is the set of
“actors” and E is the set of “ties”.
actors are people and a tie exists if two people know each other.
actors are wire transfer database records and a tie exists if two
records have the same matching field.
actors are telephone numbers and a tie exists if calls were made
between them.
14/93
15. Social network analysis
“Popular” social networks:
Kevin Bacon Number
Erd¨s Number
o
Six degrees of separation and small world phenomenon in
acquaintance networks
15/93
16. Cohesive subgroups
Cohesive subgroups are “tightly knit groups” in a social
network.
Social cohesion is often used to explain and develop sociological
theories.
Members of a cohesive subgroup are believed to share
information, have homogeneity of thought, identity, beliefs,
behavior, even food habits and illnesses.
16/93
17. Applications
Acquaintance Networks - criminal network analysis
Wire Transfer Database Networks - detecting money
laundering
Call Networks - organized crime detection
Protein Interaction Networks - predicting protein functions
Gene Co-expression Networks - detecting network motifs
Metabolic Networks - identifying metabolic pathways
Stock Market Networks - stock portfolios
Internet Graphs - information search and retrieval
Wireless and telecommunication networks - clustering and
routing
17/93
18. Random graph models
Uniform random graphs (Erdos Renyi):
G (n, p): n vertices, each edge exists with a probability p;
G (n, M): all graphs with n vertices and M edges are assigned
the same probability
Power-law random graphs:
the probability that a vertex has a degree k is proportional to
k −β
Random graphs with a given degree sequence
18/93
19. Scale-free graphs
Scale-free nature : A property observed in many natural and
man-made networks - biological networks, social networks,
internet graphs
Degree distribution follows a power law - P(k) ∝ k −β
Fundamentally different from classical random graph model of
Erd¨s and Renyi
o
Principle of preferential attachment underlies the growth and
evolution of such networks (i.e. the rich get richer!)
19/93
21. Internet colored by IP addresses
Image source:
http://www.jeffkennedyassociates.com:16080/connections/concept/image.html
21/93
22. H. Pylori - protein interactions
Image generated using Graphviz
22/93
23. Properties of cohesive subgroups
Some desirable properties of a cohesive subgroup are:
Familiarity (degree);
Reachability (distance, diameter);
Robustness (connectivity);
Density (edge density).
Earliest models of cohesive subgroups were cliques (Luce and Perry,
1949), representing an “ideal” model of a cohesive subgroup.
23/93
24. Clique relaxations: k-cliques and k-clubs
A k-clique is a subset of vertices C such that for every i, j ∈ C ,
d(i, j) ≤ k.
A k-club is a subset of vertices D such that diam(G [D]) ≤ k.
6
5 1
4 2
3
24/93
25. Clique relaxations: k-cliques and k-clubs
A k-clique is a subset of vertices C such that for every i, j ∈ C ,
d(i, j) ≤ k.
A k-club is a subset of vertices D such that diam(G [D]) ≤ k.
6 {2,3,4} is a 1-club ... the
“regular” clique
5 1
4 2
3
24/93
26. Clique relaxations: k-cliques and k-clubs
A k-clique is a subset of vertices C such that for every i, j ∈ C ,
d(i, j) ≤ k.
A k-club is a subset of vertices D such that diam(G [D]) ≤ k.
6 {2,3,4} is a 1-club ... the
“regular” clique
5 1
{1,2,4,5,6} is a 2-club
4 2
3
24/93
27. Clique relaxations: k-cliques and k-clubs
A k-clique is a subset of vertices C such that for every i, j ∈ C ,
d(i, j) ≤ k.
A k-club is a subset of vertices D such that diam(G [D]) ≤ k.
6 {2,3,4} is a 1-club ... the
“regular” clique
5 1
{1,2,4,5,6} is a 2-club
{1,2,3,4,5} is a 2-clique but
4 2 NOT a 2-club
3
24/93
28. Clique relaxations: k-cliques and k-clubs
A k-clique is a subset of vertices C such that for every i, j ∈ C ,
d(i, j) ≤ k.
A k-club is a subset of vertices D such that diam(G [D]) ≤ k.
6 {2,3,4} is a 1-club ... the
“regular” clique
5 1
{1,2,4,5,6} is a 2-club
{1,2,3,4,5} is a 2-clique but
4 2 NOT a 2-club
maximality of a 2-club is harder
3 to test
24/93
29. k-plex
Definition
A subset of vertices S is said to be a k-plex if the minimum degree
in the induced subgraph δ(G [S]) ≥ |S| − k
i.e. every vertex in G [S] has degree at least |S| − k.
2 1
6 3
5 4
25/93
30. k-plex
Definition
A subset of vertices S is said to be a k-plex if the minimum degree
in the induced subgraph δ(G [S]) ≥ |S| − k
i.e. every vertex in G [S] has degree at least |S| − k.
2 1 {3,4,5,6} is a 1-plex ... the
“regular” clique
6 3
5 4
25/93
31. k-plex
Definition
A subset of vertices S is said to be a k-plex if the minimum degree
in the induced subgraph δ(G [S]) ≥ |S| − k
i.e. every vertex in G [S] has degree at least |S| − k.
2 1 {3,4,5,6} is a 1-plex ... the
“regular” clique
6 3 {1,3,4,5,6} is a 2-plex
(and NOT a 1-plex)
5 4
25/93
32. k-plex
Definition
A subset of vertices S is said to be a k-plex if the minimum degree
in the induced subgraph δ(G [S]) ≥ |S| − k
i.e. every vertex in G [S] has degree at least |S| − k.
2 1
{3,4,5,6} is a 1-plex ... the
“regular” clique
6 3 {1,3,4,5,6} is a 2-plex
(and NOT a 1-plex)
{1,2,3,4,5,6} is a 3-plex
5 4
(and NOT a 2-plex)
25/93
33. co-k-plex
Definition
A subset of vertices S is a co-k-plex if the maximum degree in the
induced subgraph ∆(G [S]) ≤ k − 1.
i.e. degree of every vertex in G [S] is at most k − 1.
S is a co-k-plex in G if and only if S is a k-plex in the complement
¯
graph G .
2 1 2 1
6 3 6 3
5 4 5 4
3-plex Co-3-plex
26/93
34. Structural Properties of a k-Plex
If G is a k-plex then
1. Every subgraph of G is a k-plex;
n+2
2. If k 2 then diam(G ) ≤ 2;
3. κ(G ) ≥ n − 2k + 2.
k-plexes for “small” k values, guarantee reachability and
connectivity while relaxing familiarity.
27/93
36. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
29/93
37. Alternative clique definitions
Distance-based
Vertices are distance one away from each other.
Diameter-based
Vertices induce a subgraph of diameter one.
Domination-based
Every one vertex forms a dominating set.
30/93
38. Alternative clique definitions
Degree-based
Each vertex is connected to all other vertices.
Connectivity-based
All vertices need to be removed to disconnect the induced subgraph.
Density-based
Vertices induce a subgraph that has all possible edges.
31/93
39. Order of a clique relaxation
We introduce a hierarchy of clique relaxation models by defining
their order as follows.
Clique is the only clique relaxation of order 0.
Clique relaxations of order 1 relax one of the elementary
properties (distance, diameter, ...) used to define clique.
Clique relaxations of order 2 relax two of the elementary
clique-defining properties.
...
32/93
40. Nature of a clique relaxation
Parametric clique relaxations
relax a parameter (“one”, “all”) describing an elementary
clique-defining property.
Partial clique relaxations
require only a (high) fraction of elements to satisfy an elementary
clique-defining property or a parametric clique-relaxation property.
33/93
41. Types of parametric clique relaxations
Parametric clique relaxation of type 1
Replace “one” with “(at most) k”, k 1, in a definition of clique.
Parametric clique relaxation of type 2
Replace “one” with “(at most) all but k”, k 1, in a definition of clique.
Parametric clique relaxation of type 1a
Replace “all” with “(at least) k”, k 1, in a definition of clique.
Parametric clique relaxation of type 2a
Replace “all” with “(at least) all but k”, k 1, in a definition of clique.
34/93
42. First-order parametric
clique relaxations of type 1
Replace “one” with “(at most) k”, k 1.
Distance-based: k-clique
Vertices are distance at most k away from each other.
Diameter-based: k-club
Vertices induce a subgraph of diameter at most k.
Domination-based: k-plex
Every k vertices form a dominating set.
35/93
43. First-order parametric
clique relaxations of type 2
Replace “one” with “(at most) all but k”, k 1
Distance-based: N/A
Vertices in S are distance at most |S| − k away from each other.
Diameter-based: N/A
Vertices induce a subgraph G [S] of diameter at most |S| − k.
Domination-based: N/A
Every |S| − k vertices form a dominating set.
36/93
44. First-order parametric
clique relaxations of type 1a
Replace “all” with “(at least) k”, k 1
Degree-based: k-core
Each vertex is connected to at least k other vertices.
Connectivity-based: k-connected subgraph
At least k vertices need to be removed to disconnect the induced
subgraph.
Density-based: N/A
Vertices induce a subgraph that has at least k edges.
37/93
45. First-order parametric
clique relaxations of type 2a
Replace “all” with “(at least) all but k”, k 1.
Degree-based: k-plex
Each vertex is connected to at least all but k other vertices.
Connectivity-based: N/A
At least all but k vertices need to be removed to disconnect the
induced subgraph.
Density-based: N/A
Vertices induce a subgraph that has at least all but k edges.
38/93
46. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
39/93
47. Basics of the complexity theory
Given a combinatorial optimization problem, a natural question
is:
is this problem “easy” or “hard”?
Methods for easy and hard problems are fundamentally
different.
How do we distinguish between “easy” and “hard” problems?
40/93
48. “Easy” problems
By easy or tractable problems we mean the problems that can
be solved in time polynomial with respect to their size.
We also call such problems polynomially solvable and denote
the class of polynomially solvable problems by P.
Sorting
Minimum weight spanning tree
Linear programming
41/93
49. Defining “hard” problems
How do we define “hard” problems?
How about defining hard problems as all problems that are not
easy, i.e., not in P?
Then some of the problems in such a class could be TOO hard
– we cannot even hope to be able to solve them.
We want to define a class of hard problems that we may be
able to solve, if we are lucky (say, we may be able to guess the
solution and check that it is indeed correct).
42/93
50. Defining “hard” problems
Let us take the maximum clique problem as an example and try
to guess a solution.
We can pick some random clique, compute its size, but how do
we know if this solution is optimal?
Indeed, this is as difficult to verify as to solve the maximum
clique problem – it is unlikely that we can count on luck in this
case...
How about transforming our optimization problem to such a
form, where instead of finding an optimal solution the question
would be “is there a solution with objective value ≥ s, where s
is a given integer?
43/93
51. Three versions of optimization problems
Consider a problem
min f (x) subject to x ∈ X .
Optimization version: find x from X that maximizes f (x);
Answer: x ∗ maximizes f (x)
Evaluation version: find the largest possible f (x);
Answer: the largest possible value for f (x) is f ∗
Recognition version: Given f ∗ , does there exist an x such that
f (x) ≥ f ∗ ?
Answer: “yes” or “no”
44/93
52. Recognition problems
Recognition problems can still be undecidable.
Halting problem: Given a computer program with its
input, will it ever halt?
Recall the “luck factor”: if we pick a random feasible solution
and it happens to give “yes” answer, then we solved the
problem in polynomial time.
Max clique: Randomly pick a solution (a clique). If its
size is ≥ s (which we can verify in polynomial time),
then obviously the answer is “yes”. This clique can be
viewed as a certificate proving that this is indeed a yes
instance of max clique.
45/93
53. Class N P
We only consider problems for which any yes instance there
exists a concise (polynomial-size) certificate that can be verified
in polynomial time.
We call this class of problems nondeterministic polynomial and
denote it by N P.
46/93
54. P vs N P
Note that any problem from P is also in N P (i.e., P ⊆ N P),
so there are easy problems in N P.
So, are there “hard” problems in N P, and if there are, how do
we define them?
We don’t know if P = NP, but “most” people believe that
P = NP.
An “easy” way to make $1,000,000!
http://www.claymath.org/millennium/P vs NP/
We can call a problem hard if the fact that we can solve this
problem would mean that we can solve any other problem in
comparable amount of time.
47/93
55. Polynomial reducibility
Reduce π1 to π2 : if we can solve π2 fast, then we can solve π1
fast, given that the reduction is “easy”.
Polynomial reduction from π1 to π2 requires existence of
polynomial-time algorithms
1. A1 converts an input for π1 into an input for π2 ;
2. A2 converts an output for π2 into output for π1 .
Transitivity: If π1 is polynomially reducible to π2 and π2 is
polynomially reducible to π3 then π1 is polynomially reducible
to π3 .
48/93
56. NP-complete problems
A problem π is called NP-complete if
1. π ∈ NP;
2. Any problem from NP can be reduced to π in polynomial time.
A problem π is called NP-hard if any problem from NP can be
reduced to π in polynomial time. (no π ∈ NP requirement)
Due to transitivity of polynomial reducibility, in order to show
that a problem π is N P-complete, it is sufficient to show that
1. π ∈ NP;
2. There is an NP-complete problem π that can be reduced to π
in polynomial time.
To use this observation, we need to know at least one
N P-complete problem...
49/93
57. Satisfiability (SAT) problem
A Boolean variable x is a variable that can assume only the values
true and false.
Boolean variables can be combined to form Boolean formulas using
the following logical operations:
1. Logical AND (∧ or ·) -conjunction
2. Logical OR (∨ or +) - disjunction
3. Logical NOT (¯)
x
kj
A clause is Cj = yjp , where a literal yjp is xr or xr for some r .
¯
p=1
m
Conjunctive normal form (CNF):F = Cj , where Cj is a clause.
j=1
50/93
58. Satisfiability problem
A CNF F is called satisfiable if there is an assignment of
variables such that F = 1 (TRUE ).
Satisfiability (SAT) problem: Given m clauses C1 , . . . , Cm
involving the variables x1 , . . . , xn , is the CNF
m
F = Cj ,
j=1
satisfiable?
Theorem (Cook, 1971)
SAT is NP-complete.
51/93
59. “Best” approximation algorithms and
heuristics
For some problems there are hardness of approximation results
stating that the problem is hard to approximate within a certain
factor.
For example, the k-center problem is hard to approximate
within a factor better than 2.
Then any polynomial-time algorithm approximating the
k-center problem within the factor of 2 can be considered the
“best” approximation algorithm for this problem.
52/93
60. “Best” approximation algorithms and
heuristics
However, some problems are even harder to approximate. For
example, the maximum clique is hard to approximate within a
factor n1− for any positive .
In this case, by the “best” heuristic we could mean a heuristic
that cannot be provably outperformed by any other
polynomial-time algorithm (unless P = NP).
53/93
61. Recognizing the gap between k-club and
l -club numbers
Theorem
Let positive integer constants k and l, l k be given. The problem
of checking whether ωl (G ) = ωk (G ) is NP-hard.
¯ ¯
Note that
ω(G ) ≤ ∆(G ) + 1 ≤ ωk (G )
¯
and observe that we can easily check whether ω(G ) = ∆(G ) + 1.
Hence, it is NP-hard to check whether ωk (G ) = ∆(G ) + 1.
¯
54/93
62. “Best” heuristics for k-club/clique
Corollary
Let k be a fixed integer, k ≥ 2. Unless P = NP, there cannot be a
polynomial time algorithm that finds a k-club of size greater than
∆(G ) + 1 whenever such a k-club exists in the graph.
55/93
67. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
60/93
68. Size of clique relaxations in a graph
Question: How large could a k-clique (k-club, k-plex) be for a
graph with a small clique number?
61/93
69. Size of clique relaxations in a graph
Question: How large could a k-clique (k-club, k-plex) be for a
graph with a small clique number?
Answers:
k-club: ω(K1,n ) = 2, but ωk (K1,n ) = n + 1, could be arbitrary
¯
large
k-clique: same as k-club
61/93
70. Size of clique relaxations in a graph
Question: How large could a k-clique (k-club, k-plex) be for a
graph with a small clique number?
Answers:
k-club: ω(K1,n ) = 2, but ωk (K1,n ) = n + 1, could be arbitrary
¯
large
k-clique: same as k-club
k-plex:
Lemma
For any graph G and a positive integer k,
ωk (G ) ≤ kω(G ) and αk (G ) ≤ kα(G )
61/93
71. Upper bound on the quasi-clique number
Proposition
The γ-clique number ωγ (G ) of a graph G with n vertices and m
edges satisfies the following inequality:
γ+ γ 2 + 8γm
ωγ (G ) ≤ . (1)
2γ
Moreover, if a graph G is connected then
γ+2+ (γ + 2)2 + 8(m − n)γ
ωγ (G ) ≤ . (2)
2γ
62/93
72. Relation between ωγ (G ) and ω(G )
Proposition
The γ-clique number ωγ (G ) and the clique number ω(G ) of graph
G satisfy the following inequalities:
ω(G ) − 1 ωγ (G ) − 1 1 ω(G ) − 1
≤ ≤ . (3)
ω(G ) ωγ (G ) γ ω(G )
Corollary
1
If γ 1 − ω(G ) then
ω(G )γ
ωγ (G ) ≤ . (4)
1 − ω(G ) + ω(G )γ
63/93
73. Relation between omegaγ (G ) and ω(G )
Table: The value of upper bound (4) on γ-clique number with
γ = 0.95, 0.9, 0.85 for graphs with small clique number.
1
ω(G ) 1 − ω(G ) 0.95 0.9 0.85
3 0.667 3.35 3.86 4.64
4 0.75 4.75 6 8.5
5 0.8 6.33 9 17
6 0.83 8.14 13.5 51
7 0.86 10.23 21 –
8 0.88 12.67 36 –
9 0.89 15.55 81 –
10 0.9 19 – –
64/93
74. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
65/93
75. Math programming formulations
Consider a simple undirected graph G = (V , E ) with n vertices
Let A = [aij ]n
i,j=1 be the adjacency matrix of G
Let x = (x1 , . . . , xn ) be a 0-1 vector with xi = 1 if node i
belongs to Gs , and xi = 0 otherwise.
66/93
76. Math programming formulations
GS is a γ-clique if it has at least γ|GS |(|GS | − 1)/2 edges
n
We have: |GS | = xi
i=1
This number of edges can be expressed in terms of vector x as:
n n n n
1
2γ xi xi − 1 = 1γ
2 xi xj − xi
i=1 i=1 i,j=1 i=1
n n n n
1 1
= 2γ xi xj + xi2 − xi = 2γ xi xj
i,j=1;i=j i=1 i=1 i,j=1;i=j
n
1
The number of edges in GS is 2 aij xi xj
i,j=1
67/93
77. Math programming formulations
We obtain the following 0-1 problem with one quadratic constraint:
n
max xi
i=1
subject to:
n n
aij xi xj ≥ γ xi xj
i,j=1 i,j=1;i=j
68/93
78. Math programming formulations
Define wij = xi xj . The constraint wij = xi xj is equivalent to
wij ≤ xi , wij ≤ xj , wij ≥ xi + xj − 1 .
n
Linearized formulation: max xi
i=1
n n
subjectto : aij wij ≥ γ wij
i,j=1 i,j=1;i=j
wij ≤ xi , wij ≤ xj , wij ≥ xi + xj − 1 .
wij , xi ∈ {0, 1}, ∀i j = 1, . . . , n
O(n2 ) 0-1 variables, O(n2 ) constraints
69/93
79. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
70/93
80. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
71/93
81. Clique relaxation and node deletion problems
Node deletion problems
Clique Node deletion
relaxation problems
problems with heredity
72/93
82. Node deletion problems
A graph property Π is said to be hereditary on induced
subgraphs, if G is a graph with property Π and the deletion of
any subset of vertices does not produce a graph violating Π
Π is said to be nontrivial if it is true for a single vertex graph
and is not satisfied by every graph
Π is said to be interesting if there are arbitrarily large graphs
satisfying Π
73/93
83. Yannakakis theorem
The maximum Π problem is to find the largest order induced
subgraph that does not violate property Π
Theorem (Yannakakis, 1978)
The maximum Π problem for nontrivial, interesting graph properties
that are hereditary on induced subgraphs is NP-hard.
74/93
84. Max weight node deletion problem
Given
a simple, undirected graph G = (V , E ),
positive weights w (v ) for each v ∈ V ,
and a nontrivial, interesting and hereditary (on induced
subgraphs) property Π.
Find
ν(G ) = max{w (P) : P ⊆ V , G [P] satisfies Π},
where w (P) = v ∈P w (v ) and G [P] is the subgraph induced by P.
75/93
85. Generalizing max clique algorithms
Some of the most practically effective algorithms for the maximum
clique problem rely on the fact that clique is hereditary on induced
subgraphs.
Carraghan and Pardalos (1990) - used as DIMACS benchmark
¨
Osterg˚rd (2002)
a
We use this observation to develop a generalized algorithm for the
minimum weight node deletion problem (GAMWNDP).
76/93
86. GAMWNDP
Order vertices V = {v1 , v2 , · · · , vn }, define Si = {vi , vi+1 , · · · , vn }
We compute the function c(i) that is the weight of the maximum
induced subgraph with property Π in G [Si ].
Obviously, c(n) = w (vn ) and c(1) = ν(G ).
For the unweighted case with w (vi ) = 1, i = 1, . . . , n we have
c(i + 1) + 1, if the solution must contain vi
c(i) =
c(i + 1), otherwise
For the weighted case, c(i) c(i + 1) implies that vi belongs to
every optimal solution, and c(i) ≤ c(i + 1) + w (vi ).
77/93
87. GAMWNDP
We compute the value of c(i) starting from c(n) and down to
c(1), and in each major iteration work with a current feasible
solution P, a candidate set C , and an incumbent solution S.
Pruning occurs when
w (C ) + w (P) w (S) or
c(i) + w (P) w (S), where i = min{j : vj ∈ C }.
Problem-specific features:
candidate set generation;
vertex ordering.
78/93
88. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
79/93
89. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
80/93
90. k-Core and k-community
k-Core
A sub-graph G of G such that each node has degree greater than or
equal to k.
k-Community
A subgraph G of G such that each (i, j) ∈ E if (i, j) ∈ E and i, j
have at least k common neighbors in G .
81/93
91. Example - k-Core k-Core example
(a) Original Graph (b) Max 3-core
Each node in the max 3-core is connected to at least 3 other
82/93
vertices.
92. Example - k-Community k-Community example
(c) Original Graph (d) Max 2-community
Any two vertices in the max 2-community are connected only if 83/93
they have at least 2 mutual neighbors.
93. Scale reduction approaches
Reduce problem size by ignoring certain parts of the graph
which we are certain would not be in the graph.
Most commonly used technique: Peeling
1. Remove all the vertices with degree ≤ k − 1.
2. If no vertices removed, stop. Else, redo step 1 using the
obtained graph.
Peeling is nothing but finding the k-core!
84/93
94. Scale reduction approaches
Property 1
A clique of size k is both a (k − 1)-core and (k − 2)-community.
Property 2
If the k-Core of G is empty, then ω(G ) k + 1.
Property 3
If the k-Community of G is empty, then ω(G ) k + 2.
85/93
95. k-Core Upper Bound
Find smallest k1 such that k1 -Core is empty. UB = k1 = 5.k-Core upper bound
(e) Original Graph (f) 4-core
(g) Original Graph (h) 5-core
Anurag Verma, Sergiy Butenko, Texas AM University 10/ 20
Find smallest k1 such that k1 -Core is empty. UB = k1 = 5.
86/93
96. k-Community Upper Bound
k-Community upper bound
Find smallest k2 such that k2 -Comm is empty.
(i) Original Graph (j) Intermediate (k) 2-comm
Find smallest k2 such that k2 -Comm 1 = 4.
3-comm is empty. Thus UB = k2 + is empty. UB = k2 + 1 = 4.
87/93
97. Finding the least k2
A simple binary search strategy:
Step 0 Set ku = n − 2, kl = 0, k2 = n − 2
Step 1 If k2 -Comm(G) is empty, ku = k2 . Else, kl = k2 .
Step 2 If ku − kl ≤ 1, set k2 = ku , STOP. Else, set
k2 = (ku + kl )/2, go to Step 1.
88/93
98. The call graph
Abello, Pardalos, and Resende, 1999:
20-day period: 290 million vertices and 4 billion edges.
one-day call graph: 53,767,087 vertices and over 170 millions of
edges.
3,667,448 connected components; 302,468 of them with more
than 3 vertices.
A giant connected component with 44,989,297 vertices.
89/93
101. Upper bounds on random graphs
Uniform random graphs with 1000 nodes.
Bollobas-Erd¨ s estimate: 2 log(n)/ log(1/p).
o
92/93
102. Outline
Introduction
Graph Theory Basics
Taxonomy of Clique Relaxations
Theory
Complexity Issues
Analytical bounds
Mathematical programming formulations
Algorithms
Node Deletion Problems with Heredity
Scale Reduction Approaches
Conclusion and References
93/93
103. Conclusion
We propose a taxonomy of clique relaxation models
opens new avenues for systematic study of clique relaxations
covers known clique relaxation models
unveils new models of potential interest
An effective algorithm is provided for node deletion problems
that are nontrivial, interesting, and hereditary on induced
subgraphs
state-of-the-art exact approach to max k-plex
can be used to solve many other important problems
A scale-reduction approach based on the concept of
k-community is proposed
outperforms the peeling approach based on k-core
94/93
104. References
B. Balasundaram, S. Butenko, I. V. Hicks.
Clique relaxations in social network analysis: The maximum k-plex
problem.
Operations Research, 59: 133–142, 2011.
S. Trukhanov, B. Balasundaram, and S. Butenko.
Exact algorithms for hard node deletion problems in networks.
Submitted.
S. Butenko, J. Pattillo, and N. Youssef.
A taxonomy of clique relaxation models in networks.
In preparation.
A. Verma and S. Butenko.
A new scale-reduction technique for the maximum clique and related
problems.
In preparation.
95/93
105. References
J. Pattillo, A. Veremyev, S. Butenko, and V. Boginski.
On the maximum quasi-clique problem.
Submitted.
S. Butenko, S. Kahruman, and O. Prokopyev.
On “provably best” heuristics for hard optimization problems.
In preparation.
96/93