A Game Theoretic Framework for Heterogenous Information Network Clustering
1. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Game Theoretic Framework for
Heterogeneous Information Network Clustering
Faris Alqadah
Johns Hopkins University
2. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Outline
1 Introduction
Motivation
2 Preliminaries
HINs and FCA
Game Theory
3 The Bi-clustering Game
Party-Planners
4 Framework
GHIN
5 Reward Functions
Expected Satisfaction
6 Experimental Results
Real world HINs
7 Conclusion
3. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Outline
1 Introduction
Motivation
2 Preliminaries
HINs and FCA
Game Theory
3 The Bi-clustering Game
Party-Planners
4 Framework
GHIN
5 Reward Functions
Expected Satisfaction
6 Experimental Results
Real world HINs
7 Conclusion
4. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Motivation
Heterogeneous Information Networks (HINs) are pervasive
in applications ranging from bioinformatics to e-commerce.
Generalization of bi-clustering to pairwise relations as
opposed to tensor spaces.
No unified definition of a HIN-cluster or algorithmic
framework to mine them.
Address short coming of ‘pattern’-based approaches.
5. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
HINs
Objects derived from
distinct domains
Topology of the network
determined by
pairwise-binary relations
amongst domains.
Graph representation of a
HIN is a multi-partite
graph.
Clicking patterns, social
networks, gene networks
from different experiments.
6. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Related Work
Three major categories of work
Multi-way clustering [5, 4, 1, 2]: Directly extend
bi-clustering or co-clustering. Mostly hard-clusters.
Information-network [10, 11]: Combine ranking and
clustering using probabilty generating models, limited by
network-topology, hard clustering.
Pattern-based [3, 12, 7]: Formal Concept Analysis,
overlapping clustering, too many clusters, parameter
settings.
7. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Key Idea
For single-edge HIN,
trade-off between number
of nodes in bipartite sets.
8. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Key Idea
For single-edge HIN,
trade-off between number
of nodes in bipartite sets.
9. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Key Idea
For single-edge HIN,
trade-off between number
of nodes in bipartite sets.
10. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Key Idea
For single-edge HIN,
trade-off between number
of nodes in bipartite sets.
11. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Key Idea
For single-edge HIN,
trade-off between number
of nodes in bipartite sets.
Multiple-edge HIN,
competing
cluster-influences.
12. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Key Idea
For single-edge HIN,
trade-off between number
of nodes in bipartite sets.
Multiple-edge HIN,
competing
cluster-influences.
13. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Key Idea
For single-edge HIN,
trade-off between number
of nodes in bipartite sets.
Multiple-edge HIN,
competing
cluster-influences.
14. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Key Idea
Multiple-edge HIN,
competing
cluster-influences.
An ‘ideal’ HIN-cluster
should be an equilibrium
point among all competing
clustering influences.
15. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Key Idea
Multiple-edge HIN,
competing
cluster-influences.
An ‘ideal’ HIN-cluster
should be an equilibrium
point among all competing
clustering influences.
Nash equilibrium: No one
can do any better
assuming everyone else
retains the same strategy.
16. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Outline
1 Introduction
Motivation
2 Preliminaries
HINs and FCA
Game Theory
3 The Bi-clustering Game
Party-Planners
4 Framework
GHIN
5 Reward Functions
Expected Satisfaction
6 Experimental Results
Real world HINs
7 Conclusion
17. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Notation
Context Kij = (Gi , Gj , Iij ), two sets and a relation.
A HIN Gn = (V, E) where V is a set of domains
{G1 , . . . , Gn } and (Gi , Gj ) ∈ E iff ∃Kij
18. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Notation
Context Kij = (Gi , Gj , Iij ), two sets and a relation.
A HIN Gn = (V, E) where V is a set of domains
{G1 , . . . , Gn } and (Gi , Gj ) ∈ E iff ∃Kij
19. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Concepts (maximal bicliques)
Common neighbors:
{gj ∈ Gj |gj Iij gi ∀gi ∈ Ai } if (Gi , Gj ) ∈ E,
ψ j (Ai ) =
∅ otherwise.
Concept or maximal bi-clique: (Ai , Aj ) such that
ψ j (Ai ) = Aj and ψ i (Aj ) = Ai .
20. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Concepts (maximal bicliques)
Common neighbors:
{gj ∈ Gj |gj Iij gi ∀gi ∈ Ai } if (Gi , Gj ) ∈ E,
ψ j (Ai ) =
∅ otherwise.
Concept or maximal bi-clique: (Ai , Aj ) such that
ψ j (Ai ) = Aj and ψ i (Aj ) = Ai .
21. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
FCA-based approaches
Generalize the notion of a concept (several definitions),
and enumerate all such concepts.
Parameter settings not always intuitive.
Substantially different algorithm design for simple change
in definition.
For suitably defined game, Nash equilibrium points capture
maximal bi-cliques.
22. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Outline
1 Introduction
Motivation
2 Preliminaries
HINs and FCA
Game Theory
3 The Bi-clustering Game
Party-Planners
4 Framework
GHIN
5 Reward Functions
Expected Satisfaction
6 Experimental Results
Real world HINs
7 Conclusion
23. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Normal form game
A finite, n-player, normal form game, G, is a triple N, (Mi ), (ri )
where
N = {1, . . . , n} is the set of players
Mi = {mi1 , . . . , mili } is the set of moves available to player i
and li is the number of available moves for that player.
ri : M1 × · · · × Mn → R is the reward function for each
player i. It maps a profile of moves to a value.
Each player i selects a strategy from the set of all available
strategies, Pi = {pi : Mi → [0, 1]}
24. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Normal form game
A finite, n-player, normal form game, G, is a triple N, (Mi ), (ri )
where
N = {1, . . . , n} is the set of players
Mi = {mi1 , . . . , mili } is the set of moves available to player i
and li is the number of available moves for that player.
ri : M1 × · · · × Mn → R is the reward function for each
player i. It maps a profile of moves to a value.
Each player i selects a strategy from the set of all available
strategies, Pi = {pi : Mi → [0, 1]}
25. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Nash equilibrium and example
Nash equilibrium: A strategy profile in which no player has an
incentive to unilaterally deviate [8, 6].
∀i ∈ N, pi ∈ Pi :
∗ ∗ ∗ ∗ ∗
ri (p1 , . . . , pi−1 , pi , . . . , pn ) ≤ ri (p1 , . . . , pn )
Player 2 chooses 0 Player 2 chooses 1 Player 2 chooses 2
Player 1 chooses 0 (0,0) (1,0) (2,-2)
Player 1 chooses 1 (0,1) (1,1) ( 3,-2)
Player 1 chooses 2 (-2,2) (-2,3) (2,2)
26. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Nash equilibrium and example
Nash equilibrium: A strategy profile in which no player has an
incentive to unilaterally deviate [8, 6].
∀i ∈ N, pi ∈ Pi :
∗ ∗ ∗ ∗ ∗
ri (p1 , . . . , pi−1 , pi , . . . , pn ) ≤ ri (p1 , . . . , pn )
Player 2 chooses 0 Player 2 chooses 1 Player 2 chooses 2
Player 1 chooses 0 (0,0) (1,0) (2,-2)
Player 1 chooses 1 (0,1) (1,1) ( 3,-2)
Player 1 chooses 2 (-2,2) (-2,3) (2,2)
27. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Outline
1 Introduction
Motivation
2 Preliminaries
HINs and FCA
Game Theory
3 The Bi-clustering Game
Party-Planners
4 Framework
GHIN
5 Reward Functions
Expected Satisfaction
6 Experimental Results
Real world HINs
7 Conclusion
28. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Party planner game
Two party planners P1 and P2 plan a party by inviting
guests from disjoint sets of clients G1 and G2 .
Party planners receive compensation based on overall
satisfaction of clients.
Client satisfaction is a function of positive and negative
interactions at the party
P1 and P2 do not cooperate, but are privy to each others
guest list at any point. Both wish to maximize
compensation.
29. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Satisfaction Reward Function
Let (A1 , A2 ) be a party. Define satisfaction of g1 ∈ A1 attending
party (A1 , A2 ) as
|ψ 2 (g1 ) ∩ A2 | − w ∗ |A2 ψ 2 (g1 )|
sat1 (g1 , A2 ) = (1)
|A2 |
Overall reward to party planner i:
risat (Ai , Aj ) = sati (gi , Aj ) (2)
gi ∈Ai
31. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Concepts as Nash equilibrium points
Theorem
For any instance of the bi-clustering game Gbicluster in which risat
is the selected reward function, there exists w ∗ , such that
∀w ≥ w ∗ if (A∗ , A∗ ) is a concept of K = (G1 , G2 , I12 ) then
1 2
(A∗ , A∗ ) is a Nash equilibrium point of Gbicluster .
1 2
32. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Outline
1 Introduction
Motivation
2 Preliminaries
HINs and FCA
Game Theory
3 The Bi-clustering Game
Party-Planners
4 Framework
GHIN
5 Reward Functions
Expected Satisfaction
6 Experimental Results
Real world HINs
7 Conclusion
33. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
HIN-clustering game
Extend bi-clustering game to n-party planners, n sets of guests.
Guest interactions are determined by network topology.
Mining HIN-clusters is equivalent to finding
Nash-equilibrium points of the HIN-clustering game.
Finding Nash-equilibrium is non-trivial [9].
Adapt simple strategy and key heuristic to enumerate the
Nash equilibrium points.
34. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Strategy and heuristics
M1 M1, M2 M1, M2, M3 M1, M3 M2 M2, M3 M3
G1 (1,1) (1,2) (1,3) (1,2) (1,1) (1,2) (1,1)
G1, G2 (2,1) (-1,-1) (-2,-3) (-1,-1) (-4,-2) (-4,-4) (-4,-2)
G1, G2, G3 (3,1) (0,0) (-3,-3) (-3,-2) (-3,-1) (-6,-4) (-9,-3)
G1, G3 (2,1) (2,2) (0,0) (-1,-1) (2,1) (-1,-1) (-4,-2)
G2 (1,1) (-2,-4) (-3,-9) (-2,-4) (-5,-5) (-5,-10) (-5,-5)
G2, G3 (2,1) (-1,-1) (-4,-6) (-4,-4) (-4,-2) (-7,-7) (-10,-5)
G3 (1,1) (1,2) (-1,-3) (-2,-4) (1,1) (-2,-4) (-5,-5)
1 Mark all second components that are maximal in each row.
35. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Strategy and heuristics
M1 M1, M2 M1, M2, M3 M1, M3 M2 M2, M3 M3
G1 (1,1) (1,2) (1,3**) (1,2) (1,1) (1,2) (1,1)
G1, G2 (2,1**) (-1,-1) (-2,-3) (-1,-1) (-4,-2) (-4,-4) (-4,-2)
G1, G2, G3 (3,1**) (0,0) (-3,-3) (-3,-2) (-3,-1) (-6,-4) (-9,-3)
G1, G3 (2,1) (2,2**) (0,0) (-1,-1) (2,1) (-1,-1) (-4,-2)
G2 (1,1**) (-2,-4) (-3,-9) (-2,-4) (-5,-5) (-5,-10) (-5,-5)
G2, G3 (2,1**) (-1,-1) (-4,-6) (-4,-4) (-4,-2) (-7,-7) (-10,-5)
G3 (1,1) (1,2**) (-1,-3) (-2,-4) (1,1) (-2,-4) (-5,-5)
1 Mark all second components that are maximal in each row.
36. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Strategy and heuristics
M1 M1, M2 M1, M2, M3 M1, M3 M2 M2, M3 M3
G1 (1,1) (1,2) (1**,3**) (1**,2) (1,1) (1**,2) (1**,1)
G1, G2 (2,1**) (-1,-1) (-2,-3) (-1,-1) (-4,-2) (-4,-4) (-4,-2)
G1, G2, G3 (3**,1**) (0,0) (-3,-3) (-3,-2) (-3,-1) (-6,-4) (-9,-3)
G1, G3 (2,1) (2**,2**) (0,0) (-1,-1) (2**,1) (-1,-1) (-4,-2)
G2 (1,1**) (-2,-4) (-3,-9) (-2,-4) (-5,-5) (-5,-10) (-5,-5)
G2, G3 (2,1**) (-1,-1) (-4,-6) (-4,-4) (-4,-2) (-7,-7) (-10,-5)
G3 (1,1) (1,2**) (-1,-3) (-2,-4) (1,1) (-2,-4) (-5,-5)
1 Mark all second components that are maximal in each row.
2 Mark all first components that are maximal in each column.
37. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Strategy and heuristics
M1 M1, M2 M1, M2, M3 M1, M3 M2 M2, M3 M3
G1 (1,1) (1,2) (1**,3**) (1**,2) (1,1) (1**,2) (1**,1)
G1, G2 (2,1**) (-1,-1) (-2,-3) (-1,-1) (-4,-2) (-4,-4) (-4,-2)
G1, G2, G3 (3**,1**) (0,0) (-3,-3) (-3,-2) (-3,-1) (-6,-4) (-9,-3)
G1, G3 (2,1) (2**,2**) (0,0) (-1,-1) (2**,1) (-1,-1) (-4,-2)
G2 (1,1**) (-2,-4) (-3,-9) (-2,-4) (-5,-5) (-5,-10) (-5,-5)
G2, G3 (2,1**) (-1,-1) (-4,-6) (-4,-4) (-4,-2) (-7,-7) (-10,-5)
G3 (1,1) (1,2**) (-1,-3) (-2,-4) (1,1) (-2,-4) (-5,-5)
1 Mark all second components that are maximal in each row.
2 Mark all first components that are maximal in each column.
3 Any cell that has both components marked is a Nash
equilibrium.
Heuristic: Every Nash equilibrium point is a superset of an
n-concept.
38. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
GHIN framework
Utilizing heuristic, exponential run time still possible.
Sacrifice completeness, but guarantee correctness
Attempt to form a Nash equilibrium point with each object
in the HIN.
39. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
GHIN framework
1 For each object gi in the seed set attempt to form
maximally large n-partite clique in HIN.
2 Add objects from all domains to the clique while the reward
increases.
3 Remove objects not in original clique from all domains
while the reward increases.
4 If no change from step 2 and 3 Nash equilibrium found,
else repeat 2 and 3.
5 Update the seed set by removing all objects in the cluster.
40. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Outline
1 Introduction
Motivation
2 Preliminaries
HINs and FCA
Game Theory
3 The Bi-clustering Game
Party-Planners
4 Framework
GHIN
5 Reward Functions
Expected Satisfaction
6 Experimental Results
Real world HINs
7 Conclusion
41. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Shortcomings of satisfaction reward function
Satisfaction reward function simple, intuitive, and efficient.
If matrices in HIN have significantly different density levels,
then bias occurs.
Use expected satisfaction instead.
42. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Expected satisfaction
Assume all objects are independent.
For given party (A1 , . . . , An ) expected number of
interactions is number of success in |Aj | draws from finite
population of |Gj | objects
Expected number of success is hypergeometrically
distributed random variable.
45. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Tiring party goers
Incorporate ‘tiring’ factor to avoid too much overlap. Let c(gi )
denote the number of clusters gi has appeared in upto the
current time-step, then let
t = f (c(gi ))
where
f : N → (0, 1]
and f is anti-monotonic. For example:
1
f (x) =
x2
1
f (x) =
ex
46. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Outline
1 Introduction
Motivation
2 Preliminaries
HINs and FCA
Game Theory
3 The Bi-clustering Game
Party-Planners
4 Framework
GHIN
5 Reward Functions
Expected Satisfaction
6 Experimental Results
Real world HINs
7 Conclusion
47. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
HINs and evaluation
HIN name Description Num domains Num classes Total num objects
MER Newsgroup, Middle East politics and Religion 3 2 24,783
REC Newsgroup, recreation 3 2 26,225
SCI Newsgroup, science 3 5 37,413
PC Newsgroup, pc and software 3 5 35,186
PCR Newsgroup, politics and Christianity 3 2 24,485
FOUR_AREAS DBLP subset of database, data mining, AI, and IR papers 4 4 70,517
Extrinsic evaluation, B 3 recall and precision:
min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|)
Prec(g, g ) =
|C(g) ∩ C(g )|
min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|)
Rcl(g, g ) =
|L(g) ∩ L(g )|
B 3 Prec = Avgg [Avgg ,C(g)∩C(g )=∅ [Prec(g, g )]]
B 3 Rcl = Avgg [Avgg ,L(g)∩L(g )=∅ [Rcl(g, g )]]
48. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
HINs and evaluation
HIN name Description Num domains Num classes Total num objects
MER Newsgroup, Middle East politics and Religion 3 2 24,783
REC Newsgroup, recreation 3 2 26,225
SCI Newsgroup, science 3 5 37,413
PC Newsgroup, pc and software 3 5 35,186
PCR Newsgroup, politics and Christianity 3 2 24,485
FOUR_AREAS DBLP subset of database, data mining, AI, and IR papers 4 4 70,517
Extrinsic evaluation, B 3 recall and precision:
min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|)
Prec(g, g ) =
|C(g) ∩ C(g )|
min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|)
Rcl(g, g ) =
|L(g) ∩ L(g )|
B 3 Prec = Avgg [Avgg ,C(g)∩C(g )=∅ [Prec(g, g )]]
B 3 Rcl = Avgg [Avgg ,L(g)∩L(g )=∅ [Rcl(g, g )]]
49. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
HINs and evaluation
HIN name Description Num domains Num classes Total num objects
MER Newsgroup, Middle East politics and Religion 3 2 24,783
REC Newsgroup, recreation 3 2 26,225
SCI Newsgroup, science 3 5 37,413
PC Newsgroup, pc and software 3 5 35,186
PCR Newsgroup, politics and Christianity 3 2 24,485
FOUR_AREAS DBLP subset of database, data mining, AI, and IR papers 4 4 70,517
Extrinsic evaluation, B 3 recall and precision:
min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|)
Prec(g, g ) =
|C(g) ∩ C(g )|
min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|)
Rcl(g, g ) =
|L(g) ∩ L(g )|
B 3 Prec = Avgg [Avgg ,C(g)∩C(g )=∅ [Prec(g, g )]]
B 3 Rcl = Avgg [Avgg ,L(g)∩L(g )=∅ [Rcl(g, g )]]
51. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Class distributions in clusters
Algorithm Class C1 C2 C3 C4
DB 0.0601266 0.93633 0.0133188 0.0512748
DM 0.028481 0.0363608 0.0106007 0.850142
GHIN expsat
IR 0.882911 0.0204432 0.133188 0.0339943
AI 0.028481 0.00686642 0.842892 0.0645892
DB 0.0553833 0.450802 0.500074 0.0955971
DM 0.163934 0.15815 0.128535 0.304584
NetClus
IR 0.179553 0.0512035 0.242707 0.112786
AI 0.60113 0.339844 0.128684 0.487033
DB 0.186681 0.232455 0.803727 0.000000
DM 0.261844 0.000000 0.128592 0.161790
MDC
IR 0.003183 0.278748 0.000000 0.75888
AI 0.548292 0.488797 0.067680 0.079323
52. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Sample Clusters
Terms Authors Conferences
data Surajit Chaudhuri VLDB
database Divesh Srivastava SIGMOD
queries H. V. Jagadish ICDE
databases Jeffrey F. Naughton PODS
querys Michael J. Carey EDBT
xml Raghu Ramakrishnan
mining Jiawei Han KDD
learning Christos Faloutsos PAKDD
data Wei Wang ICDM
frequent Heikki Mannila SDM
association Srinivasan Parthasarathy PKDD
patterns Ke Wang ICML
53. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Applying GHIN to EMAP data
E-MAP (epistatic miniarray porfiles) query and target genes
Genetic interaction score indicates whether strain is
healthier or sicker than expected (positive or negative)
Negative network derived by using scores ≤ −2.5
Find Nash points, and use functional enrichment: Do we
find small functional classes?
54. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Applying GHIN to EMAP data
Functional enrichment by large classes (31−500)
0.7
Exp sat tiring
Sat
0.6
Fraction of patterns enriched
0.5
0.4
0.3
0.2
0.1
0
−0.01 0 0.01 0.02 0.03 0.04 0.05 0.06
P−value threshold
Functional enrichment by small classes
0.7
Exp sat tiring
Sat
0.6
Fraction of patterns enriched
0.5
0.4
0.3
0.2
0.1
0
−0.01 0 0.01 0.02 0.03 0.04 0.05 0.06
P−value threshold
56. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Parameter study
Effect of w on extrinsic clustering quality.
0.7 0.7 0.9
mer mer mer
rec rec 0.8 rec
0.6 pcr 0.6 pcr pcr
pc pc pc
0.7
sci sci sci
0.5
four 0.5 four four
0.6
0.4
0.5
F0.5 score
0.4
F1 score
F2 score
0.3 0.4
0.3
0.3
0.2
0.2 0.2
0.1
0.1
0 0.1
0
−0.1 0 −0.1
0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12
w w w
57. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Parameter study
Effect of w on algorithm operation.
4
x 10
30 2.5 1000
mer
rec mer mer
900
pcr rec rec
pc
25 2 pcr 800 pcr
Average num iterations to find Nash
sci
four pc pc
700
Total number of iterations
sci sci
Number clusters
four four
20 1.5 600
500
15 1 400
300
10 0.5 200
100
5 0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12
w w w
58. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Conclusion
Novel framework for defining and enumerating
HIN-clusters.
First (as far as I know) connection between Information
network clustering and game theory.
Initial experimental results show promise.
59. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
Ongoing and future work
Development of reward functions, (information theortic,
spectral?).
Clustering in biological data, do we find smaller functional
classes compared to other bi-clustering methods?
Extension of framework to weighted HINs.
More algorithmic development.
Compare algorithms with actual Nash solver.
60. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
S. M. Arindam Banerjee, Sugato Basu.
Multi-way clustering on relation graphs.
In Proceedings of the SIAM International Conference on
Data Mining, 2007.
R. Bekkerman, R. El-Yaniv, and A. McCallum.
Multi-way distributional clustering via pairwise interactions.
In ICML ’05: Proceedings of the 22nd international
conference on Machine learning, pages 41–48, New York,
NY, USA, 2005. ACM.
J. Li, G. Liu, H. Li, and L. Wong.
Maximal biclique subgraphs and closed pattern pairs of the
adjacency matrix: A one-to-one correspondence and
mining algorithms.
IEEE Trans. Knowl. Data Eng., 19(12):1625–1637, 2007.
B. Long, X. Wu, Z. M. Zhang, and P. S. Yu.
Unsupervised learning on k-partite graphs.
61. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
In KDD ’06: Proceedings of the 12th ACM SIGKDD
international conference on Knowledge discovery and data
mining, pages 317–326, New York, NY, USA, 2006. ACM.
B. Long, Z. M. Zhang, X. Wu, and P. S. Yu.
Spectral clustering for multi-type relational data.
In ICML ’06: Proceedings of the 23rd international
conference on Machine learning, pages 585–592, New
York, NY, USA, 2006. ACM.
E. Mendelson.
Introducing Game Theory and Its Applications.
Chapman & Hall / CRC, 2004.
I. A. T. S. Mohammed J Zaki, Markus Peters.
Clicks: An effective algorithm for mining subspace clusters
in categorical datasets.
Data and Knowledge Engineering special issue on
Intelligent Data Mining, 60 (2):51–70, 2007.
62. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
G. Owen.
Game Theory.
Academic Press, 1995.
R. Porter, E. Nudelman, and Y. Shoham.
Simple search methods for finding a nash equilibrium.
In Games and Economic Behavior, pages 664–669, 2004.
Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu.
Rankclus: Integrating clustering with ranking for
heterogeneous information network analysis.
In Proc. 2009 Int. Conf. on Extending Data Base
Technology (EDBT’09 ), 2009.
Y. Sun, Y. Yu, and J. Han.
Ranking-based clustering of heterogeneous information
networks with star network schema.
In Proc. 2009 ACM SIGKDD Int. Conf. on Knowledge
Discovery and Data Mining (KDD’09 ), 2009.
63. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion
A. Tanay, R. Sharan, and R. Shamir.
Discovering statistically significant biclusters in gene
expression data.
In In Proceedings of ISMB 2002, 2002.