SlideShare une entreprise Scribd logo
1  sur  20
Clustering is a method of finding similar objects
that together make a group i.e., the members of a
particular group has most similar properties to
each other.
While in terms of graph theory clustering is a
method in which degree of nodes should be
identified in which they tend to cluster each other
or we can say they form an optimal clique.
When we travel along one node to another it is
most likely to be lie under one cluster then to
another.
Clustering over graph taken in two ways one on
undirected graph and other is directed or weighted
graph.
On an unweighted graph: Start at a vertex,
choose an outgoing edge uniformly at
random, walk along that edge, and repeat.
On a weighted graph: Start at a vertex u,
choose an incident edge e with weight we with
probability
we / Σd wd
where d ranges over the edges incident to u,
walk along that edge, and repeat.
“ MCL Algorithm”
It is introduced by “Stijn Marinus Van Dongen” in
year 2000.
Markov clustering algorithm is based on the
random walks calculated by using markov chains
which in turn is calculated by using transition
probability matrix.
The basic idea of clustering is that in random
walk that visiting cluster will not leave the cluster
until its most vertices are not visited.
In short the basic idea here is of flow simulation.
w
u
Suppose you start at u.
What’s the
probability you are at
w after 3 steps?
Let vu be the vector
that is 0
everywhere except
index u.
At step 0, vu[w] gives
the
probability you are at
node w.
After 1 step, (TGvu)[w] gives the
probability that you are at w.
after k steps, the probability that
you are at w is:
(TG
k
vu)[w]
In other words, TGk
vu is a vector giving our
probability of being at any node after taking k steps
and starting from u.
MCL algorithm works in two ways:
K-paths clustering.
Random walks.
Here, we are going to discuss simulation of random
walks in graphs to find clusters over them.
According to van dongen :
Number of u-v paths of length k is larger if u, v
are in the same dense cluster, and smaller if they
belong to different clusters.
A random walk on the graph won’t leave a dense
cluster until many of its vertices have been visited.
Random walks therefore helps to find where the
flow going and so where the cluster lies which make
it more optimal.
MCL work on the phenomena of probability
where next time probability depend on the current
probability and not on the past ,the process may
change or remains in the same state depends on the
probability distribution.
The number of Higher-Length paths in G is large
for pairs of vertices lying in the same dense cluster
Small for pairs of vertices belonging to different
clusters.
Two basic operations done over in MCL are:
Expansion
Inflation
Expansion operator: Expansion operator is
responsible for strengthening more strength
regions, it is responsible for allowing flow to
connect different regions of graph.
•Expansion is done by doing normal matrix
product of a stochastic matrix.
Inflation operator: while inflation operator is
responsible for eliminating weak regions, it is
responsible for both strengthening and weakening
of current.
 Inflation doing by taking hadamard power of
matrix and then normalizing it or rescaling it, such
that the resulting matrix is stochastic again.
Algorithm says that the flow is easier in the dense
region then in sparse boundaries, but in larger data
and long run this effect disappears.
oA walker starts from some arbitrary point .
oHe successively visits new vertices by selecting
arbitrarily one of the outgoing edges.
Following figure
showing clustering over
graph:
Different node colors
showing different clusters
and their links to other
clusters.
Steps Of MCL Algorithm
 Input will be an un-directed graph.
Create the associated matrix.
Add self loops to resolve issues of stucking into
local minima(this is an optional step).
 Normalize the matrix.
Perform expansion operation by simply taking
nth
power matrix of matrix(it is a normal matrix
multiplication i.e., simply squaring the matrix).
Perform inflation operation, in it first we take
hadamard power of matrix and then rescale it so
that its columns sum to 1(inflation parameter is r).
Repeat expansion and inflation operation
respectively until a steady state is reached.
 Now after getting an idempotent matrix the
resulting matrix should interpret in order to find
clusters.
Example :
Step 1:
Taking
undirected
graph as an
input:
1
2
3
4
In the following example we expanding matrix by
power of 2 and inflate it with the power of 2.
1
2
3
4
0 1 1 1
1 0 0 1
1 0 0 0
1 1 0 0
1 1 1 1
1 1 0 1
1 0 1 0
1 1 0 1
Diagonal
matrix
¼ 1/3 ½ 1/3
1/4 1/3 0 1/3
1/4 0 1/2 0
1/4 1/3 0 1/3
Normalizing
matrix so it will no
more be symmetric
Perform expansion operation by taking nth
power of
matrix.
¼ 1/3 ½ 1/3
1/4 1/3 0 1/3
1/4 0 1/2 0
1/4 1/3 0 1/3
¼ 1/3 ½ 1/3
1/4 1/3 0 1/3
1/4 0 1/2 0
1/4 1/3 0 1/3
.35 .31 .38 .31
.23 .31 .13 .31
.19 .08 .38 .08
.23 .31 .13 .31
Expansion operation
completed
Inflation operation perform:
.35 .31 .38 .31
.23 .31 .13 .31
.19 .08 .38 .08
.23 .31 .13 .31
.35 .31 .38 .31
.23 .31 .13 .31
.19 .08 .38 .08
.23 .31 .13 .31
.13 .09 .14 .09
.05 .09 .02 .09
.04 .01 .14 .01
.05 .09 .02 .09
Repeat expansion and inflation of matrix till
steady state is reached , further resulting matrices
will be:
Inflation operation
completed
.47 .33 .45 .33
.20 .33 .05 .33
.13 .02 .45 .02
.20 .33 .05 .33
.70 .33 .49 .33
.12 .33 .01 .33
.05 .02 .49 --
.12 .33 .01 .33
.94 .33 .50 .33
.03 .33 -- .33
.01 -- .50 --
.13 .33 -- .33
1 .33 .50 .33
-- .33 -- .33
-- -- .50 --
-- .33 -- .33
Attractors and the elements they attract are swept
together into the same cluster:
In this case, {1},{2,4},{3}
My dm ppt

Contenu connexe

Similaire à My dm ppt

DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...ijfcstjournal
 
Distribution of maximal clique size under
Distribution of maximal clique size underDistribution of maximal clique size under
Distribution of maximal clique size underijfcstjournal
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxHaibinSu2
 
PRML Chapter 11
PRML Chapter 11PRML Chapter 11
PRML Chapter 11Sunwoo Kim
 
Book chapter-5
Book chapter-5Book chapter-5
Book chapter-5Hung Le
 
1234567890-Chapter 11b_Dynamic Force Analysis.pptx
1234567890-Chapter 11b_Dynamic Force Analysis.pptx1234567890-Chapter 11b_Dynamic Force Analysis.pptx
1234567890-Chapter 11b_Dynamic Force Analysis.pptxadonyasdd
 
Financial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic ImportanceFinancial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic ImportanceKimmo Soramaki
 
Markov Cluster Algorithm & real world application
Markov Cluster Algorithm & real world applicationMarkov Cluster Algorithm & real world application
Markov Cluster Algorithm & real world applicationAndjela Todorovic
 
Fundamentos de la cadena de markov - Libro
Fundamentos de la cadena de markov - LibroFundamentos de la cadena de markov - Libro
Fundamentos de la cadena de markov - LibroNelson Salinas
 
Causality in special relativity
Causality in special relativityCausality in special relativity
Causality in special relativityMuhammad Ishaq
 
Quantum inspired evolutionary algorithm for solving multiple travelling sales...
Quantum inspired evolutionary algorithm for solving multiple travelling sales...Quantum inspired evolutionary algorithm for solving multiple travelling sales...
Quantum inspired evolutionary algorithm for solving multiple travelling sales...eSAT Publishing House
 
SLAM of Multi-Robot System Considering Its Network Topology
SLAM of Multi-Robot System Considering Its Network TopologySLAM of Multi-Robot System Considering Its Network Topology
SLAM of Multi-Robot System Considering Its Network Topologytoukaigi
 
1 Lab 3 Newton’s Second Law of Motion Introducti.docx
1 Lab 3 Newton’s Second Law of Motion  Introducti.docx1 Lab 3 Newton’s Second Law of Motion  Introducti.docx
1 Lab 3 Newton’s Second Law of Motion Introducti.docxmercysuttle
 
Breaking the 49 qubit barrier in the simulation of quantum circuits
Breaking the 49 qubit barrier in the simulation of quantum circuitsBreaking the 49 qubit barrier in the simulation of quantum circuits
Breaking the 49 qubit barrier in the simulation of quantum circuitshquynh
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)irjes
 

Similaire à My dm ppt (20)

DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
 
Distribution of maximal clique size under
Distribution of maximal clique size underDistribution of maximal clique size under
Distribution of maximal clique size under
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptx
 
PRML Chapter 11
PRML Chapter 11PRML Chapter 11
PRML Chapter 11
 
sbs.pdf
sbs.pdfsbs.pdf
sbs.pdf
 
Book chapter-5
Book chapter-5Book chapter-5
Book chapter-5
 
1234567890-Chapter 11b_Dynamic Force Analysis.pptx
1234567890-Chapter 11b_Dynamic Force Analysis.pptx1234567890-Chapter 11b_Dynamic Force Analysis.pptx
1234567890-Chapter 11b_Dynamic Force Analysis.pptx
 
Financial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic ImportanceFinancial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic Importance
 
Markov Cluster Algorithm & real world application
Markov Cluster Algorithm & real world applicationMarkov Cluster Algorithm & real world application
Markov Cluster Algorithm & real world application
 
Fundamentos de la cadena de markov - Libro
Fundamentos de la cadena de markov - LibroFundamentos de la cadena de markov - Libro
Fundamentos de la cadena de markov - Libro
 
Causality in special relativity
Causality in special relativityCausality in special relativity
Causality in special relativity
 
My Prize Winning Physics Poster from 2006
My Prize Winning Physics Poster from 2006My Prize Winning Physics Poster from 2006
My Prize Winning Physics Poster from 2006
 
H010223640
H010223640H010223640
H010223640
 
Quantum inspired evolutionary algorithm for solving multiple travelling sales...
Quantum inspired evolutionary algorithm for solving multiple travelling sales...Quantum inspired evolutionary algorithm for solving multiple travelling sales...
Quantum inspired evolutionary algorithm for solving multiple travelling sales...
 
SLAM of Multi-Robot System Considering Its Network Topology
SLAM of Multi-Robot System Considering Its Network TopologySLAM of Multi-Robot System Considering Its Network Topology
SLAM of Multi-Robot System Considering Its Network Topology
 
1 Lab 3 Newton’s Second Law of Motion Introducti.docx
1 Lab 3 Newton’s Second Law of Motion  Introducti.docx1 Lab 3 Newton’s Second Law of Motion  Introducti.docx
1 Lab 3 Newton’s Second Law of Motion Introducti.docx
 
Breaking the 49 qubit barrier in the simulation of quantum circuits
Breaking the 49 qubit barrier in the simulation of quantum circuitsBreaking the 49 qubit barrier in the simulation of quantum circuits
Breaking the 49 qubit barrier in the simulation of quantum circuits
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
 
Fractal
FractalFractal
Fractal
 
MZ2
MZ2MZ2
MZ2
 

Dernier

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsZilliz
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

My dm ppt

  • 1.
  • 2. Clustering is a method of finding similar objects that together make a group i.e., the members of a particular group has most similar properties to each other. While in terms of graph theory clustering is a method in which degree of nodes should be identified in which they tend to cluster each other or we can say they form an optimal clique. When we travel along one node to another it is most likely to be lie under one cluster then to another.
  • 3. Clustering over graph taken in two ways one on undirected graph and other is directed or weighted graph. On an unweighted graph: Start at a vertex, choose an outgoing edge uniformly at random, walk along that edge, and repeat. On a weighted graph: Start at a vertex u, choose an incident edge e with weight we with probability we / Σd wd where d ranges over the edges incident to u, walk along that edge, and repeat.
  • 4. “ MCL Algorithm” It is introduced by “Stijn Marinus Van Dongen” in year 2000. Markov clustering algorithm is based on the random walks calculated by using markov chains which in turn is calculated by using transition probability matrix. The basic idea of clustering is that in random walk that visiting cluster will not leave the cluster until its most vertices are not visited. In short the basic idea here is of flow simulation.
  • 5. w u Suppose you start at u. What’s the probability you are at w after 3 steps? Let vu be the vector that is 0 everywhere except index u. At step 0, vu[w] gives the probability you are at node w.
  • 6. After 1 step, (TGvu)[w] gives the probability that you are at w. after k steps, the probability that you are at w is: (TG k vu)[w] In other words, TGk vu is a vector giving our probability of being at any node after taking k steps and starting from u.
  • 7. MCL algorithm works in two ways: K-paths clustering. Random walks. Here, we are going to discuss simulation of random walks in graphs to find clusters over them. According to van dongen : Number of u-v paths of length k is larger if u, v are in the same dense cluster, and smaller if they belong to different clusters. A random walk on the graph won’t leave a dense cluster until many of its vertices have been visited.
  • 8. Random walks therefore helps to find where the flow going and so where the cluster lies which make it more optimal. MCL work on the phenomena of probability where next time probability depend on the current probability and not on the past ,the process may change or remains in the same state depends on the probability distribution. The number of Higher-Length paths in G is large for pairs of vertices lying in the same dense cluster
  • 9. Small for pairs of vertices belonging to different clusters. Two basic operations done over in MCL are: Expansion Inflation Expansion operator: Expansion operator is responsible for strengthening more strength regions, it is responsible for allowing flow to connect different regions of graph. •Expansion is done by doing normal matrix product of a stochastic matrix.
  • 10. Inflation operator: while inflation operator is responsible for eliminating weak regions, it is responsible for both strengthening and weakening of current.  Inflation doing by taking hadamard power of matrix and then normalizing it or rescaling it, such that the resulting matrix is stochastic again. Algorithm says that the flow is easier in the dense region then in sparse boundaries, but in larger data and long run this effect disappears. oA walker starts from some arbitrary point . oHe successively visits new vertices by selecting arbitrarily one of the outgoing edges.
  • 11. Following figure showing clustering over graph: Different node colors showing different clusters and their links to other clusters.
  • 12. Steps Of MCL Algorithm  Input will be an un-directed graph. Create the associated matrix. Add self loops to resolve issues of stucking into local minima(this is an optional step).  Normalize the matrix. Perform expansion operation by simply taking nth power matrix of matrix(it is a normal matrix multiplication i.e., simply squaring the matrix).
  • 13. Perform inflation operation, in it first we take hadamard power of matrix and then rescale it so that its columns sum to 1(inflation parameter is r). Repeat expansion and inflation operation respectively until a steady state is reached.  Now after getting an idempotent matrix the resulting matrix should interpret in order to find clusters.
  • 14. Example : Step 1: Taking undirected graph as an input: 1 2 3 4 In the following example we expanding matrix by power of 2 and inflate it with the power of 2.
  • 15. 1 2 3 4 0 1 1 1 1 0 0 1 1 0 0 0 1 1 0 0 1 1 1 1 1 1 0 1 1 0 1 0 1 1 0 1 Diagonal matrix
  • 16. ¼ 1/3 ½ 1/3 1/4 1/3 0 1/3 1/4 0 1/2 0 1/4 1/3 0 1/3 Normalizing matrix so it will no more be symmetric Perform expansion operation by taking nth power of matrix. ¼ 1/3 ½ 1/3 1/4 1/3 0 1/3 1/4 0 1/2 0 1/4 1/3 0 1/3 ¼ 1/3 ½ 1/3 1/4 1/3 0 1/3 1/4 0 1/2 0 1/4 1/3 0 1/3
  • 17. .35 .31 .38 .31 .23 .31 .13 .31 .19 .08 .38 .08 .23 .31 .13 .31 Expansion operation completed Inflation operation perform: .35 .31 .38 .31 .23 .31 .13 .31 .19 .08 .38 .08 .23 .31 .13 .31 .35 .31 .38 .31 .23 .31 .13 .31 .19 .08 .38 .08 .23 .31 .13 .31
  • 18. .13 .09 .14 .09 .05 .09 .02 .09 .04 .01 .14 .01 .05 .09 .02 .09 Repeat expansion and inflation of matrix till steady state is reached , further resulting matrices will be: Inflation operation completed .47 .33 .45 .33 .20 .33 .05 .33 .13 .02 .45 .02 .20 .33 .05 .33 .70 .33 .49 .33 .12 .33 .01 .33 .05 .02 .49 -- .12 .33 .01 .33
  • 19. .94 .33 .50 .33 .03 .33 -- .33 .01 -- .50 -- .13 .33 -- .33 1 .33 .50 .33 -- .33 -- .33 -- -- .50 -- -- .33 -- .33 Attractors and the elements they attract are swept together into the same cluster: In this case, {1},{2,4},{3}