SlideShare une entreprise Scribd logo
1  sur  17
GENE EXPRESSION
CLUSTERING
GRAPH BASED APPROACHES
                             A   P R E S E N T A T I O N   B Y   GOVIND M (M120432CS)
                         MTECH COMPUTER SCIENCE AND ENGINEERING
                         N AT I O N A L I N S T I T U T E O F T E C H N O L O G Y C A L I C U T
                                                           govindmaheswaran@gmail.com
Clustering and Graph Theory


      Using Graphs in
      Clustering

        Simple Graph Partitioning   Outline

      Spectral Graph Partitioning


Conclusion
Clustering
• Process of Grouping a set of data objects, in terms of similarity
• Same Cluster => Similar Objects and vice versa.
• Widely used in data mining, market analysis etc.
• Used to make sense of Bioinformatics data.
• Two major purposes, in Bioinformatics
    • Find properties of genes ( Relationship among genes, deduce the functions of genes etc)
    • Predict more relevant factors (eg. Clustering cancerous and non cancerous
      genes, finding the effect of a medication)
Graphs
• Data Structure
• Used in multiple domains
• Key Terms
   • Edge
   • Vertex
   • Weighted Graph
Some Graph Theory


                • Cut



                • Partitioning
Clustering using Graphs
 Involves 3 steps
1.   Preprocessing
     ◦   Convert data set into a graph
     ◦   Using Adjacency matrix and Degree Matrix representation
     ◦   Similarity between nodes can be taken as the weight of an edge.

2.   Partitioning
     ◦   Partition the graph


3.   Clustering
     ◦   Repeat until required number of clusters are obtained
     ◦   Alternatively, extra iterations followed by joinings may also be implemented.
Simple Graph Partitioning
• Weight of an edge = Similarity between the nodes
• Find Minimum Cut
• Edge Value decreases, cluster differs
Simple Graph Partitioning : The
Algorithm
Input : Graph G<V,E>, Number of Clusters k
Output: Cluster of Graphs


Repeat k-1 times
     Low_val = infinity
     For each edge e of the graph
           Calculate Cut_Cost, cost of a CUT at that edge
           if Cut_Cost < Low_val
                 Low_Val = cut_cost
                 Cut_Edge = e
     Cut at edge e
Simple Graph Partitioning                    (cont..)

• Advantage
  • Simple to implement
  • Uses the concept of Min Cut.
• Disadvantage
  • What about intra-cluster similarity..?
Spectral Graph Partitioning
• Is widely used
• Uses Eigen Vectors of Laplacian Matrix
• Recursive algorithm
• Qualitatively Good
• Computationally Better than SGP.
Some graph theory…
                                    d1 = 7
        • Degree :                  d2 = 3
                                    d3 = 1
                                    d4 = 0


                               0     2   5   0
        • Affinity Matrix :    0     0   3   0
                               0     0   0   1
                               0     0   0   0

                               7     0   0   0
                               0     3   0   0
        • Degree Matrix        0     0   1   0
                               0     0   0   0


                               -7    2 5 0
                                0   -3 3 0
        • Laplacian Matrix :    0    0 -1 1
                                0    0 0 0
Some more Graph Theory…
• Spectrum : Eigen vectors, arranged in the order of magnitude of eigen values.
• Eigen Values of Graphs
   •   Calculated as Eigen values of Laplacian matrix of the graph
   •   Corresponidngly Eigen Vectors too


• Fiedler Theorm
   •   Correlation b/w eigen vectors and graph properties
   •   Principal Eigen Vectors. Kth Principal Eigen Vector.
   •   Principal Eigen Vector : Centrality of Vertices


• 2nd Principal Eigen Vector : algebraic connectivity
   •   Called Fiedler Vector
   •   Matrix of positive and negative values
   •   Partition is decided by the Sign of the value.
Spectral Graph Partitioning
Input : Graph G<V,E>
Output: Graphs G1< V1,E1>, G2< V2,E2>

 Create the Laplacian Vector L, of the Graph G.
 Calculate the Fiedler Vector F
 for each vertex vi in G
    if F[i]>0
          V1.append(v)
    else
          V2.append(v)
SPG : Example
           2nd Principal Vector = <0.415, 0.309, 0.069, −0.221, 0.221, −0.794>




          2nd Principal Vector = <0.415, 0.309, -0.190, 0.169, >
              (of 1235)
SGP : Bipartitioning Method
       (contd.)

• Recursive Algorithm
• Although better than Simple Graph Partitioning, not optimum
• Multiple times bipartitioning.


• Can be improved by Multipartitioning
• Use more eigen vectors.
Conclusion
• Clustering is Based on simple concepts of graph theory
• Optimal results (Spectral methods)
• Can give better performance than traditional clustering.
• Preprocessing overhead.
References
1.   Yanhua Chen; Ming Dong; Rege, M., "Gene Expression Clustering: a Novel Graph Partitioning
     Approach," Neural Networks, 2007. IJCNN 2007. International Joint Conference on
     , vol., no., pp.1542,1547, 12-17 Aug. 2007, doi: 10.1109/IJCNN.2007.4371187
     URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4371187&isnumber=4370
     891
2.   Hagen, L.; Kahng, A.B., "New spectral methods for ratio cut partitioning and clustering,"
     Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on
     , vol.11, no.9, pp.1074,1085, Sep 1992, doi: 10.1109/43.159993
     URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=159993&isnumber=4190
3.   Donath, W.E.; Hoffman, A.J., "Lower Bounds for the Partitioning of Graphs," IBM Journal of
     Research and Development, vol. 17, pp. 420-425, 1973.
4.   Pavla Kabel´ıková , “Graph Partitioning Using Spectral Methods”, Thesis, VˇSB - Technical
     University of Ostrava, 2006.
5.   Chung, F.R.K., "Spectral Graph Theory," American Mathematical Society, 1997.

Contenu connexe

Tendances

Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and ldaSuresh Pokharel
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality ReductionSaad Elbeleidy
 
K-means Clustering
K-means ClusteringK-means Clustering
K-means ClusteringAnna Fensel
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisJaclyn Kokx
 
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PCA-SIFT: A More Distinctive Representation for Local Image DescriptorsPCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PCA-SIFT: A More Distinctive Representation for Local Image Descriptorswolf
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsPrashanth Guntal
 
Image segmentation using normalized graph cut
Image segmentation using normalized graph cutImage segmentation using normalized graph cut
Image segmentation using normalized graph cutMahesh Dananjaya
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpankit_ppt
 
K means Clustering
K means ClusteringK means Clustering
K means ClusteringEdureka!
 
"Principal Component Analysis - the original paper" presentation @ Papers We ...
"Principal Component Analysis - the original paper" presentation @ Papers We ..."Principal Component Analysis - the original paper" presentation @ Papers We ...
"Principal Component Analysis - the original paper" presentation @ Papers We ...Adrian Florea
 
New Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmNew Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmEditor IJCATR
 
Facial keypoint recognition
Facial keypoint recognitionFacial keypoint recognition
Facial keypoint recognitionAkrita Agarwal
 
A Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityA Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityFarah M. Altufaili
 

Tendances (20)

Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
 
PCA
PCAPCA
PCA
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
K-means Clustering
K-means ClusteringK-means Clustering
K-means Clustering
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
 
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PCA-SIFT: A More Distinctive Representation for Local Image DescriptorsPCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithms
 
Image segmentation using normalized graph cut
Image segmentation using normalized graph cutImage segmentation using normalized graph cut
Image segmentation using normalized graph cut
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlp
 
Hog
HogHog
Hog
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 
"Principal Component Analysis - the original paper" presentation @ Papers We ...
"Principal Component Analysis - the original paper" presentation @ Papers We ..."Principal Component Analysis - the original paper" presentation @ Papers We ...
"Principal Component Analysis - the original paper" presentation @ Papers We ...
 
Pca
PcaPca
Pca
 
Kmeans
KmeansKmeans
Kmeans
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
8.1 notes
8.1 notes8.1 notes
8.1 notes
 
New Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmNew Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids Algorithm
 
Facial keypoint recognition
Facial keypoint recognitionFacial keypoint recognition
Facial keypoint recognition
 
A Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityA Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image Similarity
 

Similaire à Graph based approaches to Gene Expression Clustering

Webinar on Graph Neural Networks
Webinar on Graph Neural NetworksWebinar on Graph Neural Networks
Webinar on Graph Neural NetworksLucaCrociani1
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
141205 graphulo ingraphblas
141205 graphulo ingraphblas141205 graphulo ingraphblas
141205 graphulo ingraphblasgraphulo
 
141222 graphulo ingraphblas
141222 graphulo ingraphblas141222 graphulo ingraphblas
141222 graphulo ingraphblasMIT
 
ImageSegmentation (1).ppt
ImageSegmentation (1).pptImageSegmentation (1).ppt
ImageSegmentation (1).pptNoorUlHaq47
 
ImageSegmentation.ppt
ImageSegmentation.pptImageSegmentation.ppt
ImageSegmentation.pptAVUDAI1
 
ImageSegmentation.ppt
ImageSegmentation.pptImageSegmentation.ppt
ImageSegmentation.pptDEEPUKUMARR
 
Sparse Graph Attention Networks 2021.pptx
Sparse Graph Attention Networks 2021.pptxSparse Graph Attention Networks 2021.pptx
Sparse Graph Attention Networks 2021.pptxssuser2624f71
 
GRAPH PARTITIONING FOR IMAGE SEGMENTATION USING ISOPERIMETRIC APPROACH: A REVIEW
GRAPH PARTITIONING FOR IMAGE SEGMENTATION USING ISOPERIMETRIC APPROACH: A REVIEWGRAPH PARTITIONING FOR IMAGE SEGMENTATION USING ISOPERIMETRIC APPROACH: A REVIEW
GRAPH PARTITIONING FOR IMAGE SEGMENTATION USING ISOPERIMETRIC APPROACH: A REVIEWDrm Kapoor
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraJason Riedy
 
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTECFace recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTECBAINIDA
 
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATIONSCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATIONaftab alam
 
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAIJ. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAIMLILAB
 
Dahlquist et-al bosc-ismb_2016_poster
Dahlquist et-al bosc-ismb_2016_posterDahlquist et-al bosc-ismb_2016_poster
Dahlquist et-al bosc-ismb_2016_posterGRNsight
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data AnalysisDeviousQuant
 

Similaire à Graph based approaches to Gene Expression Clustering (20)

Webinar on Graph Neural Networks
Webinar on Graph Neural NetworksWebinar on Graph Neural Networks
Webinar on Graph Neural Networks
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
talk9.ppt
talk9.ppttalk9.ppt
talk9.ppt
 
141205 graphulo ingraphblas
141205 graphulo ingraphblas141205 graphulo ingraphblas
141205 graphulo ingraphblas
 
141222 graphulo ingraphblas
141222 graphulo ingraphblas141222 graphulo ingraphblas
141222 graphulo ingraphblas
 
ImageSegmentation (1).ppt
ImageSegmentation (1).pptImageSegmentation (1).ppt
ImageSegmentation (1).ppt
 
ImageSegmentation.ppt
ImageSegmentation.pptImageSegmentation.ppt
ImageSegmentation.ppt
 
ImageSegmentation.ppt
ImageSegmentation.pptImageSegmentation.ppt
ImageSegmentation.ppt
 
Sparse Graph Attention Networks 2021.pptx
Sparse Graph Attention Networks 2021.pptxSparse Graph Attention Networks 2021.pptx
Sparse Graph Attention Networks 2021.pptx
 
GRAPH PARTITIONING FOR IMAGE SEGMENTATION USING ISOPERIMETRIC APPROACH: A REVIEW
GRAPH PARTITIONING FOR IMAGE SEGMENTATION USING ISOPERIMETRIC APPROACH: A REVIEWGRAPH PARTITIONING FOR IMAGE SEGMENTATION USING ISOPERIMETRIC APPROACH: A REVIEW
GRAPH PARTITIONING FOR IMAGE SEGMENTATION USING ISOPERIMETRIC APPROACH: A REVIEW
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTECFace recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
 
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATIONSCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
 
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAIJ. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
 
PPT s07-machine vision-s2
PPT s07-machine vision-s2PPT s07-machine vision-s2
PPT s07-machine vision-s2
 
Dahlquist et-al bosc-ismb_2016_poster
Dahlquist et-al bosc-ismb_2016_posterDahlquist et-al bosc-ismb_2016_poster
Dahlquist et-al bosc-ismb_2016_poster
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data Analysis
 
Sun_MAPL_GNN.pptx
Sun_MAPL_GNN.pptxSun_MAPL_GNN.pptx
Sun_MAPL_GNN.pptx
 
Data Mining Lecture_8(a).pptx
Data Mining Lecture_8(a).pptxData Mining Lecture_8(a).pptx
Data Mining Lecture_8(a).pptx
 

Dernier

Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 

Dernier (20)

Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young minds
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 

Graph based approaches to Gene Expression Clustering

  • 1. GENE EXPRESSION CLUSTERING GRAPH BASED APPROACHES A P R E S E N T A T I O N B Y GOVIND M (M120432CS) MTECH COMPUTER SCIENCE AND ENGINEERING N AT I O N A L I N S T I T U T E O F T E C H N O L O G Y C A L I C U T govindmaheswaran@gmail.com
  • 2. Clustering and Graph Theory Using Graphs in Clustering Simple Graph Partitioning Outline Spectral Graph Partitioning Conclusion
  • 3. Clustering • Process of Grouping a set of data objects, in terms of similarity • Same Cluster => Similar Objects and vice versa. • Widely used in data mining, market analysis etc. • Used to make sense of Bioinformatics data. • Two major purposes, in Bioinformatics • Find properties of genes ( Relationship among genes, deduce the functions of genes etc) • Predict more relevant factors (eg. Clustering cancerous and non cancerous genes, finding the effect of a medication)
  • 4. Graphs • Data Structure • Used in multiple domains • Key Terms • Edge • Vertex • Weighted Graph
  • 5. Some Graph Theory • Cut • Partitioning
  • 6. Clustering using Graphs Involves 3 steps 1. Preprocessing ◦ Convert data set into a graph ◦ Using Adjacency matrix and Degree Matrix representation ◦ Similarity between nodes can be taken as the weight of an edge. 2. Partitioning ◦ Partition the graph 3. Clustering ◦ Repeat until required number of clusters are obtained ◦ Alternatively, extra iterations followed by joinings may also be implemented.
  • 7. Simple Graph Partitioning • Weight of an edge = Similarity between the nodes • Find Minimum Cut • Edge Value decreases, cluster differs
  • 8. Simple Graph Partitioning : The Algorithm Input : Graph G<V,E>, Number of Clusters k Output: Cluster of Graphs Repeat k-1 times Low_val = infinity For each edge e of the graph Calculate Cut_Cost, cost of a CUT at that edge if Cut_Cost < Low_val Low_Val = cut_cost Cut_Edge = e Cut at edge e
  • 9. Simple Graph Partitioning (cont..) • Advantage • Simple to implement • Uses the concept of Min Cut. • Disadvantage • What about intra-cluster similarity..?
  • 10. Spectral Graph Partitioning • Is widely used • Uses Eigen Vectors of Laplacian Matrix • Recursive algorithm • Qualitatively Good • Computationally Better than SGP.
  • 11. Some graph theory… d1 = 7 • Degree : d2 = 3 d3 = 1 d4 = 0 0 2 5 0 • Affinity Matrix : 0 0 3 0 0 0 0 1 0 0 0 0 7 0 0 0 0 3 0 0 • Degree Matrix 0 0 1 0 0 0 0 0 -7 2 5 0 0 -3 3 0 • Laplacian Matrix : 0 0 -1 1 0 0 0 0
  • 12. Some more Graph Theory… • Spectrum : Eigen vectors, arranged in the order of magnitude of eigen values. • Eigen Values of Graphs • Calculated as Eigen values of Laplacian matrix of the graph • Corresponidngly Eigen Vectors too • Fiedler Theorm • Correlation b/w eigen vectors and graph properties • Principal Eigen Vectors. Kth Principal Eigen Vector. • Principal Eigen Vector : Centrality of Vertices • 2nd Principal Eigen Vector : algebraic connectivity • Called Fiedler Vector • Matrix of positive and negative values • Partition is decided by the Sign of the value.
  • 13. Spectral Graph Partitioning Input : Graph G<V,E> Output: Graphs G1< V1,E1>, G2< V2,E2> Create the Laplacian Vector L, of the Graph G. Calculate the Fiedler Vector F for each vertex vi in G if F[i]>0 V1.append(v) else V2.append(v)
  • 14. SPG : Example 2nd Principal Vector = <0.415, 0.309, 0.069, −0.221, 0.221, −0.794> 2nd Principal Vector = <0.415, 0.309, -0.190, 0.169, > (of 1235)
  • 15. SGP : Bipartitioning Method (contd.) • Recursive Algorithm • Although better than Simple Graph Partitioning, not optimum • Multiple times bipartitioning. • Can be improved by Multipartitioning • Use more eigen vectors.
  • 16. Conclusion • Clustering is Based on simple concepts of graph theory • Optimal results (Spectral methods) • Can give better performance than traditional clustering. • Preprocessing overhead.
  • 17. References 1. Yanhua Chen; Ming Dong; Rege, M., "Gene Expression Clustering: a Novel Graph Partitioning Approach," Neural Networks, 2007. IJCNN 2007. International Joint Conference on , vol., no., pp.1542,1547, 12-17 Aug. 2007, doi: 10.1109/IJCNN.2007.4371187 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4371187&isnumber=4370 891 2. Hagen, L.; Kahng, A.B., "New spectral methods for ratio cut partitioning and clustering," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , vol.11, no.9, pp.1074,1085, Sep 1992, doi: 10.1109/43.159993 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=159993&isnumber=4190 3. Donath, W.E.; Hoffman, A.J., "Lower Bounds for the Partitioning of Graphs," IBM Journal of Research and Development, vol. 17, pp. 420-425, 1973. 4. Pavla Kabel´ıková , “Graph Partitioning Using Spectral Methods”, Thesis, VˇSB - Technical University of Ostrava, 2006. 5. Chung, F.R.K., "Spectral Graph Theory," American Mathematical Society, 1997.

Notes de l'éditeur

  1. Centrality : Influence