SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
Analysis of
Gene Expression Data
     _______________________

            Jhoirene B. Clemente
       Algorithms and Complexity Lab
     University of the Philippines Diliman
Overview

● Definitions
● Clustering of Gene Expression Data
● Visualizations of Gene Expression Data
Definitions
Gene
Basic unit of heredity in a living organism.
It is normally a stretch of DNA that codes
for a type of protein or for an RNA chain
that has a function in the organism.

Gene Expression Data
Expression level of genes in an individual
that is measured through Microarray
Definitions
Definitions
Definitions
Gene Expression Data

                        Gene     Gene
                               Expression
                       a
                       b
                       c
                       ...
                       n
Definitions
Gene Expression Data                 1 Sample

                              Gene     Gene
                                     Expression
                             a
                             b
                      n
                   Samples   c
                             ...
                             n
Definitions
   (n x m) Data Matrix          m Samples


            Gene   Sample   Sample      .....   Sample
                     1        1                   m
           a
           b
   n
Samples    c
           ...
           n
Definitions
   (n x m) Data Matrix          m Samples


            Gene   Sample   Sample      .....   Sample
                     1        1                   m
           a
           b
   n
Samples    c
           ...
           n
Clustering




Clustering is the unsupervised classification of
patterns including observations, data sets and
feature vectors into groups called clusters,
such that objects in the same cluster are similar to
each other while objects in different clusters are
dissimilar as possible.
Clustering




Clustering is the unsupervised classification of
patterns including observations, data sets and
feature vectors into groups called clusters,
such that objects in the same cluster are similar to
each other while objects in different clusters are
dissimilar as possible.
Cluster Analysis
Preprocessing
 ● Filtering

 ● Normalization




                   Clustering



                                Analysis
Clustering
Partitional
●   K-means Algorithm
●   X-means Algorithm



Hierarchical
Clustering
Given the (n x m) data matrix, we can

●   Cluster the set of genes
●   Cluster the set of samples
●   Cluster the set of genes and samples
    simultaneously.
Data Set
Data set is a time series gene expression data from
a synchronized population of yeast.
Data Set
Data set is a time series gene expression data from
a synchronized population of yeast.
Preprocessing
Filtering
 ● Removed genes not involved in cell cycle

    regulation
 ● Removed genes belonging to more than one

    group

Normalization
● All gene expression values range from -1.0 to

  1.0.
Data Set
Data matrix (384 genes and 17 samples) with 5
classifications.
Groupings based from cell cycle phase activation.
Data Set
Group 1: Resting Phase
Data Set
Group 2: First Growth Phase
Data Set
Group 3: Synthesis Phase
Data Set
Group 4: Second Growth Phase
Data Set
Group 5: Cell Division
Clustering of genes
K-means Algorithm

Given n data points in Rd
1. Assign k initial centers of the k clusters
2. Assign all the data points to the nearest cluster
   (Euclidean distance, Manhattan distance, etc.)
3. Adjust the k centers
4. Repeat steps 2 and 3 until convergence
Clustering of genes
K-means Algorithm

Given n data points in Rd
1. Assign k initial centers of the k clusters
2. Assign all the data points to the nearest cluster
   (Euclidean distance, Manhattan distance, etc.)
3. Adjust the k centers
4. Repeat steps 2 and 3 until convergence
                   k =5
    since we want to approximate the 5
Clustering of genes
Initialization

1. Choose the first k centers that will maximize the
   distance between the clusters
2. Sort the distances between all the data points
   and then choose the k initial points at constant
   intervals from the sorted list
3. Use the first k points in the data set as the first k
   centers
Clustering of genes
Using k-means clustering, with k =5
Clustering of genes
●   Clustering may suggest possible roles for genes
    with unknown functions
●   Clustering the samples or experiments may shed
    light on new subtypes of diseases.
●   Identify which type of treatment is suited for a
    specific type of cancer.
●   Building genetic networks
visualization
Vector Fusion
Non-metric Multidimensional Scaling (nMDS)
Principal Components Analysis (PCA)
Vector fusion
Visualization technique that uses the Single point
broken line parallel algorithm
nMDS visualization
Input (Dissimilarity Matrix=|ij|) actual distance
 ● In nMDS, only the rank order of entries is

   assumed to contain the significant information.
 ● Thus, the purpose of the non-metric MDS

   algorithm is to find a configuration of points
   whose distances reflect as closely as possible
   the rank order of the data.
 ● The transformation is by using a non parametric

   function f. (monotone regression)

             dij= f(dij) pseudo-distance
PCA
vector fusion
visualization
nmds visualization
nmds visualization
nmds visualization
nmds visualization
nmds visualization
nmds visualization
nmds visualization
References
2010: "Non-Metric Multidimensional Scaling and Vector
Fusion Visualization of Cell Cycle Independent Gene
Expressions for Gene Function Analysis", Clemente J.,
Salido J.A., (2010), Published in the conference
proceedings of National Conference on Information
Technology for Education(NCITE) 2010 and Philippine IT
Journal Feb 2011 Issue.

2010: "Cluster Analysis for Identifying Genes Highly
Correlated with a Phenotype", Clemente J.,
Undergraduate thesis, Department of Computer Science,
University of the Philippines Diliman
Thank you for
  Listening

Contenu connexe

Tendances

Microsatellites- Molecular fingerprints
Microsatellites- Molecular fingerprints Microsatellites- Molecular fingerprints
Microsatellites- Molecular fingerprints Sumana Choudhury
 
Comparitive genomic hybridisation
Comparitive genomic hybridisationComparitive genomic hybridisation
Comparitive genomic hybridisationnamrathrs87
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsprateek kumar
 
Molecular hybridization of nucleic acids
Molecular hybridization of nucleic acidsMolecular hybridization of nucleic acids
Molecular hybridization of nucleic acidsshobejee
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencingGoutham Sarovar
 
Different pcr techniques and their application
Different pcr techniques and their applicationDifferent pcr techniques and their application
Different pcr techniques and their applicationsaurabh Pandey.Saurabh784
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...VHIR Vall d’Hebron Institut de Recerca
 
Genotyping by sequencing
Genotyping by sequencingGenotyping by sequencing
Genotyping by sequencingBhavya Sree
 
Proteomics and protein-protein interaction
Proteomics  and protein-protein interactionProteomics  and protein-protein interaction
Proteomics and protein-protein interactionSenthilkumarV25
 

Tendances (20)

Microsatellites- Molecular fingerprints
Microsatellites- Molecular fingerprints Microsatellites- Molecular fingerprints
Microsatellites- Molecular fingerprints
 
Phylogenetic data analysis
Phylogenetic data analysisPhylogenetic data analysis
Phylogenetic data analysis
 
Comparitive genomic hybridisation
Comparitive genomic hybridisationComparitive genomic hybridisation
Comparitive genomic hybridisation
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Molecular hybridization of nucleic acids
Molecular hybridization of nucleic acidsMolecular hybridization of nucleic acids
Molecular hybridization of nucleic acids
 
Genome assembly
Genome assemblyGenome assembly
Genome assembly
 
Dna sequencing
Dna sequencingDna sequencing
Dna sequencing
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencing
 
Different pcr techniques and their application
Different pcr techniques and their applicationDifferent pcr techniques and their application
Different pcr techniques and their application
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
YEAST TWO HYBRID SYSTEM
 YEAST TWO HYBRID SYSTEM YEAST TWO HYBRID SYSTEM
YEAST TWO HYBRID SYSTEM
 
Types of genomics ppt
Types of genomics pptTypes of genomics ppt
Types of genomics ppt
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 
Gene expression profiling
Gene expression profilingGene expression profiling
Gene expression profiling
 
Genotyping by sequencing
Genotyping by sequencingGenotyping by sequencing
Genotyping by sequencing
 
Dna quantification
Dna quantificationDna quantification
Dna quantification
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Proteomics and protein-protein interaction
Proteomics  and protein-protein interactionProteomics  and protein-protein interaction
Proteomics and protein-protein interaction
 
Microarray
MicroarrayMicroarray
Microarray
 
Genetic mapping
Genetic mappingGenetic mapping
Genetic mapping
 

En vedette

Introduction to Network Medicine
Introduction to Network MedicineIntroduction to Network Medicine
Introduction to Network MedicineMarc Santolini
 
Graph properties of biological networks
Graph properties of biological networksGraph properties of biological networks
Graph properties of biological networksngulbahce
 
Gene expression concept and analysis
Gene expression concept and analysisGene expression concept and analysis
Gene expression concept and analysisNoha Lotfy Ibrahim
 
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...Ramy K. Aziz
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its toolsGaurav Diwakar
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomicssonam786
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelLars Juhl Jensen
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biologylemberger
 

En vedette (12)

Introduction to Network Medicine
Introduction to Network MedicineIntroduction to Network Medicine
Introduction to Network Medicine
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
 
Graph properties of biological networks
Graph properties of biological networksGraph properties of biological networks
Graph properties of biological networks
 
Artificial Intelligence in Data Curation
Artificial Intelligence in Data CurationArtificial Intelligence in Data Curation
Artificial Intelligence in Data Curation
 
Gene expression concept and analysis
Gene expression concept and analysisGene expression concept and analysis
Gene expression concept and analysis
 
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
 
RT-PCR
RT-PCRRT-PCR
RT-PCR
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its tools
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomics
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems level
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biology
 
Dr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 MedicineDr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 Medicine
 

Similaire à Gene Expression Data Analysis

LE03.doc
LE03.docLE03.doc
LE03.docbutest
 
20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0Computer Science Club
 
Seminar Slides
Seminar SlidesSeminar Slides
Seminar Slidespannicle
 
MCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdfMCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdfRajendraChavhan3
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDavide Chicco
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeSean Davis
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrUSD Bioinformatics
 
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal ClubMed_KU
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysisAcad
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Rakibul Hasan Pranto
 
Identification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning MethodIdentification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning Methodpraveena06
 
Microarray Data Analysis
Microarray Data AnalysisMicroarray Data Analysis
Microarray Data Analysisyuvraj404
 
Survey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue ClassificationSurvey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue Classificationperfj
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsajay301
 
High Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and VisualizationHigh Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and VisualizationDmitry Grapov
 

Similaire à Gene Expression Data Analysis (20)

LE03.doc
LE03.docLE03.doc
LE03.doc
 
Microarray Analysis
Microarray AnalysisMicroarray Analysis
Microarray Analysis
 
20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0
 
Seminar Slides
Seminar SlidesSeminar Slides
Seminar Slides
 
Dbm630 lecture09
Dbm630 lecture09Dbm630 lecture09
Dbm630 lecture09
 
Gene expression profiling ii
Gene expression profiling  iiGene expression profiling  ii
Gene expression profiling ii
 
MCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdfMCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdf
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the Transcriptome
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corr
 
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019
 
Gene expression profiling i
Gene expression profiling  iGene expression profiling  i
Gene expression profiling i
 
Identification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning MethodIdentification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning Method
 
Microarray Data Analysis
Microarray Data AnalysisMicroarray Data Analysis
Microarray Data Analysis
 
Survey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue ClassificationSurvey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue Classification
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
31931 31941
31931 3194131931 31941
31931 31941
 
High Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and VisualizationHigh Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and Visualization
 

Plus de Jhoirene Clemente

Reoptimization Algorithms and Persistent Turing Machines
Reoptimization Algorithms and Persistent Turing MachinesReoptimization Algorithms and Persistent Turing Machines
Reoptimization Algorithms and Persistent Turing MachinesJhoirene Clemente
 
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation AlgorithmsJhoirene Clemente
 
Reoptimization techniques for solving hard problems
Reoptimization techniques for solving hard problemsReoptimization techniques for solving hard problems
Reoptimization techniques for solving hard problemsJhoirene Clemente
 
Parallel Random Projection for Motif Discovery on GPUs
Parallel Random Projection for Motif Discovery on GPUsParallel Random Projection for Motif Discovery on GPUs
Parallel Random Projection for Motif Discovery on GPUsJhoirene Clemente
 
Consurrent Processes and Reaction
Consurrent Processes and ReactionConsurrent Processes and Reaction
Consurrent Processes and ReactionJhoirene Clemente
 

Plus de Jhoirene Clemente (7)

Reoptimization Algorithms and Persistent Turing Machines
Reoptimization Algorithms and Persistent Turing MachinesReoptimization Algorithms and Persistent Turing Machines
Reoptimization Algorithms and Persistent Turing Machines
 
LaTex Tutorial
LaTex TutorialLaTex Tutorial
LaTex Tutorial
 
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation Algorithms
 
Reoptimization techniques for solving hard problems
Reoptimization techniques for solving hard problemsReoptimization techniques for solving hard problems
Reoptimization techniques for solving hard problems
 
Randomized Computation
Randomized ComputationRandomized Computation
Randomized Computation
 
Parallel Random Projection for Motif Discovery on GPUs
Parallel Random Projection for Motif Discovery on GPUsParallel Random Projection for Motif Discovery on GPUs
Parallel Random Projection for Motif Discovery on GPUs
 
Consurrent Processes and Reaction
Consurrent Processes and ReactionConsurrent Processes and Reaction
Consurrent Processes and Reaction
 

Dernier

Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 

Dernier (20)

Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 

Gene Expression Data Analysis

  • 1. Analysis of Gene Expression Data _______________________ Jhoirene B. Clemente Algorithms and Complexity Lab University of the Philippines Diliman
  • 2. Overview ● Definitions ● Clustering of Gene Expression Data ● Visualizations of Gene Expression Data
  • 3. Definitions Gene Basic unit of heredity in a living organism. It is normally a stretch of DNA that codes for a type of protein or for an RNA chain that has a function in the organism. Gene Expression Data Expression level of genes in an individual that is measured through Microarray
  • 6. Definitions Gene Expression Data Gene Gene Expression a b c ... n
  • 7. Definitions Gene Expression Data 1 Sample Gene Gene Expression a b n Samples c ... n
  • 8. Definitions (n x m) Data Matrix m Samples Gene Sample Sample ..... Sample 1 1 m a b n Samples c ... n
  • 9. Definitions (n x m) Data Matrix m Samples Gene Sample Sample ..... Sample 1 1 m a b n Samples c ... n
  • 10. Clustering Clustering is the unsupervised classification of patterns including observations, data sets and feature vectors into groups called clusters, such that objects in the same cluster are similar to each other while objects in different clusters are dissimilar as possible.
  • 11. Clustering Clustering is the unsupervised classification of patterns including observations, data sets and feature vectors into groups called clusters, such that objects in the same cluster are similar to each other while objects in different clusters are dissimilar as possible.
  • 12. Cluster Analysis Preprocessing ● Filtering ● Normalization Clustering Analysis
  • 13. Clustering Partitional ● K-means Algorithm ● X-means Algorithm Hierarchical
  • 14. Clustering Given the (n x m) data matrix, we can ● Cluster the set of genes ● Cluster the set of samples ● Cluster the set of genes and samples simultaneously.
  • 15. Data Set Data set is a time series gene expression data from a synchronized population of yeast.
  • 16. Data Set Data set is a time series gene expression data from a synchronized population of yeast.
  • 17. Preprocessing Filtering ● Removed genes not involved in cell cycle regulation ● Removed genes belonging to more than one group Normalization ● All gene expression values range from -1.0 to 1.0.
  • 18. Data Set Data matrix (384 genes and 17 samples) with 5 classifications. Groupings based from cell cycle phase activation.
  • 19. Data Set Group 1: Resting Phase
  • 20. Data Set Group 2: First Growth Phase
  • 21. Data Set Group 3: Synthesis Phase
  • 22. Data Set Group 4: Second Growth Phase
  • 23. Data Set Group 5: Cell Division
  • 24. Clustering of genes K-means Algorithm Given n data points in Rd 1. Assign k initial centers of the k clusters 2. Assign all the data points to the nearest cluster (Euclidean distance, Manhattan distance, etc.) 3. Adjust the k centers 4. Repeat steps 2 and 3 until convergence
  • 25. Clustering of genes K-means Algorithm Given n data points in Rd 1. Assign k initial centers of the k clusters 2. Assign all the data points to the nearest cluster (Euclidean distance, Manhattan distance, etc.) 3. Adjust the k centers 4. Repeat steps 2 and 3 until convergence k =5 since we want to approximate the 5
  • 26. Clustering of genes Initialization 1. Choose the first k centers that will maximize the distance between the clusters 2. Sort the distances between all the data points and then choose the k initial points at constant intervals from the sorted list 3. Use the first k points in the data set as the first k centers
  • 27. Clustering of genes Using k-means clustering, with k =5
  • 28. Clustering of genes ● Clustering may suggest possible roles for genes with unknown functions ● Clustering the samples or experiments may shed light on new subtypes of diseases. ● Identify which type of treatment is suited for a specific type of cancer. ● Building genetic networks
  • 29. visualization Vector Fusion Non-metric Multidimensional Scaling (nMDS) Principal Components Analysis (PCA)
  • 30. Vector fusion Visualization technique that uses the Single point broken line parallel algorithm
  • 31. nMDS visualization Input (Dissimilarity Matrix=|ij|) actual distance ● In nMDS, only the rank order of entries is assumed to contain the significant information. ● Thus, the purpose of the non-metric MDS algorithm is to find a configuration of points whose distances reflect as closely as possible the rank order of the data. ● The transformation is by using a non parametric function f. (monotone regression) dij= f(dij) pseudo-distance
  • 32. PCA
  • 41. References 2010: "Non-Metric Multidimensional Scaling and Vector Fusion Visualization of Cell Cycle Independent Gene Expressions for Gene Function Analysis", Clemente J., Salido J.A., (2010), Published in the conference proceedings of National Conference on Information Technology for Education(NCITE) 2010 and Philippine IT Journal Feb 2011 Issue. 2010: "Cluster Analysis for Identifying Genes Highly Correlated with a Phenotype", Clemente J., Undergraduate thesis, Department of Computer Science, University of the Philippines Diliman
  • 42. Thank you for Listening