SlideShare une entreprise Scribd logo
1  sur  15
Microarray Dataset: quick mining
and gene profile analysis using
online tools
Dr. Etienne Z. GNIMPIEBA
Sioux Falls, March 2013
Etienne.gnimpieba@usd.edu
Plan
 Gene expression measurement
 Microarray process
 Gene expression data stores
 Data mining / quering
 Data analysis
 Example: ATP13A2 profile in stress
conditions
Gene expression
measurement
Higher-plex techniques:
SAGE
DNA microarray
Tiling array
RNA-Seq
NGS
Low-to-mid-plex techniques:
Reporter gene
Northern blot
Western blot
Fluorescent in situ
hybridization
Reverse transcription PCR
What is a Microarray?
“A DNA microarray is a multiplex technology
consisting of thousands of oligonucleotide
spots, each containing picomoles of a
specific DNA sequence.”
 Used to quantitate mRNA or DNA
 Many applications:
◦ mRNA or DNA levels
◦ SNP identification
◦ ChIP-on-Chip
Hypotheses
 Microarrays are usually hypothesis-generating:
◦ They highlight specific genes or features that are
particularly interesting for follow-up experiments
◦ There are many interesting exceptions
 Biomarkers
 Pathway analyses
 This does not reduce the importance of
experimental design
◦ the low statistical power of array studies make good
design even more important and very challenging
Microarray process (1/3)
• Image analysis
(genepix)
• Normalization (R)
• Pre-treatment
• Differential expression
• Clustering
• Data mining
• Annotation
Microarray process (2/3)
Microarray process (3/3)
High density
filters(macroarrays)
Glass slides
(microarrays)
Oligonucleotides
chips
Detail: Detail: Detail:
Size: 12cm x 8cm Size: 5,4cm x 0,9cm Size: 1,28cm x 1,28cm
•2400 clones by
membrane
•radioactive labelling
•1 experimental
condition by membrane
•10000 clones by slide
•fluorescent labelling
•2 experimental
conditions by slide
•300000
oligonucleotides by
slide
•fluorescent labelling
•1 experimental
condition by slide
Gene expression data
management
Database
Microarray
Experiment
Sets
Sample
Profiles
Date Reported
ArrayExpress at EBI 24,838 708,914 October 28, 2011
ArrayTrack™ 1,622 50,953 February 11, 2012
caArray at NCI 41 1,741 November 15, 2006
Gene Expression Omnibus -
NCBI
25,859 641,770 October 28, 2011
Genevestigator database 2,500 65,000 January 2012
MUSC database ~45 555 April 1, 2007
Stanford Microarray database 82,542 Not reported October 23, 2011
UNC Microarray database ~31 2,093 April 1, 2007
UNC modENCODE Microarray
database
~6 180 July 17, 2009
UPenn RAD database ~100 ~2,500 September 1, 2007
UPSC-BASE ~100 Not reported November 15, 2007
SAGE
GEO
GUDMAP (421)
MGI
BIOGPS
Data mining / querying
 Problem specification
 Query
 Extraction
 Storage
 Load
 Pretreat / prepare for analysis
Data analysis (1/3)
 Question-Answer
◦ Experimental condition profile: group
comparison
◦ Annotation profile: systems biological involved
◦ Clustering profile: co-regulation
◦ Time course profile: time variation
◦ …
 Descriptive
◦ Boxplot (SD, MEAN, MEDIAN, )
◦ Scatter plot
 Predictive / inference (clustering)
 Modeling (machine learning, simulation)
Data analysis (2/3)
 3 Questions
◦ What is the right dataset (experimental condition)?
◦ Is dataset is ready for analysis (quality)?
◦ What is the expression profile for a given gene?
◦ Significant differential expression in groups
comparison
 Tools
◦ ArrayExpress (EBI)
◦ Boxplot
◦ GEO2R (LIMMA, profile graph,)
◦ ….
Data analysis (3/3)
Boxplot
Example: ATP13A2 profile in stress
conditions
 Specification: ATP13A2 profile in
stress conditions
 Data querying:
◦ GEO
◦ Array Express
◦ Gene Atlas
 Data analysis:
◦ Online: GEO2R, Genospace, …
◦ Desktop: R, ArrayTrack, …
Significant differential expression
!!!
Kerry Bemis slides

Contenu connexe

Tendances (20)

NCBI
NCBINCBI
NCBI
 
Databases ii
Databases iiDatabases ii
Databases ii
 
Rishi
RishiRishi
Rishi
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databases
 
NCBI
NCBINCBI
NCBI
 
Ncbi basic intro_v_pitt_kent_osu
Ncbi basic intro_v_pitt_kent_osuNcbi basic intro_v_pitt_kent_osu
Ncbi basic intro_v_pitt_kent_osu
 
Biological data base
Biological data baseBiological data base
Biological data base
 
ADARSH JOSE_Resume
ADARSH JOSE_ResumeADARSH JOSE_Resume
ADARSH JOSE_Resume
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners Slides
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Designing Biological Databases
Designing Biological DatabasesDesigning Biological Databases
Designing Biological Databases
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahu
 
Intro bioinfo
Intro bioinfoIntro bioinfo
Intro bioinfo
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
 
Biological databases
Biological databasesBiological databases
Biological databases
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Bioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide SequencesBioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide Sequences
 

En vedette

Session i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcSession i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcUSD Bioinformatics
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccUSD Bioinformatics
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrUSD Bioinformatics
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisUSD Bioinformatics
 
Seminario investigacion
Seminario investigacionSeminario investigacion
Seminario investigacionjlpc1962
 
Nuevos enfoques en el análisis de datos de microarrays.
Nuevos enfoques en el análisis de datos de microarrays.Nuevos enfoques en el análisis de datos de microarrays.
Nuevos enfoques en el análisis de datos de microarrays.Alberto Labarga
 
Micro arreglos o microarrays
Micro arreglos o microarraysMicro arreglos o microarrays
Micro arreglos o microarraysVictor González
 
Diagnostico Molecular De Las Enfermedades Pdf
Diagnostico Molecular De Las Enfermedades PdfDiagnostico Molecular De Las Enfermedades Pdf
Diagnostico Molecular De Las Enfermedades PdfCESI-DESAN
 
Tema 16: El ADN y la ingeniería genética
Tema 16: El ADN y la ingeniería genéticaTema 16: El ADN y la ingeniería genética
Tema 16: El ADN y la ingeniería genéticaEduardo Gómez
 

En vedette (19)

Session i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcSession i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmc
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mcc
 
Huber brin pb1_f2_poster_2012
Huber brin pb1_f2_poster_2012Huber brin pb1_f2_poster_2012
Huber brin pb1_f2_poster_2012
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corr
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data Analysis
 
Visualization Tools
Visualization ToolsVisualization Tools
Visualization Tools
 
Seminario investigacion
Seminario investigacionSeminario investigacion
Seminario investigacion
 
Nuevos enfoques en el análisis de datos de microarrays.
Nuevos enfoques en el análisis de datos de microarrays.Nuevos enfoques en el análisis de datos de microarrays.
Nuevos enfoques en el análisis de datos de microarrays.
 
Biochips
BiochipsBiochips
Biochips
 
prediction methods for ORF
prediction methods for ORFprediction methods for ORF
prediction methods for ORF
 
Microarreglos de dna completa
Microarreglos de dna completaMicroarreglos de dna completa
Microarreglos de dna completa
 
Micro arreglos o microarrays
Micro arreglos o microarraysMicro arreglos o microarrays
Micro arreglos o microarrays
 
transposon mediated mutagenesis
transposon mediated mutagenesistransposon mediated mutagenesis
transposon mediated mutagenesis
 
PCR
PCRPCR
PCR
 
Genómica
GenómicaGenómica
Genómica
 
DNA microarray
DNA microarrayDNA microarray
DNA microarray
 
DNA microarray
DNA microarrayDNA microarray
DNA microarray
 
Diagnostico Molecular De Las Enfermedades Pdf
Diagnostico Molecular De Las Enfermedades PdfDiagnostico Molecular De Las Enfermedades Pdf
Diagnostico Molecular De Las Enfermedades Pdf
 
Tema 16: El ADN y la ingeniería genética
Tema 16: El ADN y la ingeniería genéticaTema 16: El ADN y la ingeniería genética
Tema 16: El ADN y la ingeniería genética
 

Similaire à Session ii g1 overview genomics and gene expression mmc-good

The UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewThe UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewVictoria Perreau
 
20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0Computer Science Club
 
Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...QBiC_Tue
 
American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013Dmitry Grapov
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Golden Helix Inc
 
Reconstructing paleoenvironments using metagenomics
Reconstructing paleoenvironments using metagenomicsReconstructing paleoenvironments using metagenomics
Reconstructing paleoenvironments using metagenomicsRutger Vos
 
Graziano Pesole - il progetto EPIGEN
Graziano Pesole - il progetto EPIGENGraziano Pesole - il progetto EPIGEN
Graziano Pesole - il progetto EPIGENeventi-ITBbari
 
Health Sciences Driving UCSD Research Cyberinfrastructure
Health Sciences Driving UCSD Research CyberinfrastructureHealth Sciences Driving UCSD Research Cyberinfrastructure
Health Sciences Driving UCSD Research CyberinfrastructureLarry Smarr
 
Evolution of DNA Sequencing by Jonathan Eisen
Evolution of DNA Sequencing by Jonathan EisenEvolution of DNA Sequencing by Jonathan Eisen
Evolution of DNA Sequencing by Jonathan EisenJonathan Eisen
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...Chris Evelo
 
AIQC - ISCB 2022.pdf
AIQC - ISCB 2022.pdfAIQC - ISCB 2022.pdf
AIQC - ISCB 2022.pdfLayne Sadler
 
Introduction to Next Generation Sequencing
Introduction to Next Generation SequencingIntroduction to Next Generation Sequencing
Introduction to Next Generation SequencingEdizonJambormias2
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...geraintduck
 
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...VHIR Vall d’Hebron Institut de Recerca
 
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008Saul Kravitz
 
DNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data PerspectiveDNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data PerspectivePalaniappan SP
 
Next generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasNext generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasMuhammadAbbaskhan9
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)LOGESWARAN KA
 

Similaire à Session ii g1 overview genomics and gene expression mmc-good (20)

The UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewThe UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overview
 
20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0
 
Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...
 
American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
Reconstructing paleoenvironments using metagenomics
Reconstructing paleoenvironments using metagenomicsReconstructing paleoenvironments using metagenomics
Reconstructing paleoenvironments using metagenomics
 
Graziano Pesole - il progetto EPIGEN
Graziano Pesole - il progetto EPIGENGraziano Pesole - il progetto EPIGEN
Graziano Pesole - il progetto EPIGEN
 
Introduction to Biological databases
Introduction to Biological databasesIntroduction to Biological databases
Introduction to Biological databases
 
Health Sciences Driving UCSD Research Cyberinfrastructure
Health Sciences Driving UCSD Research CyberinfrastructureHealth Sciences Driving UCSD Research Cyberinfrastructure
Health Sciences Driving UCSD Research Cyberinfrastructure
 
Evolution of DNA Sequencing by Jonathan Eisen
Evolution of DNA Sequencing by Jonathan EisenEvolution of DNA Sequencing by Jonathan Eisen
Evolution of DNA Sequencing by Jonathan Eisen
 
ChIP-seq Theory
ChIP-seq TheoryChIP-seq Theory
ChIP-seq Theory
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...
 
AIQC - ISCB 2022.pdf
AIQC - ISCB 2022.pdfAIQC - ISCB 2022.pdf
AIQC - ISCB 2022.pdf
 
Introduction to Next Generation Sequencing
Introduction to Next Generation SequencingIntroduction to Next Generation Sequencing
Introduction to Next Generation Sequencing
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
 
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
 
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
 
DNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data PerspectiveDNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data Perspective
 
Next generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasNext generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad Abbas
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)
 

Plus de USD Bioinformatics

Clinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerClinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerUSD Bioinformatics
 
Small Molecule Real Time Sequencing
Small Molecule Real Time SequencingSmall Molecule Real Time Sequencing
Small Molecule Real Time SequencingUSD Bioinformatics
 
Next Generation Sequencing - the basics
Next Generation Sequencing - the basicsNext Generation Sequencing - the basics
Next Generation Sequencing - the basicsUSD Bioinformatics
 
Session ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcSession ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcUSD Bioinformatics
 
Session ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcSession ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcUSD Bioinformatics
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcUSD Bioinformatics
 
Session ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcSession ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcUSD Bioinformatics
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcUSD Bioinformatics
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcUSD Bioinformatics
 

Plus de USD Bioinformatics (20)

Clinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerClinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder Cancer
 
Clinical Application 1.0
Clinical Application 1.0Clinical Application 1.0
Clinical Application 1.0
 
Clinical Application 2.0
Clinical Application 2.0Clinical Application 2.0
Clinical Application 2.0
 
Bridge Amplification Part 2
Bridge Amplification Part 2Bridge Amplification Part 2
Bridge Amplification Part 2
 
Bridge Amplification Part 1
Bridge Amplification Part 1Bridge Amplification Part 1
Bridge Amplification Part 1
 
Basic Steps of the NGS Method
Basic Steps of the NGS MethodBasic Steps of the NGS Method
Basic Steps of the NGS Method
 
True Single Molecule Sequencing
True Single Molecule SequencingTrue Single Molecule Sequencing
True Single Molecule Sequencing
 
Small Molecule Real Time Sequencing
Small Molecule Real Time SequencingSmall Molecule Real Time Sequencing
Small Molecule Real Time Sequencing
 
Sanger Dideoxy Method
Sanger Dideoxy MethodSanger Dideoxy Method
Sanger Dideoxy Method
 
Pyrosequencing 454
Pyrosequencing 454Pyrosequencing 454
Pyrosequencing 454
 
Ion Torrent Sequencing
Ion Torrent SequencingIon Torrent Sequencing
Ion Torrent Sequencing
 
Next Generation Sequencing - the basics
Next Generation Sequencing - the basicsNext Generation Sequencing - the basics
Next Generation Sequencing - the basics
 
Illumina Sequencing
Illumina SequencingIllumina Sequencing
Illumina Sequencing
 
Session ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcSession ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmc
 
Session ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcSession ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmc
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmc
 
Session ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcSession ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmc
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmc
 
Session ii g2 lab modeling mmc
Session ii g2 lab modeling mmcSession ii g2 lab modeling mmc
Session ii g2 lab modeling mmc
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmc
 

Dernier

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Session ii g1 overview genomics and gene expression mmc-good

  • 1. Microarray Dataset: quick mining and gene profile analysis using online tools Dr. Etienne Z. GNIMPIEBA Sioux Falls, March 2013 Etienne.gnimpieba@usd.edu
  • 2. Plan  Gene expression measurement  Microarray process  Gene expression data stores  Data mining / quering  Data analysis  Example: ATP13A2 profile in stress conditions
  • 3. Gene expression measurement Higher-plex techniques: SAGE DNA microarray Tiling array RNA-Seq NGS Low-to-mid-plex techniques: Reporter gene Northern blot Western blot Fluorescent in situ hybridization Reverse transcription PCR
  • 4. What is a Microarray? “A DNA microarray is a multiplex technology consisting of thousands of oligonucleotide spots, each containing picomoles of a specific DNA sequence.”  Used to quantitate mRNA or DNA  Many applications: ◦ mRNA or DNA levels ◦ SNP identification ◦ ChIP-on-Chip
  • 5. Hypotheses  Microarrays are usually hypothesis-generating: ◦ They highlight specific genes or features that are particularly interesting for follow-up experiments ◦ There are many interesting exceptions  Biomarkers  Pathway analyses  This does not reduce the importance of experimental design ◦ the low statistical power of array studies make good design even more important and very challenging
  • 6. Microarray process (1/3) • Image analysis (genepix) • Normalization (R) • Pre-treatment • Differential expression • Clustering • Data mining • Annotation
  • 8. Microarray process (3/3) High density filters(macroarrays) Glass slides (microarrays) Oligonucleotides chips Detail: Detail: Detail: Size: 12cm x 8cm Size: 5,4cm x 0,9cm Size: 1,28cm x 1,28cm •2400 clones by membrane •radioactive labelling •1 experimental condition by membrane •10000 clones by slide •fluorescent labelling •2 experimental conditions by slide •300000 oligonucleotides by slide •fluorescent labelling •1 experimental condition by slide
  • 9. Gene expression data management Database Microarray Experiment Sets Sample Profiles Date Reported ArrayExpress at EBI 24,838 708,914 October 28, 2011 ArrayTrack™ 1,622 50,953 February 11, 2012 caArray at NCI 41 1,741 November 15, 2006 Gene Expression Omnibus - NCBI 25,859 641,770 October 28, 2011 Genevestigator database 2,500 65,000 January 2012 MUSC database ~45 555 April 1, 2007 Stanford Microarray database 82,542 Not reported October 23, 2011 UNC Microarray database ~31 2,093 April 1, 2007 UNC modENCODE Microarray database ~6 180 July 17, 2009 UPenn RAD database ~100 ~2,500 September 1, 2007 UPSC-BASE ~100 Not reported November 15, 2007 SAGE GEO GUDMAP (421) MGI BIOGPS
  • 10. Data mining / querying  Problem specification  Query  Extraction  Storage  Load  Pretreat / prepare for analysis
  • 11. Data analysis (1/3)  Question-Answer ◦ Experimental condition profile: group comparison ◦ Annotation profile: systems biological involved ◦ Clustering profile: co-regulation ◦ Time course profile: time variation ◦ …  Descriptive ◦ Boxplot (SD, MEAN, MEDIAN, ) ◦ Scatter plot  Predictive / inference (clustering)  Modeling (machine learning, simulation)
  • 12. Data analysis (2/3)  3 Questions ◦ What is the right dataset (experimental condition)? ◦ Is dataset is ready for analysis (quality)? ◦ What is the expression profile for a given gene? ◦ Significant differential expression in groups comparison  Tools ◦ ArrayExpress (EBI) ◦ Boxplot ◦ GEO2R (LIMMA, profile graph,) ◦ ….
  • 14. Example: ATP13A2 profile in stress conditions  Specification: ATP13A2 profile in stress conditions  Data querying: ◦ GEO ◦ Array Express ◦ Gene Atlas  Data analysis: ◦ Online: GEO2R, Genospace, … ◦ Desktop: R, ArrayTrack, …

Notes de l'éditeur

  1. I can not say that I'm into Statistician 20 min. I give you just a few items to give rapid analysis of microarray.
  2. The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity.
  3. The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity.
  4. The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity.
  5. ArrayTrack™ provides an integrated solution for managing, analyzing, and interpreting microarray gene expression data. Specifically, ArrayTrack™ is MIAME (Minimum Information About A Microarray Experiment)-supportive for storing both microarray data and experiment parameters associated with a pharmacogenomics or toxicogenomics study. Many statistical and visualization tools are available with ArrayTrack™ which provides a rich collection of functional information about genes, proteins, and pathways for biological interpretation.  The primary emphasis of ArrayTrack™ is the direct linking of analysis results with functional information to facilitate the interaction between the choice of analysis methods and the biological relevance of analysis results. Using ArrayTrack™, users can easily select a statistical method applied to stored microarray data to determine a list of differentially expressed genes. The gene list can then be directly linked to pathways and gene ontology for functional analysis.
  6. Boxplots are useful for determining where the majority of the data lies