SlideShare une entreprise Scribd logo
1  sur  37
Querying Chado




Monday, February 16, 2009
Overview

                   • Relational databases
                   • Chado
                   • Writing queries
                   • Saving the results
                   • More examples


Monday, February 16, 2009
Relational database


                   • Data are organised in tables:
                   • The columns of the table represent attributes,
                   • The rows represent entities.




Monday, February 16, 2009
A conventional genomic database
                                          organism




         gene
           gene_id          organism_id    locus_tag
                 1              1           SAR0001
                 2              1           SAR0002
                 3              2           PFA00005
                 4              2           PFB00010
                 5              3          Smp_012250
                 6              3          Smp_152680


Monday, February 16, 2009
A conventional genomic database
                                            organism




                            Primary key
         gene
           gene_id            organism_id    locus_tag
                 1                 1          SAR0001
                 2                 1          SAR0002
                 3                 2          PFA00005
                 4                 2          PFB00010
                 5                 3         Smp_012250
                 6                 3         Smp_152680


Monday, February 16, 2009
A conventional genomic database
                                            organism




                            Primary key      Foreign key
         gene
           gene_id            organism_id     locus_tag
                 1                 1          SAR0001
                 2                 1          SAR0002
                 3                 2          PFA00005
                 4                 2          PFB00010
                 5                 3         Smp_012250
                 6                 3         Smp_152680


Monday, February 16, 2009
A conventional genomic database
                                            organism




                                                           Attribute
                            Primary key      Foreign key
         gene
           gene_id            organism_id     locus_tag
                 1                 1          SAR0001
                 2                 1          SAR0002
                 3                 2          PFA00005
                 4                 2          PFB00010
                 5                 3         Smp_012250
                 6                 3         Smp_152680


Monday, February 16, 2009
A conventional genomic database
                                            organism
                                            organism_id         genus          species     strain
                                                 1          Staphylococcus      aureus     MRSA252
                                                 2            Plasmodium      falciparum     3D7
                                                 3           Schistosoma       mansoni




                                                                  Attribute
                            Primary key       Foreign key
         gene
           gene_id            organism_id      locus_tag
                 1                 1           SAR0001
                 2                 1           SAR0002
                 3                 2           PFA00005
                 4                 2           PFB00010
                 5                 3          Smp_012250
                 6                 3          Smp_152680


Monday, February 16, 2009
A conventional genomic database
                                          organism
                                          organism_id       genus         species     strain
                                              1         Staphylococcus     aureus     MRSA252
                                              2           Plasmodium     falciparum     3D7
                                              3          Schistosoma      mansoni




         gene
           gene_id          organism_id      locus_tag
                 1              1            SAR0001
                 2              1            SAR0002
                 3              2            PFA00005
                 4              2            PFB00010
                 5              3           Smp_012250
                 6              3           Smp_152680


Monday, February 16, 2009
A conventional genomic database
                                          organism
                                          organism_id       genus         species     strain
                                              1         Staphylococcus     aureus     MRSA252
                                              2           Plasmodium     falciparum     3D7
                                              3          Schistosoma      mansoni




                                                                         transcript
         gene
                                                                            exon
           gene_id          organism_id      locus_tag
                 1              1            SAR0001
                 2              1            SAR0002
                                                                       chromosome
                 3              2            PFA00005
                                                                             &c
                 4              2            PFB00010
                 5              3           Smp_012250
                 6              3           Smp_152680


Monday, February 16, 2009
The core of Chado
                              Organism
                              organism_id
                              genus
                              species       CV
                                            cv_id
                                            name
                               Feature
                             feature_id
                                            CVterm
                             organism_id
                                            cvterm_id
                             type_id
                                            cv_id
                             uniquename
                                            name
                             name
                             residues


Monday, February 16, 2009
The core of Chado
                              Organism
                              organism_id
                              genus
                              species       CV
                                            cv_id
                                            name
                               Feature
                             feature_id
                                            CVterm
                             organism_id
                                            cvterm_id
                             type_id
                                            cv_id
                             uniquename
                                            name
                             name
                             residues


Monday, February 16, 2009
Connecting to the database


                  • Make sure you have an account on the database,
                  • Log onto pcs4,
                  • Type “chado”,
                  • Enter your database password.



Monday, February 16, 2009
Connecting to the database

            Welcome to psql 8.2.5, the PostgreSQL interactive terminal.

              • Make sure you have an account on the database,
            Type: copyright for distribution terms
                         h for help with SQL commands
                  •    Log ontohelp with psql commands
                         ? for pcs4,
                         g or terminate with semicolon to execute query
                       Type to quit
                         q “chado”,
                  •
            malaria_workshop=>
                  • Enter your database password.



Monday, February 16, 2009
Example queries
                     d cv




Monday, February 16, 2009
Example queries
                            d for ‘describe’

                     d cv




Monday, February 16, 2009
Example queries
                     d cv

                     select * from cv;




Monday, February 16, 2009
Example queries
                     d cv

                     select * from cv;
                              * means ‘aquot;
                                columns’




Monday, February 16, 2009
Example queries
                     d cv

                     select * from cv;
                                            Name of
                              * means ‘aquot;
                                             table
                                columns’




Monday, February 16, 2009
Example queries
                     d cv                     Queries end with
                                                a semicolon
                     select * from cv;
                                            Name of
                              * means ‘aquot;
                                             table
                                columns’




Monday, February 16, 2009
Example queries
                     d cv

                     select * from cv;

                     d cvterm




Monday, February 16, 2009
Example queries
                     d cv

                     select * from cv;

                     d cvterm

                     select name from cvterm
                     where cv_id = 10;




Monday, February 16, 2009
Example queries
                     d cv

                     select * from cv;

                     d cvterm the terms like this is pretty baffling.
                         Just seeing
                         If you want to understand the structure of the
                         ontology better,from download OBO-Edit
                                         you can cvterm
                     select name
                         'om oboedit.org, and the sequence ontology
                     where cv_id sequenceontology.org
                                 'om = 10;




Monday, February 16, 2009
Example queries
                     select name from cvterm
                     where cv_id = 10;

                     select cvterm.name
                     from cvterm
                     join cv on cv.cv_id = cvterm.cv_id
                     where cv.name = 'sequence';




Monday, February 16, 2009
Example queries
                     select name from cvterm
                     where cv_id = 10;

                     select cvterm.name
                     from cvterm
                     join cv on cv.cv_id = cvterm.cv_id
                     where cv.name = 'sequence';

                     select species from organism where
                     genus = 'Staphylococcus';

Monday, February 16, 2009
Count the genes in MRSA252
               select count(*)
               from feature gene
               where gene.type_id in (
                  select cvterm.cvterm_id
                  from cvterm
                  join cv on cv.cv_id = cvterm.cv_id
                  where cv.name = 'sequence'
                  and cvterm.name = 'gene'
               )
               and gene.organism_id in (
                  select organism_id
                  from organism
                  where genus = 'Staphylococcus'
                  and species = 'aureus (MRSA252)'
               );




Monday, February 16, 2009
Editing queries


                   • Now type e (for “edit”),
                   • Change “gene” to “pseudogene”:
                   • The query will run again, and count the
                        pseudogenes.




Monday, February 16, 2009
More Chado tables
                   • Locations are stored in the table featureloc.

                            Featureloc
                            featureloc_id
                                            refers to the gene
                            feature_id
                                            refers to the chromosome
                            srcfeature_id
                            fmin
                                  }         interbase coordinates
                            fmax
                                            1 (forward) or -1 (reverse)
                            strand
                            locgroup
                                      }     both 0 for the principal location
                            rank



Monday, February 16, 2009
More Chado tables
                   • Locations are stored in the table featureloc.
                                     Interbase coordinates

                          Featureloc
                          featureloc_id
                    ACGGTCCATACGGTCCATACGGTCCATCGGTTA
                                           refers to the gene
                          feature_id
                                           refers to the chromosome
                   0 1 2 3srcfeature_id
                           45 etc.
                          fmin
                             }             interbase coordinates
                          fmax
                                        13–18(forward) or -1 (reverse)
                                           1
                          strand
                          locgroup
                                 }         both 0 for the principal location
                          rank



Monday, February 16, 2009
More Chado tables
                   • Locations are stored in the table featureloc.

                            Featureloc
                            featureloc_id
                                            refers to the gene
                            feature_id
                                            refers to the chromosome
                            srcfeature_id
                            fmin
                                  }         interbase coordinates
                            fmax
                                            1 (forward) or -1 (reverse)
                            strand
                            locgroup
                                      }     both 0 for the principal location
                            rank



Monday, February 16, 2009
Location example
               select avg(geneloc.fmax - geneloc.fmin)
               from feature gene
               join featureloc geneloc
                   on geneloc.feature_id = gene.feature_id
               where gene.type_id in (
                 select cvterm.cvterm_id
                           Find the mean gene length of MRSA252
                 from cvterm
                 join cv on cv.cv_id on the forward strand.
                                 genes = cvterm.cv_id
                 where cv.name = 'sequence'
                 and cvterm.name = 'gene'
               )
               and gene.organism_id in (
                 select organism_id
                 from organism
                 where genus = 'Staphylococcus'
                 and species = 'aureus (MRSA252)'
               )
               and geneloc.locgroup = 0
               and geneloc.rank = 0
               and geneloc.strand = 1;

Monday, February 16, 2009
Location example
               select avg(geneloc.fmax - geneloc.fmin)
               from feature gene
               join featureloc geneloc
                   on geneloc.feature_id = gene.feature_id
               where gene.type_id in (
                 select cvterm.cvterm_id
                 from cvterm
                 join cv on cv.cv_id = cvterm.cv_id
                 where cv.name = 'sequence'
                 and cvterm.name = 'gene'
               )
               and gene.organism_id in (
                 select organism_id
                 from organism
                 where genus = 'Staphylococcus'
                 and species = 'aureus (MRSA252)'
               )
               and geneloc.locgroup = 0
               and geneloc.rank = 0
               and geneloc.strand = 1;

Monday, February 16, 2009
Another location example
               select chromosome.uniquename as chromosome
                    , count(*) as quot;number of genesquot;
               from feature gene
               join featureloc geneloc
                   on geneloc.feature_id = gene.feature_id
               join feature chromosome
                   on geneloc.srcfeature_id = chromosome.feature_id
               where gene.type_id in (
                 select cvterm.cvterm_id
                                    How many genes are on each
                 from cvterm
                               chromosome in Plasmodium falciparum?
                 join cv on cv.cv_id = cvterm.cv_id
                 where cv.name = 'sequence'
                 and cvterm.name = 'gene'
               )
               and gene.organism_id in (
                 select organism_id
                 from organism
                 where genus = 'Plasmodium'
                 and species = 'falciparum'
               )
               and geneloc.locgroup = 0
               and geneloc.rank = 0
               group by chromosome.uniquename
               ;



Monday, February 16, 2009
Another location example
               select chromosome.uniquename as chromosome
                    , count(*) as quot;number of genesquot;
               from feature gene
               join featureloc geneloc
                   on geneloc.feature_id = gene.feature_id
               join feature chromosome
                   on geneloc.srcfeature_id = chromosome.feature_id
               where gene.type_id in (
                 select cvterm.cvterm_id
                 from cvterm
                 join cv on cv.cv_id = cvterm.cv_id
                 where cv.name = 'sequence'
                 and cvterm.name = 'gene'
               )
               and gene.organism_id in (
                 select organism_id
                 from organism
                 where genus = 'Plasmodium'
                 and species = 'falciparum'
               )
               and geneloc.locgroup = 0
               and geneloc.rank = 0
               group by chromosome.uniquename
               ;



Monday, February 16, 2009
Transcripts and exons

                            Feature_relationship
                            subject_id
                                                   }   feature
                            object_id
                            type_id                cvterm


                     • Each exon is related to a transcript,
                     • Each transcript is related to a gene,
                     • Each polypeptide is related to a transcript,
                     • Annotation is attached to the polypeptide.


Monday, February 16, 2009
Annotation
                                              Products
                             Feature_cvterm                    Most other things
                            feature_id
                            cvterm_id
                                                 Featureprop
                                               feature_id
                                               type_id
                                               value




Monday, February 16, 2009
Lots more examples




                   • Live and direct!




Monday, February 16, 2009

Contenu connexe

Dernier

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

En vedette

AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

En vedette (20)

AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 

Querying Chado.Key

  • 2. Overview • Relational databases • Chado • Writing queries • Saving the results • More examples Monday, February 16, 2009
  • 3. Relational database • Data are organised in tables: • The columns of the table represent attributes, • The rows represent entities. Monday, February 16, 2009
  • 4. A conventional genomic database organism gene gene_id organism_id locus_tag 1 1 SAR0001 2 1 SAR0002 3 2 PFA00005 4 2 PFB00010 5 3 Smp_012250 6 3 Smp_152680 Monday, February 16, 2009
  • 5. A conventional genomic database organism Primary key gene gene_id organism_id locus_tag 1 1 SAR0001 2 1 SAR0002 3 2 PFA00005 4 2 PFB00010 5 3 Smp_012250 6 3 Smp_152680 Monday, February 16, 2009
  • 6. A conventional genomic database organism Primary key Foreign key gene gene_id organism_id locus_tag 1 1 SAR0001 2 1 SAR0002 3 2 PFA00005 4 2 PFB00010 5 3 Smp_012250 6 3 Smp_152680 Monday, February 16, 2009
  • 7. A conventional genomic database organism Attribute Primary key Foreign key gene gene_id organism_id locus_tag 1 1 SAR0001 2 1 SAR0002 3 2 PFA00005 4 2 PFB00010 5 3 Smp_012250 6 3 Smp_152680 Monday, February 16, 2009
  • 8. A conventional genomic database organism organism_id genus species strain 1 Staphylococcus aureus MRSA252 2 Plasmodium falciparum 3D7 3 Schistosoma mansoni Attribute Primary key Foreign key gene gene_id organism_id locus_tag 1 1 SAR0001 2 1 SAR0002 3 2 PFA00005 4 2 PFB00010 5 3 Smp_012250 6 3 Smp_152680 Monday, February 16, 2009
  • 9. A conventional genomic database organism organism_id genus species strain 1 Staphylococcus aureus MRSA252 2 Plasmodium falciparum 3D7 3 Schistosoma mansoni gene gene_id organism_id locus_tag 1 1 SAR0001 2 1 SAR0002 3 2 PFA00005 4 2 PFB00010 5 3 Smp_012250 6 3 Smp_152680 Monday, February 16, 2009
  • 10. A conventional genomic database organism organism_id genus species strain 1 Staphylococcus aureus MRSA252 2 Plasmodium falciparum 3D7 3 Schistosoma mansoni transcript gene exon gene_id organism_id locus_tag 1 1 SAR0001 2 1 SAR0002 chromosome 3 2 PFA00005 &c 4 2 PFB00010 5 3 Smp_012250 6 3 Smp_152680 Monday, February 16, 2009
  • 11. The core of Chado Organism organism_id genus species CV cv_id name Feature feature_id CVterm organism_id cvterm_id type_id cv_id uniquename name name residues Monday, February 16, 2009
  • 12. The core of Chado Organism organism_id genus species CV cv_id name Feature feature_id CVterm organism_id cvterm_id type_id cv_id uniquename name name residues Monday, February 16, 2009
  • 13. Connecting to the database • Make sure you have an account on the database, • Log onto pcs4, • Type “chado”, • Enter your database password. Monday, February 16, 2009
  • 14. Connecting to the database Welcome to psql 8.2.5, the PostgreSQL interactive terminal. • Make sure you have an account on the database, Type: copyright for distribution terms h for help with SQL commands • Log ontohelp with psql commands ? for pcs4, g or terminate with semicolon to execute query Type to quit q “chado”, • malaria_workshop=> • Enter your database password. Monday, February 16, 2009
  • 15. Example queries d cv Monday, February 16, 2009
  • 16. Example queries d for ‘describe’ d cv Monday, February 16, 2009
  • 17. Example queries d cv select * from cv; Monday, February 16, 2009
  • 18. Example queries d cv select * from cv; * means ‘aquot; columns’ Monday, February 16, 2009
  • 19. Example queries d cv select * from cv; Name of * means ‘aquot; table columns’ Monday, February 16, 2009
  • 20. Example queries d cv Queries end with a semicolon select * from cv; Name of * means ‘aquot; table columns’ Monday, February 16, 2009
  • 21. Example queries d cv select * from cv; d cvterm Monday, February 16, 2009
  • 22. Example queries d cv select * from cv; d cvterm select name from cvterm where cv_id = 10; Monday, February 16, 2009
  • 23. Example queries d cv select * from cv; d cvterm the terms like this is pretty baffling. Just seeing If you want to understand the structure of the ontology better,from download OBO-Edit you can cvterm select name 'om oboedit.org, and the sequence ontology where cv_id sequenceontology.org 'om = 10; Monday, February 16, 2009
  • 24. Example queries select name from cvterm where cv_id = 10; select cvterm.name from cvterm join cv on cv.cv_id = cvterm.cv_id where cv.name = 'sequence'; Monday, February 16, 2009
  • 25. Example queries select name from cvterm where cv_id = 10; select cvterm.name from cvterm join cv on cv.cv_id = cvterm.cv_id where cv.name = 'sequence'; select species from organism where genus = 'Staphylococcus'; Monday, February 16, 2009
  • 26. Count the genes in MRSA252 select count(*) from feature gene where gene.type_id in ( select cvterm.cvterm_id from cvterm join cv on cv.cv_id = cvterm.cv_id where cv.name = 'sequence' and cvterm.name = 'gene' ) and gene.organism_id in ( select organism_id from organism where genus = 'Staphylococcus' and species = 'aureus (MRSA252)' ); Monday, February 16, 2009
  • 27. Editing queries • Now type e (for “edit”), • Change “gene” to “pseudogene”: • The query will run again, and count the pseudogenes. Monday, February 16, 2009
  • 28. More Chado tables • Locations are stored in the table featureloc. Featureloc featureloc_id refers to the gene feature_id refers to the chromosome srcfeature_id fmin } interbase coordinates fmax 1 (forward) or -1 (reverse) strand locgroup } both 0 for the principal location rank Monday, February 16, 2009
  • 29. More Chado tables • Locations are stored in the table featureloc. Interbase coordinates Featureloc featureloc_id ACGGTCCATACGGTCCATACGGTCCATCGGTTA refers to the gene feature_id refers to the chromosome 0 1 2 3srcfeature_id 45 etc. fmin } interbase coordinates fmax 13–18(forward) or -1 (reverse) 1 strand locgroup } both 0 for the principal location rank Monday, February 16, 2009
  • 30. More Chado tables • Locations are stored in the table featureloc. Featureloc featureloc_id refers to the gene feature_id refers to the chromosome srcfeature_id fmin } interbase coordinates fmax 1 (forward) or -1 (reverse) strand locgroup } both 0 for the principal location rank Monday, February 16, 2009
  • 31. Location example select avg(geneloc.fmax - geneloc.fmin) from feature gene join featureloc geneloc on geneloc.feature_id = gene.feature_id where gene.type_id in ( select cvterm.cvterm_id Find the mean gene length of MRSA252 from cvterm join cv on cv.cv_id on the forward strand. genes = cvterm.cv_id where cv.name = 'sequence' and cvterm.name = 'gene' ) and gene.organism_id in ( select organism_id from organism where genus = 'Staphylococcus' and species = 'aureus (MRSA252)' ) and geneloc.locgroup = 0 and geneloc.rank = 0 and geneloc.strand = 1; Monday, February 16, 2009
  • 32. Location example select avg(geneloc.fmax - geneloc.fmin) from feature gene join featureloc geneloc on geneloc.feature_id = gene.feature_id where gene.type_id in ( select cvterm.cvterm_id from cvterm join cv on cv.cv_id = cvterm.cv_id where cv.name = 'sequence' and cvterm.name = 'gene' ) and gene.organism_id in ( select organism_id from organism where genus = 'Staphylococcus' and species = 'aureus (MRSA252)' ) and geneloc.locgroup = 0 and geneloc.rank = 0 and geneloc.strand = 1; Monday, February 16, 2009
  • 33. Another location example select chromosome.uniquename as chromosome , count(*) as quot;number of genesquot; from feature gene join featureloc geneloc on geneloc.feature_id = gene.feature_id join feature chromosome on geneloc.srcfeature_id = chromosome.feature_id where gene.type_id in ( select cvterm.cvterm_id How many genes are on each from cvterm chromosome in Plasmodium falciparum? join cv on cv.cv_id = cvterm.cv_id where cv.name = 'sequence' and cvterm.name = 'gene' ) and gene.organism_id in ( select organism_id from organism where genus = 'Plasmodium' and species = 'falciparum' ) and geneloc.locgroup = 0 and geneloc.rank = 0 group by chromosome.uniquename ; Monday, February 16, 2009
  • 34. Another location example select chromosome.uniquename as chromosome , count(*) as quot;number of genesquot; from feature gene join featureloc geneloc on geneloc.feature_id = gene.feature_id join feature chromosome on geneloc.srcfeature_id = chromosome.feature_id where gene.type_id in ( select cvterm.cvterm_id from cvterm join cv on cv.cv_id = cvterm.cv_id where cv.name = 'sequence' and cvterm.name = 'gene' ) and gene.organism_id in ( select organism_id from organism where genus = 'Plasmodium' and species = 'falciparum' ) and geneloc.locgroup = 0 and geneloc.rank = 0 group by chromosome.uniquename ; Monday, February 16, 2009
  • 35. Transcripts and exons Feature_relationship subject_id } feature object_id type_id cvterm • Each exon is related to a transcript, • Each transcript is related to a gene, • Each polypeptide is related to a transcript, • Annotation is attached to the polypeptide. Monday, February 16, 2009
  • 36. Annotation Products Feature_cvterm Most other things feature_id cvterm_id Featureprop feature_id type_id value Monday, February 16, 2009
  • 37. Lots more examples • Live and direct! Monday, February 16, 2009

Notes de l'éditeur