SlideShare une entreprise Scribd logo
1  sur  117
SPARQL UniProt.RDF




    Jerven Bolleman
      Developer
      Swiss-Prot Group
      Swiss Institute of Bioinformatics




Tuesday, December 4, 2012
A few notes before we begin


     • SPARQL 1
        – Some what useful
        – Standardized in 2008
     • SPARQL 1.1
        – Very useful
        – Currently in recommended standard

     • Still finding incompatibilities
     • Or not yet implemented features



    © 2012 SIB



Tuesday, December 4, 2012
Raise your hand if you have questions




    © 2012 SIB



Tuesday, December 4, 2012
Tutorial plan


     • Set up Topbraid Composer
        – Skipped in talk
        – On VM
     • Gather data from uniprot website
        – Already there.        Text
     • Learn sparql
                   You do not need Topbraid Composer
                   to use UniProt RDF data or do sparql
                   queries.
                   You can use beta.sparql.uniprot.org
                   as well.
    © 2012 SIB



Tuesday, December 4, 2012
Download and install Topbraid composer


     • Requirements
        – Sun/Oracle JVM
     • Go to
        – http://www.topquadrant.com/products/
          TB_download.html
        – Register
        – Select any edition, free is ok for today




    © 2012 SIB



Tuesday, December 4, 2012
Start Topbraid




    © 2012 SIB



Tuesday, December 4, 2012
Setting up a workspace for this tutorial


     • http://www.topquadrant.com/products/TB_download.html




    © 2012 SIB



Tuesday, December 4, 2012
New project
     • File > New Project > General




    © 2012 SIB



Tuesday, December 4, 2012
Gather data from uniprot.org website




                 • In the navigator select the new project you just made.




    © 2012 SIB



Tuesday, December 4, 2012
Gather data from uniprot.org website
  Right click on your new project.
  Select “Import” in the drop down menu




          • Import RDF or OWL file from the web


    © 2012 SIB



Tuesday, December 4, 2012
Using the same process download core.owl




                 You can see a html view of this schema
                 ontology at
                 http://www.uniprot.org/core/




    © 2012 SIB



Tuesday, December 4, 2012
Gather data from uniprot.org website




             You can see a html view of this entry at
                http://www.uniprot.org/taxonomy/40674




    © 2012 SIB



Tuesday, December 4, 2012
Gather data from uniprot.org website


     • Open the mammalia.rdf file by double clicking




    © 2012 SIB



Tuesday, December 4, 2012
You get a very helpfull dialog.
      Hit yes




    © 2012 SIB



Tuesday, December 4, 2012
Its SPARQLy mammal time !!




    © 2012 SIB



Tuesday, December 4, 2012
Lets look at an single taxon record




    © 2012 SIB



Tuesday, December 4, 2012
Lets navigate to it in TopBraid


     • Type the uri as is with the angle brackets




    © 2012 SIB



Tuesday, December 4, 2012
Investigate the taxon record




    © 2012 SIB



Tuesday, December 4, 2012
The “Eastern Chipmunk” in turtle




    © 2012 SIB



Tuesday, December 4, 2012
Turtle is the RDF serialization aligned with
     SPARQL

     • Shorthand to avoid typing so much
        – . ‘dot’ is end statement
        – ; ‘semi-colon’ repeat subject
        – , ‘comma’ is repeat subject and predicate
     • prefix
        – before ‘:’ is abbreviation of uri




    © 2012 SIB



Tuesday, December 4, 2012
Why don’t these queries work on the web?


     • PREFIX
        – Topbraid composer uses the prefixes defined in the
          files “overview” tab.
        – On the web you often have to add these.

                   PREFIX :<http://purl.uniprot.org/core/>
                   SELECT ?x
                   FROM <http://purl.uniprot.org/taxonomy/>
                   WHERE {?x a :Taxon}




    © 2012 SIB



Tuesday, December 4, 2012
a = rdf:type = <http://www.w3.org/1999/02/22-rdf-
     syntax-ns#type>




    © 2012 SIB



Tuesday, December 4, 2012
rdfs:subClassOf
     taxon:45474 is a more specific classification than
     taxon:13712




    © 2012 SIB



Tuesday, December 4, 2012
rank => “The level, for nomenclatural purposes, of
     a taxon in a taxonomic hierarchy”




    © 2012 SIB



Tuesday, December 4, 2012
Why learn SPARQL


     • Standardized formal query language
        – implementation independent
           • SPARQL ➔ SQL (via R2ML)
           • SPARQL ➔ webservice (via SADI)
           • SPARQL ➔ LDAP (e.g. SquirrelRDF)
           • SPARQL ➔ RDF (triplestore e.g. OWLIM-se)
           • SPARQL ➔ HADOOP/HIVE (e.g. SHARD)
        – How you query independent of how you store!




    © 2012 SIB



Tuesday, December 4, 2012
Apparently it helps
      kill vampires !!!




    © 2012 SIB



Tuesday, December 4, 2012
Lets learn SPARQL


     • Queries over RDF data.
       – Four basic types
          • SELECT
              – Returns “tab delimited” results
          • CONSTRUCT
              – Makes new triples
          • DESCRIBE
              – Returns all triples mentioning a resource
          • ASK
              – Return true if anything matches

    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




                 taxon:9606 rdf:type core:Taxon .




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




                 ?anyTaxon rdf:type core:Taxon .




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




          SELECT ?anyTaxon
          WHERE {
            ?anyTaxon rdf:type core:Taxon .
          }




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




                 taxon:9606 rdf:type core:Taxon .
                 taxon:9606 core:reviewed “true” .




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




                 ?anyTaxon rdf:type core:Taxon .
                 ?anyTaxon core:reviewed “true” .




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




          SELECT ?anyTaxon
          WHERE {
            ?anyTaxon rdf:type core:Taxon .
            ?anyTaxon core:reviewed “true” .
          }




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




          SELECT ?anyTaxon
          WHERE {
            ?anyTaxon rdf:type core:Taxon .
            ?anyTaxin core:reviewed “true” .
          }




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




          SELECT ?anyTaxon
          WHERE {
            ?anyTaxon rdf:type core:Taxon .
            $anyTaxon core:reviewed “true” .
          }




    © 2012 SIB



Tuesday, December 4, 2012
Lets learn SPARQL




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
Shorthand a = rdf:type




    © 2012 SIB



Tuesday, December 4, 2012
AND join (default)




    © 2012 SIB



Tuesday, December 4, 2012
Now you type




    © 2012 SIB



Tuesday, December 4, 2012
Remember ‘;’ shortcut




    © 2012 SIB



Tuesday, December 4, 2012
Two variables one output column




    © 2012 SIB



Tuesday, December 4, 2012
Optional


     • When values may be missing
        – yet interesting when they are there
     • Use as sub query
     • bound values from outside stay bound inside
        – ?x ?y?z . OPTIONAL {?x ?b ?c}
           • ?x same variable = same thing




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
UNION


     • Allows you to combine query patterns as an OR
       operation.
     • Joins are still from outer to inner.




    © 2012 SIB



Tuesday, December 4, 2012
UNION




    © 2012 SIB



Tuesday, December 4, 2012
Negation


     • When you do not want a certain category of matches.

                            SELECT ?pet
                            WHERE {
                              ?pet a pets:Friendly .
                            }




    © 2012 SIB



Tuesday, December 4, 2012
Oooops




    © 2012 SIB



Tuesday, December 4, 2012
Not exists (Negation 1)




    © 2012 SIB



Tuesday, December 4, 2012
Minus (Negation 2)




    © 2012 SIB



Tuesday, December 4, 2012
MINUS{} or FILTER (NOT EXISTS{})


     • Whats the difference?
       – MINUS subtracts results
       – NOT EXITS tests if the sub pattern is possible at all.
          • Normally the faster option.




    © 2012 SIB



Tuesday, December 4, 2012
MINUS all data




    © 2012 SIB



Tuesday, December 4, 2012
FILTER (NOT EXISTS{}) no results




    © 2012 SIB



Tuesday, December 4, 2012
Negation option 3
       SPARQL 1.0

                 SELECT ?subject ?rank
                 WHERE {
                    ?subject core:rank ?rank .
                    OPTIONAL
 { ?subject core:rank core:Genus .
                   
 
   
    
   
   
    ?subject core:rank ?genus .}
                    FILTER(! BOUND(?genus))
                 }




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
FILTERS


     • You just saw it twice
        – Once in the !BOUND
        – Once in the NOT EXISTS

     • FILTERS a result set by possibly removing values
        – FILTER do not add a value to the result
     • Inside the same graph pattern order independent.




    © 2012 SIB



Tuesday, December 4, 2012
Filter




    © 2012 SIB



Tuesday, December 4, 2012
Filter on not in




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
IN




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
FILTER on numbers


     • <
        –        FILTER (1 < 2)
     • >
        –        FILTER (2 > 1)
     • =
        –        FILTER (1 =1)
     • !=
        –        FILTER(1 != 2)
     •



    © 2012 SIB



Tuesday, December 4, 2012
Filters


     • ?x = ?y does casting (value conversions)
        – 1.0^^xsd:float = 1^^xsd:int is true
     • sameTerm(?x, ?y) does not
        – sameTerm(1.0^^xsd:float, 1^^xsd:int)




    © 2012 SIB



Tuesday, December 4, 2012
FILTER on strings


     • Functions
        – STRLEN            –   ENCODE_FOR_URI
        – SUBSTR            –   CONCAT
        – UCASE             –   langMatches
        – LCASE             –   REGEX
        – STRSTARTS         –   REPLACE
        – STRENDS
        – CONTAINS          – IRI
        – STRBEFORE
        – STRAFTER

    © 2012 SIB



Tuesday, December 4, 2012
STRLEN == String Length




    © 2012 SIB



Tuesday, December 4, 2012
CONTAINS is case sensitive is it in there




    © 2012 SIB



Tuesday, December 4, 2012
REGEX, just like java regex




    © 2012 SIB



Tuesday, December 4, 2012
BIND


     • Builds new Values
        – Closes the basic graph pattern
                 SELECT ?p WHERE {
                   {
                     ?taxon a :Taxon .
                   }
                   BIND (?taxon AS ?p)
                 }
     • Always declare before use.



    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
BIND can assign any output




    © 2012 SIB



Tuesday, December 4, 2012
Aggregate functions


     • on select line
     • limited in number
         – count
         – sum
         – avg
         – min
         – max
         – groupConcat
         – sample



    © 2012 SIB



Tuesday, December 4, 2012
count




    © 2012 SIB



Tuesday, December 4, 2012
SAMPLE should give a random result back




    © 2012 SIB



Tuesday, December 4, 2012
Follow the path




    © 2012 SIB



Tuesday, December 4, 2012
Path queries




    © 2012 SIB



Tuesday, December 4, 2012
Finding a grand parent using normal joins




    © 2012 SIB



Tuesday, December 4, 2012
Finding a grandParent using a path query




    © 2012 SIB



Tuesday, December 4, 2012
| is OR for predicate




    © 2012 SIB



Tuesday, December 4, 2012
Same result with UNION




    © 2012 SIB



Tuesday, December 4, 2012
Finding any ancestor




    © 2012 SIB



Tuesday, December 4, 2012
Can use the variable in a normal join afterwards




    © 2012 SIB



Tuesday, December 4, 2012
GROUP BY




    © 2012 SIB



Tuesday, December 4, 2012
GROUP BY


     • Needed for aggregate values
     • After closing the where clause
        – ... WHERE {?x ?y ?z} GROUP BY ?x




    © 2012 SIB



Tuesday, December 4, 2012
GROUP BY




    © 2012 SIB



Tuesday, December 4, 2012
HAVING




                            I have carrot !




    © 2012 SIB



Tuesday, December 4, 2012
HAVING


     • FILTER for aggregates
     • After the GROUP BY clause
        – ... GROUP BY ?x HAVING (count(?y) > 2)
        – ... GROUP BY ?x HAVING (min(?y) = 2)
        – etc...




    © 2012 SIB



Tuesday, December 4, 2012
HAVING




    © 2012 SIB



Tuesday, December 4, 2012
LIMITS
         &
            OFFSET




    © 2012 SIB



Tuesday, December 4, 2012
LIMIT and OFFSET

     • OFFSET is skip first results
     • LIMIT return no more than x results




    © 2012 SIB



Tuesday, December 4, 2012
ORDER




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
VALUES


     • Super BIND
     • Provide inline data




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
Examples


     • Parameter lists are between ()


                   VALUES (?annotation) {
                     (core:Disease_Annotation)
                                       Text
                     (core:Disulfide_Bond_Annotation)
                   }




    © 2012 SIB



Tuesday, December 4, 2012
Examples


     • Undef means no value at
        – all not bound
                 VALUES (?annotation ?begin) {
                   (core:Disease_Annotation UNDEF)
                                       Text
                   (core:Disulfide_Bond_Annotation 2)
                 }




    © 2012 SIB



Tuesday, December 4, 2012
VALUES


     • After declaring a set of values you can use them in your
       query.

                 SELECT ?comment WHERE {
                   VALUES (?annotation ?begin) {
                     (core:Disease_Annotation UNDEF)
                     (core:Disulfide_Bond_Annotation 2)
                   }
                   ?annotation rdfs:comment ?comment .
                 }


    © 2012 SIB



Tuesday, December 4, 2012
SERVICE: Using other sparql endpoints


     • SERVICE<URL of other endpoint>
        – Runs a sub query on the other endpoint and merges it
          back into your query.




    © 2012 SIB



Tuesday, December 4, 2012
“Life is better with friends who understand you.”




    © 2012 SIB



Tuesday, December 4, 2012
SERVICE




    © 2012 SIB



Tuesday, December 4, 2012
SERVICE


     • Useful
        – Quick experimenting with combing multiple
          datasources
        – Quick for queries where not to much data is send to
          the remote point

     • Slow
        – When you ask for to much data
        – Remote endpoint not resourced for your questions



    © 2012 SIB



Tuesday, December 4, 2012
Lets make
                            some triples




    © 2012 SIB



Tuesday, December 4, 2012
Construction


     • CONSTRUCT
        – New triples
           • downloads RDF
        – Does not update store




    © 2012 SIB



Tuesday, December 4, 2012
New triples




    © 2012 SIB



Tuesday, December 4, 2012
Constructing an owl:sameAs between two URI




    © 2012 SIB



Tuesday, December 4, 2012
INSERT


     • Adds data
        – like construct




    © 2012 SIB



Tuesday, December 4, 2012
Modifies data




    © 2012 SIB



Tuesday, December 4, 2012
DELETE


     • Removes data
        – Triples matching are removed from the data
        – Triples can be bound using where clause




    © 2012 SIB



Tuesday, December 4, 2012
DELETE




    © 2012 SIB



Tuesday, December 4, 2012
DELETE
     INSERT

     • Single atomic operation.




    © 2012 SIB



Tuesday, December 4, 2012
Atomic operation




    © 2012 SIB



Tuesday, December 4, 2012
I’m exhausted now




    © 2012 SIB



Tuesday, December 4, 2012
Questions




Tuesday, December 4, 2012

Contenu connexe

Tendances

Langage RDF/RDFs
Langage RDF/RDFsLangage RDF/RDFs
Langage RDF/RDFs
Rached Krim
 
Apprentissage supervisé.pdf
Apprentissage supervisé.pdfApprentissage supervisé.pdf
Apprentissage supervisé.pdf
hanamettali
 

Tendances (20)

Langage RDF/RDFs
Langage RDF/RDFsLangage RDF/RDFs
Langage RDF/RDFs
 
système multi agent
système multi agentsystème multi agent
système multi agent
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data Mud
 
SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat Sheet
 
Jarrar: Description Logic
Jarrar: Description LogicJarrar: Description Logic
Jarrar: Description Logic
 
Tp Sql Server Integration Services 2008
Tp  Sql Server Integration Services  2008Tp  Sql Server Integration Services  2008
Tp Sql Server Integration Services 2008
 
Bases de Données non relationnelles, NoSQL (Introduction) 1er cours
Bases de Données non relationnelles, NoSQL (Introduction) 1er coursBases de Données non relationnelles, NoSQL (Introduction) 1er cours
Bases de Données non relationnelles, NoSQL (Introduction) 1er cours
 
Knn
KnnKnn
Knn
 
Présentation pfe
Présentation pfePrésentation pfe
Présentation pfe
 
Agent intelligent
Agent intelligentAgent intelligent
Agent intelligent
 
Ch3 sma-architectures-2012
Ch3 sma-architectures-2012Ch3 sma-architectures-2012
Ch3 sma-architectures-2012
 
Intelligence Artificielle - Algorithmes de recherche
Intelligence Artificielle - Algorithmes de rechercheIntelligence Artificielle - Algorithmes de recherche
Intelligence Artificielle - Algorithmes de recherche
 
Base de données NoSQL
Base de données NoSQLBase de données NoSQL
Base de données NoSQL
 
Les arbres de décisions
Les arbres de décisionsLes arbres de décisions
Les arbres de décisions
 
Les BD NoSQL
Les BD NoSQLLes BD NoSQL
Les BD NoSQL
 
Apprentissage supervisé.pdf
Apprentissage supervisé.pdfApprentissage supervisé.pdf
Apprentissage supervisé.pdf
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Architecture des Systèmes Multi-Agents
Architecture des Systèmes Multi-Agents Architecture des Systèmes Multi-Agents
Architecture des Systèmes Multi-Agents
 
Exposé segmentation
Exposé segmentationExposé segmentation
Exposé segmentation
 
BigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceBigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-Reduce
 

En vedette (8)

SPIN in Five Slides
SPIN in Five SlidesSPIN in Five Slides
SPIN in Five Slides
 
The uni prot knowledgebase
The uni prot knowledgebaseThe uni prot knowledgebase
The uni prot knowledgebase
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database Systems
 
PROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANKPROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANK
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Java and SPARQL
Java and SPARQLJava and SPARQL
Java and SPARQL
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
Proteomics
ProteomicsProteomics
Proteomics
 

Similaire à Learning sparql 2012 12

Análisis de ataques APT
Análisis de ataques APT Análisis de ataques APT
Análisis de ataques APT
linenoise
 
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
Chiradeep Vittal
 
Avoiding API Waterfalls
Avoiding API WaterfallsAvoiding API Waterfalls
Avoiding API Waterfalls
Jakub Nesetril
 
Improving Front End Performance
Improving Front End PerformanceImproving Front End Performance
Improving Front End Performance
Joseph Scott
 
OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?
Aidan Hogan
 
NOSQL also means RDF stores: an Android case study
NOSQL also means RDF stores: an Android case studyNOSQL also means RDF stores: an Android case study
NOSQL also means RDF stores: an Android case study
Fabrizio Giudici
 

Similaire à Learning sparql 2012 12 (20)

STOP THE INSANITY - Juggle your classes!
STOP THE INSANITY - Juggle your classes!STOP THE INSANITY - Juggle your classes!
STOP THE INSANITY - Juggle your classes!
 
Análisis de ataques APT
Análisis de ataques APT Análisis de ataques APT
Análisis de ataques APT
 
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
What\'s Hot, What\'s Not: Skills For SAS® Professionals (35 Minutes)
What\'s Hot, What\'s Not: Skills For SAS® Professionals (35 Minutes)What\'s Hot, What\'s Not: Skills For SAS® Professionals (35 Minutes)
What\'s Hot, What\'s Not: Skills For SAS® Professionals (35 Minutes)
 
Avoiding API Waterfalls
Avoiding API WaterfallsAvoiding API Waterfalls
Avoiding API Waterfalls
 
Developing RESTful Web APIs with Python, Flask and MongoDB
Developing RESTful Web APIs with Python, Flask and MongoDBDeveloping RESTful Web APIs with Python, Flask and MongoDB
Developing RESTful Web APIs with Python, Flask and MongoDB
 
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
 
Scala
ScalaScala
Scala
 
Improving Front End Performance
Improving Front End PerformanceImproving Front End Performance
Improving Front End Performance
 
Best Practices in Theme Development - WordCamp Orlando 2012
Best Practices in Theme Development - WordCamp Orlando 2012Best Practices in Theme Development - WordCamp Orlando 2012
Best Practices in Theme Development - WordCamp Orlando 2012
 
Ilt forum 2 may 2012
Ilt forum 2 may 2012Ilt forum 2 may 2012
Ilt forum 2 may 2012
 
Thomas risberg mongosv-2012-spring-data-cloud-foundry
Thomas risberg mongosv-2012-spring-data-cloud-foundryThomas risberg mongosv-2012-spring-data-cloud-foundry
Thomas risberg mongosv-2012-spring-data-cloud-foundry
 
OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache Pig
 
NOSQL also means RDF stores: an Android case study
NOSQL also means RDF stores: an Android case studyNOSQL also means RDF stores: an Android case study
NOSQL also means RDF stores: an Android case study
 
Triple Stores
Triple StoresTriple Stores
Triple Stores
 
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
 
RDFa
RDFaRDFa
RDFa
 
Presentation on Windows 8 Application at IIT, University of Dhaka
Presentation on Windows 8 Application at IIT, University of DhakaPresentation on Windows 8 Application at IIT, University of Dhaka
Presentation on Windows 8 Application at IIT, University of Dhaka
 

Plus de Jerven Bolleman

Plus de Jerven Bolleman (8)

Semantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLSemantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQL
 
Why sparql tohu
Why sparql tohuWhy sparql tohu
Why sparql tohu
 
RDF: what and why plus a SPARQL tutorial
RDF: what and why plus a SPARQL tutorialRDF: what and why plus a SPARQL tutorial
RDF: what and why plus a SPARQL tutorial
 
UniProtKB/Swiss-Prot:Why sparql?
UniProtKB/Swiss-Prot:Why sparql?UniProtKB/Swiss-Prot:Why sparql?
UniProtKB/Swiss-Prot:Why sparql?
 
sparql,uniprot.org in production
sparql,uniprot.org in productionsparql,uniprot.org in production
sparql,uniprot.org in production
 
The UniProt SPARQL endpoint: 20 billion quads in production
The UniProt SPARQL endpoint: 20 billion quads in productionThe UniProt SPARQL endpoint: 20 billion quads in production
The UniProt SPARQL endpoint: 20 billion quads in production
 
Biohackathon2013: Tripling Bioinformatics Productivity
Biohackathon2013: Tripling Bioinformatics ProductivityBiohackathon2013: Tripling Bioinformatics Productivity
Biohackathon2013: Tripling Bioinformatics Productivity
 
Uni protsparqlcloud
Uni protsparqlcloudUni protsparqlcloud
Uni protsparqlcloud
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Learning sparql 2012 12

  • 1. SPARQL UniProt.RDF Jerven Bolleman Developer Swiss-Prot Group Swiss Institute of Bioinformatics Tuesday, December 4, 2012
  • 2. A few notes before we begin • SPARQL 1 – Some what useful – Standardized in 2008 • SPARQL 1.1 – Very useful – Currently in recommended standard • Still finding incompatibilities • Or not yet implemented features © 2012 SIB Tuesday, December 4, 2012
  • 3. Raise your hand if you have questions © 2012 SIB Tuesday, December 4, 2012
  • 4. Tutorial plan • Set up Topbraid Composer – Skipped in talk – On VM • Gather data from uniprot website – Already there. Text • Learn sparql You do not need Topbraid Composer to use UniProt RDF data or do sparql queries. You can use beta.sparql.uniprot.org as well. © 2012 SIB Tuesday, December 4, 2012
  • 5. Download and install Topbraid composer • Requirements – Sun/Oracle JVM • Go to – http://www.topquadrant.com/products/ TB_download.html – Register – Select any edition, free is ok for today © 2012 SIB Tuesday, December 4, 2012
  • 6. Start Topbraid © 2012 SIB Tuesday, December 4, 2012
  • 7. Setting up a workspace for this tutorial • http://www.topquadrant.com/products/TB_download.html © 2012 SIB Tuesday, December 4, 2012
  • 8. New project • File > New Project > General © 2012 SIB Tuesday, December 4, 2012
  • 9. Gather data from uniprot.org website • In the navigator select the new project you just made. © 2012 SIB Tuesday, December 4, 2012
  • 10. Gather data from uniprot.org website Right click on your new project. Select “Import” in the drop down menu • Import RDF or OWL file from the web © 2012 SIB Tuesday, December 4, 2012
  • 11. Using the same process download core.owl You can see a html view of this schema ontology at http://www.uniprot.org/core/ © 2012 SIB Tuesday, December 4, 2012
  • 12. Gather data from uniprot.org website You can see a html view of this entry at http://www.uniprot.org/taxonomy/40674 © 2012 SIB Tuesday, December 4, 2012
  • 13. Gather data from uniprot.org website • Open the mammalia.rdf file by double clicking © 2012 SIB Tuesday, December 4, 2012
  • 14. You get a very helpfull dialog. Hit yes © 2012 SIB Tuesday, December 4, 2012
  • 15. Its SPARQLy mammal time !! © 2012 SIB Tuesday, December 4, 2012
  • 16. Lets look at an single taxon record © 2012 SIB Tuesday, December 4, 2012
  • 17. Lets navigate to it in TopBraid • Type the uri as is with the angle brackets © 2012 SIB Tuesday, December 4, 2012
  • 18. Investigate the taxon record © 2012 SIB Tuesday, December 4, 2012
  • 19. The “Eastern Chipmunk” in turtle © 2012 SIB Tuesday, December 4, 2012
  • 20. Turtle is the RDF serialization aligned with SPARQL • Shorthand to avoid typing so much – . ‘dot’ is end statement – ; ‘semi-colon’ repeat subject – , ‘comma’ is repeat subject and predicate • prefix – before ‘:’ is abbreviation of uri © 2012 SIB Tuesday, December 4, 2012
  • 21. Why don’t these queries work on the web? • PREFIX – Topbraid composer uses the prefixes defined in the files “overview” tab. – On the web you often have to add these. PREFIX :<http://purl.uniprot.org/core/> SELECT ?x FROM <http://purl.uniprot.org/taxonomy/> WHERE {?x a :Taxon} © 2012 SIB Tuesday, December 4, 2012
  • 22. a = rdf:type = <http://www.w3.org/1999/02/22-rdf- syntax-ns#type> © 2012 SIB Tuesday, December 4, 2012
  • 23. rdfs:subClassOf taxon:45474 is a more specific classification than taxon:13712 © 2012 SIB Tuesday, December 4, 2012
  • 24. rank => “The level, for nomenclatural purposes, of a taxon in a taxonomic hierarchy” © 2012 SIB Tuesday, December 4, 2012
  • 25. Why learn SPARQL • Standardized formal query language – implementation independent • SPARQL ➔ SQL (via R2ML) • SPARQL ➔ webservice (via SADI) • SPARQL ➔ LDAP (e.g. SquirrelRDF) • SPARQL ➔ RDF (triplestore e.g. OWLIM-se) • SPARQL ➔ HADOOP/HIVE (e.g. SHARD) – How you query independent of how you store! © 2012 SIB Tuesday, December 4, 2012
  • 26. Apparently it helps kill vampires !!! © 2012 SIB Tuesday, December 4, 2012
  • 27. Lets learn SPARQL • Queries over RDF data. – Four basic types • SELECT – Returns “tab delimited” results • CONSTRUCT – Makes new triples • DESCRIBE – Returns all triples mentioning a resource • ASK – Return true if anything matches © 2012 SIB Tuesday, December 4, 2012
  • 28. SPARQL:queries triple pattern taxon:9606 rdf:type core:Taxon . © 2012 SIB Tuesday, December 4, 2012
  • 29. SPARQL:queries triple pattern ?anyTaxon rdf:type core:Taxon . © 2012 SIB Tuesday, December 4, 2012
  • 30. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . } © 2012 SIB Tuesday, December 4, 2012
  • 31. SPARQL:queries triple pattern taxon:9606 rdf:type core:Taxon . taxon:9606 core:reviewed “true” . © 2012 SIB Tuesday, December 4, 2012
  • 32. SPARQL:queries triple pattern ?anyTaxon rdf:type core:Taxon . ?anyTaxon core:reviewed “true” . © 2012 SIB Tuesday, December 4, 2012
  • 33. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . ?anyTaxon core:reviewed “true” . } © 2012 SIB Tuesday, December 4, 2012
  • 34. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . ?anyTaxin core:reviewed “true” . } © 2012 SIB Tuesday, December 4, 2012
  • 35. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . $anyTaxon core:reviewed “true” . } © 2012 SIB Tuesday, December 4, 2012
  • 36. Lets learn SPARQL © 2012 SIB Tuesday, December 4, 2012
  • 37. © 2012 SIB Tuesday, December 4, 2012
  • 38. © 2012 SIB Tuesday, December 4, 2012
  • 39. Shorthand a = rdf:type © 2012 SIB Tuesday, December 4, 2012
  • 40. AND join (default) © 2012 SIB Tuesday, December 4, 2012
  • 41. Now you type © 2012 SIB Tuesday, December 4, 2012
  • 42. Remember ‘;’ shortcut © 2012 SIB Tuesday, December 4, 2012
  • 43. Two variables one output column © 2012 SIB Tuesday, December 4, 2012
  • 44. Optional • When values may be missing – yet interesting when they are there • Use as sub query • bound values from outside stay bound inside – ?x ?y?z . OPTIONAL {?x ?b ?c} • ?x same variable = same thing © 2012 SIB Tuesday, December 4, 2012
  • 45. © 2012 SIB Tuesday, December 4, 2012
  • 46. UNION • Allows you to combine query patterns as an OR operation. • Joins are still from outer to inner. © 2012 SIB Tuesday, December 4, 2012
  • 47. UNION © 2012 SIB Tuesday, December 4, 2012
  • 48. Negation • When you do not want a certain category of matches. SELECT ?pet WHERE { ?pet a pets:Friendly . } © 2012 SIB Tuesday, December 4, 2012
  • 49. Oooops © 2012 SIB Tuesday, December 4, 2012
  • 50. Not exists (Negation 1) © 2012 SIB Tuesday, December 4, 2012
  • 51. Minus (Negation 2) © 2012 SIB Tuesday, December 4, 2012
  • 52. MINUS{} or FILTER (NOT EXISTS{}) • Whats the difference? – MINUS subtracts results – NOT EXITS tests if the sub pattern is possible at all. • Normally the faster option. © 2012 SIB Tuesday, December 4, 2012
  • 53. MINUS all data © 2012 SIB Tuesday, December 4, 2012
  • 54. FILTER (NOT EXISTS{}) no results © 2012 SIB Tuesday, December 4, 2012
  • 55. Negation option 3 SPARQL 1.0 SELECT ?subject ?rank WHERE { ?subject core:rank ?rank . OPTIONAL { ?subject core:rank core:Genus . ?subject core:rank ?genus .} FILTER(! BOUND(?genus)) } © 2012 SIB Tuesday, December 4, 2012
  • 56. © 2012 SIB Tuesday, December 4, 2012
  • 57. FILTERS • You just saw it twice – Once in the !BOUND – Once in the NOT EXISTS • FILTERS a result set by possibly removing values – FILTER do not add a value to the result • Inside the same graph pattern order independent. © 2012 SIB Tuesday, December 4, 2012
  • 58. Filter © 2012 SIB Tuesday, December 4, 2012
  • 59. Filter on not in © 2012 SIB Tuesday, December 4, 2012
  • 60. © 2012 SIB Tuesday, December 4, 2012
  • 61. © 2012 SIB Tuesday, December 4, 2012
  • 62. IN © 2012 SIB Tuesday, December 4, 2012
  • 63. © 2012 SIB Tuesday, December 4, 2012
  • 64. FILTER on numbers • < – FILTER (1 < 2) • > – FILTER (2 > 1) • = – FILTER (1 =1) • != – FILTER(1 != 2) • © 2012 SIB Tuesday, December 4, 2012
  • 65. Filters • ?x = ?y does casting (value conversions) – 1.0^^xsd:float = 1^^xsd:int is true • sameTerm(?x, ?y) does not – sameTerm(1.0^^xsd:float, 1^^xsd:int) © 2012 SIB Tuesday, December 4, 2012
  • 66. FILTER on strings • Functions – STRLEN – ENCODE_FOR_URI – SUBSTR – CONCAT – UCASE – langMatches – LCASE – REGEX – STRSTARTS – REPLACE – STRENDS – CONTAINS – IRI – STRBEFORE – STRAFTER © 2012 SIB Tuesday, December 4, 2012
  • 67. STRLEN == String Length © 2012 SIB Tuesday, December 4, 2012
  • 68. CONTAINS is case sensitive is it in there © 2012 SIB Tuesday, December 4, 2012
  • 69. REGEX, just like java regex © 2012 SIB Tuesday, December 4, 2012
  • 70. BIND • Builds new Values – Closes the basic graph pattern SELECT ?p WHERE { { ?taxon a :Taxon . } BIND (?taxon AS ?p) } • Always declare before use. © 2012 SIB Tuesday, December 4, 2012
  • 71. © 2012 SIB Tuesday, December 4, 2012
  • 72. © 2012 SIB Tuesday, December 4, 2012
  • 73. BIND can assign any output © 2012 SIB Tuesday, December 4, 2012
  • 74. Aggregate functions • on select line • limited in number – count – sum – avg – min – max – groupConcat – sample © 2012 SIB Tuesday, December 4, 2012
  • 75. count © 2012 SIB Tuesday, December 4, 2012
  • 76. SAMPLE should give a random result back © 2012 SIB Tuesday, December 4, 2012
  • 77. Follow the path © 2012 SIB Tuesday, December 4, 2012
  • 78. Path queries © 2012 SIB Tuesday, December 4, 2012
  • 79. Finding a grand parent using normal joins © 2012 SIB Tuesday, December 4, 2012
  • 80. Finding a grandParent using a path query © 2012 SIB Tuesday, December 4, 2012
  • 81. | is OR for predicate © 2012 SIB Tuesday, December 4, 2012
  • 82. Same result with UNION © 2012 SIB Tuesday, December 4, 2012
  • 83. Finding any ancestor © 2012 SIB Tuesday, December 4, 2012
  • 84. Can use the variable in a normal join afterwards © 2012 SIB Tuesday, December 4, 2012
  • 85. GROUP BY © 2012 SIB Tuesday, December 4, 2012
  • 86. GROUP BY • Needed for aggregate values • After closing the where clause – ... WHERE {?x ?y ?z} GROUP BY ?x © 2012 SIB Tuesday, December 4, 2012
  • 87. GROUP BY © 2012 SIB Tuesday, December 4, 2012
  • 88. HAVING I have carrot ! © 2012 SIB Tuesday, December 4, 2012
  • 89. HAVING • FILTER for aggregates • After the GROUP BY clause – ... GROUP BY ?x HAVING (count(?y) > 2) – ... GROUP BY ?x HAVING (min(?y) = 2) – etc... © 2012 SIB Tuesday, December 4, 2012
  • 90. HAVING © 2012 SIB Tuesday, December 4, 2012
  • 91. LIMITS & OFFSET © 2012 SIB Tuesday, December 4, 2012
  • 92. LIMIT and OFFSET • OFFSET is skip first results • LIMIT return no more than x results © 2012 SIB Tuesday, December 4, 2012
  • 93. ORDER © 2012 SIB Tuesday, December 4, 2012
  • 94. © 2012 SIB Tuesday, December 4, 2012
  • 95. © 2012 SIB Tuesday, December 4, 2012
  • 96. © 2012 SIB Tuesday, December 4, 2012
  • 97. VALUES • Super BIND • Provide inline data © 2012 SIB Tuesday, December 4, 2012
  • 98. © 2012 SIB Tuesday, December 4, 2012
  • 99. Examples • Parameter lists are between () VALUES (?annotation) { (core:Disease_Annotation) Text (core:Disulfide_Bond_Annotation) } © 2012 SIB Tuesday, December 4, 2012
  • 100. Examples • Undef means no value at – all not bound VALUES (?annotation ?begin) { (core:Disease_Annotation UNDEF) Text (core:Disulfide_Bond_Annotation 2) } © 2012 SIB Tuesday, December 4, 2012
  • 101. VALUES • After declaring a set of values you can use them in your query. SELECT ?comment WHERE { VALUES (?annotation ?begin) { (core:Disease_Annotation UNDEF) (core:Disulfide_Bond_Annotation 2) } ?annotation rdfs:comment ?comment . } © 2012 SIB Tuesday, December 4, 2012
  • 102. SERVICE: Using other sparql endpoints • SERVICE<URL of other endpoint> – Runs a sub query on the other endpoint and merges it back into your query. © 2012 SIB Tuesday, December 4, 2012
  • 103. “Life is better with friends who understand you.” © 2012 SIB Tuesday, December 4, 2012
  • 104. SERVICE © 2012 SIB Tuesday, December 4, 2012
  • 105. SERVICE • Useful – Quick experimenting with combing multiple datasources – Quick for queries where not to much data is send to the remote point • Slow – When you ask for to much data – Remote endpoint not resourced for your questions © 2012 SIB Tuesday, December 4, 2012
  • 106. Lets make some triples © 2012 SIB Tuesday, December 4, 2012
  • 107. Construction • CONSTRUCT – New triples • downloads RDF – Does not update store © 2012 SIB Tuesday, December 4, 2012
  • 108. New triples © 2012 SIB Tuesday, December 4, 2012
  • 109. Constructing an owl:sameAs between two URI © 2012 SIB Tuesday, December 4, 2012
  • 110. INSERT • Adds data – like construct © 2012 SIB Tuesday, December 4, 2012
  • 111. Modifies data © 2012 SIB Tuesday, December 4, 2012
  • 112. DELETE • Removes data – Triples matching are removed from the data – Triples can be bound using where clause © 2012 SIB Tuesday, December 4, 2012
  • 113. DELETE © 2012 SIB Tuesday, December 4, 2012
  • 114. DELETE INSERT • Single atomic operation. © 2012 SIB Tuesday, December 4, 2012
  • 115. Atomic operation © 2012 SIB Tuesday, December 4, 2012
  • 116. I’m exhausted now © 2012 SIB Tuesday, December 4, 2012