SlideShare une entreprise Scribd logo
1  sur  89
Télécharger pour lire hors ligne
An RDF Data Model for the
     Semantic Web

5th Oracle Life Sciences User Group meeting
               May 16-17, 2005
Agenda

Introduction – 5 min
  –   Susie Stephens
Semantic Web for Life Sciences – 25 min
  –   Susie Stephens
Oracle support of RDF in RDBMS – 25 min
  –   Souripriya Das
Demo of Siderean’s Seamark Navigation Server – 25 min
  –   Mike DiLascio, David LaVigna & Joanne Luciano
Discussion – 10 min
  –   Susie Stephens
Semantic Web for Life Sciences
         Susie Stephens
What is the Semantic Web?
 A machine-readable format that is Web
 compatible
 The Semantic Web adds definition tags to
 information in Web pages
   –   Enables computers to discover data more
       effectively
   –   Allows new associations to form between pieces
       of information
Resource Description Framework
 W3C standard for the common data format
 Based on triples (subject–predicate–object)
 Everything has a URI
 Ontologies used to label the RDF tagged elements




                                    Image Source: W3C
Image Source: W3C
Enterprise Integration Hub




                             Image Source: W3C
Semantic Web Stack




                     Image Source: W3C
Pharma Productivity




                      Source: PhRMA & FDA 2003
Critical Path Initiative




                   Source: Innovation or Stagnation, FDA Report, March 2004
Ontology Frameworks for Integration
                              <hasProduct>                  Protein

                                                                          <participatesIn>
      Gene          <transcribes>            <translatesTo>

                               mRNA                       <located>
  <influences>
                                                                                     Cascade
                               <affectedTissue>            Localization              pathway
     Disease
                     <probeFor>                                                        <partOf>
                                             <targets>
  <profiledBy>                                              Intervention                Bio-process
                                Drug                            point
                                             <MOA>

       Microarray                                                      <drugInteraction>
       experiment
                                                         <affecting>                    Target
       <efficacyMarkerFor>                                                              model
                                        Treatment
Biological Pathways




                      Image Source: Cytoscape
Beyond the “Dead” Graphical Model




                          Image Source: KEGG
Assigning Trust Values to Data




                            Image Source: SWANS
Inferencing
If Gene G is implicated in Disease D, and its Protein
Product P is a functional component of only Pathway
P2 -> then Disease D directly perturbs Pathway P2
<rdf:Description>
<log:is rdf:parseType=‘Quote’>
<rdf:Description rdf:about=‘variable#Gene_G’>
          <hasProduct rdf:resource=‘variable#Protein_P’/>
        <isImplicatedIn rdf:resource=‘variable#Disease_D’/>
</rdf:Description>
        <rdf:Description rdf:about=‘variable#Protein_P’>
        <inPathway rdf:resource=‘variable#Pathway_P2’/>
</rdf:Description>
<log:is>
<log:implies rdf:parseType=‘Quote’>
         <rdf:Description rdf:about=‘variable#Disease_D’>
        <D_perturbs rdf:resource=‘variable#pathway_P2’>
</rdf:Description>
</log:implies>
</rdf:Description>
Why Semantic Web for Life Sciences?
 Heterogeneous data integration using explicit
 semantics
 Expression well-defined and rich models of
 biological systems
 Annotating findings and interpretations formally and
 sharing with other scientists
 Embedding models and semantics within papers
 Applying logic to infer additional insights and to
 propose and/or capture new hypotheses
Q U E S T I O N S
 A N S W E R S
RDF Support in Oracle RDBMS
            Souripriya Das, Ph.D.
       Consultant Member of Technical Staff
     Oracle New England Development Center
Overview

Three types of database objects
  Model    RDF graph consisting of a set of triples
  Rulebase   Set of (user-defined) rules
  Rule Index  Entailed RDF graph
We discuss following aspects for each type of object
  DDL
  DML
  Views
  Security
RDF Query (with Inference)
RDF Models
Model: Overview

 Each RDF Model (graph) consists of a set of
 triples
 A triple (statement) consists of three
 components
  –   Subject    URI or blank node
  –   Predicate   URI
  –   Object    URI or literal or blank node
 A statement itself can be a resource (allowing
 nested graphs)
Model: Example
                                            :John             16
                                                      age
Family:
                                                  brotherOf
(:John :brotherOf :Mary)
(:John :age        “16”^^xsd:Integer)               parentOf
(:Mary :parentOf :Matt)                        :Mary       :Matt
(:John :name      “John”)
(:Mary :name      “Mary”)
                                        thinks
Reification:
(:John :thinks     _:S1)
(_:S1 rdf:subject :Sue)                          livesIn
(_:S1 rdf:predicate :livesIn)           :Sue                NYC
(_:S1 rdf:object    “NYC”)
RDF Query
SDO_RDF_MATCH Table Func
 Arguments
   –   Graph pattern
          A sequence of triple patterns
          Triple patterns typically use variables
   –   RDF Data set    a set of models
   –   Filter
   –   Aliases
 …
 FROM TABLE(SDO_RDF_MATCH(
           ‘(?x :brotherOf ?y) (?y :parentOf ?z)’,
           SDO_RDF_Models(‘family’),
           …
      )) t
 …
SDO_RDF_MATCH: return
Columns (of type VARCHAR2) in each returned row:
  For each variable ?x in Graph Pattern
    –   x
    –   x$rdfVTYP
          URI, Literal, Blank node
    –   x$rdfLTYP
          Specific literal type (e.g., xsd:integer)
    –   x$rdfCLOB
          Contains actual value, if ?x matches a
          CLOB value
    –   x$rdfLANG
          Language tag, if any (e.g., “en-us”)
  If no variable in Graph Pattern
    –   A dummy column
SDO_RDF_MATCH: matching
Matching multiple representations
 The same point in value space may have
 multiple representations
   –   “10”^^xsd:Integer
   –   “10”^^xsd:PositiveInteger
   –   “010”^^xsd:Integer
   –   “000010”^^xsd:Integer
 SDO_RDF_MATCH automatically resolves
 these
RDF Query: Example
 Find salary and hiredate of all the uncles
 SELECT emp.name, emp.salary, emp.hiredate
 FROM emp,
        TABLE(SDO_RDF_MATCH(
                 ‘(?x :brotherOf ?y)
                  (?y :parentOf ?z)
                  (?x :name      ?name)’,
                 SDO_RDF_Models(‘family'),
        …)) t
 WHERE emp.name=t.name;
 Use of SDO_RDF_MATCH allows embedding a
 graph query in a SQL query
RDF Query: Example 2
 Find pairs of persons residing at the same
 address where the first person rents a truck and
 the second person buys a fertilizer
 SELECT t3.x name1, t3.y name2
 FROM AddrTable t1, AddrTable t2,
      TABLE(SDO_RDF_MATCH(
               ‘(?x :rents ?a) (?a rdf:type :Truck)
                (?y :buys ?b) (?b rdf:type :Fertilizer)’,
               SDO_RDF_Models(‘Activities'),
      …)) t3
 WHERE t1.name=t3.x and t2.name=t3.y and
       t1.addr=t2.addr;
RDF Rulebases
Rulebase: Overview

 Each RDF rulebase consists of a set of rules
 Each rule consists of
  –   antecedent: graph-pattern
  –   filter condition (optional)
  –   Consequent: graph-pattern
 One or more rulebases may be used with
 relevant RDF models (graphs) to obtain
 entailed graphs
Rulebase: Example

Rules in a rulebase family_rb:
    Antecedent: ‘(?x :brotherOf ?y) (?y :parentOf ?z)’
    Filter: NULL
    Consequent: ‘(?x :uncleOf ?z)’

    Antecedent: ‘(?x :age ?a)’
    Filter: ‘a >= 65’
    Consequent: ‘(?x :ageGroup “Senior”)’

    Antecedent: ‘(?x :parentOf ?y) (?y :parentOf ?z)’
    Filter: NULL
    Consequent: ‘(?x :grandParentOf ?z)’
RDF Rule Indexes
Rule Index: Overview

 A rule index represents an entailed graph
 A rule index is created on an RDF dataset
 (consisting of a set of RDF models and a set
 of RDF rulebases)
Rule Index: Example

 A rule index may be created on a dataset
 consisting of
  –   family RDF data, and
  –   family_rb rulebase (shown earlier)
 The rule index will contain inferred triples
 showing uncleOf and ageGroup information
RDF Query with Inference
SDO_RDF_MATCH with
Rulebases
 Arguments
   –   Graph pattern
           A sequence of triples (with variables)
   –   RDF Data set
           a set of models
           a set of rulebases
   –   Filter
   –   Aliases
 …
 FROM TABLE(SDO_RDF_MATCH(
           ‘(?x :uncleOf ?y)’,
           SDO_RDF_Models(‘family’),
           SDO_RDF_Rulebases (‘rdfs’, ‘family_rb’)
           …
      )) t
 …
RDF Query w/ Inference:
Example
 Find salary and hiredate of all the
 uncles
 SELECT emp.name, emp.salary, emp.hiredate
 FROM emp,
      TABLE(SDO_RDF_MATCH(
              ‘(?x :uncleOf ?y) (?x :name ?name)’,
              SDO_RDF_Models(‘family'),
              SDO_RDF_Rulebases(‘rdfs’, ‘family_rb'),
      …)) t
 WHERE emp.name=t.name;
RDF Query w/ Inference:
Example 2
 Find pairs of persons residing at the same
 address where the first person rents a truck and
 the second person buys a fertilizer
 SELECT t3.x name1, t3.y name2
 FROM AddrTable t1, AddrTable t2,
      TABLE(SDO_RDF_MATCH(
               ‘(?x :rents ?a) (?a rdf:type :Truck)
                (?y :buys ?b) (?b rdf:type :Fertilizer)’,
               SDO_RDF_Models(‘Activities'),
               SDO_RDF_Rulebases(‘rdfs’),
      …)) t3
 WHERE t1.name=t3.x and t2.name=t3.y and
       t1.addr=t2.addr;
RDF Models
Model: DDL

 Procedures provided as part of the API may be used
 to
   –   Create a model
   –   Drop a model
 When a user creates a model, a database view gets
 created automatically
   –   rdfm_family
 A model corresponds to a column of type
 SDO_RDF_TRIPLE_S in a base table
 Each model has exactly one base table associated
 with it
Model: DDL            Creating a Model
  Create an Application Table
CREATE TABLE family_table (
  id NUMBER, family_triple SDO_RDF_TRIPLE_S);
  Create a Model
EXEC SDO_RDF.CREATE_RDF_MODEL(
  ‘family’, ‘family_table’,‘family_triple’);
  Automatically creates the following database
  view
rdfm_family (…)
Loading RDF Data into Oracle

 Java API provided to load NTriple into NDM

 Sample XSLs provided
  –   To convert RDF to NTriple
  –   To convert RDF to INSERT statements
Model: DML

 SQL DML commands may be used to do DML
 operations on a base table to effect DML (i.e., triple
 insert, delete, and update) on the corresponding
 model

 Insert Triples
 INSERT INTO family_table VALUES (1,
           SDO_RDF_TRIPLE_S(‘family',
                   '<http://example.org/family/John>',
                   '<http://example.org/family/brotherOf>',
                   ‘<http://example.org/family/Mary>'));
Model: Security

 The creator of the base table corresponding to a
 model can grant privileges to other users
 To perform DML to a model, a user must have DML
 privileges for the corresponding base table
 The creator of a model can grant QUERY privileges
 on the corresponding database view to other users
 A user can query only those models for which s/he
 has QUERY privileges to the corr. database views
 Only the creator of a model can drop the model
Model: Views

 Database views corresponding to the models
RDF Rulebases
Rulebase: DDL

 Procedures provided as part of the API may
 be used to
  –   Create a rulebase
      create_rulebase('family_rb');
  –   Drop a rulebase
  –   drop_rulebase('family_rb');
 When a user creates a rulebase, a database
 view gets created automatically
  –   rdfr_family_rb (rule_name,
          antecedent, filter, consequent, aliases)
Rulebase: DML

 SQL DML commands may be used on the
 database view corresponding to a target
 rulebase to insert, delete, and update rules
 insert into mdsys.rdfr_family_rb values(
   ‘uncle_rule',
  ‘(?x :brotherOf ?y) (?y :parentOf ?z)’,
   NULL,
   '(?x :uncleOf ?z)',
   SDO_RDF_Aliases(…));
Rulebase: Security

 Creator of a rulebase can grant privileges to
 the corresponding database view to other
 users
 Performing DML operations requires invoker
 to have appropriate privileges on the
 database view
 Only the creator of a rulebase can drop the
 rulebase
Rulebase: Views

 RDF_RULEBASE_INFO
  –   Contains the list of rulebases
  –   For each rulebase, contains additional
      information (such as, creator, view name, etc)
 Content of each rulebase is available from the
 corresponding database view
RDF Rule Indexes
Rule Index: DDL

 Procedures provided as part of the API may be used
 to
   –   Create a rule index
       create_rules_index ('family_rb_rix_family‘,
           SDO_RDF_Models('family'),
           SDO_RDF_Rulebases(‘rdfs','family_rb'));
   –   Drop a rule index
       drop_rules_index ('family_rb_rix_family');
 When a user creates a rule index, a database view
 gets created automatically
   –   rdfi_family_rb_rix_family (…)
Rule Index: Security

 To create a rule index on an RDF dataset
 (models and rulebases), user needs to have
 QUERY privileges on those models and
 rulebases
 Creator of a rule index holds QUERY privilege
 on the rule index and may grant this privilege
 to other users
 Only the creator of a rule index can drop it
Rule Index: Views

 RDF_RULEINDEX_INFO
  –   Contains the list of rule indexes
  –   For each rule index, contains additional
      information (such as, creator, status, etc)
 RDF_RULEINDEX_DATASETS
  –   For every rule index, stores the names of its
      models and rulebases
Rule Index: Dependencies

 Content of a rule index depends upon the
 content of each element of its dataset
  –   Any modification to the models or rulebases in its
      dataset invalidates the rule index
  –   Dropping a model or rulebase will drop
      dependent rule indexes automatically.
Summary

 RDF Data Model
  –   Models (Graphs)
  –   RDF Query using SDO_RDF_MATCH Table Function
 RDF Data Model with (user-defined) Rules
  –   Models (Graphs)
  –   Rulebases
  –   Rule Indexes
  –   RDF Query on entailed RDF graphs
 Management (DDL, DML, Security, …)
  –   Models, Rulebases, and Rule Indexes
RDF Data Model Demo
Demo: Family Schema
Demo: Family Schema 2
Demo: Family Model Data
Demo: Family Model Data (Alt)
Demo: Query without Inference
select m from TABLE(SDO_RDF_MATCH(
    '(?m rdf:type :Male)',
    SDO_RDF_Models('family'),
    null,
    SDO_RDF_Aliases(
      SDO_RDF_Alias('', 'http://www.example.org/family/')),
    null));
M
--------------------------------------------------------------------------------
http://www.example.org/family/Jack
http://www.example.org/family/Tom
Demo: Query w/ RDFS Inference
select m from TABLE(SDO_RDF_MATCH(
    '(?m rdf:type :Male)',
    SDO_RDF_Models('family'),
    SDO_RDF_Rulebases(‘RDFS’),
    SDO_RDF_Aliases(
      SDO_RDF_Alias('', 'http://www.example.org/family/')),
    null));
M
--------------------------------------------------------------------------------
http://www.example.org/family/Jack
http://www.example.org/family/Tom
http://www.example.org/family/John
http://www.example.org/family/Matt
http://www.example.org/family/Sammy
Demo: Family Rulebase
  Antecedent: ‘(?x :parentOf ?y) (?y :parentOf ?z)’
  Filter: NULL
  Consequent: ‘(?x :grandParentOf ?z)’
Demo: Query w/ Family and RDFS
    Inference
select x, y from TABLE(SDO_RDF_MATCH(
    '(?x :grandParentOf ?y) (?x rdf:type :Male)',
    SDO_RDF_Models('family'),
    SDO_RDF_Rulebases('RDFS','family_rb'),
    SDO_RDF_Aliases(
      SDO_RDF_Alias('','http://www.example.org/family/')),
    null));
X                                                      Y
------------------------------------------------------ -----------------------------------------------------
http://www.example.org/family/John                     http://www.example.org/family/Cindy
http://www.example.org/family/John                     http://www.example.org/family/Tom
http://www.example.org/family/John                     http://www.example.org/family/Jack
http://www.example.org/family/John                     http://www.example.org/family/Cathy
Q U E S T I O N S
 A N S W E R S
Demo of Siderean’s Seamark
    Navigation Server
 Mike DiLascio & Joanne Luciano
Agenda

 About Siderean Software & Predictive
 Medicine, Inc.
 Introducing Seamark Navigation Server v.3.6
 Seamark & Oracle 10g RDF Data Model
 Demonstration of Seamark / Oracle 10g
 integration
 Lessons Learned / Q&A
About Siderean Software

   Aggregate, organize and navigate information
              -the way users think –
    -to improve analysis and decision making.



  Founded in 2001 and based in El Segundo, CA
  Ventured backed in 2004
  Delivering RDF-centric navigation and analysis capabilities
  for end users (a.k.a. - “the last mile”)
  Active W3C member leveraging Semantic Web standards
  Demonstrating integrated Seamark navigation layer over
  Oracle 10g RDF Data Model in collaboration with
  Predictive Medicine, Inc.
Current solutions
“50,000 results!!! Now what?”           “I give up! Hello? Get me an apple!”      “Why do I get oranges when I’m looking
                                                                                               for apples?”




         IT:                                                                                     CONTENT PRODUCER:
“As soon as I fix his,                                                                      “I just produced three apples
hers stops working.”                                                                                  last week!”




                 Enterprise search –                                           Knowledge management –
               a brute force approach                                          breathtakingly expensive
Introducing Seamark Navigation Server
  “I can see the big picture!”           “No more staring at a blank text box.”   “I can drill down quickly to what I want.”




          IT:                                                                                      CONTENT PRODUCER:
 “I can take my coffee                                                                         “I knew we had an apple in
      break now.”                                                                                   here somewhere.”




                         Seamark – layering organization to deliver pinpoint navigation
How it works: process


                   Term                               View   View
                                 Person
                          Text
                                          Place

                                 Event




 Metadata about     Organized into a unified      Analyzed to generate    Providing pinpoint
data and content   information architecture…       on-demand views…       navigation across
is aggregated…                                                           the data and content
How it works: architecture
                                                                 User Navigation
                                                                and User Tagging



Unstructured Content
  and Data Feeds
                                                                 Web Browsers
                                                                   & Portals



  Search Engines                                                  User Alerts


                        Metadata    Navigation    Navigation
                       Aggregator    Metadata    Web Services


                                                                Feed Aggregators



  Structured Content
       Sources
Seamark/Oracle integration
  architecture: Phase 1
                                                                  User Navigation
                                                                 and User Tagging




                                                                  Web Browsers
                                                                    & Portals



                                                                   User Alerts

                  Batch RDFMatch
 Oracle 10g      Query issued from    Cached       Navigation
  RDF Data          Seamark at       Navigation   Web Services
  Model for          index time       Metadata
   scalable
persistence of
                                                                 Feed Aggregators
  metadata
Seamark/Oracle integration
 architecture: Phase 2
                                                                      User Navigation
                                                                     and User Tagging




                                                                      Web Browsers
                                                                        & Portals



                                                                       User Alerts


 Oracle 10g      Federated RDFMatch       Dynamic      Navigation
  RDF Data        Queries issued from    Navigation   Web Services
  Model for      Seamark at query time    Metadata
   scalable
persistence of
  metadata                                                           Feed Aggregators
Seamark Demo: Background & Concepts
  Life Sciences demonstration premise
     RDF offers high value during early stage research

  Leveraging strengths of Oracle 10g & Seamark v3.6
     Oracle – large datasets / scalability
     Seamark – useful subsets / flexible navigation & insights

  Project elapsed time - about one week
    Locating and identifying data sources represented the
   greatest time element
    Data sources in RDF required minimal integration time
    Non-RDF data sources required transformation and linking
   values (non-trivial but straightforward)
Seamark Demonstration: Identification of new drug candidates

                                                                                                       1. Differentiate different forms
                                           GO2Keyword.rdf
                Keywords.rdf
                                                                                                       of disease
                                                               ProbeSet.rdf                            2. Identify patients subgroups.
                                                                                                       3. Identify top biomarkers
                                      Keyword                                                          4. Identify function
      GO2UniProt.rdf                                                        GO2OMIM.rdf
                                                            Probe
                                                                                                       5. Identify biological and
                                                                                                       chemical properties and
                                Protein
                                                                                                       disease associations of
                                                  Gene
                                                                                                       biomarker
                                                                        MIM Id
                                                                                            OMIM.rdf   6. Identify documents
IntAct.rdf                                                                                             7. Identify role in metabolic
                                                 GO.rdf
                                                                                 GO2Enzyme.rdf         pathways
               UniProt.rdf                                     Enzyme
                                Organism
                                                                                                       8. Identify compounds that
                     Citation                                                                          interact
                                                                                                       9. Identify and compare
                                                                           Compound
                                  Taxonomy.rdf                                                         function in other organisms
             PubMed.xml                             Enzymes.rdf               KEGG.rdf
                                                                           Pathway                     10. Identify any prior art
Live Seamark Life Sciences
      Demonstration:
   Sample Screenshots
Seamark application start page shows integration of OMIM, GO, KEGG, UniProt and NCBI
Select: Probe Set ID: “M18255_cds2_s_at”
Results: 9 Matches on “M18255_cds2_s_at” to the Gene Ontology


                                                                Cytoplasm 1st of 9 Matches
                                                                Cellular Location Via Gene Ontology
Cytoplasm 1st of 9 Matches


          Page Scroll
Cytoplasm 1st of 9 Matches


                      Page Scroll




Plasma Membrane, …, 2nd of 9 Matches
Cellular Location Via Gene Ontology




    Page Scroll for more results, etc.
Start Page: Optionally search across entire collection based upon
keywords from the integrated data sources
Seamark Lessons Learned
 RDF offers multiple unconstrained views of
 data/relationships
 – Provides  maximum flexibility during early stage research
 – Later stages can leverage OWL to constrain known
   relationships

 Data providers – Timing is right to publish in RDF format
 – Cut your customer’s integration costs
 – Speed discovery time

 Even with one week of effort…
 – Proof of Concept demonstrates value of broad & deep
   integration
 – Additional value in extending POC in customer pilot initiatives
Siderean Seamark Conclusion

 Getting the precise
 information we need from
 today’s data glut is
 profoundly difficult
 Solving this problem
 requires a solution that
 works the way you think
 Siderean is the world’s first
 turnkey navigation server
 for the enterprise and
 people at large
To arrange a demonstration of Seamark or
Thank You!   for more information please contact:

             Mike DiLascio
             Office: +1 781 652 0339
             Mobile: +1 781 354 7663
             mdilascio@siderean.com


             Siderean Software, Inc.
             390 North Sepulveda Blvd., Suite 2070
             El Segundo, CA 90245-4475 USA
             http://www.siderean.com
Bio it 2005_rdf_workshop05

Contenu connexe

Tendances

SWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mappingSWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mapping
Mariano Rodriguez-Muro
 

Tendances (12)

SWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mappingSWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mapping
 
CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403
CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403
CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403
 
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation LanguagesSyntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
 
Ontologies in RDF-S/OWL
Ontologies in RDF-S/OWLOntologies in RDF-S/OWL
Ontologies in RDF-S/OWL
 
SWT Lecture Session 11 - R2RML part 2
SWT Lecture Session 11 - R2RML part 2SWT Lecture Session 11 - R2RML part 2
SWT Lecture Session 11 - R2RML part 2
 
SWT Lecture Session 10 R2RML Part 1
SWT Lecture Session 10 R2RML Part 1SWT Lecture Session 10 R2RML Part 1
SWT Lecture Session 10 R2RML Part 1
 
SWT Lecture Session 3 - SPARQL
SWT Lecture Session 3 - SPARQLSWT Lecture Session 3 - SPARQL
SWT Lecture Session 3 - SPARQL
 
UKOUG Tech14 - Getting Started With JSON in the Database
UKOUG Tech14 - Getting Started With JSON in the DatabaseUKOUG Tech14 - Getting Started With JSON in the Database
UKOUG Tech14 - Getting Started With JSON in the Database
 
5 rdfs
5 rdfs5 rdfs
5 rdfs
 
JSON Array Indexes in MySQL
JSON Array Indexes in MySQLJSON Array Indexes in MySQL
JSON Array Indexes in MySQL
 
RDF data model
RDF data modelRDF data model
RDF data model
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic Repositories
 

Similaire à Bio it 2005_rdf_workshop05

Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Web
ebiquity
 
Triplestore and SPARQL
Triplestore and SPARQLTriplestore and SPARQL
Triplestore and SPARQL
Lino Valdivia
 
Querying the Semantic Web with SPARQL
Querying the Semantic Web with SPARQLQuerying the Semantic Web with SPARQL
Querying the Semantic Web with SPARQL
Emanuele Della Valle
 

Similaire à Bio it 2005_rdf_workshop05 (20)

Jena Programming
Jena ProgrammingJena Programming
Jena Programming
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage
 
Semantic Web(Web 3.0) SPARQL
Semantic Web(Web 3.0) SPARQLSemantic Web(Web 3.0) SPARQL
Semantic Web(Web 3.0) SPARQL
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorial
 
CSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialCSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web Tutorial
 
SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)
 
SPARQL
SPARQLSPARQL
SPARQL
 
RDF and Java
RDF and JavaRDF and Java
RDF and Java
 
AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Web
 
Triplestore and SPARQL
Triplestore and SPARQLTriplestore and SPARQL
Triplestore and SPARQL
 
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
 
Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked Data
 
Querying the Semantic Web with SPARQL
Querying the Semantic Web with SPARQLQuerying the Semantic Web with SPARQL
Querying the Semantic Web with SPARQL
 
Do it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLioDo it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLio
 
Semantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care AnalyticsSemantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care Analytics
 
Semantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care AnalyticsSemantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care Analytics
 
IBC FAIR Data Prototype Implementation slideshow
IBC FAIR Data Prototype Implementation   slideshowIBC FAIR Data Prototype Implementation   slideshow
IBC FAIR Data Prototype Implementation slideshow
 

Plus de Joanne Luciano

2013 dec 26_bgu_israel_seminar_l_luciano
2013 dec 26_bgu_israel_seminar_l_luciano2013 dec 26_bgu_israel_seminar_l_luciano
2013 dec 26_bgu_israel_seminar_l_luciano
Joanne Luciano
 
2013 dec bgu_israel_luciano_dec_22
2013 dec bgu_israel_luciano_dec_222013 dec bgu_israel_luciano_dec_22
2013 dec bgu_israel_luciano_dec_22
Joanne Luciano
 
2013 dec bgu_israel_luciano_day_1_dec_22
2013 dec bgu_israel_luciano_day_1_dec_222013 dec bgu_israel_luciano_day_1_dec_22
2013 dec bgu_israel_luciano_day_1_dec_22
Joanne Luciano
 
2013 dec bgu_israel_luciano_day_3_dec_25
2013 dec bgu_israel_luciano_day_3_dec_252013 dec bgu_israel_luciano_day_3_dec_25
2013 dec bgu_israel_luciano_day_3_dec_25
Joanne Luciano
 
06317731 Patent page 1
06317731 Patent page 106317731 Patent page 1
06317731 Patent page 1
Joanne Luciano
 
The Translational Medicine
The Translational MedicineThe Translational Medicine
The Translational Medicine
Joanne Luciano
 

Plus de Joanne Luciano (19)

Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
 
Indiana University 2018 SICE summer camp slides
Indiana University 2018 SICE summer camp slidesIndiana University 2018 SICE summer camp slides
Indiana University 2018 SICE summer camp slides
 
Why are some websites successful (at behavioral change) Informs International...
Why are some websites successful (at behavioral change) Informs International...Why are some websites successful (at behavioral change) Informs International...
Why are some websites successful (at behavioral change) Informs International...
 
The General Ontology Evaluation Framework (GOEF) & the I-Choose Use Case A ...
The General Ontology Evaluation Framework (GOEF) & the I-Choose Use CaseA ...The General Ontology Evaluation Framework (GOEF) & the I-Choose Use CaseA ...
The General Ontology Evaluation Framework (GOEF) & the I-Choose Use Case A ...
 
Luciano informs healthcare_2015 Nashville, TN USA July 30 2015
Luciano informs healthcare_2015 Nashville, TN USA July 30 2015Luciano informs healthcare_2015 Nashville, TN USA July 30 2015
Luciano informs healthcare_2015 Nashville, TN USA July 30 2015
 
Ontology Support for Influenza and Surveillance
Ontology Support for Influenza and Surveillance Ontology Support for Influenza and Surveillance
Ontology Support for Influenza and Surveillance
 
2013 dec 26_bgu_israel_seminar_l_luciano
2013 dec 26_bgu_israel_seminar_l_luciano2013 dec 26_bgu_israel_seminar_l_luciano
2013 dec 26_bgu_israel_seminar_l_luciano
 
2013 dec bgu_israel_luciano_dec_22
2013 dec bgu_israel_luciano_dec_222013 dec bgu_israel_luciano_dec_22
2013 dec bgu_israel_luciano_dec_22
 
2013 dec bgu_israel_luciano_day_1_dec_22
2013 dec bgu_israel_luciano_day_1_dec_222013 dec bgu_israel_luciano_day_1_dec_22
2013 dec bgu_israel_luciano_day_1_dec_22
 
2013 dec bgu_israel_luciano_day_3_dec_25
2013 dec bgu_israel_luciano_day_3_dec_252013 dec bgu_israel_luciano_day_3_dec_25
2013 dec bgu_israel_luciano_day_3_dec_25
 
Translational Medicine: Patterns of Response to Antidepressant Treatment and ...
Translational Medicine: Patterns of Response to Antidepressant Treatment and ...Translational Medicine: Patterns of Response to Antidepressant Treatment and ...
Translational Medicine: Patterns of Response to Antidepressant Treatment and ...
 
Amia tbi 2010_pmi_luciano.ppt
Amia tbi 2010_pmi_luciano.pptAmia tbi 2010_pmi_luciano.ppt
Amia tbi 2010_pmi_luciano.ppt
 
Luciano pr 08-849_ontology_evaluation_methods_metrics
Luciano pr 08-849_ontology_evaluation_methods_metricsLuciano pr 08-849_ontology_evaluation_methods_metrics
Luciano pr 08-849_ontology_evaluation_methods_metrics
 
Bio onttalk 30minutes-june2003[1]
Bio onttalk 30minutes-june2003[1]Bio onttalk 30minutes-june2003[1]
Bio onttalk 30minutes-june2003[1]
 
06063028 face page
06063028 face page06063028 face page
06063028 face page
 
06317731 Patent page 1
06317731 Patent page 106317731 Patent page 1
06317731 Patent page 1
 
Luciano phddefense
Luciano phddefenseLuciano phddefense
Luciano phddefense
 
Luciano pr 08-849_ontology_evaluation_methods_metrics
Luciano pr 08-849_ontology_evaluation_methods_metricsLuciano pr 08-849_ontology_evaluation_methods_metrics
Luciano pr 08-849_ontology_evaluation_methods_metrics
 
The Translational Medicine
The Translational MedicineThe Translational Medicine
The Translational Medicine
 

Bio it 2005_rdf_workshop05

  • 1.
  • 2. An RDF Data Model for the Semantic Web 5th Oracle Life Sciences User Group meeting May 16-17, 2005
  • 3. Agenda Introduction – 5 min – Susie Stephens Semantic Web for Life Sciences – 25 min – Susie Stephens Oracle support of RDF in RDBMS – 25 min – Souripriya Das Demo of Siderean’s Seamark Navigation Server – 25 min – Mike DiLascio, David LaVigna & Joanne Luciano Discussion – 10 min – Susie Stephens
  • 4. Semantic Web for Life Sciences Susie Stephens
  • 5. What is the Semantic Web? A machine-readable format that is Web compatible The Semantic Web adds definition tags to information in Web pages – Enables computers to discover data more effectively – Allows new associations to form between pieces of information
  • 6. Resource Description Framework W3C standard for the common data format Based on triples (subject–predicate–object) Everything has a URI Ontologies used to label the RDF tagged elements Image Source: W3C
  • 8. Enterprise Integration Hub Image Source: W3C
  • 9. Semantic Web Stack Image Source: W3C
  • 10. Pharma Productivity Source: PhRMA & FDA 2003
  • 11. Critical Path Initiative Source: Innovation or Stagnation, FDA Report, March 2004
  • 12. Ontology Frameworks for Integration <hasProduct> Protein <participatesIn> Gene <transcribes> <translatesTo> mRNA <located> <influences> Cascade <affectedTissue> Localization pathway Disease <probeFor> <partOf> <targets> <profiledBy> Intervention Bio-process Drug point <MOA> Microarray <drugInteraction> experiment <affecting> Target <efficacyMarkerFor> model Treatment
  • 13. Biological Pathways Image Source: Cytoscape
  • 14. Beyond the “Dead” Graphical Model Image Source: KEGG
  • 15. Assigning Trust Values to Data Image Source: SWANS
  • 16. Inferencing If Gene G is implicated in Disease D, and its Protein Product P is a functional component of only Pathway P2 -> then Disease D directly perturbs Pathway P2 <rdf:Description> <log:is rdf:parseType=‘Quote’> <rdf:Description rdf:about=‘variable#Gene_G’> <hasProduct rdf:resource=‘variable#Protein_P’/> <isImplicatedIn rdf:resource=‘variable#Disease_D’/> </rdf:Description> <rdf:Description rdf:about=‘variable#Protein_P’> <inPathway rdf:resource=‘variable#Pathway_P2’/> </rdf:Description> <log:is> <log:implies rdf:parseType=‘Quote’> <rdf:Description rdf:about=‘variable#Disease_D’> <D_perturbs rdf:resource=‘variable#pathway_P2’> </rdf:Description> </log:implies> </rdf:Description>
  • 17. Why Semantic Web for Life Sciences? Heterogeneous data integration using explicit semantics Expression well-defined and rich models of biological systems Annotating findings and interpretations formally and sharing with other scientists Embedding models and semantics within papers Applying logic to infer additional insights and to propose and/or capture new hypotheses
  • 18. Q U E S T I O N S A N S W E R S
  • 19. RDF Support in Oracle RDBMS Souripriya Das, Ph.D. Consultant Member of Technical Staff Oracle New England Development Center
  • 20. Overview Three types of database objects Model RDF graph consisting of a set of triples Rulebase Set of (user-defined) rules Rule Index Entailed RDF graph We discuss following aspects for each type of object DDL DML Views Security RDF Query (with Inference)
  • 22. Model: Overview Each RDF Model (graph) consists of a set of triples A triple (statement) consists of three components – Subject URI or blank node – Predicate URI – Object URI or literal or blank node A statement itself can be a resource (allowing nested graphs)
  • 23. Model: Example :John 16 age Family: brotherOf (:John :brotherOf :Mary) (:John :age “16”^^xsd:Integer) parentOf (:Mary :parentOf :Matt) :Mary :Matt (:John :name “John”) (:Mary :name “Mary”) thinks Reification: (:John :thinks _:S1) (_:S1 rdf:subject :Sue) livesIn (_:S1 rdf:predicate :livesIn) :Sue NYC (_:S1 rdf:object “NYC”)
  • 25. SDO_RDF_MATCH Table Func Arguments – Graph pattern A sequence of triple patterns Triple patterns typically use variables – RDF Data set a set of models – Filter – Aliases … FROM TABLE(SDO_RDF_MATCH( ‘(?x :brotherOf ?y) (?y :parentOf ?z)’, SDO_RDF_Models(‘family’), … )) t …
  • 26. SDO_RDF_MATCH: return Columns (of type VARCHAR2) in each returned row: For each variable ?x in Graph Pattern – x – x$rdfVTYP URI, Literal, Blank node – x$rdfLTYP Specific literal type (e.g., xsd:integer) – x$rdfCLOB Contains actual value, if ?x matches a CLOB value – x$rdfLANG Language tag, if any (e.g., “en-us”) If no variable in Graph Pattern – A dummy column
  • 27. SDO_RDF_MATCH: matching Matching multiple representations The same point in value space may have multiple representations – “10”^^xsd:Integer – “10”^^xsd:PositiveInteger – “010”^^xsd:Integer – “000010”^^xsd:Integer SDO_RDF_MATCH automatically resolves these
  • 28. RDF Query: Example Find salary and hiredate of all the uncles SELECT emp.name, emp.salary, emp.hiredate FROM emp, TABLE(SDO_RDF_MATCH( ‘(?x :brotherOf ?y) (?y :parentOf ?z) (?x :name ?name)’, SDO_RDF_Models(‘family'), …)) t WHERE emp.name=t.name; Use of SDO_RDF_MATCH allows embedding a graph query in a SQL query
  • 29. RDF Query: Example 2 Find pairs of persons residing at the same address where the first person rents a truck and the second person buys a fertilizer SELECT t3.x name1, t3.y name2 FROM AddrTable t1, AddrTable t2, TABLE(SDO_RDF_MATCH( ‘(?x :rents ?a) (?a rdf:type :Truck) (?y :buys ?b) (?b rdf:type :Fertilizer)’, SDO_RDF_Models(‘Activities'), …)) t3 WHERE t1.name=t3.x and t2.name=t3.y and t1.addr=t2.addr;
  • 31. Rulebase: Overview Each RDF rulebase consists of a set of rules Each rule consists of – antecedent: graph-pattern – filter condition (optional) – Consequent: graph-pattern One or more rulebases may be used with relevant RDF models (graphs) to obtain entailed graphs
  • 32. Rulebase: Example Rules in a rulebase family_rb: Antecedent: ‘(?x :brotherOf ?y) (?y :parentOf ?z)’ Filter: NULL Consequent: ‘(?x :uncleOf ?z)’ Antecedent: ‘(?x :age ?a)’ Filter: ‘a >= 65’ Consequent: ‘(?x :ageGroup “Senior”)’ Antecedent: ‘(?x :parentOf ?y) (?y :parentOf ?z)’ Filter: NULL Consequent: ‘(?x :grandParentOf ?z)’
  • 34. Rule Index: Overview A rule index represents an entailed graph A rule index is created on an RDF dataset (consisting of a set of RDF models and a set of RDF rulebases)
  • 35. Rule Index: Example A rule index may be created on a dataset consisting of – family RDF data, and – family_rb rulebase (shown earlier) The rule index will contain inferred triples showing uncleOf and ageGroup information
  • 36. RDF Query with Inference
  • 37. SDO_RDF_MATCH with Rulebases Arguments – Graph pattern A sequence of triples (with variables) – RDF Data set a set of models a set of rulebases – Filter – Aliases … FROM TABLE(SDO_RDF_MATCH( ‘(?x :uncleOf ?y)’, SDO_RDF_Models(‘family’), SDO_RDF_Rulebases (‘rdfs’, ‘family_rb’) … )) t …
  • 38. RDF Query w/ Inference: Example Find salary and hiredate of all the uncles SELECT emp.name, emp.salary, emp.hiredate FROM emp, TABLE(SDO_RDF_MATCH( ‘(?x :uncleOf ?y) (?x :name ?name)’, SDO_RDF_Models(‘family'), SDO_RDF_Rulebases(‘rdfs’, ‘family_rb'), …)) t WHERE emp.name=t.name;
  • 39. RDF Query w/ Inference: Example 2 Find pairs of persons residing at the same address where the first person rents a truck and the second person buys a fertilizer SELECT t3.x name1, t3.y name2 FROM AddrTable t1, AddrTable t2, TABLE(SDO_RDF_MATCH( ‘(?x :rents ?a) (?a rdf:type :Truck) (?y :buys ?b) (?b rdf:type :Fertilizer)’, SDO_RDF_Models(‘Activities'), SDO_RDF_Rulebases(‘rdfs’), …)) t3 WHERE t1.name=t3.x and t2.name=t3.y and t1.addr=t2.addr;
  • 41. Model: DDL Procedures provided as part of the API may be used to – Create a model – Drop a model When a user creates a model, a database view gets created automatically – rdfm_family A model corresponds to a column of type SDO_RDF_TRIPLE_S in a base table Each model has exactly one base table associated with it
  • 42. Model: DDL Creating a Model Create an Application Table CREATE TABLE family_table ( id NUMBER, family_triple SDO_RDF_TRIPLE_S); Create a Model EXEC SDO_RDF.CREATE_RDF_MODEL( ‘family’, ‘family_table’,‘family_triple’); Automatically creates the following database view rdfm_family (…)
  • 43. Loading RDF Data into Oracle Java API provided to load NTriple into NDM Sample XSLs provided – To convert RDF to NTriple – To convert RDF to INSERT statements
  • 44. Model: DML SQL DML commands may be used to do DML operations on a base table to effect DML (i.e., triple insert, delete, and update) on the corresponding model Insert Triples INSERT INTO family_table VALUES (1, SDO_RDF_TRIPLE_S(‘family', '<http://example.org/family/John>', '<http://example.org/family/brotherOf>', ‘<http://example.org/family/Mary>'));
  • 45. Model: Security The creator of the base table corresponding to a model can grant privileges to other users To perform DML to a model, a user must have DML privileges for the corresponding base table The creator of a model can grant QUERY privileges on the corresponding database view to other users A user can query only those models for which s/he has QUERY privileges to the corr. database views Only the creator of a model can drop the model
  • 46. Model: Views Database views corresponding to the models
  • 48. Rulebase: DDL Procedures provided as part of the API may be used to – Create a rulebase create_rulebase('family_rb'); – Drop a rulebase – drop_rulebase('family_rb'); When a user creates a rulebase, a database view gets created automatically – rdfr_family_rb (rule_name, antecedent, filter, consequent, aliases)
  • 49. Rulebase: DML SQL DML commands may be used on the database view corresponding to a target rulebase to insert, delete, and update rules insert into mdsys.rdfr_family_rb values( ‘uncle_rule', ‘(?x :brotherOf ?y) (?y :parentOf ?z)’, NULL, '(?x :uncleOf ?z)', SDO_RDF_Aliases(…));
  • 50. Rulebase: Security Creator of a rulebase can grant privileges to the corresponding database view to other users Performing DML operations requires invoker to have appropriate privileges on the database view Only the creator of a rulebase can drop the rulebase
  • 51. Rulebase: Views RDF_RULEBASE_INFO – Contains the list of rulebases – For each rulebase, contains additional information (such as, creator, view name, etc) Content of each rulebase is available from the corresponding database view
  • 53. Rule Index: DDL Procedures provided as part of the API may be used to – Create a rule index create_rules_index ('family_rb_rix_family‘, SDO_RDF_Models('family'), SDO_RDF_Rulebases(‘rdfs','family_rb')); – Drop a rule index drop_rules_index ('family_rb_rix_family'); When a user creates a rule index, a database view gets created automatically – rdfi_family_rb_rix_family (…)
  • 54. Rule Index: Security To create a rule index on an RDF dataset (models and rulebases), user needs to have QUERY privileges on those models and rulebases Creator of a rule index holds QUERY privilege on the rule index and may grant this privilege to other users Only the creator of a rule index can drop it
  • 55. Rule Index: Views RDF_RULEINDEX_INFO – Contains the list of rule indexes – For each rule index, contains additional information (such as, creator, status, etc) RDF_RULEINDEX_DATASETS – For every rule index, stores the names of its models and rulebases
  • 56. Rule Index: Dependencies Content of a rule index depends upon the content of each element of its dataset – Any modification to the models or rulebases in its dataset invalidates the rule index – Dropping a model or rulebase will drop dependent rule indexes automatically.
  • 57. Summary RDF Data Model – Models (Graphs) – RDF Query using SDO_RDF_MATCH Table Function RDF Data Model with (user-defined) Rules – Models (Graphs) – Rulebases – Rule Indexes – RDF Query on entailed RDF graphs Management (DDL, DML, Security, …) – Models, Rulebases, and Rule Indexes
  • 62. Demo: Family Model Data (Alt)
  • 63. Demo: Query without Inference select m from TABLE(SDO_RDF_MATCH( '(?m rdf:type :Male)', SDO_RDF_Models('family'), null, SDO_RDF_Aliases( SDO_RDF_Alias('', 'http://www.example.org/family/')), null)); M -------------------------------------------------------------------------------- http://www.example.org/family/Jack http://www.example.org/family/Tom
  • 64. Demo: Query w/ RDFS Inference select m from TABLE(SDO_RDF_MATCH( '(?m rdf:type :Male)', SDO_RDF_Models('family'), SDO_RDF_Rulebases(‘RDFS’), SDO_RDF_Aliases( SDO_RDF_Alias('', 'http://www.example.org/family/')), null)); M -------------------------------------------------------------------------------- http://www.example.org/family/Jack http://www.example.org/family/Tom http://www.example.org/family/John http://www.example.org/family/Matt http://www.example.org/family/Sammy
  • 65. Demo: Family Rulebase Antecedent: ‘(?x :parentOf ?y) (?y :parentOf ?z)’ Filter: NULL Consequent: ‘(?x :grandParentOf ?z)’
  • 66. Demo: Query w/ Family and RDFS Inference select x, y from TABLE(SDO_RDF_MATCH( '(?x :grandParentOf ?y) (?x rdf:type :Male)', SDO_RDF_Models('family'), SDO_RDF_Rulebases('RDFS','family_rb'), SDO_RDF_Aliases( SDO_RDF_Alias('','http://www.example.org/family/')), null)); X Y ------------------------------------------------------ ----------------------------------------------------- http://www.example.org/family/John http://www.example.org/family/Cindy http://www.example.org/family/John http://www.example.org/family/Tom http://www.example.org/family/John http://www.example.org/family/Jack http://www.example.org/family/John http://www.example.org/family/Cathy
  • 67. Q U E S T I O N S A N S W E R S
  • 68. Demo of Siderean’s Seamark Navigation Server Mike DiLascio & Joanne Luciano
  • 69. Agenda About Siderean Software & Predictive Medicine, Inc. Introducing Seamark Navigation Server v.3.6 Seamark & Oracle 10g RDF Data Model Demonstration of Seamark / Oracle 10g integration Lessons Learned / Q&A
  • 70. About Siderean Software Aggregate, organize and navigate information -the way users think – -to improve analysis and decision making. Founded in 2001 and based in El Segundo, CA Ventured backed in 2004 Delivering RDF-centric navigation and analysis capabilities for end users (a.k.a. - “the last mile”) Active W3C member leveraging Semantic Web standards Demonstrating integrated Seamark navigation layer over Oracle 10g RDF Data Model in collaboration with Predictive Medicine, Inc.
  • 71. Current solutions “50,000 results!!! Now what?” “I give up! Hello? Get me an apple!” “Why do I get oranges when I’m looking for apples?” IT: CONTENT PRODUCER: “As soon as I fix his, “I just produced three apples hers stops working.” last week!” Enterprise search – Knowledge management – a brute force approach breathtakingly expensive
  • 72. Introducing Seamark Navigation Server “I can see the big picture!” “No more staring at a blank text box.” “I can drill down quickly to what I want.” IT: CONTENT PRODUCER: “I can take my coffee “I knew we had an apple in break now.” here somewhere.” Seamark – layering organization to deliver pinpoint navigation
  • 73. How it works: process Term View View Person Text Place Event Metadata about Organized into a unified Analyzed to generate Providing pinpoint data and content information architecture… on-demand views… navigation across is aggregated… the data and content
  • 74. How it works: architecture User Navigation and User Tagging Unstructured Content and Data Feeds Web Browsers & Portals Search Engines User Alerts Metadata Navigation Navigation Aggregator Metadata Web Services Feed Aggregators Structured Content Sources
  • 75. Seamark/Oracle integration architecture: Phase 1 User Navigation and User Tagging Web Browsers & Portals User Alerts Batch RDFMatch Oracle 10g Query issued from Cached Navigation RDF Data Seamark at Navigation Web Services Model for index time Metadata scalable persistence of Feed Aggregators metadata
  • 76. Seamark/Oracle integration architecture: Phase 2 User Navigation and User Tagging Web Browsers & Portals User Alerts Oracle 10g Federated RDFMatch Dynamic Navigation RDF Data Queries issued from Navigation Web Services Model for Seamark at query time Metadata scalable persistence of metadata Feed Aggregators
  • 77. Seamark Demo: Background & Concepts Life Sciences demonstration premise RDF offers high value during early stage research Leveraging strengths of Oracle 10g & Seamark v3.6 Oracle – large datasets / scalability Seamark – useful subsets / flexible navigation & insights Project elapsed time - about one week Locating and identifying data sources represented the greatest time element Data sources in RDF required minimal integration time Non-RDF data sources required transformation and linking values (non-trivial but straightforward)
  • 78. Seamark Demonstration: Identification of new drug candidates 1. Differentiate different forms GO2Keyword.rdf Keywords.rdf of disease ProbeSet.rdf 2. Identify patients subgroups. 3. Identify top biomarkers Keyword 4. Identify function GO2UniProt.rdf GO2OMIM.rdf Probe 5. Identify biological and chemical properties and Protein disease associations of Gene biomarker MIM Id OMIM.rdf 6. Identify documents IntAct.rdf 7. Identify role in metabolic GO.rdf GO2Enzyme.rdf pathways UniProt.rdf Enzyme Organism 8. Identify compounds that Citation interact 9. Identify and compare Compound Taxonomy.rdf function in other organisms PubMed.xml Enzymes.rdf KEGG.rdf Pathway 10. Identify any prior art
  • 79. Live Seamark Life Sciences Demonstration: Sample Screenshots
  • 80. Seamark application start page shows integration of OMIM, GO, KEGG, UniProt and NCBI
  • 81. Select: Probe Set ID: “M18255_cds2_s_at”
  • 82. Results: 9 Matches on “M18255_cds2_s_at” to the Gene Ontology Cytoplasm 1st of 9 Matches Cellular Location Via Gene Ontology
  • 83. Cytoplasm 1st of 9 Matches Page Scroll
  • 84. Cytoplasm 1st of 9 Matches Page Scroll Plasma Membrane, …, 2nd of 9 Matches Cellular Location Via Gene Ontology Page Scroll for more results, etc.
  • 85. Start Page: Optionally search across entire collection based upon keywords from the integrated data sources
  • 86. Seamark Lessons Learned RDF offers multiple unconstrained views of data/relationships – Provides maximum flexibility during early stage research – Later stages can leverage OWL to constrain known relationships Data providers – Timing is right to publish in RDF format – Cut your customer’s integration costs – Speed discovery time Even with one week of effort… – Proof of Concept demonstrates value of broad & deep integration – Additional value in extending POC in customer pilot initiatives
  • 87. Siderean Seamark Conclusion Getting the precise information we need from today’s data glut is profoundly difficult Solving this problem requires a solution that works the way you think Siderean is the world’s first turnkey navigation server for the enterprise and people at large
  • 88. To arrange a demonstration of Seamark or Thank You! for more information please contact: Mike DiLascio Office: +1 781 652 0339 Mobile: +1 781 354 7663 mdilascio@siderean.com Siderean Software, Inc. 390 North Sepulveda Blvd., Suite 2070 El Segundo, CA 90245-4475 USA http://www.siderean.com