My presentation to the IEEE Asia Pacific Services Computing Conference 2009 - Semantic Web Services In Practice (SWSIP 09) track. This show introduces the SADI (Semantic Automated Discovery and Integration) Framework - the replacement for our earlier explorations with the BioMoby project. In this slideshow I explore what SADI is, and why it is able to generate such interesting (and useful!) Semantic-Webby behaviours from Web Services. I also discuss our current research activities around how we are trying to exploit the SADI system to create much more natural query interfaces for Cardiovascular researchers.
13. Can we define frameworks and/or best practices that make Web Services âtransparentâ to the Semantic Web?
14. Semantic Web? An information system where machines can receive information from one source, re-interpret it, and correctly use it for a purpose that the source had not anticipated.
15. Semantic Web? If we cannot achieve those two things, then IMO we donât have a âsemantic webâ, we only have a distributed (??), linked database⊠and that isnât particularly excitingâŠ
16. What do we need to do? Make Web Services exhibit the same âbehaviorâ as GET Add rich Semantics to Web Services at every level of the WS âstackâ
17. What do we have to work with?(Without inventing new technologies or standardsâŠ) Data Interface annotations
19. History of SADIBased on observations of the usage and behaviour of BioMoby Semantic Web Servicessince 2002
20. SADI Observation #1: What does âGETâ really do? What âbehaviourâ of GET do we need to mimic in a Web Service?
21. SADI Observation #1: GET guarantees the response relates to the input URI in a very precise and predictable way POST does not⊠Web Services have a fundamentally different semantic behaviour than the Semantic Web
22. SADI Observation #1: We can fix that! (without breaking any existing rules or standards!)
23. SADI Observation #2: All Web Services in Bioinformatics create implicitbiologicalrelationships between their input and output
26. SADI: Core Proposition Web Service output must be Triples SUBJECT URI of the output graph is the same URI as the SUBJECT URI of the input graphthat was POSTedto the Web Service Define OWL classes that describe your input and output triples
27. Consequence(i.e. why SADI works!) The Semantics of the triples from a Web Service are now explicit and identical to the Semantics of GET
28. How do we exploit this? Current Web Service definition systems such as WSDL, and OWL-S are missing a crucial annotation element⊠âŠthe semantic relationship between input and output
29. How do we exploit this? With SADI services, this relationship can be automatically derived by âdiff-ingâ the input and output OWL classes âŠbecause the SUBJECT node of the input and output graphs is guaranteed to be the same!
30. How do we exploit this? We have constructed a simple prototype Web Services registry that provides discovery based on annotation of these biological relationships
31. Demo #1 a plug-in to the IO-Informatics Knowledge Explorer
32. The Sentient Knowledge Explorerfrom IO InformaticsA Semantic Integration, Visualization and Search Tool
33. SADI Plug-in to the Sentient Knowledge Explorer The Knowledge Explorer has a plug-ins API that we have utilized to pass data directly from SADI to the KE visualization and exploration system Data elements in KE are queried for their âtypeâ (rdf:type); this is used to discover additional data properties by querying the SADI Web Service registry
49. SADI Demo #2Imagine there is a âvirtual graphâ connecting every conceivable input for every conceivable Web Service to their respective outputs How do we query that graph?
52. What pathways does UniProt protein P47989 belong to? PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> PREFIX ont: <http://ontology.dumontierlab.com/> PREFIX uniprot: <http://lsrn.org/UniProt:> SELECT ?gene ?pathway WHERE { uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway . }
53.
54.
55.
56. Recapwhat we just saw A standard SPARQL query was entered into SHARE â a SADI-based query engine The query was interpreted to extract the properties being queried and these were passed to SADI for Web Service discovery SADI searched-for, found, and accessed the databases and/or analytical tools capable of generating those properties
57. Recapwhat we just saw We posed, and answered a complex database query WITHOUT A DATABASE (in fact, the data didnât even have to exist...)
58. RDF graph browsing and querying is useful⊠âŠbut whereâs the semantics??Letâs bring in some ontologies!
59. My Definition of Ontology (for this talk) Ontologies explicitly define the things that exist in âthe worldâ based on what propertieseach kind of thing must have
60. Ontology Spectrum Frames (Properties) Thesauri ânarrower termâ relation Selected Logical Constraints (disjointness, inverse, âŠ) Catalog/ ID Formal is-a Informal is-a Formal instance General Logical constraints Terms/ glossary Value Restrs.
61. Semantic Web? An information system where machines can receive information from one source, re-interpret it, and correctly use it for a purpose that the source had not anticipated.
66. Benefit of late binding Data is amenable to constant re-interpretation
67. How do we achieve this? DO NOT PRE-CLASSIFY DATA! just hang properties on it
68. These are axioms that enable classification Frames (Properties) Thesauri ânarrower termâ relation Selected Logical Constraints (disjointness, inverse, âŠ) Catalog/ ID Formal is-a Informal is-a Formal instance General Logical constraints Terms/ glossary Value Restrs. These are indexing/annotation systems
69. Design OWL-DL ontologies describing cardiovascular concepts Ontologies are in the âFrames+â area of the Ontology spectrum, and therefore can leverage SADI and be âexecutedâ as workflows
70. What theâŠ??? Did you just say âexecute ontologies as workflows?!â
76. As OWL AxiomsHomologousMutantImage is owl:equivalentTo { Gene Q hasImage image P Gene Q hasSequence Sequence Q Gene R hasSequence Sequence R Sequence Q similarTo Sequence R Gene R = âmy gene of interestâ }
79. Demo #3 Discover instances of OWL classes from data that doesnât existâŠ
80. Show me patients whose creatinine level is increasing over time, along with their latest BUN and creatinine levels. PREFIX regress: <http://sadiframework.org/examples/regression.owl#> PREFIX patients: <http://sadiframework.org/ontologies/patients.owl#> PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf> WHERE { ?patient patients:creatinineLevels ?collection . ?collection regress:hasRegressionModel ?model . ?model regress:slope ?slope FILTER (?slope > 0) . ?patient pred:latestBUN ?bun . ?patient pred:latestCreatinine ?creat . }
81.
82.
83.
84. Show me patients whose creatinine level is increasing over time, along with their latest BUN and creatininelevels(now as an OWL class ï ElevatedCreatininePatient) PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patients: <http://sadiframework.org/ontologies/patients.owl#> PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf> WHERE { ?patient rdf:typepatients:ElevatedCreatininePatient . ?patient pred:latestBUN ?bun . ?patient pred:latestCreatinine ?creat . }
86. Regression models have features like slopes and intercepts⊠and so onThe class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties
87. Successful decomposition of the OWL class to discover the need for a LinearRegression Web Service, and so on
89. Demo #3 Discover instances of OWL classes from data that doesnât existâŠ
90. SADI and CardioSHARE OWL Class restrictions converted into workflows SPARQL queries converted into workflows Reasoning happening at the same time as queries are being executed
93. âReckoningâDynamic discovery of instances of OWL classes through synthesis and invocation of a Web Service workflow capable of generating data described by the OWL Class restrictions, followed by reasoning to classify the data into that ontology
94. But OWL Classes can be (and often are) completely âmade-upâWhat does that mean in the context of SADI and CardioSHARE?
95. Current Research We believe that ontologies and hypothesesare, in some ways, the same âthingââŠâŠsimply assertions about individuals that may or may not existIf an instance of a class exists, then the hypothesis is âvalidatedâ
96. Current Research Constructing OWL classes that represent patient categories mimicking clinical outcomes research experiments done on the BC Cardiac RegistryThe original experiments have already been published, and therefore act as a gold-standardWe attempt to use âReckoningâ to automatically discover instances of these patient Classes and compare our results with those of the clinical researchers
97.
98. Hypothesis Ischemia SADI Web âagentsâ Hypertension Blood Pressure CardioSHARE architecture: Increasingly complex ontological layers organize data into richer concepts, even hypotheses Database 1 Database 2 AnalysisAlgorithmXX
99. Recap SADI Semantic Web Services generate triples; the predicates of those triples are indexed... Period!! For a given query, determine which properties are available, and which need to be discovered/generated Discovery of services through index queries + on-the-fly âclassificationâ of local data against the service interface OWL Classes
100. Semantic Web An information system where machines can receive information from one source, re-interpret it, and correctly use it for a purpose that the source had not anticipated.
101. What SADI + SHARE supports Re-interpretation : The SADI data-store simply collects properties, and matches them up with OWL Classes in a SPARQL query and/or from individual service providerâs Web Service interface
102. What SADI + SHARE supports Novel re-use: Because we donât pre-classify, there is no way for the provider to dictate how their data should be used. They simply add their properties into the âcloudâ and those properties are used in whatever way is appropriate for me.
103. Important âwinsâ Data remains distributed â no warehouse! Data is not âexposedâ as a SPARQL endpoint ï greater provider-control over computational resources Yet data appears to be a SPARQL endpoint⊠no modification of SPARQL or reasoner required.
104. And all this because we simply require that the input URI is the same as the output URI
105. Join us! SADI and CardioSHARE are Open-Source projects Come join us â weâre having a lot of fun!! http://sadiframework.org
106. Credits Benjamin VanderValk (SADI & CardioSHARE) Luke McCarthy (SADI & CardioSHARE) SoroushSamadian (CardioSHARE) IO Informatics (Knowledge Explorer API) microsoftResearch Fin