1. Research… this time it’s Personal! Mark Wilkinson, PI Bioinformatics, Heart + Lung Institute @ St. Paul’s Hospital Vancouver, BC, Canada
2. DEMO...to give you incentive to listen to the rest of the presentation ;-)
3. Show me patients with elevated creatinine along with their latest BUN and creatinine levels PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patients: <http://sadiframework.org/ontologies/patients.owl#> PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf> WHERE { ?patient rdf:typepatients:ElevatedCreatininePatient . ?patient pred:latestBUN ?bun . ?patient pred:latestCreatinine ?creat . }
16. >1000 X more data in the “Deep Web” than in Web pages
17. Accessing these databases and analytical algorithms “transparently”, based on an individual researcher’s ideas, beliefs, and preferenceswill help us personalize medical research
18. Mark Butler (2003) Is the semantic web hype? Hewlett Packard laboratories presentation at MMU, 2003-03-12
19. Semantic Web?(my definition) An information system where machines can receive information from one source, re-interpret it, and correctly use it for a purpose that the source had not anticipated.
29. SADI Best Practice #1 Make the implicit explicit… A Web Service should create “triples” linking the input data to the output data, thus explicitly describing the semantic relationship between them
30. SADI Best Practice #1 This is what bioinformatics Web Services implicitly do anyway! Easy to implement this as a best-practice
31. SADI Observation #2:HTTP GET and POST GET guarantees the response relates to the request URI in a very precise and predictable way POST does not…
32. SADI Observation #2:GET and POST That’s why Web Services have a fundamentally different behaviour than the Semantic Web
33. SADI Observation #2:GET and POST We can fix that! (without breaking any existing rules or standards!)
34. SADI Best Practice #2 SUBJECT URI of the output graph (triples) is the sameas the SUBJECT URI of the input graph (triples) (the output is “about” the input... Now explicitly!)
35. Consequence The “Semantics” of our interaction with the Web Service are now explicitandidentical to the “Semantics” of GET
36. SADI Web Service Interfaces Service Interfaces defined by two OWL classes:
39. SADI Web Service Interfaces My Service consumes OWL Individuals of Class #1and returns OWL Individuals of Class #2 …but the URI of those two individuals is the same!(see best practice #2)
40. How do we discover services? Since input and output are about the same “thing”, we can automatically determine what a service doesby comparing the Input and Output OWL classes
41. How do we discover services? Automatically index services in a registry based on what properties (predicates) Services add to their respective input data
42. EXAMPLE Input Data: BRCA1 rdf:type Gene ID Output Data: BRCA1 hasDNASequence AGCTTAGCCA… Registry Index: Service provides “hasDNASequence” property to Gene IDs
43. Now we can answer questions like “what is the DNA sequence of BRCA1?” Discover a SADI Web Service that generates the DNA Sequence property for gene identifiers
59. …but I’m supposed to be personalizing research… Let’s make this a little more personal by bringing in Ontologies
60. My Definition of Ontology (for this talk) Ontologies explicitly define the things that exist in “the world” based on what propertieseach kind of thing must have
61. Ontology Spectrum Frames (Properties) Thesauri “narrower term” relation Selected Logical Constraints (disjointness, inverse, …) Catalog/ ID Formal is-a Informal is-a Formal instance General Logical constraints Terms/ glossary Value Restrs.
62. Demo #2 Discover instances of OWL classes from data that doesn’t exist…
63. Show me patients whose creatinine level is increasing over time, along with their latest BUN and creatinine levels. PREFIX regress: <http://sadiframework.org/examples/regression.owl#> PREFIX patients: <http://sadiframework.org/ontologies/patients.owl#> PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf> WHERE { ?patient patients:creatinineLevels ?collection . ?collection regress:hasRegressionModel ?model . ?model regress:slope ?slope FILTER (?slope > 0) . ?patient pred:latestBUN ?bun . ?patient pred:latestCreatinine ?creat . }
64.
65.
66.
67. Show me patients with elevated creatinine along with their latest BUN and creatinine levels PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patients: <http://sadiframework.org/ontologies/patients.owl#> PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf> WHERE { ?patient rdf:typepatients:ElevatedCreatininePatient . ?patient pred:latestBUN ?bun . ?patient pred:latestCreatinine ?creat . }
69. Regression models have features like slopes and intercepts… and so onThe class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties
70. Successful decomposition of the OWL class to discover the need for a LinearRegression Web Service, and so on
72. OWL Class restrictions converted into workflows SPARQL queries converted into workflows Reasoning happening in parallel with query executionData fulfilling OWL models is discovered, or generated through running analytical tools SADI and CardioSHARE
74. Show me patients whose creatinine level is increasing over time, along with their latest BUN and creatinine SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf> WHERE { ?patient rdf:typepatients:ElevatedCreatininePatient . ?patient pred:latestBUN ?bun . ?patient pred:latestCreatinine ?creat . }
75. I created a small ontologydescribing my definition ofan Elevated Creatinine Patient
86. What’s a fancy word for a world-view that you make-up? Hypothesis
87. Current Research We believe that ontologies and hypothesesare, in some ways, the same “thing”……simply assertions about individuals that may or may not existFuture SADI client applications will supportdata-driven hypothesis generation and resolution
88. Recap SADI Semantic Web Services generate triples; the predicates of those triples are indexed... Period. For a given query, determine which properties are available, and which need to be discovered/generated Find services that generate the properties we need
89. Semantic Web An information system where machines can receive information from one source, re-interpret it, and correctly use it for a purpose that the source had not anticipated. My Purpose!!
90. What SADI + SHARE supports Re-interpretation We constantly compare the collection of properties, gathered from third-parties worldwide, to whatever world-model (query/ontology) we wish to view it through. MY world model
91. What SADI + SHARE supports Novel re-use There is no way for the provider to dictate how their data should be used, or how it should be interpreted. They simply add their properties into the “data cloud” and those properties are used in whatever way is appropriate forME.
92. And all this because SADI simply requires that the input URI is the same as the output URI
93. Semi-automated SADI service writing and deployment Taverna Semantically-guided SADI service discovery and pipelining SADI Plug-ins
94. Simple and Open WINS! Join us! We have recently received funding from CANARIEto assist and train service providersin deploying their own SADI Semantic Web Services Come join us – we’re having a lot of fun!! http://sadiframework.org http://twitter.com/sadiframework
95. Credits Benjamin VanderValk (SADI & CardioSHARE) Luke McCarthy (SADI & CardioSHARE) SoroushSamadian (CardioSHARE) Microsoft Research Fin This presentation available on SlideShare: keywords ‘wilkinson’ ‘BioIT-2010’
96. Credits Benjamin VanderValk(SADI & CardioSHARE) Luke McCarthy (SADI & CardioSHARE) SoroushSamadian(CardioSHARE)