SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Tools for Next Generation of CMS: XML, RDF,
                                   & GRDDL



       Chimezie Ogbuji (chee-meh)
       Cleveland Clinic Foundation
       Cardiothoracic Surgery Research
       ogbujic@ccf.org / chimezie@gmail.com
Background (CT Research Roadmap)

●   A large, relational registry for Cardiothoracic
    procedures
●   Relatively small research department with very little
    software engineering experience
●   Traditional CMS and DBMS were insufficient
●   Initiated a large effort to convert to a metadata-
    driven XML / RDF repository (SemanticDB)
●   Need to replace a productive, integrated research
    pipeline
     –   Data entry, clinical Q&A, patient follow-up, concurrent
         study management,...
     –   100+ research papers per year
Background (Institute of Medicine
               Proposal)
●   The Computer-Based Patient Record: An Essential
    Technology for Health Care
     – ISBN: 0309055326
●   Old but very relevant set of requirements by the
    IOM (still unfulfilled).
●   A comprehensive attempt to address all the
    requirements: technological, clinical, procedural,
    etc..
●   Can be (completely) addressed with Semantic Web
    architecture, document processing, and “Web 2.0”
    architecture.
CPR: Functional Requirements
●   Uniform, extensible record content
●   (Standard) record formats
●   System performance
●   Linkages
●   Intelligence
●   Reporting Capabilities
●   Security
●   Multi-views
●   Accessiblity
Definitions: KR / CMS
●   What is Knowledge Representation (KR)?
●   What is a Knowledge Base (KB)?:
     – A database system which facilitates
       deductive reasoning over a KR
     – Commonly called Rule-based Systems
●   What are Expert Systems?
●   What is a Content Management System
    (CMS)?
Knowledge Representation




●   Older ideas at corners, newer ideas along sides
    (Credit: Conrad Barski, M.D.)
Content Management System:
                The What
●   The terms CMS and Content Repository are
    essentially interchangeable
●   Modern content repositories are best characterized
    by JSR 170 / 283
●   “.. a high-level information management system that
    is a superset of traditional data repositories”
●   Integrated support for the XPath data model is the
    most prominent feature (native document
    management)
Content Repository Feature Set

●   Modern CMS standards cover document
    management effectively
    –   Read/write access
    –   Versioning
    –   Event monitoring
    –   Document-level access control
    –   Concurrent access
    –   Cross-linking
    –   Profiles and Document Types
Anatomy of a JSR 170 Implementation

●   Jack Rabbit
●   Component-based
    –   Content Applications
    –   Content Repository API
    –   Implementation
Knowledge Bases and CMS

●   What of the requirements that Expert Systems
    meet?
●   Document management and knowledge
    management systems are historically isolated from
    each other
●   XML & RDF are contemporary manifestations of
    these methodologies
●   They have remained as isolated as their
    predecessors
●   They typically only coincide with regards to syntax
XML & RDF:
          Eating and Having your Cake
●   Classic example of where the document-oriented
    approach falls short:
    –   Modern EHR cannot facilitate dynamic research
●   Unified infrastructure for document and
    knowledge management is needed
●   One of the earliest examples:
    –   4Suite Server version 0.10.0 (December 2000)
●   Current state of the art (GRDDL):
    –   Gleaning Resource Descriptions from Dialects of Language
GRDDL:
                  The Elevator Pitch
●   Provides a way to normalize RDF concrete
    syntaxes
●   The problem:
    –   Many RDF concrete syntaxes (RDF/XML,Trix, RDFa,..)
    –   The authoritative concrete syntax is not without issues
●   The solution:
    –   Define mappings from XML dialects to RDF graphs
    –   Use turing-complete XML pipelines
●   English as a second language analogy
The GRDDL Picture
GRDDL:
                    The Components
●   Faithful Rendition
    –   “By specifying a GRDDL transformation, the author of a
        document states that the transformation will provide a
        faithful rendition in RDF of information (or some portion
        of the information) expressed through the XML dialect
        used in the source document.”
●   Various Mechanism for nominating transformations:
    –   Specific XML attribute, XML Namespaces, HTML
        Profiles, and XHTML links
●   GRDDL-aware agents compute GRDDL results
    (RDF graphs)
The CMS Alternative:
              “Dual Representation”
●   Persist XML in synchrony with its faithful rendition
    –   Changes to the XML trigger calculation and storage of
        corresponding RDF
●   “Dual Representation”
●   Implemented by 4Suite Server Document
    Definitions
●   The basis of how we capture patient records with
    maximum syntactic and semantic expressivity
Document Definition




●
    The document definition is the mapping
    –   Usually an XSLT document
Content Repository Architecture
Overlap between Content Repository
              APIs
Dual Representation:
                   Advantages
●   Maximum expressiveness and versatility of content
●   Unified naming convention and access control
    (more on this later)
●   Uniform, concrete RDF syntaxes
    –   For systems which speak XML fluently (XForms, POX
        over HTTP, WS-*, etc..)
●   Cheap support for XML & RDF content negotiation
●   Use of RDF as a semantic index for XML
Document Definition:
                    Similarities
●   GRDDL
●   RDDL
    –   Resource Directory Description Language
    –   Human-readable descriptive material about a target
    –   A directory of individual resources related to a target
          ● Nature and Purpose

          ● Schema, stylesheet, etc.

    –   Lives at a namespace URI
●   WXS's targetNamespace
●   Common theme is a set of definitions for a
    document or a class of documents
Registering a Document to a Class

●   Namespace registration works well for the web
    (preferred approach of W3C TAG)
●   What if you don't control the content served from
    the namespace of an existing vocabulary?
    –   Atom, Docbook, etc.
●   A CMS is better suited for a 'closed' / 'controlled'
    approach
    –   Persist membership metadata in the CMS
SemanticDB and Dual Representation
Document and Graph Granularity

●   Tying documents to graphs normalizes the content
    granularity
●   Documents and their RDF graphs can be treated
    uniformly:
    –   Naming convention
    –   Targeted querying
    –   Access control management
JSR Fine-Grained Control
'Controlled' Naming Convention
Controlled Naming Convention:
                  Continued
●   RDF Dataset (from SPARQL):
    –   A collection of named graphs
●   The RDF is stored in a graph with the same URI as
    the XML source document
●   When RDF is used as the primary cross-document
    'index' you can:
    –   SELECT ?graph WHERE { GRAPH ?graph { ... } }
    –   document($graph)/.. XPath ..
●   The space compromise (of dual representation) can
    be further mitigated by only extracting a minimal
    RDF graph
Uniform Access Control for
           XML/RDF CMS
●   Traditionally, Access Control Lists are associated
    with an object
    –   Example: a file or directory in a filesystem
●   Assign document / graph ACLs to a single URI
    –   Certain users / groups can query the RDF but cannot
        read the XML
    –   De-identification of EHR: HIPPA
●   The 4Suite repository supports unified XML/RDF
    ACL
Going Forward
●   The SPARQL RDF dataset needs to be generalized
    –   There is a long list of representation problems solved by
        a formal named graph specification
●   RDF graphs need to be first-class objects in CMS
●   Build a common Content Repository API for XML /
    RDF on the JSR 170 / 283 foundation
●   Where do the 4Suite Repository API and JSR 170 /
    283 overlap?
●   How do we generalize Document Definitions?
A Proposal for XML/RDF CMS
Primary Takeaways
●   We need to stop thinking of XML & RDF as mutually
    exclusive solutions to similar problems
●   CMS standards are needed for the next generation
    of semantic / rich web applications
●   These standards can preemptively level the
    landscape of toolkits in this space
References
●   D. Nuescheler et al, JSR 170: Content Repository for Java
     – http://jcp.org/en/jsr/detail?id=170
●   D. Connolly, Gleaning Resource Descriptions from Dialects of Language
     – http://www.w3.org/TR/grddl/
●   J. Borden, T. Bray, Resource Directory Description Language
     – http://www.rddl.org/
●   E. Prud'hommeaux, A. Seaborne, SPARQL Query Language for RDF
     – http://www.w3.org/TR/rdf-sparql-query/
●   Fourthought Inc., 4Suite
     –   http://4Suite.org

Contenu connexe

Tendances

Semantic Media Management with Apache Marmotta
Semantic Media Management with Apache MarmottaSemantic Media Management with Apache Marmotta
Semantic Media Management with Apache Marmotta
Thomas Kurz
 
Shrinking the silo boundary: data and schema in the Semantic Web
Shrinking the silo boundary: data and schema in the Semantic WebShrinking the silo boundary: data and schema in the Semantic Web
Shrinking the silo boundary: data and schema in the Semantic Web
Gordon Dunsire
 
The return of the hierarchical model
The return of the hierarchical modelThe return of the hierarchical model
The return of the hierarchical model
Jukka Zitting
 

Tendances (20)

Graph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandraGraph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandra
 
Web Spa
Web SpaWeb Spa
Web Spa
 
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODSAlphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
 
Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...
Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...
Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...
 
Linked data and voyager
Linked data and voyagerLinked data and voyager
Linked data and voyager
 
Publishing RDF SKOS with microservices
Publishing RDF SKOS with microservicesPublishing RDF SKOS with microservices
Publishing RDF SKOS with microservices
 
Apache Marmotta (incubating)
Apache Marmotta (incubating)Apache Marmotta (incubating)
Apache Marmotta (incubating)
 
RDF Seminar Presentation
RDF Seminar PresentationRDF Seminar Presentation
RDF Seminar Presentation
 
Owl web ontology language
Owl  web ontology languageOwl  web ontology language
Owl web ontology language
 
Semantic Media Management with Apache Marmotta
Semantic Media Management with Apache MarmottaSemantic Media Management with Apache Marmotta
Semantic Media Management with Apache Marmotta
 
RDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL PlatformsRDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL Platforms
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
Shrinking the silo boundary: data and schema in the Semantic Web
Shrinking the silo boundary: data and schema in the Semantic WebShrinking the silo boundary: data and schema in the Semantic Web
Shrinking the silo boundary: data and schema in the Semantic Web
 
Resource description framework
Resource description frameworkResource description framework
Resource description framework
 
Deriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF DataDeriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF Data
 
How to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issuesHow to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issues
 
The Semantic Web #9 - Web Ontology Language (OWL)
The Semantic Web #9 - Web Ontology Language (OWL)The Semantic Web #9 - Web Ontology Language (OWL)
The Semantic Web #9 - Web Ontology Language (OWL)
 
Xml databases
Xml databasesXml databases
Xml databases
 
The return of the hierarchical model
The return of the hierarchical modelThe return of the hierarchical model
The return of the hierarchical model
 
Expressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query RewritingExpressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query Rewriting
 

En vedette

Shock : hypovolemic, septic and neurogenic
Shock : hypovolemic, septic and neurogenic Shock : hypovolemic, septic and neurogenic
Shock : hypovolemic, septic and neurogenic
Bethelhem Berhanu
 

En vedette (11)

Overview of CPR Ontology
Overview of CPR OntologyOverview of CPR Ontology
Overview of CPR Ontology
 
ACLS/ Theraputic Hypothermia presentation
ACLS/ Theraputic Hypothermia presentationACLS/ Theraputic Hypothermia presentation
ACLS/ Theraputic Hypothermia presentation
 
Unit 4 indications contraindications_cti
Unit 4 indications contraindications_ctiUnit 4 indications contraindications_cti
Unit 4 indications contraindications_cti
 
special investigations in abdominal pathologies
special investigations in abdominal pathologiesspecial investigations in abdominal pathologies
special investigations in abdominal pathologies
 
14 three moment equation
14 three moment equation14 three moment equation
14 three moment equation
 
Shock : hypovolemic, septic and neurogenic
Shock : hypovolemic, septic and neurogenic Shock : hypovolemic, septic and neurogenic
Shock : hypovolemic, septic and neurogenic
 
Anaphylactic shock
Anaphylactic shockAnaphylactic shock
Anaphylactic shock
 
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAPOpen Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
 
Thyroid crisis
Thyroid crisisThyroid crisis
Thyroid crisis
 
Anaphylaxis
AnaphylaxisAnaphylaxis
Anaphylaxis
 
Adrenal gland lecture
Adrenal gland lectureAdrenal gland lecture
Adrenal gland lecture
 

Similaire à Tools for Next Generation of CMS: XML, RDF, & GRDDL

RDF and the Semantic Web -- Joanna Pszenicyn
RDF and the Semantic Web -- Joanna PszenicynRDF and the Semantic Web -- Joanna Pszenicyn
RDF and the Semantic Web -- Joanna Pszenicyn
Richard.Sapon-White
 
Robust Module based data management system
Robust Module based data management systemRobust Module based data management system
Robust Module based data management system
Rahul Roi
 
ravenbenweb xml and its application .PPT
ravenbenweb xml and its application .PPTravenbenweb xml and its application .PPT
ravenbenweb xml and its application .PPT
ubaidullah75790
 
Wed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservationsWed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservations
eswcsummerschool
 
Wed batsakis tut_chalasdlenges of preservations
Wed batsakis tut_chalasdlenges of preservationsWed batsakis tut_chalasdlenges of preservations
Wed batsakis tut_chalasdlenges of preservations
eswcsummerschool
 
Michael Lang Sr. Presentation
Michael Lang Sr. PresentationMichael Lang Sr. Presentation
Michael Lang Sr. Presentation
Mediabistro
 
Knowledge Representation, Semantic Web
Knowledge Representation, Semantic WebKnowledge Representation, Semantic Web
Knowledge Representation, Semantic Web
Serendipity Seraph
 

Similaire à Tools for Next Generation of CMS: XML, RDF, & GRDDL (20)

Semantic Web use cases in outcomes research
Semantic Web use cases in outcomes researchSemantic Web use cases in outcomes research
Semantic Web use cases in outcomes research
 
Infromation Reprentation, Structured Data and Semantics
Infromation Reprentation,Structured Data and SemanticsInfromation Reprentation,Structured Data and Semantics
Infromation Reprentation, Structured Data and Semantics
 
Comparative study on the processing of RDF in PHP
Comparative study on the processing of RDF in PHPComparative study on the processing of RDF in PHP
Comparative study on the processing of RDF in PHP
 
RDF and the Semantic Web -- Joanna Pszenicyn
RDF and the Semantic Web -- Joanna PszenicynRDF and the Semantic Web -- Joanna Pszenicyn
RDF and the Semantic Web -- Joanna Pszenicyn
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
Robust Module based data management system
Robust Module based data management systemRobust Module based data management system
Robust Module based data management system
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
Semantics
SemanticsSemantics
Semantics
 
2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql
 
ravenbenweb xml and its application .PPT
ravenbenweb xml and its application .PPTravenbenweb xml and its application .PPT
ravenbenweb xml and its application .PPT
 
Wed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservationsWed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservations
 
Wed batsakis tut_chalasdlenges of preservations
Wed batsakis tut_chalasdlenges of preservationsWed batsakis tut_chalasdlenges of preservations
Wed batsakis tut_chalasdlenges of preservations
 
A content repository for your PHP application or CMS?
A content repository for your PHP application or CMS?A content repository for your PHP application or CMS?
A content repository for your PHP application or CMS?
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
Michael Lang Sr. Presentation
Michael Lang Sr. PresentationMichael Lang Sr. Presentation
Michael Lang Sr. Presentation
 
Knowledge Representation, Semantic Web
Knowledge Representation, Semantic WebKnowledge Representation, Semantic Web
Knowledge Representation, Semantic Web
 
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
 
Structured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackStructured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product Stack
 

Plus de Chimezie Ogbuji

Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
Chimezie Ogbuji
 
GRDDL: A Pictorial Approach
GRDDL: A Pictorial ApproachGRDDL: A Pictorial Approach
GRDDL: A Pictorial Approach
Chimezie Ogbuji
 
UniProt and the Semantic Web
UniProt and the Semantic WebUniProt and the Semantic Web
UniProt and the Semantic Web
Chimezie Ogbuji
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical Informatics
Chimezie Ogbuji
 
Segmenting & Merging Domain-specific Modules for Clinical Informatics
Segmenting & Merging Domain-specific Modules for Clinical InformaticsSegmenting & Merging Domain-specific Modules for Clinical Informatics
Segmenting & Merging Domain-specific Modules for Clinical Informatics
Chimezie Ogbuji
 

Plus de Chimezie Ogbuji (10)

Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptx
 
Using OWL for the RESO Data Dictionary
Using OWL for the RESO Data DictionaryUsing OWL for the RESO Data Dictionary
Using OWL for the RESO Data Dictionary
 
Semantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsSemantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical Informatics
 
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
 
Automated clinicalontologyextraction
Automated clinicalontologyextractionAutomated clinicalontologyextraction
Automated clinicalontologyextraction
 
GRDDL: A Pictorial Approach
GRDDL: A Pictorial ApproachGRDDL: A Pictorial Approach
GRDDL: A Pictorial Approach
 
UniProt and the Semantic Web
UniProt and the Semantic WebUniProt and the Semantic Web
UniProt and the Semantic Web
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical Informatics
 
Segmenting & Merging Domain-specific Modules for Clinical Informatics
Segmenting & Merging Domain-specific Modules for Clinical InformaticsSegmenting & Merging Domain-specific Modules for Clinical Informatics
Segmenting & Merging Domain-specific Modules for Clinical Informatics
 
The Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are ImportantThe Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are Important
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Tools for Next Generation of CMS: XML, RDF, & GRDDL

  • 1. Tools for Next Generation of CMS: XML, RDF, & GRDDL Chimezie Ogbuji (chee-meh) Cleveland Clinic Foundation Cardiothoracic Surgery Research ogbujic@ccf.org / chimezie@gmail.com
  • 2. Background (CT Research Roadmap) ● A large, relational registry for Cardiothoracic procedures ● Relatively small research department with very little software engineering experience ● Traditional CMS and DBMS were insufficient ● Initiated a large effort to convert to a metadata- driven XML / RDF repository (SemanticDB) ● Need to replace a productive, integrated research pipeline – Data entry, clinical Q&A, patient follow-up, concurrent study management,... – 100+ research papers per year
  • 3. Background (Institute of Medicine Proposal) ● The Computer-Based Patient Record: An Essential Technology for Health Care – ISBN: 0309055326 ● Old but very relevant set of requirements by the IOM (still unfulfilled). ● A comprehensive attempt to address all the requirements: technological, clinical, procedural, etc.. ● Can be (completely) addressed with Semantic Web architecture, document processing, and “Web 2.0” architecture.
  • 4. CPR: Functional Requirements ● Uniform, extensible record content ● (Standard) record formats ● System performance ● Linkages ● Intelligence ● Reporting Capabilities ● Security ● Multi-views ● Accessiblity
  • 5. Definitions: KR / CMS ● What is Knowledge Representation (KR)? ● What is a Knowledge Base (KB)?: – A database system which facilitates deductive reasoning over a KR – Commonly called Rule-based Systems ● What are Expert Systems? ● What is a Content Management System (CMS)?
  • 6. Knowledge Representation ● Older ideas at corners, newer ideas along sides (Credit: Conrad Barski, M.D.)
  • 7. Content Management System: The What ● The terms CMS and Content Repository are essentially interchangeable ● Modern content repositories are best characterized by JSR 170 / 283 ● “.. a high-level information management system that is a superset of traditional data repositories” ● Integrated support for the XPath data model is the most prominent feature (native document management)
  • 8. Content Repository Feature Set ● Modern CMS standards cover document management effectively – Read/write access – Versioning – Event monitoring – Document-level access control – Concurrent access – Cross-linking – Profiles and Document Types
  • 9. Anatomy of a JSR 170 Implementation ● Jack Rabbit ● Component-based – Content Applications – Content Repository API – Implementation
  • 10. Knowledge Bases and CMS ● What of the requirements that Expert Systems meet? ● Document management and knowledge management systems are historically isolated from each other ● XML & RDF are contemporary manifestations of these methodologies ● They have remained as isolated as their predecessors ● They typically only coincide with regards to syntax
  • 11. XML & RDF: Eating and Having your Cake ● Classic example of where the document-oriented approach falls short: – Modern EHR cannot facilitate dynamic research ● Unified infrastructure for document and knowledge management is needed ● One of the earliest examples: – 4Suite Server version 0.10.0 (December 2000) ● Current state of the art (GRDDL): – Gleaning Resource Descriptions from Dialects of Language
  • 12. GRDDL: The Elevator Pitch ● Provides a way to normalize RDF concrete syntaxes ● The problem: – Many RDF concrete syntaxes (RDF/XML,Trix, RDFa,..) – The authoritative concrete syntax is not without issues ● The solution: – Define mappings from XML dialects to RDF graphs – Use turing-complete XML pipelines ● English as a second language analogy
  • 14. GRDDL: The Components ● Faithful Rendition – “By specifying a GRDDL transformation, the author of a document states that the transformation will provide a faithful rendition in RDF of information (or some portion of the information) expressed through the XML dialect used in the source document.” ● Various Mechanism for nominating transformations: – Specific XML attribute, XML Namespaces, HTML Profiles, and XHTML links ● GRDDL-aware agents compute GRDDL results (RDF graphs)
  • 15. The CMS Alternative: “Dual Representation” ● Persist XML in synchrony with its faithful rendition – Changes to the XML trigger calculation and storage of corresponding RDF ● “Dual Representation” ● Implemented by 4Suite Server Document Definitions ● The basis of how we capture patient records with maximum syntactic and semantic expressivity
  • 16. Document Definition ● The document definition is the mapping – Usually an XSLT document
  • 18. Overlap between Content Repository APIs
  • 19. Dual Representation: Advantages ● Maximum expressiveness and versatility of content ● Unified naming convention and access control (more on this later) ● Uniform, concrete RDF syntaxes – For systems which speak XML fluently (XForms, POX over HTTP, WS-*, etc..) ● Cheap support for XML & RDF content negotiation ● Use of RDF as a semantic index for XML
  • 20. Document Definition: Similarities ● GRDDL ● RDDL – Resource Directory Description Language – Human-readable descriptive material about a target – A directory of individual resources related to a target ● Nature and Purpose ● Schema, stylesheet, etc. – Lives at a namespace URI ● WXS's targetNamespace ● Common theme is a set of definitions for a document or a class of documents
  • 21. Registering a Document to a Class ● Namespace registration works well for the web (preferred approach of W3C TAG) ● What if you don't control the content served from the namespace of an existing vocabulary? – Atom, Docbook, etc. ● A CMS is better suited for a 'closed' / 'controlled' approach – Persist membership metadata in the CMS
  • 22. SemanticDB and Dual Representation
  • 23. Document and Graph Granularity ● Tying documents to graphs normalizes the content granularity ● Documents and their RDF graphs can be treated uniformly: – Naming convention – Targeted querying – Access control management
  • 26. Controlled Naming Convention: Continued ● RDF Dataset (from SPARQL): – A collection of named graphs ● The RDF is stored in a graph with the same URI as the XML source document ● When RDF is used as the primary cross-document 'index' you can: – SELECT ?graph WHERE { GRAPH ?graph { ... } } – document($graph)/.. XPath .. ● The space compromise (of dual representation) can be further mitigated by only extracting a minimal RDF graph
  • 27. Uniform Access Control for XML/RDF CMS ● Traditionally, Access Control Lists are associated with an object – Example: a file or directory in a filesystem ● Assign document / graph ACLs to a single URI – Certain users / groups can query the RDF but cannot read the XML – De-identification of EHR: HIPPA ● The 4Suite repository supports unified XML/RDF ACL
  • 28. Going Forward ● The SPARQL RDF dataset needs to be generalized – There is a long list of representation problems solved by a formal named graph specification ● RDF graphs need to be first-class objects in CMS ● Build a common Content Repository API for XML / RDF on the JSR 170 / 283 foundation ● Where do the 4Suite Repository API and JSR 170 / 283 overlap? ● How do we generalize Document Definitions?
  • 29. A Proposal for XML/RDF CMS
  • 30. Primary Takeaways ● We need to stop thinking of XML & RDF as mutually exclusive solutions to similar problems ● CMS standards are needed for the next generation of semantic / rich web applications ● These standards can preemptively level the landscape of toolkits in this space
  • 31. References ● D. Nuescheler et al, JSR 170: Content Repository for Java – http://jcp.org/en/jsr/detail?id=170 ● D. Connolly, Gleaning Resource Descriptions from Dialects of Language – http://www.w3.org/TR/grddl/ ● J. Borden, T. Bray, Resource Directory Description Language – http://www.rddl.org/ ● E. Prud'hommeaux, A. Seaborne, SPARQL Query Language for RDF – http://www.w3.org/TR/rdf-sparql-query/ ● Fourthought Inc., 4Suite – http://4Suite.org