SlideShare a Scribd company logo
1 of 26
Download to read offline
GRDDL
The Why, What, How, and Where




                            Chimezie Ogbuji
                            Cleveland Clinic Foundation
GRDDL: The Acronym
 Gleaning
 Resource
 Descriptions (from)
 Dialects (of)
 Language



   Rather long and intimidating
GRDDL: By Deconstruction

   Wordnet Definition of Glean:
    ◦ (gather, as of natural products)
    ◦ Synonyms: reap, harvest.
   Resource Description Framework (RDF)
    ◦ Logical assertions
   Dialects of Language
    ◦ XML document families (XHTML, for instance)
GRDDL: By Analogy
           GRDDL can be thought of
           as a protocol for sowing
           semantics in web content
           for later harvest.
The Why
   Vast amount of latent semantics in markup
        <span>Chimezie Ogbuji<span>
   Web content today is primarily built for
    human consumption
   Text indexing will only get you so far for
    document retrieval
   If machines are meant to harvest RDF from
    documents, reproducible protocols are
    needed
The Why (Cont.)
 Microformats, eRDF, and RDFa
     Specific to a particular family of
      documents
     XHTML and HTML
 If the goal is machine consumption, the
  bar needs to be raised beyond XHTML
The Why (Cont.)
 It seems easy to forget that XHTML is
  indeed an XML dialect
     You would think the (X) would make
      that obvious
 What was needed was a standard way to
  harvest RDF that is applicable to all XML
  dialects
The What
   Faithful rendition
   Transformations
   GRDDL result
   Source documents
   GRDDL-aware Agents
Faithful Rendition
“By specifying a GRDDL transformation, the author of a document
  states that the transformation will provide a faithful rendition in
  RDF of information (or some portion of the information)
  expressed through the XML dialect used in the source document.”

 Licenses an author-certified interpretation of
  an XML document
 A powerful paradigm for messaging
    See David Booths “RDF and SOA”
        http://www.w3.org/2007/01/wos-papers/booth
GRDDL Transformations
   Functions that take an XML document and
    return an RDF graph
   Transformations can be written in any
    particular language
   The “reference” transformation language is
    XSLT
        “[XSLT1] is the format most widely supported by GRDDL-
         aware agents as of this writing […] is specifically designed to
         express XML to XML transformations and has some good
         safety characteristics”
Other Transformation Languages
   “.. technically Javascript, C, or virtually any
    other programming language may be used to
    express transformations for GRDDL”
   However, these transformations need to be
    deterministic in order to ensure the result is
    a faithful rendition
   Hence, they must be functions
GRDDL Result
   The result of applying the transformation is
    an RDF serialization
   The RDF graph that corresponds to the
    serialization is a GRDDL result of the
    original document
   The “reference” result format is RDF/XML
   Other formats can be used (Turtle, N3,etc.)
GRDDL Source Documents
   The class of documents for which GRDDL
    defines a way to extract a result graph:
      XML Documents
      XML Namespace Documents
      Valid XHTML
      XHTML Profiles
GRDDL Source Documents
GRDDL: XML Documents
   GRDDL Namespace (grddl prefix)
              http://www.w3.org/2003/g/data-view#


   transformation attribute
    <?xml version=“1.0” encoding=“UTF-8”?>
    <root
     xmlns:grddl='http://www.w3.org/2003/g/data-view#’
     grddl:transformation=“.. path to transform ..”>
    … XML content ..
    </root>
Namespace Documents
“Transformations can be associated not only with individual
   documents but also with whole dialects that share an XML
   namespace”

   A GRDDL source document lives at the
    location of the namespace URI of the root
    element (the namespace document)
   The GRDDL result of the namespace
    document has a statement of the form:
            ?nsDoc grddl:namespaceTransformation ?txDoc
•   txDoc is the location of a transformation
    applicable to such XML documents
Valid XHTML Documents
    <html xmlns="http://www.w3.org/1999/xhtml">
     <head
      profile="http://www.w3.org/2003/g/data-view">
      <title>Some Document</title>
        <link rel="transformation"
              href=”.. path to transformation .. " />
        ...
     </head>
    …
    </html>
   Refers to the GRDDL XHTML profile
      Licenses the interpretation of
       rel=“transformation” links
XHTML Profiles
“Adding a GRDDL profileTransformation assertion to a profile
  document is much like adding a namespaceTransformation
  assertion to a namespace document”

   A GRDDL source document lives at the
    location of the profile URI an XHTML
    document
   The GRDDL result of the profile document
    has a statement of the form:
            ?profileDoc grddl:profileTransformation ?txDoc
•   txDoc is the location of a transformation
    applicable to such XML documents
The How
   GRDDL builds on existing XML & RDF
    standards
   An implementation mostly needs to
    orchestrate:
       Parsing of data representations
       Resolving representations from web locations
       The necessary XML processing to peek into and
        harvest RDF from the various sources
       The highly recursive nature of GRDDL 
Technological Overlap
Anatomy of a GRDDL
Implementation: GRDDL.py
   A reference implementation from scratch
   650 LOC
        RDFLib, 4Suite-XML, and Python control logic
   A layered approach
        Core module that handles transformations
        One module per source type stacked on top of the
         core
        A top layer that orchestrates the recursion and
         identification of which ‘class’ a source document
         belongs to
GRDDL.py Core
Component Stack
The Where
   GRDDL services online:
        http://triplr.org/ (Stuff in, triples out)
        http://www.w3.org/2007/08/grddl/ (W3C GRDDL
         Service)
   Primary GRDDL implementations:
        Redland
        GRDDL.py
        Virtuoso
        GRDDL Reader for Jena
   RDFa is most common GRDDL source
    content format in the wild
Hidden Value Proposition
   Supports separation of concerns:
      XML for messaging, data collection,
       structural validation
      RDF for Expressive assertions, inference,
       etc.
   A way to invest in data richness and
    accessibility
GRDDL Usecases
   Embedding scheduling assertions on
    personal pages
   Using GRDDL for extracting RDF from XML
    medical record documents
      Cleveland Clinic use case (clinical
       research)
   Aggregating web-based product reviews
   Embedding web service descriptions
   Adding semantic assertions to XML schemas
   Embedding semantic assertions to Wikis

More Related Content

What's hot

NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Ontotext
 
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Connected Data World
 
SHACL-based data life cycle management
SHACL-based data life cycle managementSHACL-based data life cycle management
SHACL-based data life cycle managementConnected Data World
 
An Approach for the Incremental Export of Relational Databases into RDF Graphs
An Approach for the Incremental Export of Relational Databases into RDF GraphsAn Approach for the Incremental Export of Relational Databases into RDF Graphs
An Approach for the Incremental Export of Relational Databases into RDF GraphsNikolaos Konstantinou
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
XML and Databases
XML and DatabasesXML and Databases
XML and DatabasesCittrex
 
The Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemThe Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemRoberto García
 
Semantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLSemantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLJerven Bolleman
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases IJECEIAES
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL DatabasesRajith Pemabandu
 
Rdf And Rdf Schema For Ontology Specification
Rdf And Rdf Schema For Ontology SpecificationRdf And Rdf Schema For Ontology Specification
Rdf And Rdf Schema For Ontology Specificationchenjennan
 
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jConnected Data World
 
Structured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackStructured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackMike Bergman
 
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software ComponentsFIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software ComponentsFIWARE
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise Ontotext
 

What's hot (20)

NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
 
SHACL-based data life cycle management
SHACL-based data life cycle managementSHACL-based data life cycle management
SHACL-based data life cycle management
 
An Approach for the Incremental Export of Relational Databases into RDF Graphs
An Approach for the Incremental Export of Relational Databases into RDF GraphsAn Approach for the Incremental Export of Relational Databases into RDF Graphs
An Approach for the Incremental Export of Relational Databases into RDF Graphs
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
XML and Databases
XML and DatabasesXML and Databases
XML and Databases
 
The Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemThe Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management System
 
Linked data and voyager
Linked data and voyagerLinked data and voyager
Linked data and voyager
 
HyperGraphQL
HyperGraphQLHyperGraphQL
HyperGraphQL
 
Semantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLSemantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQL
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
 
Rdf And Rdf Schema For Ontology Specification
Rdf And Rdf Schema For Ontology SpecificationRdf And Rdf Schema For Ontology Specification
Rdf And Rdf Schema For Ontology Specification
 
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
 
Structured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackStructured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product Stack
 
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software ComponentsFIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise
 
JSON-LD and SHACL for Knowledge Graphs
JSON-LD and SHACL for Knowledge GraphsJSON-LD and SHACL for Knowledge Graphs
JSON-LD and SHACL for Knowledge Graphs
 
Use of ISOcat within CMDI
Use of ISOcat within CMDIUse of ISOcat within CMDI
Use of ISOcat within CMDI
 

Similar to GRDDL: The Why, What, How, and Where

Applied xml programming for microsoft 3
Applied xml programming for microsoft 3Applied xml programming for microsoft 3
Applied xml programming for microsoft 3Raghu nath
 
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data SourcesVirtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sourcesrumito
 
Web services Overview in depth
Web services Overview in depthWeb services Overview in depth
Web services Overview in depthAbdulImrankhan7
 
Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2Dimitris Kontokostas
 
Triplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataTriplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataRoberto García
 
Introduction to RDFa
Introduction to RDFaIntroduction to RDFa
Introduction to RDFaIvan Herman
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7Deniz Kılınç
 
Introduction To Docbook 4 .5 Authoring
Introduction To Docbook 4 .5   AuthoringIntroduction To Docbook 4 .5   Authoring
Introduction To Docbook 4 .5 AuthoringViswanath J
 
RDFa Introductory Course Session 2/4 How RDFa
RDFa Introductory Course Session 2/4 How RDFaRDFa Introductory Course Session 2/4 How RDFa
RDFa Introductory Course Session 2/4 How RDFaPlatypus
 
Data interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTDData interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTDAnushaMahmood
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
RDF and the Semantic Web -- Joanna Pszenicyn
RDF and the Semantic Web -- Joanna PszenicynRDF and the Semantic Web -- Joanna Pszenicyn
RDF and the Semantic Web -- Joanna PszenicynRichard.Sapon-White
 
DC-2008 Tutorial 3 - Dublin Core and other metadata schemas
DC-2008 Tutorial 3 - Dublin Core and other metadata schemasDC-2008 Tutorial 3 - Dublin Core and other metadata schemas
DC-2008 Tutorial 3 - Dublin Core and other metadata schemasMikael Nilsson
 
Deploying PHP applications using Virtuoso as Application Server
Deploying PHP applications using Virtuoso as Application ServerDeploying PHP applications using Virtuoso as Application Server
Deploying PHP applications using Virtuoso as Application Serverwebhostingguy
 
Semantic Web use cases in outcomes research
Semantic Web use cases in outcomes researchSemantic Web use cases in outcomes research
Semantic Web use cases in outcomes researchChimezie Ogbuji
 

Similar to GRDDL: The Why, What, How, and Where (20)

Applied xml programming for microsoft 3
Applied xml programming for microsoft 3Applied xml programming for microsoft 3
Applied xml programming for microsoft 3
 
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data SourcesVirtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
 
Web services Overview in depth
Web services Overview in depthWeb services Overview in depth
Web services Overview in depth
 
Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2
 
Triplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataTriplificating and linking XBRL financial data
Triplificating and linking XBRL financial data
 
Introduction to RDFa
Introduction to RDFaIntroduction to RDFa
Introduction to RDFa
 
Unit 10: XML and Beyond (Sematic Web, Web Services, ...)
Unit 10: XML and Beyond (Sematic Web, Web Services, ...)Unit 10: XML and Beyond (Sematic Web, Web Services, ...)
Unit 10: XML and Beyond (Sematic Web, Web Services, ...)
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
Semantic Web talk TEMPLATE
Semantic Web talk TEMPLATESemantic Web talk TEMPLATE
Semantic Web talk TEMPLATE
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7
 
Xml
XmlXml
Xml
 
Introduction To Docbook 4 .5 Authoring
Introduction To Docbook 4 .5   AuthoringIntroduction To Docbook 4 .5   Authoring
Introduction To Docbook 4 .5 Authoring
 
RDFa Introductory Course Session 2/4 How RDFa
RDFa Introductory Course Session 2/4 How RDFaRDFa Introductory Course Session 2/4 How RDFa
RDFa Introductory Course Session 2/4 How RDFa
 
How RDFa works
How RDFa worksHow RDFa works
How RDFa works
 
Data interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTDData interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTD
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
RDF and the Semantic Web -- Joanna Pszenicyn
RDF and the Semantic Web -- Joanna PszenicynRDF and the Semantic Web -- Joanna Pszenicyn
RDF and the Semantic Web -- Joanna Pszenicyn
 
DC-2008 Tutorial 3 - Dublin Core and other metadata schemas
DC-2008 Tutorial 3 - Dublin Core and other metadata schemasDC-2008 Tutorial 3 - Dublin Core and other metadata schemas
DC-2008 Tutorial 3 - Dublin Core and other metadata schemas
 
Deploying PHP applications using Virtuoso as Application Server
Deploying PHP applications using Virtuoso as Application ServerDeploying PHP applications using Virtuoso as Application Server
Deploying PHP applications using Virtuoso as Application Server
 
Semantic Web use cases in outcomes research
Semantic Web use cases in outcomes researchSemantic Web use cases in outcomes research
Semantic Web use cases in outcomes research
 

More from Chimezie Ogbuji

Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxChimezie Ogbuji
 
Using OWL for the RESO Data Dictionary
Using OWL for the RESO Data DictionaryUsing OWL for the RESO Data Dictionary
Using OWL for the RESO Data DictionaryChimezie Ogbuji
 
Semantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsSemantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsChimezie Ogbuji
 
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...Chimezie Ogbuji
 
Automated clinicalontologyextraction
Automated clinicalontologyextractionAutomated clinicalontologyextraction
Automated clinicalontologyextractionChimezie Ogbuji
 
GRDDL: A Pictorial Approach
GRDDL: A Pictorial ApproachGRDDL: A Pictorial Approach
GRDDL: A Pictorial ApproachChimezie Ogbuji
 
UniProt and the Semantic Web
UniProt and the Semantic WebUniProt and the Semantic Web
UniProt and the Semantic WebChimezie Ogbuji
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsChimezie Ogbuji
 
Segmenting & Merging Domain-specific Modules for Clinical Informatics
Segmenting & Merging Domain-specific Modules for Clinical InformaticsSegmenting & Merging Domain-specific Modules for Clinical Informatics
Segmenting & Merging Domain-specific Modules for Clinical InformaticsChimezie Ogbuji
 
Overview of CPR Ontology
Overview of CPR OntologyOverview of CPR Ontology
Overview of CPR OntologyChimezie Ogbuji
 
The Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are ImportantThe Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are ImportantChimezie Ogbuji
 

More from Chimezie Ogbuji (11)

Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptx
 
Using OWL for the RESO Data Dictionary
Using OWL for the RESO Data DictionaryUsing OWL for the RESO Data Dictionary
Using OWL for the RESO Data Dictionary
 
Semantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsSemantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical Informatics
 
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...
 
Automated clinicalontologyextraction
Automated clinicalontologyextractionAutomated clinicalontologyextraction
Automated clinicalontologyextraction
 
GRDDL: A Pictorial Approach
GRDDL: A Pictorial ApproachGRDDL: A Pictorial Approach
GRDDL: A Pictorial Approach
 
UniProt and the Semantic Web
UniProt and the Semantic WebUniProt and the Semantic Web
UniProt and the Semantic Web
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical Informatics
 
Segmenting & Merging Domain-specific Modules for Clinical Informatics
Segmenting & Merging Domain-specific Modules for Clinical InformaticsSegmenting & Merging Domain-specific Modules for Clinical Informatics
Segmenting & Merging Domain-specific Modules for Clinical Informatics
 
Overview of CPR Ontology
Overview of CPR OntologyOverview of CPR Ontology
Overview of CPR Ontology
 
The Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are ImportantThe Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are Important
 

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

GRDDL: The Why, What, How, and Where

  • 1. GRDDL The Why, What, How, and Where Chimezie Ogbuji Cleveland Clinic Foundation
  • 2. GRDDL: The Acronym  Gleaning  Resource  Descriptions (from)  Dialects (of)  Language  Rather long and intimidating
  • 3. GRDDL: By Deconstruction  Wordnet Definition of Glean: ◦ (gather, as of natural products) ◦ Synonyms: reap, harvest.  Resource Description Framework (RDF) ◦ Logical assertions  Dialects of Language ◦ XML document families (XHTML, for instance)
  • 4. GRDDL: By Analogy GRDDL can be thought of as a protocol for sowing semantics in web content for later harvest.
  • 5. The Why  Vast amount of latent semantics in markup <span>Chimezie Ogbuji<span>  Web content today is primarily built for human consumption  Text indexing will only get you so far for document retrieval  If machines are meant to harvest RDF from documents, reproducible protocols are needed
  • 6. The Why (Cont.)  Microformats, eRDF, and RDFa  Specific to a particular family of documents  XHTML and HTML  If the goal is machine consumption, the bar needs to be raised beyond XHTML
  • 7. The Why (Cont.)  It seems easy to forget that XHTML is indeed an XML dialect  You would think the (X) would make that obvious  What was needed was a standard way to harvest RDF that is applicable to all XML dialects
  • 8. The What  Faithful rendition  Transformations  GRDDL result  Source documents  GRDDL-aware Agents
  • 9. Faithful Rendition “By specifying a GRDDL transformation, the author of a document states that the transformation will provide a faithful rendition in RDF of information (or some portion of the information) expressed through the XML dialect used in the source document.”  Licenses an author-certified interpretation of an XML document  A powerful paradigm for messaging  See David Booths “RDF and SOA”  http://www.w3.org/2007/01/wos-papers/booth
  • 10. GRDDL Transformations  Functions that take an XML document and return an RDF graph  Transformations can be written in any particular language  The “reference” transformation language is XSLT  “[XSLT1] is the format most widely supported by GRDDL- aware agents as of this writing […] is specifically designed to express XML to XML transformations and has some good safety characteristics”
  • 11. Other Transformation Languages  “.. technically Javascript, C, or virtually any other programming language may be used to express transformations for GRDDL”  However, these transformations need to be deterministic in order to ensure the result is a faithful rendition  Hence, they must be functions
  • 12. GRDDL Result  The result of applying the transformation is an RDF serialization  The RDF graph that corresponds to the serialization is a GRDDL result of the original document  The “reference” result format is RDF/XML  Other formats can be used (Turtle, N3,etc.)
  • 13. GRDDL Source Documents  The class of documents for which GRDDL defines a way to extract a result graph:  XML Documents  XML Namespace Documents  Valid XHTML  XHTML Profiles
  • 15. GRDDL: XML Documents  GRDDL Namespace (grddl prefix) http://www.w3.org/2003/g/data-view#  transformation attribute <?xml version=“1.0” encoding=“UTF-8”?> <root xmlns:grddl='http://www.w3.org/2003/g/data-view#’ grddl:transformation=“.. path to transform ..”> … XML content .. </root>
  • 16. Namespace Documents “Transformations can be associated not only with individual documents but also with whole dialects that share an XML namespace”  A GRDDL source document lives at the location of the namespace URI of the root element (the namespace document)  The GRDDL result of the namespace document has a statement of the form: ?nsDoc grddl:namespaceTransformation ?txDoc • txDoc is the location of a transformation applicable to such XML documents
  • 17. Valid XHTML Documents <html xmlns="http://www.w3.org/1999/xhtml"> <head profile="http://www.w3.org/2003/g/data-view"> <title>Some Document</title> <link rel="transformation" href=”.. path to transformation .. " /> ... </head> … </html>  Refers to the GRDDL XHTML profile  Licenses the interpretation of rel=“transformation” links
  • 18. XHTML Profiles “Adding a GRDDL profileTransformation assertion to a profile document is much like adding a namespaceTransformation assertion to a namespace document”  A GRDDL source document lives at the location of the profile URI an XHTML document  The GRDDL result of the profile document has a statement of the form: ?profileDoc grddl:profileTransformation ?txDoc • txDoc is the location of a transformation applicable to such XML documents
  • 19. The How  GRDDL builds on existing XML & RDF standards  An implementation mostly needs to orchestrate:  Parsing of data representations  Resolving representations from web locations  The necessary XML processing to peek into and harvest RDF from the various sources  The highly recursive nature of GRDDL 
  • 21. Anatomy of a GRDDL Implementation: GRDDL.py  A reference implementation from scratch  650 LOC  RDFLib, 4Suite-XML, and Python control logic  A layered approach  Core module that handles transformations  One module per source type stacked on top of the core  A top layer that orchestrates the recursion and identification of which ‘class’ a source document belongs to
  • 24. The Where  GRDDL services online:  http://triplr.org/ (Stuff in, triples out)  http://www.w3.org/2007/08/grddl/ (W3C GRDDL Service)  Primary GRDDL implementations:  Redland  GRDDL.py  Virtuoso  GRDDL Reader for Jena  RDFa is most common GRDDL source content format in the wild
  • 25. Hidden Value Proposition  Supports separation of concerns:  XML for messaging, data collection, structural validation  RDF for Expressive assertions, inference, etc.  A way to invest in data richness and accessibility
  • 26. GRDDL Usecases  Embedding scheduling assertions on personal pages  Using GRDDL for extracting RDF from XML medical record documents  Cleveland Clinic use case (clinical research)  Aggregating web-based product reviews  Embedding web service descriptions  Adding semantic assertions to XML schemas  Embedding semantic assertions to Wikis