SlideShare une entreprise Scribd logo
1  sur  57
Télécharger pour lire hors ligne
Semantic Technologies & Triplestores
              for BI
   1st European Business Intelligence Summer School
                      eBISS 2011

              Marin Dimitrov (Ontotext)

                      Jul 2011
eBISS 2011




Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #2
Contents

• Introduction to Semantic Technologies
• Semantic Databases – advantages, features and
  benchmarks
• Semantic Technologies and Triplestores for BI




             Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #3
INTRODUCTION TO SEMANTIC
TECHNOLOGIES



      Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #4
The need for a smarter Web

• "The Semantic Web is an extension of the current
  web in which information is given well-defined
  meaning, better enabling computers and people to
  work in cooperation.“ (Tim Berners-Lee, 2001)




             Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #5
The need for a smarter Web (2)

•   “PricewaterhouseCoopers believes a Web of data
    will develop that fully augments the document Web
    of today. You’ll be able to find and take pieces of
    data sets from different places, aggregate them
    without warehousing, and analyze them in a more
    straightforward, powerful way than you can now.”
    (PWC, May 2009)




             Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #6
The Semantic Web vision (W3C)

• Extend principles of the Web from documents to
  data
• Data should be accessed using the general Web
  architecture (e.g., URI-s, protocols, …)
• Data should be related to one another just as
  documents are already
• Creation of a common framework that allows:
  – Data to be shared and reused across applications
  – Data to be processed automatically
  – New relationships between pieces of data to be inferred

              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #7
The Semantic Web stack




                                                           (c) Benjamin Nowack

Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011              #8
The Semantic Web timeline
                 RDF                                                                                      RDF 2
        DAML+OIL                  OWL                                                 OWL 2
                                                      SPARQL                                      SPARQL 1.1
                                                                        RIF
                                                                  RDFa
                                                           SAWSDL
                                                                                       LOD
                                                                                    SKOS
                                                                                              HCLS
                                                                                                  RDB2RDF
                                                                                                          GLD

                                                                                                          PIL

1999   2000   2001     2002   2003      2004     2005      2006      2007     2008         2009    2010   2011


                         Semantic Technologies & Triplestores for BI (eBISS 2011)             Jul 2011      #9
Ontologies as data models on the Semantic Web

• An ontology is a formal specification that provides
  sharable and reusable knowledge representation
   – Examples – taxonomies, thesauri, topic maps, formal
     ontologies
• An ontology specification includes
   – Description of the concepts in some domain and their
     properties
   – Description of the possible relationships between concepts
     and the constraints on how the relationships can be used
   – Sometimes, the individuals (members of concepts)


              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #10
Resource Description Framework (RDF)

• A simple data model for
   – Formally describing the semantics of information
   – representing meta-data (data about data)
• A set of representation syntaxes
   – RDF/XML (standard), N-Triples, N3
• Building blocks
   – Resources (with unique identifiers)
   – Literals
   – Named relations between pairs of resources (or a resource
     and a literal)

              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #11
RDF (2)

• Everything is a triple
   – Subject (resource), Predicate (relation), Object (resource
     or literal)
• The RDF graph is a collection of triples
                                predicate
            subject                                     object

       École Centrale            locatedIn
                                                          Paris
            Paris

                             hasPopulation
              Paris                                  2193031




               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #12
RDF graph example (3)
                                                            hasName             “École Centrale Paris”

            dbpedia:École_centrale_Paris

                      locatedIn                               establishedIn
                                                                                           1829

                              dbpedia:Paris

                                                      hasPopulation
                              hasName

                   “Paris”                               2193031


                    Subject                           Predicate                            Object

       http://dbpedia.org/resource/Paris              hasName                              “Paris”
       http://dbpedia.org/resource/Paris            hasPopulation                         2193031
http://dbpedia.org/resource/École_centrale_Paris      locatedIn              http://dbpedia.org/resource/Paris
http://dbpedia.org/resource/École_centrale_Paris      hasName                       “École Centrale Paris”
http://dbpedia.org/resource/École_centrale_Paris    establishedIn                           1829


                               Semantic Technologies & Triplestores for BI (eBISS 2011)              Jul 2011    #13
RDF advantages

• Global identifiers of all resources (URIs)
   – Reduces ambiguity
   – Makes incremental data integration easier
• Graph data model
   – Suitable for sparse, unstructured and semi-structured data
• Inference of implicit facts
• Schema agility
   – Lowers the cost of schema evolution




              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #14
RDF Schema (RDFS)

• RDFS provides means for:
   – Defining Classes and Properties
   – Defining hierarchies (of classes and properties)
   – Domain/range of a property
• Entailment rules (axioms)
   – Infer new triples from existing ones




               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #15
RDFS entailment rules




Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #16
RDF entailment rules (2)

• Class/Property hierarchies
   – R5, R7, R9, R11
:human rdfs:subClassOf :mammal .                 :John a :man .
:man rdfs:subClassOf :human .                     :John a :human .
 :man rdfs:subClassOf :mammal .                  :John a :mammal .
:hasSpouse rdfs:subPropertyOf :hasRelative .
:John :hasSpouse :Merry .
 :John :hasRelative :Merry .

• Inferring types (domain/range restrictions)
   – R2, R3
 :hasSpouse rdfs:domain :human ;
            rdfs:range :human .
 :Adam :hasSpouse :Eve .
 :Adam a :human .
 :Eve a :human .

               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #17
Web Ontology Language (OWL)

• More expressive than RDFS
   – Identity equivalence/difference
      • sameAs, differentFrom, equivalentClass/Property

• Complex class expressions
   – Class intersection, union, complement, disjointness
• More expressive property definitions
   – Object/Datatype properties
   – Cardinality restrictions
   – Transitive, functional, symmetric, inverse properties



               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #18
OWL (2)
                                  db1:Paris :hasPopulation 2913031.
• Identity equivalence            db1:Paris = db2:Paris .
                                   db2:Paris :hasPopulation 2193031 .

                                     :locatedIn a owl:TransitiveProperty .
                                     :ECP :locatedIn :Paris .
• Transitive properties              :Paris :locatedIn :France .
                                      :ECP :locatedIn :France .

                                     :hasSpouse a owl:SymmetricProperty .
                                     :John :hasSpouse :Merry .
• Symmetric properties                :Merry :hasSpouse :John .

                                     :hasParent owl:inverseOf :hasChild .
• Inverse properties                 :John :hasChild :Jane .
                                      :Jane :hasParent :John .

                                    :hasSpouse a owl:FunctionalPropety .
                                    :Merry :hasSpouse :John .
• Functional properties             :Merry :hasSpouse :JohnSmith .
                                     :JohnSmith = :John .


             Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #19
OWL sublanguages

• OWL Lite
  –   low expressivity / low formal complexity
  –   Logical decidability & completeness
  –   All RDFS features
  –   sameAs/differentFrom, equivalent class/property
  –   Inverse / symmetric / transitive / functional properties
  –   cardinality restriction (only 0 or 1)
  –   class intersection




                Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #20
OWL sublanguages (2)

• OWL DL
  –   high expressivity / efficient DL reasoning
  –   Logical decidability & completeness
  –   All OWL Lite features
  –   Class disjointness
  –   Complex class expressions
  –   Class union & complement
• OWL Full
  – max expressivity / no efficient reasoning
  – No guarantees for completeness & decidability


                Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #21
OWL 2 profiles

• Goals
  – sublanguages that trade expressiveness for efficiency of
    reasoning
  – Cover specific important application areas
  – Easier to understand by non-experts
• OWL 2 EL
  – Best for large ontologies / small instance data (TBox
    reasoning)
  – Computationally optimal
     • PTime reasoning complexity



              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #22
OWL 2 profiles (2)

• OWL 2 QL
  – Quite limited expressive power, but very efficient for
    query answering with large instance data
  – Can exploit query rewriting techniques
     • Data storage & query evaluation can be delegated to a RDBMS

• OWL 2 RL
  – Balance between scalable reasoning and expressive power
  – Suitable for rule-based reasoning




              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #23
OWL 2 profiles (3)




                                                              (c) Axel Polleres



Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011               #24
SPARQL Protocol and RDF Query Language (SPARQL)

• SQL-like query language for RDF data
• Simple protocol for querying remote databases over
  HTTP
• Query types
   –   select – query data by complex graph patterns
   –   ask – whether a query returns results (result is true/false)
   –   describe – returns all triples about a particular resource
   –   construct – create new triples based on query results




                 Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #25
Graph patterns

    • Whitespace separated list of Subject, Predicate,
      Object
      – ?x dbp-ont:city dbpedia:Paris
      – dbpedia:École_centrale_Paris db-ont:city ?y
    • Group Graph Pattern
      – A group of 1+ graph patterns
      – FILTERs can constrain the whole group
{
     ?uni a dbpedia:University ;
          dbp-ont:city dbpedia:Paris ;
          dbp-ont:numberOfStudents ?students .
     FILTER (?students > 5000)
}                Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #26
Graph Patterns (2)

 • Optional Graph Pattern
   – Optional parts of a pattern
   – pattern OPTIONAL {pattern}

SELECT ?uni ?students
WHERE {
  ?uni a dbpedia:University ;
       dbp-ont:city dbpedia:Paris .
  OPTIONAL {
     ?uni dbp-ont:numberOfStudents ?students
  }
}


              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #27
Graph Patterns (3)

 • Alternative Graph Pattern
    – Combine results of several alternative patterns
    – {pattern} UNION {pattern}
SELECT ?uni
WHERE {
  ?uni a dbpedia:University .
  {
     { ?uni dbp-ont:city dbpedia:Paris }
     UNION
     { ?uni dbp-ont:city dbpedia:Lyon }
  }
}

                Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #28
Anatomy of a SPARQL query

• List of namespace prefixes
   – PREFIX xyz: <URI>
• Query result clause (variables)
   – ?x, $y
• Datasets
• Graph patterns + filters
   – Simple / group / alternative / optional
• Modifiers
   – ORDER BY, DISTINCT, OFFSET/LIMIT


               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #29
Linked Data

• “To make the Semantic Web a reality, it is necessary to have a
  large volume of data available on the Web in a standard,
  reachable and manageable format. In addition the
  relationships among data also need to be made available. This
  collection of interrelated data on the Web can also be referred
  to as Linked Data. Linked Data lies at the heart of the
  Semantic Web: large scale integration of, and reasoning on,
  data on the Web.” (W3C)
• Linked Data is a set of principles that allows publishing,
  querying and consumption of RDF data, distributed across
  different servers
   • similar to the way HTML is currently published & consumed


                Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #30
Linked Data design principles

1.       Unambiguous identifiers for objects (resources)
     –     Use URIs as names for things
2.       Use the structure of the web
     –     Use HTTP URIs so that people can look up the names
3.       Make is easy to discover information about an
         object (resource)
     –     When someone lookups a URI, provide useful
           information
4.       Link the object (resource) to related objects
     –     Include links to other URIs

                  Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #31
Linked Data evolution – Oct 2007




                                                             (c) R. Cyganiak & A. Jentzsch




  Semantic Technologies & Triplestores for BI (eBISS 2011)            Jul 2011               #32
Linked Data evolution – Sep 2008




                                                             (c) R. Cyganiak & A. Jentzsch




  Semantic Technologies & Triplestores for BI (eBISS 2011)       Jul 2011               #33
Linked Data evolution – Jul 2010




                                                             (c) R. Cyganiak & A. Jentzsch


  Semantic Technologies & Triplestores for BI (eBISS 2011)       Jul 2011               #34
Linked Data evolution – Sep 2010




                                                           (c) R. Cyganiak & A. Jentzsch
Introduction to Semantic Technologies, Ontologies and the Semantic Web Aug 2010            #35
SEMANTIC DATABASES
(TRIPLESTORES)



      Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #36
Triplestores

• RDF databases
   – Store data according to the RDF data model
   – Provide inference of implicit triples (either at data loading
     time, or at query time)
   – SPARQL as a query language
• Many similarities to traditional DBMS approaches
   – … and many differences too




               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #37
Triplestores vs. traditional DBMS

                                            Triplestore       OLTP        OLAP      NoSQL            Graph

       Update performance                       +/-             +          +/-       ++               +
         Complex Queries                          +             +          ++         -               +
             inference                            +             -          +/-        -                -
           Sparse data                            +             -          +/-       +                +
Semi-structured / unstructured data               +             -            -       +/-              +
         Dynamic schema                           +             -            -       +/-              +/-




                         Semantic Technologies & Triplestores for BI (eBISS 2011)         Jul 2011          #38
Triplestore advantages

• Global identifiers of resources (entities)
   – Lowers the cost of data integration
• Inference of implicit facts
• Graph data model
   – Suitable for sparse, semi-structured and unstructured data
• Agile schema
   – New relations between entities may be easily added
• Exploratory queries against unknown schema
   – Query and data vocabulary may differ
• Compliance to standards (RDF, SPARQL)
              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #39
Design & Implementation

• Storage engine
   – Native
   – on top of an RDBMS
   – on top of a NoSQL engine
• Triple density
   – The in-memory “footprint” per triple may differ x10
     between different triplestores
   – Impact on TCO
• Compression
   – Improves I/O on multi-core systems

              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #40
Design & Implementation (2)

• Reasoning strategy
  – Forward-chaining – at data loading time, start from the
    explicit facts and apply the inference rules until the
    complete closure is inferred
  – Backward-chaining – at runtime, start with a query and
    decompose recursively into smaller requests that can be
    matched to explicit facts
  – Hybrid strategy – partial materialization at data loading
    time + partial query decomposition at runtime




              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #41
Pros and cons of forward-chaining based
                   materialization

• Relatively slow addition of new facts
   – inferred closure is extended after each transaction
• Deletion of facts is slow
   – facts that are no longer true are removed from the
     inferred closure
• The maintenance of the inferred closure requires
  more resources
• Querying and retrieval are fast
   – no reasoning is required at query time
   – RDBMS-like query evaluation & optimisation techniques
     are applicable
               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #42
Design & Implementation (3)

• Invalidation strategy
   – Truth maintenance is not trivial
   – huge overhead for keeping meta-data about inference
     dependencies
   – Trivial approach: just re-compute the complete inferred
     closure after a deletion
   – Advanced approach: detect which parts of the inferred
     closure are affected and need to be invalidated as well




              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #43
Design & Implementation (4)

• owl:sameAs optimization
  – It is a transitive, reflexive and symmetric relationship
  – owl:sameAs induced inference can “inflate” the number of
    statements and deteriorate inference/query performance
  – Specific optimizations allow that a compact representation
    of equivalent resources is used
  – Query results can be expanded through backward-chaining
    at query time




             Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #44
Popular triplestores

• 4store
  – http://4store.org
  – Open source, distributed cluster (up to 32 nodes), data
    fully partitioned, no inference (external reasoner,
    backward chaining)
• AllegroGraph
  – http://www.franz.com
  – ACID transactions, RDF and limited OWL reasoning, full-
    text indexing, compression, replication cluster, backward
    chaining



              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #45
Popular triplestores (2)

• Bigdata
  – http://www.systap.com
  – Open source, data partitioning, hybrid materialization, RDF
    and limited OWL reasoning
• Dydra
  – http://dydra.com
  – SaaS, SPARQL endpoint + REST API, no reasoning
• Jena TDB
  – http://www.openjena.org/TDB
  – Open source, RDF and limited OWL reasoning

              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #46
Popular triplestores (3)

• Oracle
  – RDF and limited OWL reasoning, data partitioning &
    compression (RAC), owl:sameAs optimization, security &
    versioning; geo-spatial extensions
• OWLIM
  – http://www.ontotext.com
  – Forward-chaining, RDF / OWL 2 RL / OWL 2 QL and limited
    OWL Lite / OWL DL reasoning; replication cluster;
    owl:sameAs optimization; full-text indexing; geo-spatial
    extensions; scalable RDF Rank



             Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #47
Popular triplestores (4)

• Sesame
  – http://www.openrdf.org
  – Open source; plugable storage & inference layer
• StarDog
  – http://stardog.com
  – backward-chaining; OWL DL and all OWL 2 profiles; full-
    text indexing; compression
• Virtuoso
  – http://virtuoso.openlinksw.com
  – Universal server (RDF, XML, RDBMS); backward chaining;
    geo-spatial extensions; full-text indexing; compression
             Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #48
Benchmarking

• Tasks
  – Data loading
  – Query evaluation
  – Data modification
• Performance factors
  –   Forward-chaining vs. backward-chaining
  –   Data model complexity
  –   Query complexity
  –   Result set size
  –   Number of concurrent clients


               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #49
Popular benchmarks for triplestores

• LUBM
  – Storing & query performance benchmark
  – 14 predefined queries + data generator tool
• BSBM
  – SPARQL benchmark
  – 3 use cases (12/17/8 distinct queries)
• SP2Bench
  – SPARQL benchmark
  – 12 queries for most common SPARQL constructs & access
    patterns

              Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #50
SEMANTIC TECHNOLOGIES &
TRIPLESTORES FOR BI



      Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #51
Data integration & querying (HCLS)




distributed querying at present




                                                                        distributed querying with RDF and SPARQL


                                                                                              (c) HCLS @ W3C



                      Semantic Technologies & Triplestores for BI (eBISS 2011)                Jul 2011             #52
Data integration & querying (HCLS)




                                                              (c) HCLS @ W3C




   Semantic Technologies & Triplestores for BI (eBISS 2011)      Jul 2011      #53
Data integration cost (PwC)




                                                           (c) PriceWaterhouseCooper

Semantic Technologies & Triplestores for BI (eBISS 2011)           Jul 2011            #54
Semantic Technologies & Triplestores for BI

• Speed-up data integration
   – RDF based ETL is more agile
• Lower the cost of data integration
   – Initial cost of using ontologies is higher
   – But the cost of ad-hoc ETL will be higher in the long term
• Align & integrate legacy data silos
   – Querying & consuming data from disparate sources is
     easier with SPARQL & RDF




               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #55
Semantic Technologies & Triplestores for BI (2)

• Infer implicit & hidden knowledge
   – Custom, user-defined rules as well
• Efficiently manage unstructured & semi-structured
  data together
   – graph data model
• Improve the quality of query results
   – Inference of implicit facts
   – SPARQL query vocabulary may differ from data vocabulary
   – Exploratory queries



               Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #56
Q&A




      Questions?
                         @ontotext



Semantic Technologies & Triplestores for BI (eBISS 2011)   Jul 2011   #57

Contenu connexe

Tendances

First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
EDF2013: Data Science Curriculum: Barry Norton: Big Linked Data
 EDF2013: Data Science Curriculum: Barry Norton: Big Linked Data EDF2013: Data Science Curriculum: Barry Norton: Big Linked Data
EDF2013: Data Science Curriculum: Barry Norton: Big Linked DataEuropean Data Forum
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapubeswcsummerschool
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
A Walk Through the Kimball ETL Subsystems with Oracle Data IntegrationA Walk Through the Kimball ETL Subsystems with Oracle Data Integration
A Walk Through the Kimball ETL Subsystems with Oracle Data IntegrationMichael Rainey
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...Michael Rainey
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Ontotext
 
Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Juan Sequeda
 
RDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itRDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itJose Luis Lopez Pino
 
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your DataBuild Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your DataJean Ihm
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
 
Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1) Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1) Jean Ihm
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationPeter Haase
 
Family tree of data – provenance and neo4j
Family tree of data – provenance and neo4jFamily tree of data – provenance and neo4j
Family tree of data – provenance and neo4jM. David Allen
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsPeter Haase
 
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...LDBC council
 
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics EngineLDBC council
 

Tendances (20)

First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
EDF2013: Data Science Curriculum: Barry Norton: Big Linked Data
 EDF2013: Data Science Curriculum: Barry Norton: Big Linked Data EDF2013: Data Science Curriculum: Barry Norton: Big Linked Data
EDF2013: Data Science Curriculum: Barry Norton: Big Linked Data
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
A Walk Through the Kimball ETL Subsystems with Oracle Data IntegrationA Walk Through the Kimball ETL Subsystems with Oracle Data Integration
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research DataNISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
 
Grails goes Graph
Grails goes GraphGrails goes Graph
Grails goes Graph
 
Meetup Oracle Database BCN: 2.1 Data Management Trends
Meetup Oracle Database BCN: 2.1 Data Management TrendsMeetup Oracle Database BCN: 2.1 Data Management Trends
Meetup Oracle Database BCN: 2.1 Data Management Trends
 
Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011
 
RDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itRDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use it
 
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your DataBuild Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1) Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1)
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federation
 
Family tree of data – provenance and neo4j
Family tree of data – provenance and neo4jFamily tree of data – provenance and neo4j
Family tree of data – provenance and neo4j
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
 
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
 

En vedette

The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of InformationAdrian Paschke
 
Triplestore and SPARQL
Triplestore and SPARQLTriplestore and SPARQL
Triplestore and SPARQLLino Valdivia
 
Java and SPARQL
Java and SPARQLJava and SPARQL
Java and SPARQLRaji Ghawi
 
Jarrar: OWL (Web Ontology Language)
Jarrar: OWL (Web Ontology Language)Jarrar: OWL (Web Ontology Language)
Jarrar: OWL (Web Ontology Language)Mustafa Jarrar
 
Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQLOpen Data Support
 
An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQLOlaf Hartig
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesMarin Dimitrov
 

En vedette (10)

The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
 
Triplestore and SPARQL
Triplestore and SPARQLTriplestore and SPARQL
Triplestore and SPARQL
 
Java and SPARQL
Java and SPARQLJava and SPARQL
Java and SPARQL
 
Jarrar: OWL (Web Ontology Language)
Jarrar: OWL (Web Ontology Language)Jarrar: OWL (Web Ontology Language)
Jarrar: OWL (Web Ontology Language)
 
SPARQL Tutorial
SPARQL TutorialSPARQL Tutorial
SPARQL Tutorial
 
Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQL
 
An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQL
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic Repositories
 
SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat Sheet
 
Einführung in RDF & SPARQL
Einführung in RDF & SPARQLEinführung in RDF & SPARQL
Einführung in RDF & SPARQL
 

Similaire à Semantic Technologies and Triplestores for Business Intelligence

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big DataMarin Dimitrov
 
Services semantic technology_terminology
Services semantic technology_terminologyServices semantic technology_terminology
Services semantic technology_terminologyTenforce
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic WebIvan Herman
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordMark Wilkinson
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011François Scharffe
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift
 
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)net2-project
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebMarin Dimitrov
 
Going for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataGoing for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataEDINA, University of Edinburgh
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Mark Wilkinson
 
CSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialCSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialLeeFeigenbaum
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphsSören Auer
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...datascienceiqss
 
Modern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyModern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyMarius Butuc
 
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...Laura Akerman
 

Similaire à Semantic Technologies and Triplestores for Business Intelligence (20)

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
 
Services semantic technology_terminology
Services semantic technology_terminologyServices semantic technology_terminology
Services semantic technology_terminology
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, Oxford
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
 
20110728 datalift-rpi-troy
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troy
 
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Going for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataGoing for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial Metadata
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
 
.Net and Rdf APIs
.Net and Rdf APIs.Net and Rdf APIs
.Net and Rdf APIs
 
Metadata is back!
Metadata is back!Metadata is back!
Metadata is back!
 
CSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialCSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web Tutorial
 
Semantic web
Semantic web Semantic web
Semantic web
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
 
Semantics
SemanticsSemantics
Semantics
 
Modern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyModern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative study
 
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
 

Plus de Marin Dimitrov

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Marin Dimitrov
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career JourneyMarin Dimitrov
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsMarin Dimitrov
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Marin Dimitrov
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ UberMarin Dimitrov
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger SelfMarin Dimitrov
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesMarin Dimitrov
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsMarin Dimitrov
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Marin Dimitrov
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesMarin Dimitrov
 
DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceMarin Dimitrov
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
 
Low-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceLow-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceMarin Dimitrov
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceMarin Dimitrov
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseMarin Dimitrov
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteMarin Dimitrov
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudMarin Dimitrov
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyMarin Dimitrov
 

Plus de Marin Dimitrov (20)

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career Journey
 
Open Source @ Uber
Open Source @ Uber Open Source @ Uber
Open Source @ Uber
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & Organisations
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ Uber
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger Self
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed Sites
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance Teams
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
 
DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-Service
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Low-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceLow-cost Open Data As-a-Service
Low-cost Open Data As-a-Service
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic Technology
 

Dernier

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 

Dernier (20)

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 

Semantic Technologies and Triplestores for Business Intelligence

  • 1. Semantic Technologies & Triplestores for BI 1st European Business Intelligence Summer School eBISS 2011 Marin Dimitrov (Ontotext) Jul 2011
  • 2. eBISS 2011 Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #2
  • 3. Contents • Introduction to Semantic Technologies • Semantic Databases – advantages, features and benchmarks • Semantic Technologies and Triplestores for BI Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #3
  • 4. INTRODUCTION TO SEMANTIC TECHNOLOGIES Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #4
  • 5. The need for a smarter Web • "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.“ (Tim Berners-Lee, 2001) Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #5
  • 6. The need for a smarter Web (2) • “PricewaterhouseCoopers believes a Web of data will develop that fully augments the document Web of today. You’ll be able to find and take pieces of data sets from different places, aggregate them without warehousing, and analyze them in a more straightforward, powerful way than you can now.” (PWC, May 2009) Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #6
  • 7. The Semantic Web vision (W3C) • Extend principles of the Web from documents to data • Data should be accessed using the general Web architecture (e.g., URI-s, protocols, …) • Data should be related to one another just as documents are already • Creation of a common framework that allows: – Data to be shared and reused across applications – Data to be processed automatically – New relationships between pieces of data to be inferred Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #7
  • 8. The Semantic Web stack (c) Benjamin Nowack Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #8
  • 9. The Semantic Web timeline RDF RDF 2 DAML+OIL OWL OWL 2 SPARQL SPARQL 1.1 RIF RDFa SAWSDL LOD SKOS HCLS RDB2RDF GLD PIL 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #9
  • 10. Ontologies as data models on the Semantic Web • An ontology is a formal specification that provides sharable and reusable knowledge representation – Examples – taxonomies, thesauri, topic maps, formal ontologies • An ontology specification includes – Description of the concepts in some domain and their properties – Description of the possible relationships between concepts and the constraints on how the relationships can be used – Sometimes, the individuals (members of concepts) Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #10
  • 11. Resource Description Framework (RDF) • A simple data model for – Formally describing the semantics of information – representing meta-data (data about data) • A set of representation syntaxes – RDF/XML (standard), N-Triples, N3 • Building blocks – Resources (with unique identifiers) – Literals – Named relations between pairs of resources (or a resource and a literal) Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #11
  • 12. RDF (2) • Everything is a triple – Subject (resource), Predicate (relation), Object (resource or literal) • The RDF graph is a collection of triples predicate subject object École Centrale locatedIn Paris Paris hasPopulation Paris 2193031 Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #12
  • 13. RDF graph example (3) hasName “École Centrale Paris” dbpedia:École_centrale_Paris locatedIn establishedIn 1829 dbpedia:Paris hasPopulation hasName “Paris” 2193031 Subject Predicate Object http://dbpedia.org/resource/Paris hasName “Paris” http://dbpedia.org/resource/Paris hasPopulation 2193031 http://dbpedia.org/resource/École_centrale_Paris locatedIn http://dbpedia.org/resource/Paris http://dbpedia.org/resource/École_centrale_Paris hasName “École Centrale Paris” http://dbpedia.org/resource/École_centrale_Paris establishedIn 1829 Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #13
  • 14. RDF advantages • Global identifiers of all resources (URIs) – Reduces ambiguity – Makes incremental data integration easier • Graph data model – Suitable for sparse, unstructured and semi-structured data • Inference of implicit facts • Schema agility – Lowers the cost of schema evolution Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #14
  • 15. RDF Schema (RDFS) • RDFS provides means for: – Defining Classes and Properties – Defining hierarchies (of classes and properties) – Domain/range of a property • Entailment rules (axioms) – Infer new triples from existing ones Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #15
  • 16. RDFS entailment rules Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #16
  • 17. RDF entailment rules (2) • Class/Property hierarchies – R5, R7, R9, R11 :human rdfs:subClassOf :mammal . :John a :man . :man rdfs:subClassOf :human .  :John a :human .  :man rdfs:subClassOf :mammal .  :John a :mammal . :hasSpouse rdfs:subPropertyOf :hasRelative . :John :hasSpouse :Merry .  :John :hasRelative :Merry . • Inferring types (domain/range restrictions) – R2, R3 :hasSpouse rdfs:domain :human ; rdfs:range :human . :Adam :hasSpouse :Eve . :Adam a :human . :Eve a :human . Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #17
  • 18. Web Ontology Language (OWL) • More expressive than RDFS – Identity equivalence/difference • sameAs, differentFrom, equivalentClass/Property • Complex class expressions – Class intersection, union, complement, disjointness • More expressive property definitions – Object/Datatype properties – Cardinality restrictions – Transitive, functional, symmetric, inverse properties Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #18
  • 19. OWL (2) db1:Paris :hasPopulation 2913031. • Identity equivalence db1:Paris = db2:Paris .  db2:Paris :hasPopulation 2193031 . :locatedIn a owl:TransitiveProperty . :ECP :locatedIn :Paris . • Transitive properties :Paris :locatedIn :France .  :ECP :locatedIn :France . :hasSpouse a owl:SymmetricProperty . :John :hasSpouse :Merry . • Symmetric properties  :Merry :hasSpouse :John . :hasParent owl:inverseOf :hasChild . • Inverse properties :John :hasChild :Jane .  :Jane :hasParent :John . :hasSpouse a owl:FunctionalPropety . :Merry :hasSpouse :John . • Functional properties :Merry :hasSpouse :JohnSmith .  :JohnSmith = :John . Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #19
  • 20. OWL sublanguages • OWL Lite – low expressivity / low formal complexity – Logical decidability & completeness – All RDFS features – sameAs/differentFrom, equivalent class/property – Inverse / symmetric / transitive / functional properties – cardinality restriction (only 0 or 1) – class intersection Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #20
  • 21. OWL sublanguages (2) • OWL DL – high expressivity / efficient DL reasoning – Logical decidability & completeness – All OWL Lite features – Class disjointness – Complex class expressions – Class union & complement • OWL Full – max expressivity / no efficient reasoning – No guarantees for completeness & decidability Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #21
  • 22. OWL 2 profiles • Goals – sublanguages that trade expressiveness for efficiency of reasoning – Cover specific important application areas – Easier to understand by non-experts • OWL 2 EL – Best for large ontologies / small instance data (TBox reasoning) – Computationally optimal • PTime reasoning complexity Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #22
  • 23. OWL 2 profiles (2) • OWL 2 QL – Quite limited expressive power, but very efficient for query answering with large instance data – Can exploit query rewriting techniques • Data storage & query evaluation can be delegated to a RDBMS • OWL 2 RL – Balance between scalable reasoning and expressive power – Suitable for rule-based reasoning Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #23
  • 24. OWL 2 profiles (3) (c) Axel Polleres Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #24
  • 25. SPARQL Protocol and RDF Query Language (SPARQL) • SQL-like query language for RDF data • Simple protocol for querying remote databases over HTTP • Query types – select – query data by complex graph patterns – ask – whether a query returns results (result is true/false) – describe – returns all triples about a particular resource – construct – create new triples based on query results Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #25
  • 26. Graph patterns • Whitespace separated list of Subject, Predicate, Object – ?x dbp-ont:city dbpedia:Paris – dbpedia:École_centrale_Paris db-ont:city ?y • Group Graph Pattern – A group of 1+ graph patterns – FILTERs can constrain the whole group { ?uni a dbpedia:University ; dbp-ont:city dbpedia:Paris ; dbp-ont:numberOfStudents ?students . FILTER (?students > 5000) } Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #26
  • 27. Graph Patterns (2) • Optional Graph Pattern – Optional parts of a pattern – pattern OPTIONAL {pattern} SELECT ?uni ?students WHERE { ?uni a dbpedia:University ; dbp-ont:city dbpedia:Paris . OPTIONAL { ?uni dbp-ont:numberOfStudents ?students } } Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #27
  • 28. Graph Patterns (3) • Alternative Graph Pattern – Combine results of several alternative patterns – {pattern} UNION {pattern} SELECT ?uni WHERE { ?uni a dbpedia:University . { { ?uni dbp-ont:city dbpedia:Paris } UNION { ?uni dbp-ont:city dbpedia:Lyon } } } Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #28
  • 29. Anatomy of a SPARQL query • List of namespace prefixes – PREFIX xyz: <URI> • Query result clause (variables) – ?x, $y • Datasets • Graph patterns + filters – Simple / group / alternative / optional • Modifiers – ORDER BY, DISTINCT, OFFSET/LIMIT Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #29
  • 30. Linked Data • “To make the Semantic Web a reality, it is necessary to have a large volume of data available on the Web in a standard, reachable and manageable format. In addition the relationships among data also need to be made available. This collection of interrelated data on the Web can also be referred to as Linked Data. Linked Data lies at the heart of the Semantic Web: large scale integration of, and reasoning on, data on the Web.” (W3C) • Linked Data is a set of principles that allows publishing, querying and consumption of RDF data, distributed across different servers • similar to the way HTML is currently published & consumed Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #30
  • 31. Linked Data design principles 1. Unambiguous identifiers for objects (resources) – Use URIs as names for things 2. Use the structure of the web – Use HTTP URIs so that people can look up the names 3. Make is easy to discover information about an object (resource) – When someone lookups a URI, provide useful information 4. Link the object (resource) to related objects – Include links to other URIs Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #31
  • 32. Linked Data evolution – Oct 2007 (c) R. Cyganiak & A. Jentzsch Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #32
  • 33. Linked Data evolution – Sep 2008 (c) R. Cyganiak & A. Jentzsch Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #33
  • 34. Linked Data evolution – Jul 2010 (c) R. Cyganiak & A. Jentzsch Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #34
  • 35. Linked Data evolution – Sep 2010 (c) R. Cyganiak & A. Jentzsch Introduction to Semantic Technologies, Ontologies and the Semantic Web Aug 2010 #35
  • 36. SEMANTIC DATABASES (TRIPLESTORES) Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #36
  • 37. Triplestores • RDF databases – Store data according to the RDF data model – Provide inference of implicit triples (either at data loading time, or at query time) – SPARQL as a query language • Many similarities to traditional DBMS approaches – … and many differences too Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #37
  • 38. Triplestores vs. traditional DBMS Triplestore OLTP OLAP NoSQL Graph Update performance +/- + +/- ++ + Complex Queries + + ++ - + inference + - +/- - - Sparse data + - +/- + + Semi-structured / unstructured data + - - +/- + Dynamic schema + - - +/- +/- Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #38
  • 39. Triplestore advantages • Global identifiers of resources (entities) – Lowers the cost of data integration • Inference of implicit facts • Graph data model – Suitable for sparse, semi-structured and unstructured data • Agile schema – New relations between entities may be easily added • Exploratory queries against unknown schema – Query and data vocabulary may differ • Compliance to standards (RDF, SPARQL) Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #39
  • 40. Design & Implementation • Storage engine – Native – on top of an RDBMS – on top of a NoSQL engine • Triple density – The in-memory “footprint” per triple may differ x10 between different triplestores – Impact on TCO • Compression – Improves I/O on multi-core systems Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #40
  • 41. Design & Implementation (2) • Reasoning strategy – Forward-chaining – at data loading time, start from the explicit facts and apply the inference rules until the complete closure is inferred – Backward-chaining – at runtime, start with a query and decompose recursively into smaller requests that can be matched to explicit facts – Hybrid strategy – partial materialization at data loading time + partial query decomposition at runtime Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #41
  • 42. Pros and cons of forward-chaining based materialization • Relatively slow addition of new facts – inferred closure is extended after each transaction • Deletion of facts is slow – facts that are no longer true are removed from the inferred closure • The maintenance of the inferred closure requires more resources • Querying and retrieval are fast – no reasoning is required at query time – RDBMS-like query evaluation & optimisation techniques are applicable Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #42
  • 43. Design & Implementation (3) • Invalidation strategy – Truth maintenance is not trivial – huge overhead for keeping meta-data about inference dependencies – Trivial approach: just re-compute the complete inferred closure after a deletion – Advanced approach: detect which parts of the inferred closure are affected and need to be invalidated as well Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #43
  • 44. Design & Implementation (4) • owl:sameAs optimization – It is a transitive, reflexive and symmetric relationship – owl:sameAs induced inference can “inflate” the number of statements and deteriorate inference/query performance – Specific optimizations allow that a compact representation of equivalent resources is used – Query results can be expanded through backward-chaining at query time Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #44
  • 45. Popular triplestores • 4store – http://4store.org – Open source, distributed cluster (up to 32 nodes), data fully partitioned, no inference (external reasoner, backward chaining) • AllegroGraph – http://www.franz.com – ACID transactions, RDF and limited OWL reasoning, full- text indexing, compression, replication cluster, backward chaining Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #45
  • 46. Popular triplestores (2) • Bigdata – http://www.systap.com – Open source, data partitioning, hybrid materialization, RDF and limited OWL reasoning • Dydra – http://dydra.com – SaaS, SPARQL endpoint + REST API, no reasoning • Jena TDB – http://www.openjena.org/TDB – Open source, RDF and limited OWL reasoning Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #46
  • 47. Popular triplestores (3) • Oracle – RDF and limited OWL reasoning, data partitioning & compression (RAC), owl:sameAs optimization, security & versioning; geo-spatial extensions • OWLIM – http://www.ontotext.com – Forward-chaining, RDF / OWL 2 RL / OWL 2 QL and limited OWL Lite / OWL DL reasoning; replication cluster; owl:sameAs optimization; full-text indexing; geo-spatial extensions; scalable RDF Rank Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #47
  • 48. Popular triplestores (4) • Sesame – http://www.openrdf.org – Open source; plugable storage & inference layer • StarDog – http://stardog.com – backward-chaining; OWL DL and all OWL 2 profiles; full- text indexing; compression • Virtuoso – http://virtuoso.openlinksw.com – Universal server (RDF, XML, RDBMS); backward chaining; geo-spatial extensions; full-text indexing; compression Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #48
  • 49. Benchmarking • Tasks – Data loading – Query evaluation – Data modification • Performance factors – Forward-chaining vs. backward-chaining – Data model complexity – Query complexity – Result set size – Number of concurrent clients Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #49
  • 50. Popular benchmarks for triplestores • LUBM – Storing & query performance benchmark – 14 predefined queries + data generator tool • BSBM – SPARQL benchmark – 3 use cases (12/17/8 distinct queries) • SP2Bench – SPARQL benchmark – 12 queries for most common SPARQL constructs & access patterns Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #50
  • 51. SEMANTIC TECHNOLOGIES & TRIPLESTORES FOR BI Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #51
  • 52. Data integration & querying (HCLS) distributed querying at present distributed querying with RDF and SPARQL (c) HCLS @ W3C Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #52
  • 53. Data integration & querying (HCLS) (c) HCLS @ W3C Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #53
  • 54. Data integration cost (PwC) (c) PriceWaterhouseCooper Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #54
  • 55. Semantic Technologies & Triplestores for BI • Speed-up data integration – RDF based ETL is more agile • Lower the cost of data integration – Initial cost of using ontologies is higher – But the cost of ad-hoc ETL will be higher in the long term • Align & integrate legacy data silos – Querying & consuming data from disparate sources is easier with SPARQL & RDF Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #55
  • 56. Semantic Technologies & Triplestores for BI (2) • Infer implicit & hidden knowledge – Custom, user-defined rules as well • Efficiently manage unstructured & semi-structured data together – graph data model • Improve the quality of query results – Inference of implicit facts – SPARQL query vocabulary may differ from data vocabulary – Exploratory queries Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #56
  • 57. Q&A Questions? @ontotext Semantic Technologies & Triplestores for BI (eBISS 2011) Jul 2011 #57