SlideShare une entreprise Scribd logo
1  sur  56
Data integration at our group:
     ingredients and some
            prospects

                    Credible workshop
             Sophia-Antipolis, October 15th 2012


                                   Oscar Corcho
                                  ocorcho@fi.upm.es
               Facultad de Informática, Universidad Politécnica de Madrid
          Campus de Montegancedo s/n. 28660 Boadilla del Monte, Madrid, Spain



 With contributions from: José Mora (OEG-UPM), Boris Villazón-Terrazas (OEG-UPM, now at
iSOCO), Jean Paul Calbimonte (OEG-UPM), Freddy Priyatna (OEG-UPM), Carlos Buil-Aranda
                               (OEG-UPM, now at PUC Chile)
Our data integration needs, problems (and challenges)




                                                 And data may be available from data
                                                       streams (e.g., sensors)
                                                 Need to submit SPARQL queries into
                                                    distributed SPARQL endpoints

Need to access heterogeneous relational
data sources (mainly in the area of Geography)
 • Some of the databases are available
   in different DBMSs
 • And some of the data sources are
   available as spreadsheets
 • Furthermore, many of these datasets
   are already published as Linked Data

                                          2
Ingredients

                 100
                   80       thin applications (mas hups )

             middleware
                   60 5 Reasoning                                   Este
                   40
               s emantic data integration and querying
                                                                    Oeste
                   20                            1   RDB2RDF
                                                                    Norte
                     0
                          1er            3er 2 legacy
                         Sensor-based
              3          query rewriting
                                                       Optimisations
                                                   data s ources
                         trim. regis tries
             s ens or networks          trim.
                         Federated Query
                4          Processing



                                            Linked Open Data           Spreadsheets
From SemsorGrid4Env architecture (http://www.semsorgrid4env.eu/)   3
Disclaimer




   When I talk about ontology-based querying,
I will be normally talking about SPARQL querying




                        4
1. RDB2RDF

In other words, how to make relational data available as
RDF (and connected to ontologies)




                            5
RDB2RDF. Motivation
• A majority of dynamic Web content is backed by relational databases (RDB),
  and so are many enterprise systems.




                                       transformation
                      transformation       engine
                        description




                                                                         6
RDB2RDF. Query rewriting for OBDA with mappings


    Q




         Rewriting       Mappings




    Q’
                                There may be some
                               mappings to translate
                             between ontology and DB.
                                The rewriting should
                             consider those mappings.

                     7
RDB2RDF. Existing approaches
                                              1
                                                          2

1.   To build a new ontology from a
     database schema and content
     (direct mappings)

2.   To map the ontology created in
     approach (1) to a legacy ontology

3.   To map an existing DB to a legacy
     ontology
                                                      3




                                                              new ontology
                                                              existing ontology
OEG’s background knowledge in RDB2RDF

• R2O and ODEMapster
       • GaV wrapper generation (no mediators)
           • Syntactic sugar for the generation of SQL queries.
       • Simple use of this language and processor in the domains of
         fund finding, cultural information, and fisheries.
       • NeOn Toolkit plugin for common mappings




Barrasa J, Corcho O, Gómez-Pérez A. (2004)
R2O, an extensible and semantically based
database-to-ontology mapping language. In:
Proceedings of the Second Workshop on
Semantic Web and Databases, SWDB 2004.

                                             9
R2O (Relational-to-Ontology) Language


For concepts...                                                  One      or      more
                                                                 concepts    can     be
                                                                 extracted from a
                  A view maps exactly                            single data field (not
                   one concept in the                            in 1NF).
                       ontology.
                                            For attributes...

                     A subset of the                            A column in a
                  columns in the view                           database view maps
                  map a concept in the                          directly an attribute
                        ontology.                               or a relation.

                  A subset (selection) of
                  the records of a                               A column in a
                  database view map a                            database view maps
                  concept      in    the                         an attribute or a
                  ontology.                                      relation after some
                                                                 transformation.
                  A subset of the
                  records of a database
                  view map a concept
                  in the onto. but the                           A set of columns in a
                  selection cannot be                            database view map
                  made using SQL.                                an attribute or a
                                                                 relation.
The W3C RDB2RDF Working Group

• Created in 2007
• W3C Recommendations in
  September 2012
   •   R2RML: RDB to RDF Mapping
       Language -
       http://www.w3.org/TR/r2rml/
   •   Direct Mapping -
       http://www.w3.org/TR/rdb-
       direct-mapping/
   •   R2RML and Direct Mapping
       Test Cases -
       http://www.w3.org/2001/sw/rdb
       2rdf/test-cases/
   •   RDB2RDF Implementation
       Report -
       http://www.w3.org/2001/sw/rdb
       2rdf/implementation-report/




                                       11
R2RML example




12
Existing implementations

•   OEG implementations
    •   http://code.google.com/p/oeg-obdi/
    •   https://github.com/jpcik/morph
    •   https://github.com/boricles/morph




RDB2RDF Implementation Report. Boris Villazón-Terrazas, Michael Hausenblas.
http://www.w3.org/2001/sw/rdb2rdf/implementation-report/
                                             13
Ongoing work

• Provide a list of common patterns in R2RML
  transformations, so that they can be reused
  (increasing productivity)
   • Sequeda J, Priyatna F, Villazón-Terrazas B. Relational
     Database to RDF Mapping Patterns. In: Proceedings of the
     3rd Workshop on Ontology Patterns (WOP2012).
   • Villazón-Terrazas B, Priyatna F. Building Ontologies by
     using Re-engineering Patterns and R2RML Mappings. In:
     Proceedings of the 3rd Workshop on Ontology Patterns
     (WOP2012).Priyatna
   • http://mappingpedia.linkeddata.es/
• Improve our support at Morph for all test cases
• Adapt existing GUIs for the generation of mappings
  (such as NeOn Toolkit’s one).

                              14
2. R2RML query
rewriting optimisations

In other words, how to make this query rewriting
optimised, so that we don’t suffer from a bad efficiency
in our results



                            15
R2RML is now a W3C Recommendation

• That’s very good to ensure wide uptake, but…

• Implementations still suffer from their lack of
  efficiency
   • UltraWrap has shown that a similar performance can be
     obtained with direct mappings on high-end databases
     (Oracle, SQL Server)
   • What happens with low-end databases (mySQL)?




                              16
Several works on SPARQL to SQL translation

• Barrasa J, Corcho O, Gómez-Pérez A. (2004) R2O, an
  extensible and semantically based database-to-ontology
  mapping language. In: Proceedings of the Second Workshop on
  Semantic Web and Databases, SWDB 2004.
• R. Cyganiak. A relational algebra for sparql. Digital Media
  Systems Laboratory. HP Laboratories Bristol. HPL-2005-170,
  2005.
• B. Elliott, E. Cheng, C. Thomas-Ogbuji, and Z.M. Ozsoyoglu. A
  complete translation from sparql into ecient sql. In Proceedings
  of the 2009 International Database Engineering & Applications
  Symposium, pages 31-42. ACM, 2009.
• A. Chebotko, S. Lu, and F. Fotouhi. Semantics preserving
  sparql-to-sql translation. Data & Knowledge Engineering,
  68(10):973-1000, 2009.



                                 17
Chebotko’s query rewriting




18
Our proposal




19
An example. BSBM08

NATIVE
SELECT r.title, r.text, r.reviewDate, p.personID, p.name, r.rating1, r.rating2, r.rating3, r.rating4
FROM review r, person p
WHERE r.productID=55547 AND r.personID=p.personID AND r.language='en'
ORDER BY r.reviewDate desc

CHEBOTKO
SELECT var_rating2 AS rating2, var_reviewerName AS reviewerName, var_title AS title, var_rating1
AS rating1, var_reviewDate AS reviewDate, var_reviewer AS reviewer, var_rating3 AS rating3,
var_rating4 AS rating4, var_text AS text
FROM (SELECT *
FROM (SELECT uri_rating41477446315 AS uri_rating41477446315, var_rating2 AS var_rating2,
var_reviewer AS var_reviewer, uri_reviewDate750573656 AS uri_reviewDate750573656, var_rating4
AS var_rating4, var_rating1 AS var_rating1, var_text AS var_text, uri_title1963229325 AS
uri_title1963229325, var_rating3 AS var_rating3, uri_reviewer2088452952 AS
uri_reviewer2088452952, uri_rating21477446253 AS uri_rating21477446253, uri_text1457367120 AS
uri_text1457367120, uri_rating31477446284 AS uri_rating31477446284, uri_rating11477446222 AS
uri_rating11477446222, uri_reviewFor1499735727 AS uri_reviewFor1499735727, var_reviewDate AS
var_reviewDate, var_title AS var_title, uri_language269987354 AS uri_language269987354,
uri_Product555472014519903 AS uri_Product555472014519903, v_7634.var_review AS var_review,
var_reviewerName AS var_reviewerName, uri_name1396749066 AS uri_name1396749066, var_lang
AS var_lang
FROM (SELECT uri_reviewer2088452952 AS uri_reviewer2088452952, v_6537.var_review AS
var_review, uri_rating11477446222 AS uri_rating11477446222, uri_rating31477446284 AS
uri_rating31477446284, uri_Product555472014519903 AS uri_Product555472014519903,
uri_reviewFor1499735727 AS uri_reviewFor1499735727, var_rating2 AS var_rating2,
                                                      20
An example. BSBM08
OUR APPROACH
SELECT var_rating2 AS rating2, var_reviewDate AS reviewDate, var_rating4 AS rating4, var_rating1
AS rating1, var_reviewer AS reviewer, var_rating3 AS rating3, var_reviewerName AS reviewerName,
var_text AS text, var_title AS title
FROM (SELECT *
FROM (SELECT v_2660.var_reviewer AS var_reviewer, var_reviewDate AS var_reviewDate,
var_review AS var_review, uri_rating31477446284 AS uri_rating31477446284, uri_rating21477446253
AS uri_rating21477446253, uri_title1963229325 AS uri_title1963229325, var_rating3 AS var_rating3,
uri_reviewDate750573656 AS uri_reviewDate750573656, uri_reviewFor1499735727 AS
uri_reviewFor1499735727, uri_language269987354 AS uri_language269987354,
uri_name1396749066 AS uri_name1396749066, var_rating1 AS var_rating1, var_reviewerName AS
var_reviewerName, var_lang AS var_lang, uri_Product555472014519903 AS
uri_Product555472014519903, var_rating2 AS var_rating2, uri_rating41477446315 AS
uri_rating41477446315, var_title AS var_title, var_rating4 AS var_rating4, var_text AS var_text,
uri_rating11477446222 AS uri_rating11477446222, uri_text1457367120 AS uri_text1457367120,
uri_reviewer2088452952 AS uri_reviewer2088452952
FROM (SELECT v_8722.PERSONID AS var_reviewer, 'http://xmlns.com/foaf/0.1/name' AS
uri_name1396749066, v_8722.NAME AS var_reviewerName
FROM PERSON v_8722
WHERE (v_8722.NAME IS NOT NULL) ) v_2660
INNER JOIN (SELECT v_3353.REVIEWDATE AS var_reviewDate, 'http://www4.wiwiss.fu-
berlin.de/bizer/bsbm/v01/vocabulary/rating1' AS uri_rating11477446222, v_3353.REVIEWID AS
var_review, v_3353.TEXT AS var_text, 'http://purl.org/stuff/rev#reviewer' AS uri_reviewer2088452952,
v_3353.RATING1 AS var_rating1, 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/rating2'
AS uri_rating21477446253, v_3353.TITLE AS var_title, 'http://www4.wiwiss.fu-
berlin.de/bizer/bsbm/v01/vocabulary/language' AS uri_language269987354, 'http://www4.wiwiss.fu-
berlin.de/bizer/bsbm/v01/vocabulary/reviewDate' AS uri_reviewDate750573656, 'http://www4.wiwiss.fu-
berlin.de/bizer/bsbm/v01/vocabulary/rating3' AS uri_rating31477446284, 'http://www4.wiwiss.fu-
                                                   21
Analysis with BSBM




               SQL Server




                 mySQL




22
Ongoing work

• Writing the paper describing our optimisations

• Proposing a comprehensive benchmarking platform
  to test R2RML-compliant query rewriting systems
   • Extending our current work on the R2RML implementation
     testcases




                              23
3. Ontology-based
sensor query rewriting

In other words, what happens if our data sources are
not static, but data streams. Can we still use similar
techniques?



                            24
An example: SmartCities




Environmental sensors




     Parking sensors
                  SmartSantander Project
25
Data from the Web
 Flood risk alert:
South East England
                       Emergency
                                                 I have to make
                        planner
                                               sense out of all this
                                                      data

           wave data                               Environmental
                              forecasts              defenses




                              Heterogeneity

                              Continuous querying

                              Streaming data
                                          26
Ingredients for Linked Sensor Data

Core ontological model

Additional domain ontologies

Guidelines for generation of identifiers

Sensor Web programming interfaces

Query processing engines



                                        http://www.flickr.com/photos/santos/2252824606/
Overview of the SSN ontology
                    Deployment deploymentProcesPart only                                    System                                          OperatingRestriction
                                                                           hasSubsystem only, some        hasSurvivalRange only
                                                                                                                                               SurvivalRange
DeploymentRelatedProcess
                                       hasDeployment only
                                                                           System
                                                                                                                                              OperatingRange
       Deployment     deployedSystem only                                                                hasOperatingRange only

             deployedOnPlatform only                                                                                                    Process

               inDeployment only                                            Device                                              hasInput only
                                                                                                                     Input
                    PlatformSite       onPlatform only                      Device                                                                       Process

        Platform                                                                                                     Output
                     attachedSystem only                                                                                        hasOutput only, some

      Data                                                                             Skeleton
                                            isProducedBy some                                              implements some
                                                                                          Sensor
                                                                                                                                                         Sensing
   hasValue some                                                                                                         sensingMethodUsed only
                       SensorOutput
                                                 detects only
                                                                     SensingDevice                                    observes only
ObservationValue                       SensorInput
                                                         isProxyFor only
                                                                                                                                        Property
                                              includesEvent some                                                                             isPropertyOf some
                                                                                          observedProperty only
                     observationResult only
                                                   observedBy only                                                                          hasProperty only, some

                                       Observation                                                                              FeatureOfInterest
                                                                                featureOfInterest only

                                              MeasuringCapability                                                     ConstraintBlock
                            hasMeasurementCapability only                     forProperty only
                                                                                                  inCondition only                       inCondition only
                                               MeasurementCapability                                                  Condition


             Compton M, Barnaghi P, Bermúdez L, García-Castro R, Corcho O, Cox S, Graybeal J, Hauswirth M, Henson C, Herzog A,
             Huang V, Janowicz K, Kelsey WD, Le Phuoc D, Lefort L, Leggieri M, Neuhaus H, Nikolov A, Page K, Passant A, Sheth A,
             Taylor K. The SSN Ontology of the W3C Semantic Sensor Network Incubator Group. Journal of Web Semantics. In press
SSN Ontology with other Ontologies




García-Castro R, Corcho O, Hill C. A Core Ontology Model for Semantic Sensor Web Infrastructures.
International Journal of Semantic Web and Information Systems 8(1):22-42
                                              29
Queries to Sensor Data
                            SNEEql
       RSTREAM SELECT id, speed, direction FROM wind [NOW];

                                                Data Stream Mgmt System
                            Esper QL
        SELECT wind_speed FROM wind_sensor.win:time(10 min)

                                                Complex Event Processors
                              GSN RESTful service
http://montblanc.slf.ch:22001/multidata?vs[0]=wind_sensor&field[0]=wind_speed&
               from=15/09/2011+05:00:00&to=15/09/2011+15:00:00

                        Pachube RESTful service
   http://api.pachube.com/v2/feeds/14321/datastreams/4?start=2011-09-
                 02T14:01:46Z&end=2011-09-02T17:01:46Z

                                                Sensor Data Middleware

                   Querying through ontologies?
                                           30
SPARQL-Stream
SELECT ?windspeed ?tidespeed
FROM NAMED STREAM <http://swiss-experiment.ch/data#WannengratSensors.srdf>
[NOW-10 MINUTES TO NOW-0 MINUTES]
WHERE {
 ?WaveObs a ssn:Observation;
            ssn:observationResult ?windspeed;
            ssn:observedProperty sweetSpeed:WindSpeed.
 ?TideObs a ssn:Observation;
         ssn:observationResult ?tidespeed;
         ssn:observedProperty sweetSpeed:TideSpeed.
FILTER (?tidespeed<?windspeed)}


  Query processing closer to data

Use ontologies as conceptual model

  Query virtual stream graphs


                                      31
SPARQL-Stream
SELECT ?name ( AVG(?temperature) AS ?avgTemperature )
       ( AVG(?humidity) AS ?avgHumidity )
FROM NAMED STREAM <http://www.cwi.nl/SRBench/observations> [NOW - 1 HOURS SLIDE 1 HOURS]
FROM <http://www.cwi.nl/SRBench/sensors>
FROM <http://www.cwi.nl/SRBench/geonames>
WHERE {
   ?sensor om-owl:generatedObservation ?temperatureObservation;
                                                                          Aggregates
           om-owl:generatedObservation ?humidityObservation;


                                                                         Static & Streaming
           om-owl:hasLocatedNearRel [ om-owl:hasLocation ?nearbyLocation ] .
  ?temperatureObservation om-owl:observedProperty weather:_AirTemperature ;
                          om-owl:result [ om-owl:floatValue ?temperature ] .
  ?humidityObservation om-owl:observedProperty weather:_RelativeHumidity ;

  { SELECT ?name
                       om-owl:result [ om-owl:floatValue ?humidity ] .
                                                                           Windows
     WHERE {


                                                                         Filters, Functions
        ?nearbyLocation gn:featureClass ?featureClass ;
                      gn:name | gn:officialName ?name ;
                      gn:population ?population .
      FILTER ( ?population > 15000 && REGEX(?featureClass, “P” , “i") )
    }
  }
 UNION
{ SELECT ?name
    WHERE {

                                                     Disclaimer: some features NYI
      ?nearbyLocation gn:parentFeature+ ?parentFeature .
      ?parentFeature gn:featureClass ?parentClass ;
               gn:name | gn:officialName ?name ;
               gn:population ?parentPopulation .
      FILTER ( ?parentPopulation > 15000 && REGEX(?parentClass, “P” , “i") )
    }
}} GROUP BY ?name


                                                         32
Querying the Observations
                   SELECT ?waveheight
                   FROM STREAM <www.ssg4env.eu/SensorReadings.srdf>
                   [NOW -10 MINUTES TO NOW STEP 1 MINUTE]
                   WHERE {
                    ?WaveObs a sea:WaveHeightObservation;
                               sea:hasValue ?waveheight; }
                                         http://montblanc.slf.ch :22001/ multidata ?vs [0]= wan7 &
                                         field [0]= sp_wind

                                  Query
 :Wan4WindSpeed a rr:TriplesMapClass;
   rr:tableName "wan7";         Rewriting          GSN
         SPARQLStream
   rr:subjectMap [ rr:template
                                                   API
 "http://swissex.ch/ns#WindSpeed/Wan7/
 {timed}";
                               Mappings
         rr:class ssn:ObservationValue;
                                                          Query
 rr:graph ssg:swissexsnow.srdf ];                       Processing
     rr:predicateObjectMap [                                                    Sensor
Client




 rr:predicateMap [ rr:predicate                                                 Network
 ssn:hasQuantityValue ];
     rr:objectMap[ rr:column "sp_wind" ] ];
                                    Data               [tuples]
           [triples]             translation

              R2RML                           Query processing
              Mappings
                                              engines
                                                  33
Rewriting to different technologies
  SELECT ?windspeed
  FROM NAMED STREAM <http://swiss-
       experiment.ch/data#WannengratSensors.srdf>
  [NOW-10 MINUTE TO NOW-0 MINUTE]
  WHERE {                                                                   Query
  ?WaveObs a ssn:Observation;                                              Rewriting
  ssn:observationResult ?windspeed;
                                                                                                         Algebra
  ssn:observedProperty sweetSpeed:WindSpeed.
  }                                                                                                   representation



SELECT wind_speed_scalar_av, timed FROM wan7.win:time(10
                         min)
                                                    Esper (CEP)

      SELECT wan7.wind_speed_scalar_av AS windspeed, wan7.timed AS
          windts FROM wan7[FROM NOW-10 MINUTES TO NOW]
                                                                         SNEE (DSMS)



                 http://montblanc.slf.ch:22001/multidata?vs[0]=wan7&
                           field[0]=wind_speed_scalar_av&
                 from=15/05/2011+05:00:00&to=15/05/2011+15:00:00       GSN (Middleware)



               http://api.pachube.com/v2/feeds/14321/datastreams/4?start=2011-09-
                             02T14:01:46Z&end=2011-09-02T17:01:46Z             Pachube (Middleware)

                Calbimonte JP, Corcho O, Yeung H, Aberer K. Enabling Query Technologies for the Semantic Sensor Web.
                International Journal of Semantic Web and Information Systems 8(1):43-63

                                                             34
Ongoing work

• Benchmarking of ontology-based streaming data
  engines
   • Zhang Y, Pham MD, Corcho O, Calbimonte JP. SRBench: A
     Streaming RDF/SPARQL Benchmark. Proceedings of the
     11th International Semantic Web Conference (ISWC2012)
• Improve optimisations when joining static and
  streaming data
• Automatic characterisation of sensor data streams
   • Useful in citizen science approaches (e.g., AirQualityEgg)
   • Calbimonte JP, Yan Z, Jeung H, Corcho O, Aberer K.
     Deriving Semantic Sensor Metadata from Raw
     Measurements. ISWC2012 5th International Workshop on
     Semantic Sensor Networks 2011 (SSN2012). CEUR
     Workshop Proceedings, Vol-904, http://ceur-ws.org/Vol-904/


                               35
4. Federated query
        processing

In other words, how can we access data from federated
data sources




                          36
Example

• We query the life science domain
   1. Using the Pubmed references obtained from the GeneID
      gene dataset, retrieve information about genes and their
      references in the Pubmed dataset.
   2. From Pubmed we access the information in the National
      Library of Medicines controlled vocabulary thesaurus,
      stored at the MeSH endpoint, so we have more complete
      information about such genes.
   3. Finally, we also access the HHPID endpoint, which is the
      knowledge base for the HIV-1 protein.




                                                                 37
Introduction



• Question:
   • How can we access such amount of RDF data in an
     integrated manner?


• Current approaches
   • Replicate data in local stores, access it using existing RDF
     databases.
   • Execute individual queries and manually join data.
   • Use existing distributed query systems (starting to appear).




                                                                    38
Problem

• Existing tools for distributed SPARQL query
  processing differ in the way of handling distribution
   • SPARQL-published the Federated Query Document Last
     Call Working Draft
      • It homogenises the access to distributed RDF data
        repositories
      • SERVICE <http://dbpedia.org/sparql> {...}


• Problems in semantics: SERVICE ?X not well defined

• Current Access to SPARQL endpoints is not optimal
   • Work on SPARQL distributed query optimization is beginning



                                                             39
State of the Art
• ANAPSID, RDF::Query, OpenAnzo, ARQ, Rasqal
  RDF Query Library
• ANAPSID provides SPARQL optimization based on
  adaptive query processing operators
• RDF::Query provides basic pattern reordering
           • Implement the federation using query
             predicates
               • List of SPARQL endpoints needed
               • Helps user to direct queries to
                  remote datasets
           • FedX, SPLENDID, SemWIQ,
             NetworkedGraphs
           • All provide basic optimisations: pattern
             grouping (FedX), cost based
             optimizations(SemWIQ, SPLENDID and
             recently FedX, NetworkedGraphs)
           • SPARQL 1.1 is mostly syntactic sugar
                                             40
Assumptions & Restrictions
• Assumptions
   1. Users know how to create a
      query to the endpoints
   2. No statistics of any kind are
      available for the query
      processing system.
   3. Data are distributed


• Restrictions
   1. We only consider the
      Federation Extension of
      SPARQL 1.1
   2. We are not aware of the
      capabilities or implementation
      of the remote SPARQL server
   3. No registry of endpoints
                                                           41
SERVICE Semantics
                                   Example:
   SELECT ?name ?email      SELECT ?name ?email
          WHERE {                  WHERE {
      ?y :name ?name . SERVICE <http://example1.org/sparql>
     ?y :email ?email          {?y :name ?name} .
             }         SERVICE <http://example2.org/sparql>
                                {?y :email ?email}
                                        }
• We extend [PAG09] with the semantics for SERVICE:




                            42
SERVICE Semantics

Example:
SELECT ?name
WHERE {
  SERVICE ?X {?y :name ?name}
}




                                            43
SPARQL Optimisation - OPTIONAL

• We assume that we have no statistics of endpoints
   • This means that we cannot use cost-based optimisations
   • We will only focus on static optimisations


• Besides the usual static optimisations (e.g. Pushing
  down filters) SPARQL queries can be optimised if
  they contain OPTIONAL operators
   • The OPTIONAL operator is responsible for PSPACE-
     completeness in SPARQL [PAG09]


• OPTIONAL is a key operator in SPARQL




                              44
Well-designed patterns

• Well-designed SPARQL patterns [PAG09]
  • Class of SPARQL patterns which adds a restriction




                                                        45
Well-designed Patterns


• We extended the notion of well-designed patterns for
  the SPARQL 1.1 Federation Extension
   • The previous rules also hold for SERVICE




                                                        46
Implementation: SPARQL-DQP

• SPARQL-DQP is implemented on top of OGSA-DAI and OGSA-
  DQP
   • OGSA-DAI is a Web service-based framework for accessing
     distributed data resources
   • OGSA-DQP adds distributed query processing infrastructure
• We reuse some OGSA-DQP operators
• We added RDF and SPARQL endpoint data access
   • RDFB2RDF data resource
   • RDF data resource
   • SPARQL endpoint resources
• Good behaviour for large
  datasets

  Buil C, Arenas M, Corcho O. Semantics and
  optimization of the SPARQL 1.1 federation
  extension. Proceedings of the 8th Extended
  Semantic Web Conference (ESWC2011).
  Springer-Verlag LNCS 6644, pages 1-15
                                         47
Ongoing Work

• An extensive benchmark has been produced
   • Montoya G, Vidal ME, Corcho O, Ruckhaus E, Buil-Aranda
     C. Benchmarking Federated SPARQL Query Engines: Are
     Existing Testbeds Enough? In: Proceedings of the 11th
     International Semantic Web Conference (ISWC2012)


• Focusing now on Adaptive Query Processing
   • Query Processing should be adapted to the user's specific
     needs and specific network requirements




                                                                 48
5. Entailment in query
       rewriting

In other words, how can we take into account the
existence of ontologies in the query rewriting process,
so as to provide simple entailment



                            49
Main approaches in the state of the art

Expressiveness                   Author             System     Output

                                                               [R] Datalog,
ELHIO¬                           Pérez-Urbina et al. REQUIEM
                                                               UCQ


Sticky-join [linear] datalog±    Gottlob et al.     Nyaya      UCQ


DL-LiteR, DL-LiteF               Calvanese et al.   QuOnto     UCQ


DL-LiteR                         Chortaras et al.   Rapid      UCQ


                                                    Presto &   NR-Datalog &
DL-LiteR [+EBox]                 Rosati et al.
                                                    Prexto     UCQ


                                          50
Optimizations in the rewriting


• The rewriting can be optimized in
  several ways
  •   Ontology preprocessing
  •   Subsumption checks
  •   Prioritize inferences
  •   Constrain the searches




                       51
Our proposal




José Mora   52
Conclusion and Future Work

• We have proposed some small incremental
  improvements over the current state of the art in
  entailment-aware query rewriting
   • Need to integrate it with the rest of our work
   • This will happen during Fall 2012




                                 53
Final conclusions and
      future work




          54
Ingredients

    100
      80       thin applications (mas hups )

middleware
      60 5 Reasoning                                   Este
      40
  s emantic data integration and querying
                                                       Oeste
      20                            1   RDB2RDF
                                                       Norte
        0
             1er            3er 3 legacy
     Sensor-based
 2   query rewriting
                                          Optimisations
                                      data s ources
            trim. regis tries
s ens or networks          trim.
     Federated Query
 4     Processing



                Linked Open Data        Spreadsheets
                                   55
Data integration at our group:
     ingredients and some
            prospects

                    Credible workshop
             Sophia-Antipolis, October 15th 2012


                                   Oscar Corcho
                                  ocorcho@fi.upm.es
               Facultad de Informática, Universidad Politécnica de Madrid
          Campus de Montegancedo s/n. 28660 Boadilla del Monte, Madrid, Spain



 With contributions from: José Mora (OEG-UPM), Boris Villazón-Terrazas (OEG-UPM, now at
iSOCO), Jean Paul Calbimonte (OEG-UPM), Freddy Priyatna (OEG-UPM), Carlos Buil-Aranda
                               (OEG-UPM, now at PUC Chile)

Contenu connexe

Tendances

D49996 gc11 intro
D49996 gc11 introD49996 gc11 intro
D49996 gc11 intro
Hkn Crk
 
Open hpi semweb-06-part7
Open hpi semweb-06-part7Open hpi semweb-06-part7
Open hpi semweb-06-part7
Nadine Ludwig
 

Tendances (19)

Geo alberta2010 ppt_template
Geo alberta2010 ppt_templateGeo alberta2010 ppt_template
Geo alberta2010 ppt_template
 
D49996 gc11 intro
D49996 gc11 introD49996 gc11 intro
D49996 gc11 intro
 
Try NoSQL it doesn't hurts and is fun
Try NoSQL it doesn't hurts and is funTry NoSQL it doesn't hurts and is fun
Try NoSQL it doesn't hurts and is fun
 
[poster] Extracting Information From Classics Scholarly Texts
[poster] Extracting Information From Classics Scholarly Texts[poster] Extracting Information From Classics Scholarly Texts
[poster] Extracting Information From Classics Scholarly Texts
 
Hadoop World 2011: Hadoop and Graph Data Management: Challenges and Opportuni...
Hadoop World 2011: Hadoop and Graph Data Management: Challenges and Opportuni...Hadoop World 2011: Hadoop and Graph Data Management: Challenges and Opportuni...
Hadoop World 2011: Hadoop and Graph Data Management: Challenges and Opportuni...
 
The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...
The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...
The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...
 
Invited talk @ DCC09 workshop
Invited talk @ DCC09 workshopInvited talk @ DCC09 workshop
Invited talk @ DCC09 workshop
 
Inquiry Optimization Technique for a Topic Map Database
Inquiry Optimization Technique for a Topic Map DatabaseInquiry Optimization Technique for a Topic Map Database
Inquiry Optimization Technique for a Topic Map Database
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows Azure
 
Graph Theory and Databases
Graph Theory and DatabasesGraph Theory and Databases
Graph Theory and Databases
 
Exchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF DataExchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF Data
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows Azure
 
Open hpi semweb-06-part7
Open hpi semweb-06-part7Open hpi semweb-06-part7
Open hpi semweb-06-part7
 
P341
P341P341
P341
 
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
 
Compact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and ExchangeCompact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and Exchange
 
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
 
Bcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph DatabasesBcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph Databases
 
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
NISO Forum, Denver, Sept. 24, 2012: Data EquivalenceNISO Forum, Denver, Sept. 24, 2012: Data Equivalence
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
 

Similaire à Data Integration at the Ontology Engineering Group

An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentation
MapR Technologies
 
Whitepaper sones GraphDB (eng)
Whitepaper sones GraphDB (eng)Whitepaper sones GraphDB (eng)
Whitepaper sones GraphDB (eng)
sones GmbH
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Gezim Sejdiu
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Peter Haase
 
high_level_parallel_processing_model
high_level_parallel_processing_modelhigh_level_parallel_processing_model
high_level_parallel_processing_model
Mingliang Sun
 
Collaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph ClusteringCollaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph Clustering
Waqas Nawaz
 

Similaire à Data Integration at the Ontology Engineering Group (20)

Semantic Web and Related Work at W3C
Semantic Web and Related Work at W3CSemantic Web and Related Work at W3C
Semantic Web and Related Work at W3C
 
An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentation
 
Semantika Introduction
Semantika IntroductionSemantika Introduction
Semantika Introduction
 
Whitepaper sones GraphDB (eng)
Whitepaper sones GraphDB (eng)Whitepaper sones GraphDB (eng)
Whitepaper sones GraphDB (eng)
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
 
Contributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataContributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library Data
 
SRBench Streaming RDF SPARQL Benchmark
SRBench Streaming  RDF SPARQL BenchmarkSRBench Streaming  RDF SPARQL Benchmark
SRBench Streaming RDF SPARQL Benchmark
 
Semantic Digital Libraries
Semantic Digital LibrariesSemantic Digital Libraries
Semantic Digital Libraries
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks ass
 
high_level_parallel_processing_model
high_level_parallel_processing_modelhigh_level_parallel_processing_model
high_level_parallel_processing_model
 
Drill njhug -19 feb2013
Drill njhug -19 feb2013Drill njhug -19 feb2013
Drill njhug -19 feb2013
 
Lee oracle
Lee oracleLee oracle
Lee oracle
 
Hadoop - A big data initiative
Hadoop - A big data initiativeHadoop - A big data initiative
Hadoop - A big data initiative
 
Collaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph ClusteringCollaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph Clustering
 
Standards for Semantic Mashups
Standards for Semantic MashupsStandards for Semantic Mashups
Standards for Semantic Mashups
 
Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011
 
Whitepaper : CHI: Hadoop's Rise in Life Sciences
Whitepaper : CHI: Hadoop's Rise in Life Sciences Whitepaper : CHI: Hadoop's Rise in Life Sciences
Whitepaper : CHI: Hadoop's Rise in Life Sciences
 
History and Background of the USEWOD Data Challenge
History and Background of the  USEWOD Data ChallengeHistory and Background of the  USEWOD Data Challenge
History and Background of the USEWOD Data Challenge
 

Plus de Oscar Corcho

Plus de Oscar Corcho (20)

Organisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de MadridOrganisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de Madrid
 
Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management
 
Adiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticosAdiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticos
 
Ontology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data SharingOntology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data Sharing
 
Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...
 
STARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación LumínicaSTARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación Lumínica
 
Towards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceTowards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experience
 
Publishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case studyPublishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case study
 
An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...
 
Linked Statistical Data 101
Linked Statistical Data 101Linked Statistical Data 101
Linked Statistical Data 101
 
Aplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMETAplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMET
 
Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016
 
Educando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidadEducando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidad
 
STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016
 
Generación de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de EstadísticaGeneración de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de Estadística
 
Presentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesPresentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart Cities
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?
 
Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Data Integration at the Ontology Engineering Group

  • 1. Data integration at our group: ingredients and some prospects Credible workshop Sophia-Antipolis, October 15th 2012 Oscar Corcho ocorcho@fi.upm.es Facultad de Informática, Universidad Politécnica de Madrid Campus de Montegancedo s/n. 28660 Boadilla del Monte, Madrid, Spain With contributions from: José Mora (OEG-UPM), Boris Villazón-Terrazas (OEG-UPM, now at iSOCO), Jean Paul Calbimonte (OEG-UPM), Freddy Priyatna (OEG-UPM), Carlos Buil-Aranda (OEG-UPM, now at PUC Chile)
  • 2. Our data integration needs, problems (and challenges) And data may be available from data streams (e.g., sensors) Need to submit SPARQL queries into distributed SPARQL endpoints Need to access heterogeneous relational data sources (mainly in the area of Geography) • Some of the databases are available in different DBMSs • And some of the data sources are available as spreadsheets • Furthermore, many of these datasets are already published as Linked Data 2
  • 3. Ingredients 100 80 thin applications (mas hups ) middleware 60 5 Reasoning Este 40 s emantic data integration and querying Oeste 20 1 RDB2RDF Norte 0 1er 3er 2 legacy Sensor-based 3 query rewriting Optimisations data s ources trim. regis tries s ens or networks trim. Federated Query 4 Processing Linked Open Data Spreadsheets From SemsorGrid4Env architecture (http://www.semsorgrid4env.eu/) 3
  • 4. Disclaimer When I talk about ontology-based querying, I will be normally talking about SPARQL querying 4
  • 5. 1. RDB2RDF In other words, how to make relational data available as RDF (and connected to ontologies) 5
  • 6. RDB2RDF. Motivation • A majority of dynamic Web content is backed by relational databases (RDB), and so are many enterprise systems. transformation transformation engine description 6
  • 7. RDB2RDF. Query rewriting for OBDA with mappings Q Rewriting Mappings Q’ There may be some mappings to translate between ontology and DB. The rewriting should consider those mappings. 7
  • 8. RDB2RDF. Existing approaches 1 2 1. To build a new ontology from a database schema and content (direct mappings) 2. To map the ontology created in approach (1) to a legacy ontology 3. To map an existing DB to a legacy ontology 3 new ontology existing ontology
  • 9. OEG’s background knowledge in RDB2RDF • R2O and ODEMapster • GaV wrapper generation (no mediators) • Syntactic sugar for the generation of SQL queries. • Simple use of this language and processor in the domains of fund finding, cultural information, and fisheries. • NeOn Toolkit plugin for common mappings Barrasa J, Corcho O, Gómez-Pérez A. (2004) R2O, an extensible and semantically based database-to-ontology mapping language. In: Proceedings of the Second Workshop on Semantic Web and Databases, SWDB 2004. 9
  • 10. R2O (Relational-to-Ontology) Language For concepts... One or more concepts can be extracted from a A view maps exactly single data field (not one concept in the in 1NF). ontology. For attributes... A subset of the A column in a columns in the view database view maps map a concept in the directly an attribute ontology. or a relation. A subset (selection) of the records of a A column in a database view map a database view maps concept in the an attribute or a ontology. relation after some transformation. A subset of the records of a database view map a concept in the onto. but the A set of columns in a selection cannot be database view map made using SQL. an attribute or a relation.
  • 11. The W3C RDB2RDF Working Group • Created in 2007 • W3C Recommendations in September 2012 • R2RML: RDB to RDF Mapping Language - http://www.w3.org/TR/r2rml/ • Direct Mapping - http://www.w3.org/TR/rdb- direct-mapping/ • R2RML and Direct Mapping Test Cases - http://www.w3.org/2001/sw/rdb 2rdf/test-cases/ • RDB2RDF Implementation Report - http://www.w3.org/2001/sw/rdb 2rdf/implementation-report/ 11
  • 13. Existing implementations • OEG implementations • http://code.google.com/p/oeg-obdi/ • https://github.com/jpcik/morph • https://github.com/boricles/morph RDB2RDF Implementation Report. Boris Villazón-Terrazas, Michael Hausenblas. http://www.w3.org/2001/sw/rdb2rdf/implementation-report/ 13
  • 14. Ongoing work • Provide a list of common patterns in R2RML transformations, so that they can be reused (increasing productivity) • Sequeda J, Priyatna F, Villazón-Terrazas B. Relational Database to RDF Mapping Patterns. In: Proceedings of the 3rd Workshop on Ontology Patterns (WOP2012). • Villazón-Terrazas B, Priyatna F. Building Ontologies by using Re-engineering Patterns and R2RML Mappings. In: Proceedings of the 3rd Workshop on Ontology Patterns (WOP2012).Priyatna • http://mappingpedia.linkeddata.es/ • Improve our support at Morph for all test cases • Adapt existing GUIs for the generation of mappings (such as NeOn Toolkit’s one). 14
  • 15. 2. R2RML query rewriting optimisations In other words, how to make this query rewriting optimised, so that we don’t suffer from a bad efficiency in our results 15
  • 16. R2RML is now a W3C Recommendation • That’s very good to ensure wide uptake, but… • Implementations still suffer from their lack of efficiency • UltraWrap has shown that a similar performance can be obtained with direct mappings on high-end databases (Oracle, SQL Server) • What happens with low-end databases (mySQL)? 16
  • 17. Several works on SPARQL to SQL translation • Barrasa J, Corcho O, Gómez-Pérez A. (2004) R2O, an extensible and semantically based database-to-ontology mapping language. In: Proceedings of the Second Workshop on Semantic Web and Databases, SWDB 2004. • R. Cyganiak. A relational algebra for sparql. Digital Media Systems Laboratory. HP Laboratories Bristol. HPL-2005-170, 2005. • B. Elliott, E. Cheng, C. Thomas-Ogbuji, and Z.M. Ozsoyoglu. A complete translation from sparql into ecient sql. In Proceedings of the 2009 International Database Engineering & Applications Symposium, pages 31-42. ACM, 2009. • A. Chebotko, S. Lu, and F. Fotouhi. Semantics preserving sparql-to-sql translation. Data & Knowledge Engineering, 68(10):973-1000, 2009. 17
  • 20. An example. BSBM08 NATIVE SELECT r.title, r.text, r.reviewDate, p.personID, p.name, r.rating1, r.rating2, r.rating3, r.rating4 FROM review r, person p WHERE r.productID=55547 AND r.personID=p.personID AND r.language='en' ORDER BY r.reviewDate desc CHEBOTKO SELECT var_rating2 AS rating2, var_reviewerName AS reviewerName, var_title AS title, var_rating1 AS rating1, var_reviewDate AS reviewDate, var_reviewer AS reviewer, var_rating3 AS rating3, var_rating4 AS rating4, var_text AS text FROM (SELECT * FROM (SELECT uri_rating41477446315 AS uri_rating41477446315, var_rating2 AS var_rating2, var_reviewer AS var_reviewer, uri_reviewDate750573656 AS uri_reviewDate750573656, var_rating4 AS var_rating4, var_rating1 AS var_rating1, var_text AS var_text, uri_title1963229325 AS uri_title1963229325, var_rating3 AS var_rating3, uri_reviewer2088452952 AS uri_reviewer2088452952, uri_rating21477446253 AS uri_rating21477446253, uri_text1457367120 AS uri_text1457367120, uri_rating31477446284 AS uri_rating31477446284, uri_rating11477446222 AS uri_rating11477446222, uri_reviewFor1499735727 AS uri_reviewFor1499735727, var_reviewDate AS var_reviewDate, var_title AS var_title, uri_language269987354 AS uri_language269987354, uri_Product555472014519903 AS uri_Product555472014519903, v_7634.var_review AS var_review, var_reviewerName AS var_reviewerName, uri_name1396749066 AS uri_name1396749066, var_lang AS var_lang FROM (SELECT uri_reviewer2088452952 AS uri_reviewer2088452952, v_6537.var_review AS var_review, uri_rating11477446222 AS uri_rating11477446222, uri_rating31477446284 AS uri_rating31477446284, uri_Product555472014519903 AS uri_Product555472014519903, uri_reviewFor1499735727 AS uri_reviewFor1499735727, var_rating2 AS var_rating2, 20
  • 21. An example. BSBM08 OUR APPROACH SELECT var_rating2 AS rating2, var_reviewDate AS reviewDate, var_rating4 AS rating4, var_rating1 AS rating1, var_reviewer AS reviewer, var_rating3 AS rating3, var_reviewerName AS reviewerName, var_text AS text, var_title AS title FROM (SELECT * FROM (SELECT v_2660.var_reviewer AS var_reviewer, var_reviewDate AS var_reviewDate, var_review AS var_review, uri_rating31477446284 AS uri_rating31477446284, uri_rating21477446253 AS uri_rating21477446253, uri_title1963229325 AS uri_title1963229325, var_rating3 AS var_rating3, uri_reviewDate750573656 AS uri_reviewDate750573656, uri_reviewFor1499735727 AS uri_reviewFor1499735727, uri_language269987354 AS uri_language269987354, uri_name1396749066 AS uri_name1396749066, var_rating1 AS var_rating1, var_reviewerName AS var_reviewerName, var_lang AS var_lang, uri_Product555472014519903 AS uri_Product555472014519903, var_rating2 AS var_rating2, uri_rating41477446315 AS uri_rating41477446315, var_title AS var_title, var_rating4 AS var_rating4, var_text AS var_text, uri_rating11477446222 AS uri_rating11477446222, uri_text1457367120 AS uri_text1457367120, uri_reviewer2088452952 AS uri_reviewer2088452952 FROM (SELECT v_8722.PERSONID AS var_reviewer, 'http://xmlns.com/foaf/0.1/name' AS uri_name1396749066, v_8722.NAME AS var_reviewerName FROM PERSON v_8722 WHERE (v_8722.NAME IS NOT NULL) ) v_2660 INNER JOIN (SELECT v_3353.REVIEWDATE AS var_reviewDate, 'http://www4.wiwiss.fu- berlin.de/bizer/bsbm/v01/vocabulary/rating1' AS uri_rating11477446222, v_3353.REVIEWID AS var_review, v_3353.TEXT AS var_text, 'http://purl.org/stuff/rev#reviewer' AS uri_reviewer2088452952, v_3353.RATING1 AS var_rating1, 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/rating2' AS uri_rating21477446253, v_3353.TITLE AS var_title, 'http://www4.wiwiss.fu- berlin.de/bizer/bsbm/v01/vocabulary/language' AS uri_language269987354, 'http://www4.wiwiss.fu- berlin.de/bizer/bsbm/v01/vocabulary/reviewDate' AS uri_reviewDate750573656, 'http://www4.wiwiss.fu- berlin.de/bizer/bsbm/v01/vocabulary/rating3' AS uri_rating31477446284, 'http://www4.wiwiss.fu- 21
  • 22. Analysis with BSBM SQL Server mySQL 22
  • 23. Ongoing work • Writing the paper describing our optimisations • Proposing a comprehensive benchmarking platform to test R2RML-compliant query rewriting systems • Extending our current work on the R2RML implementation testcases 23
  • 24. 3. Ontology-based sensor query rewriting In other words, what happens if our data sources are not static, but data streams. Can we still use similar techniques? 24
  • 25. An example: SmartCities Environmental sensors Parking sensors SmartSantander Project 25
  • 26. Data from the Web Flood risk alert: South East England Emergency I have to make planner sense out of all this data wave data Environmental forecasts defenses Heterogeneity Continuous querying Streaming data 26
  • 27. Ingredients for Linked Sensor Data Core ontological model Additional domain ontologies Guidelines for generation of identifiers Sensor Web programming interfaces Query processing engines http://www.flickr.com/photos/santos/2252824606/
  • 28. Overview of the SSN ontology Deployment deploymentProcesPart only System OperatingRestriction hasSubsystem only, some hasSurvivalRange only SurvivalRange DeploymentRelatedProcess hasDeployment only System OperatingRange Deployment deployedSystem only hasOperatingRange only deployedOnPlatform only Process inDeployment only Device hasInput only Input PlatformSite onPlatform only Device Process Platform Output attachedSystem only hasOutput only, some Data Skeleton isProducedBy some implements some Sensor Sensing hasValue some sensingMethodUsed only SensorOutput detects only SensingDevice observes only ObservationValue SensorInput isProxyFor only Property includesEvent some isPropertyOf some observedProperty only observationResult only observedBy only hasProperty only, some Observation FeatureOfInterest featureOfInterest only MeasuringCapability ConstraintBlock hasMeasurementCapability only forProperty only inCondition only inCondition only MeasurementCapability Condition Compton M, Barnaghi P, Bermúdez L, García-Castro R, Corcho O, Cox S, Graybeal J, Hauswirth M, Henson C, Herzog A, Huang V, Janowicz K, Kelsey WD, Le Phuoc D, Lefort L, Leggieri M, Neuhaus H, Nikolov A, Page K, Passant A, Sheth A, Taylor K. The SSN Ontology of the W3C Semantic Sensor Network Incubator Group. Journal of Web Semantics. In press
  • 29. SSN Ontology with other Ontologies García-Castro R, Corcho O, Hill C. A Core Ontology Model for Semantic Sensor Web Infrastructures. International Journal of Semantic Web and Information Systems 8(1):22-42 29
  • 30. Queries to Sensor Data SNEEql RSTREAM SELECT id, speed, direction FROM wind [NOW]; Data Stream Mgmt System Esper QL SELECT wind_speed FROM wind_sensor.win:time(10 min) Complex Event Processors GSN RESTful service http://montblanc.slf.ch:22001/multidata?vs[0]=wind_sensor&field[0]=wind_speed& from=15/09/2011+05:00:00&to=15/09/2011+15:00:00 Pachube RESTful service http://api.pachube.com/v2/feeds/14321/datastreams/4?start=2011-09- 02T14:01:46Z&end=2011-09-02T17:01:46Z Sensor Data Middleware Querying through ontologies? 30
  • 31. SPARQL-Stream SELECT ?windspeed ?tidespeed FROM NAMED STREAM <http://swiss-experiment.ch/data#WannengratSensors.srdf> [NOW-10 MINUTES TO NOW-0 MINUTES] WHERE { ?WaveObs a ssn:Observation; ssn:observationResult ?windspeed; ssn:observedProperty sweetSpeed:WindSpeed. ?TideObs a ssn:Observation; ssn:observationResult ?tidespeed; ssn:observedProperty sweetSpeed:TideSpeed. FILTER (?tidespeed<?windspeed)} Query processing closer to data Use ontologies as conceptual model Query virtual stream graphs 31
  • 32. SPARQL-Stream SELECT ?name ( AVG(?temperature) AS ?avgTemperature ) ( AVG(?humidity) AS ?avgHumidity ) FROM NAMED STREAM <http://www.cwi.nl/SRBench/observations> [NOW - 1 HOURS SLIDE 1 HOURS] FROM <http://www.cwi.nl/SRBench/sensors> FROM <http://www.cwi.nl/SRBench/geonames> WHERE { ?sensor om-owl:generatedObservation ?temperatureObservation; Aggregates om-owl:generatedObservation ?humidityObservation; Static & Streaming om-owl:hasLocatedNearRel [ om-owl:hasLocation ?nearbyLocation ] . ?temperatureObservation om-owl:observedProperty weather:_AirTemperature ; om-owl:result [ om-owl:floatValue ?temperature ] . ?humidityObservation om-owl:observedProperty weather:_RelativeHumidity ; { SELECT ?name om-owl:result [ om-owl:floatValue ?humidity ] . Windows WHERE { Filters, Functions ?nearbyLocation gn:featureClass ?featureClass ; gn:name | gn:officialName ?name ; gn:population ?population . FILTER ( ?population > 15000 && REGEX(?featureClass, “P” , “i") ) } } UNION { SELECT ?name WHERE { Disclaimer: some features NYI ?nearbyLocation gn:parentFeature+ ?parentFeature . ?parentFeature gn:featureClass ?parentClass ; gn:name | gn:officialName ?name ; gn:population ?parentPopulation . FILTER ( ?parentPopulation > 15000 && REGEX(?parentClass, “P” , “i") ) } }} GROUP BY ?name 32
  • 33. Querying the Observations SELECT ?waveheight FROM STREAM <www.ssg4env.eu/SensorReadings.srdf> [NOW -10 MINUTES TO NOW STEP 1 MINUTE] WHERE { ?WaveObs a sea:WaveHeightObservation; sea:hasValue ?waveheight; } http://montblanc.slf.ch :22001/ multidata ?vs [0]= wan7 & field [0]= sp_wind Query :Wan4WindSpeed a rr:TriplesMapClass; rr:tableName "wan7"; Rewriting GSN SPARQLStream rr:subjectMap [ rr:template API "http://swissex.ch/ns#WindSpeed/Wan7/ {timed}"; Mappings rr:class ssn:ObservationValue; Query rr:graph ssg:swissexsnow.srdf ]; Processing rr:predicateObjectMap [ Sensor Client rr:predicateMap [ rr:predicate Network ssn:hasQuantityValue ]; rr:objectMap[ rr:column "sp_wind" ] ]; Data [tuples] [triples] translation R2RML Query processing Mappings engines 33
  • 34. Rewriting to different technologies SELECT ?windspeed FROM NAMED STREAM <http://swiss- experiment.ch/data#WannengratSensors.srdf> [NOW-10 MINUTE TO NOW-0 MINUTE] WHERE { Query ?WaveObs a ssn:Observation; Rewriting ssn:observationResult ?windspeed; Algebra ssn:observedProperty sweetSpeed:WindSpeed. } representation SELECT wind_speed_scalar_av, timed FROM wan7.win:time(10 min) Esper (CEP) SELECT wan7.wind_speed_scalar_av AS windspeed, wan7.timed AS windts FROM wan7[FROM NOW-10 MINUTES TO NOW] SNEE (DSMS) http://montblanc.slf.ch:22001/multidata?vs[0]=wan7& field[0]=wind_speed_scalar_av& from=15/05/2011+05:00:00&to=15/05/2011+15:00:00 GSN (Middleware) http://api.pachube.com/v2/feeds/14321/datastreams/4?start=2011-09- 02T14:01:46Z&end=2011-09-02T17:01:46Z Pachube (Middleware) Calbimonte JP, Corcho O, Yeung H, Aberer K. Enabling Query Technologies for the Semantic Sensor Web. International Journal of Semantic Web and Information Systems 8(1):43-63 34
  • 35. Ongoing work • Benchmarking of ontology-based streaming data engines • Zhang Y, Pham MD, Corcho O, Calbimonte JP. SRBench: A Streaming RDF/SPARQL Benchmark. Proceedings of the 11th International Semantic Web Conference (ISWC2012) • Improve optimisations when joining static and streaming data • Automatic characterisation of sensor data streams • Useful in citizen science approaches (e.g., AirQualityEgg) • Calbimonte JP, Yan Z, Jeung H, Corcho O, Aberer K. Deriving Semantic Sensor Metadata from Raw Measurements. ISWC2012 5th International Workshop on Semantic Sensor Networks 2011 (SSN2012). CEUR Workshop Proceedings, Vol-904, http://ceur-ws.org/Vol-904/ 35
  • 36. 4. Federated query processing In other words, how can we access data from federated data sources 36
  • 37. Example • We query the life science domain 1. Using the Pubmed references obtained from the GeneID gene dataset, retrieve information about genes and their references in the Pubmed dataset. 2. From Pubmed we access the information in the National Library of Medicines controlled vocabulary thesaurus, stored at the MeSH endpoint, so we have more complete information about such genes. 3. Finally, we also access the HHPID endpoint, which is the knowledge base for the HIV-1 protein. 37
  • 38. Introduction • Question: • How can we access such amount of RDF data in an integrated manner? • Current approaches • Replicate data in local stores, access it using existing RDF databases. • Execute individual queries and manually join data. • Use existing distributed query systems (starting to appear). 38
  • 39. Problem • Existing tools for distributed SPARQL query processing differ in the way of handling distribution • SPARQL-published the Federated Query Document Last Call Working Draft • It homogenises the access to distributed RDF data repositories • SERVICE <http://dbpedia.org/sparql> {...} • Problems in semantics: SERVICE ?X not well defined • Current Access to SPARQL endpoints is not optimal • Work on SPARQL distributed query optimization is beginning 39
  • 40. State of the Art • ANAPSID, RDF::Query, OpenAnzo, ARQ, Rasqal RDF Query Library • ANAPSID provides SPARQL optimization based on adaptive query processing operators • RDF::Query provides basic pattern reordering • Implement the federation using query predicates • List of SPARQL endpoints needed • Helps user to direct queries to remote datasets • FedX, SPLENDID, SemWIQ, NetworkedGraphs • All provide basic optimisations: pattern grouping (FedX), cost based optimizations(SemWIQ, SPLENDID and recently FedX, NetworkedGraphs) • SPARQL 1.1 is mostly syntactic sugar 40
  • 41. Assumptions & Restrictions • Assumptions 1. Users know how to create a query to the endpoints 2. No statistics of any kind are available for the query processing system. 3. Data are distributed • Restrictions 1. We only consider the Federation Extension of SPARQL 1.1 2. We are not aware of the capabilities or implementation of the remote SPARQL server 3. No registry of endpoints 41
  • 42. SERVICE Semantics Example: SELECT ?name ?email SELECT ?name ?email WHERE { WHERE { ?y :name ?name . SERVICE <http://example1.org/sparql> ?y :email ?email {?y :name ?name} . } SERVICE <http://example2.org/sparql> {?y :email ?email} } • We extend [PAG09] with the semantics for SERVICE: 42
  • 43. SERVICE Semantics Example: SELECT ?name WHERE { SERVICE ?X {?y :name ?name} } 43
  • 44. SPARQL Optimisation - OPTIONAL • We assume that we have no statistics of endpoints • This means that we cannot use cost-based optimisations • We will only focus on static optimisations • Besides the usual static optimisations (e.g. Pushing down filters) SPARQL queries can be optimised if they contain OPTIONAL operators • The OPTIONAL operator is responsible for PSPACE- completeness in SPARQL [PAG09] • OPTIONAL is a key operator in SPARQL 44
  • 45. Well-designed patterns • Well-designed SPARQL patterns [PAG09] • Class of SPARQL patterns which adds a restriction 45
  • 46. Well-designed Patterns • We extended the notion of well-designed patterns for the SPARQL 1.1 Federation Extension • The previous rules also hold for SERVICE 46
  • 47. Implementation: SPARQL-DQP • SPARQL-DQP is implemented on top of OGSA-DAI and OGSA- DQP • OGSA-DAI is a Web service-based framework for accessing distributed data resources • OGSA-DQP adds distributed query processing infrastructure • We reuse some OGSA-DQP operators • We added RDF and SPARQL endpoint data access • RDFB2RDF data resource • RDF data resource • SPARQL endpoint resources • Good behaviour for large datasets Buil C, Arenas M, Corcho O. Semantics and optimization of the SPARQL 1.1 federation extension. Proceedings of the 8th Extended Semantic Web Conference (ESWC2011). Springer-Verlag LNCS 6644, pages 1-15 47
  • 48. Ongoing Work • An extensive benchmark has been produced • Montoya G, Vidal ME, Corcho O, Ruckhaus E, Buil-Aranda C. Benchmarking Federated SPARQL Query Engines: Are Existing Testbeds Enough? In: Proceedings of the 11th International Semantic Web Conference (ISWC2012) • Focusing now on Adaptive Query Processing • Query Processing should be adapted to the user's specific needs and specific network requirements 48
  • 49. 5. Entailment in query rewriting In other words, how can we take into account the existence of ontologies in the query rewriting process, so as to provide simple entailment 49
  • 50. Main approaches in the state of the art Expressiveness Author System Output [R] Datalog, ELHIO¬ Pérez-Urbina et al. REQUIEM UCQ Sticky-join [linear] datalog± Gottlob et al. Nyaya UCQ DL-LiteR, DL-LiteF Calvanese et al. QuOnto UCQ DL-LiteR Chortaras et al. Rapid UCQ Presto & NR-Datalog & DL-LiteR [+EBox] Rosati et al. Prexto UCQ 50
  • 51. Optimizations in the rewriting • The rewriting can be optimized in several ways • Ontology preprocessing • Subsumption checks • Prioritize inferences • Constrain the searches 51
  • 53. Conclusion and Future Work • We have proposed some small incremental improvements over the current state of the art in entailment-aware query rewriting • Need to integrate it with the rest of our work • This will happen during Fall 2012 53
  • 54. Final conclusions and future work 54
  • 55. Ingredients 100 80 thin applications (mas hups ) middleware 60 5 Reasoning Este 40 s emantic data integration and querying Oeste 20 1 RDB2RDF Norte 0 1er 3er 3 legacy Sensor-based 2 query rewriting Optimisations data s ources trim. regis tries s ens or networks trim. Federated Query 4 Processing Linked Open Data Spreadsheets 55
  • 56. Data integration at our group: ingredients and some prospects Credible workshop Sophia-Antipolis, October 15th 2012 Oscar Corcho ocorcho@fi.upm.es Facultad de Informática, Universidad Politécnica de Madrid Campus de Montegancedo s/n. 28660 Boadilla del Monte, Madrid, Spain With contributions from: José Mora (OEG-UPM), Boris Villazón-Terrazas (OEG-UPM, now at iSOCO), Jean Paul Calbimonte (OEG-UPM), Freddy Priyatna (OEG-UPM), Carlos Buil-Aranda (OEG-UPM, now at PUC Chile)