SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Stream Reasoning
               Where We Got So Far
                      Oxford - 2010.1.18
             http://streamreasoning.org
                        Emanuele Della Valle
                           DEI - Politecnico di Milano
                            emanuele.dellavalle@polimi.it
                            http://emanueledellavalle.org
                                   Joint work with:
Davide Francesco Barbieri, Daniele Braga, Stefano http://wiki.larkc.eu/UrbanComputing
                               • For more information visit Ceri, and Michael Grossniklaus
Agenda
  •  Motivation
  •  Running Example
  •  Background
  •  Concept
  •  Achievements
  •  Retrospective and Conclusions




   Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   2
Motivation
It s a streaming World! [IEEE-IS2009]
   •  Sensor networks, …


   •  traffic engineering, …


   •  social networking, …


   •  financial markets, …


   •  generate streams!

     Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   3
Running Example
Real-Time Streams on the Web
   •  Streams are appearing more and more often on the
      Web in sites that distribute and present information in
      real-time streams.
   •  Checkout http://activitystrea.ms/ for a standard API
   •  E.g.




    Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   4
Running Example
Examples of Questions Users are Asking
   •  Which topics have my close friends discussed in the
      last hour?
   •  Which book is my friend likely to read next?
   •  What impact have I been creating with my tweets in
      the last day?
   •  …
   •  <query> … <time dimension> ?




    Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   5
Motivation
Problem Statement
   •  Making sense
       –  in real time
       –  of gigantic and inevitably noisy data streams
       –  in order to support the decision process of
          extremely large numbers of concurrent user




     Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   6
Background
What are data streams anyway?
   •  Formally:
      –  Data streams are unbounded sequences of time-
         varying data elements




          time

   •  Less formally:
      –  an (almost) continuous flow of information
      –  with the recent information being more relevant as it
         describes the current state of a dynamic system


    Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   7
Background
Continuous Semantics
   •  Processing data streams in the space of
      one-time semantics is difficult
      because of the very nature of the underlying data
   •  Innovative* assumption: continuous semantics!
         –  streams can be consumed on the fly rather than being
            stored forever and
         –  queries are registered and continuously produce
            answers




   *   This innovation arose in DB community in 90s

       Oxford, 2011-1-18       Emanuele Della Valle - visit http://streamreasoning.org   8
Background
Stream Processing
   •  Continuous queries registered over streams that
      are observed trough windows
                                                        window




           input stream       Registered	
                        stream of answer
                              Con-nuous	
  
                                Query	
  


    Oxford, 2011-1-18     Emanuele Della Valle - visit http://streamreasoning.org   9
Background
Data Stream Management Systems (DSMS)
   •  Research Prototypes
      –    Amazon/Cougar (Cornell) – sensors
      –    Aurora (Brown/MIT) – sensor monitoring, dataflow
      –    Gigascope: AT&T Labs – Network Monitoring
      –    Hancock (AT&T) – Telecom streams
      –    Niagara (OGI/Wisconsin) – Internet DBs & XML
      –    OpenCQ (Georgia) – triggers, view maintenance
      –    Stream (Stanford) – general-purpose DSMS
      –    Stream Mill (UCLA) - power & extensibility
      –    Tapestry (Xerox) – publish/subscribe filtering
      –    Telegraph (Berkeley) – adaptive engine for sensors
      –    Tribeca (Bellcore) – network monitoring
   •  High-tech startups
      –  Streambase, Coral8, Apama, Truviso
   •  Major DBMS vendors are all adding stream extensions as well
      –  Oracle http://www.oracle.com/technology/products/dataint/htdocs/streams_fo.html
      –  DB2 http://www.eweek.com/c/a/Database/IBM-DB2-Turns-25-and-Prepares-for-New-Life/

    Oxford, 2011-1-18                 Emanuele Della Valle - visit http://streamreasoning.org   10
Background
Can the Semantic Web process data stream?
   •  The Semantic Web, the Web of Data is doing fine
      –  RDF, RDF Schema, SPARQL, OWL, RIF
      –  well understood theory,
      –  rapid increase in scalability
   •  BUT it pretends that the world is static
      or at best a low change rate
      both in change-volume and change-frequency
      –  ontology versioning
      –  belief revision
      –  time stamps on named graphs
   •  It sticks to the traditional one-time semantics


    Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   11
Concept
Stream Reasoning [IEEE-IS2010]
   •  Idea origination
      –  Can continuous semantics be ported to reasoning?
      –  This is an unexplored yet high impact research area!

   •  Stream Reasoning
      –  Logical reasoning in real time on gigantic and
         inevitably noisy data streams in order to support
         the decision process of extremely large numbers
         of concurrent users.
           -- S. Ceri, E. Della Valle, F. van Harmelen and H. Stuckenschmidt, 2010


   •  Note: making sense of streams necessarily requires
      processing them against rich background knowledge

    Oxford, 2011-1-18        Emanuele Della Valle - visit http://streamreasoning.org   12
Concept
Research Challenges
   •  Relation with data-stream systems
      –  Just as RDF relates to data-base systems?
   •  Query languages for semantic streams
      –  Just as SPARQL for RDF but with continuous semantics?
   •  Reasoning on Streams
      –    Formal representations for stream reasoning
      –    Notions of soundness and completeness
      –    Efficiency
      –    Scalability
   •  Dealing with incomplete & noisy data
      –  Even more so than on the current Web of Data
   •  Distributed and parallel processing
      –  Streams are parallel in nature


    Oxford, 2011-1-18     Emanuele Della Valle - visit http://streamreasoning.org   13
Achievements
Explored Continuous Semantics for SeWeb
   •  We investigated
      –  Architecture of a Stream Reasoner
      –  RDF streams
            •  the natural extension of the RDF data model to the new
               continuous scenario and
      –  Continuous SPARQL (or simply C-SPARQL)
            •  the extension of SPARQL for querying RDF streams.
      –  Efficient incremental updates of deductive
         closures
            •  specifically considering the nature of data streams
      –  Effective inductive stream reasoning (joint work
         with Siemens - Munich)
            •  See paper in IEEE IS special issue on Social Media
               Analytics

    Oxford, 2011-1-18       Emanuele Della Valle - visit http://streamreasoning.org   14
Achievements
Architecture (IEEE-IS2010)




                                                                                                           Social	
  Media	
  Analytics
                   Selector          Abstracter                Deductive C
       Window       DSMS	
  .            DSMS                  Reasoner
                                                                C    C
                                                                             Abstracter      Inductive
     Legend                                                                  Long-­‐Term               P
              data	
  stream    C C-­‐SPARQL	
  query                          Matrix        Reasoner
              RDF	
  stream     P SPARQL	
  with Probability
                                                                             Abstracter      Inductive
              RDF	
  graph                                                     Hype                    P
                                                                               Matrix        Reasoner

   •  Based on the LarKC conceptual framework
                                                                                 http://www.larkc.eu




    Oxford, 2011-1-18                    Emanuele Della Valle - visit http://streamreasoning.org      15
Achievements
RDF Stream [WWW2009,EDBT2010,IJSC2010]
   •  RDF Stream Data Type
      –  Ordered sequence of pairs, where each pair is made
         of an RDF triple and its timestamp t
            (< triple >, t)
   •  E.g.,
      (<:Giulia :likes :Twilight >,                    2010-02-12T13:34:41)
      (<:John   :likes :TheLordOfTheRings >,           2010-02-12T13:36:28)
      (<:Alice :dislikes :Twilight >,                  2010-02-12T13:36:28)




    Oxford, 2011-1-18    Emanuele Della Valle - visit http://streamreasoning.org   16
Achievements
C-SPARQL [WWW2009,EDBT2010,IJSC2010]
   •  We specificied of C-SPARQL syntax
      –  Incrementally, from existing specifications
            •  Including windows, grouping, aggregates, timestamping
   •  We gave the formal semantics of C-SPARQL
      –  Query registration, handling overloads
      –  Order of evaluation, pattern matching over time, …
   •  We investigated efficiency of evaluation
      –  Defining a suitable algebra
      –  Applying optimizations
      –  Efficient materialization of inferred data from streams



    Oxford, 2011-1-18      Emanuele Della Valle - visit http://streamreasoning.org   17
Achievements
An Example of C-SPARQL Query
   Who are the opinion makers? i.e., the users who are likely to influence
    the behavior of other users who follow them

   REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS
   CONSTRUCT { ?opinionMaker sd:about ?resource }
   FROM STREAM <http://streamingsocialdata.org/interactions>
     [RANGE 30m STEP 5m]
   WHERE {
           ?opinionMaker ?opinion ?resource .
           ?follower sioc:follows ?opinionMaker.
           ?follower ?opinion ?resource.
           FILTER ( cs:timestamp(?follower) >
                    cs:timestamp(?opinionMaker)
                    && ?opinion != sd:accesses )
   }
   HAVING ( COUNT(DISTINCT ?follower) > 3 )

    Oxford, 2011-1-18      Emanuele Della Valle - visit http://streamreasoning.org   18
Achievements
An Example of C-SPARQL Query
   Who are the opinion makers? i.e., the users who are likely to influence
             Query registration                    RDF Stream added as
    the (for continuous execution) who follow them
         behavior of other users                    new ouput format

   REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS
   CONSTRUCT { ?opinionMaker sd:about ?resource }
   FROM STREAM <http://streamingsocialdata.org/interactions>
     [RANGE 30m STEP 5m]                     FROM STREAM clause
   WHERE {
           ?opinionMaker ?opinion ?resource .    WINDOW
           ?follower sioc:follows ?opinionMaker.      Builtin to
           ?follower ?opinion ?resource.               access
                                                    timestamps
           FILTER ( cs:timestamp(?follower) >
                    cs:timestamp(?opinionMaker)
                    && ?opinion != sd:accesses )     Aggregates as
                                                      in SPARQL 1.1
   }
   HAVING ( COUNT(DISTINCT ?follower) > 3 )

    Oxford, 2011-1-18      Emanuele Della Valle - visit http://streamreasoning.org   19
Achievements
Efficiency of Evaluation 1/3 [IEEE-IS2010]
   •  Evaluation of Window-based Selection




    Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   20
Achievements
Efficiency of Evaluation 2/3 [EDBT2010]
   •  Several transformations can be applied to algebraic
      representation of C-SPARQL
   •  some recalling well known results from classical
      relational optimization
      –  push of FILTERs and projections
   •  some being more specific to the domain of streams.
      –  push of aggregates.




    Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   21
Achievements
Efficiency of Evaluation 3/3 [EDBT2010]
   •  Push of filters and projections
                125


                100


                75
           ms




                50


                25


                 0
                      10          100             1000            10000          100000
                                             Window Size
                           None    Static Only      Streaming Only        Both

    Oxford, 2011-1-18             Emanuele Della Valle - visit http://streamreasoning.org   22
Achievements
Example of C-SPARQL and Reasoning 1/2
   What impact have I been creating with my tweets in the last hour?
   Is it positive or negative? Let’s count them …
   REGISTER QUERY CountPositiveAndNegativeReactions AS
   PREFIX : <http://ex.org/twitterImpactMining#>
   SELECT ?t count(?pos) count(?neg)
   FROM STREAM <http://ex.org/discussions.trdf>
            [RANGE 30m STEP 30s]      :discuss a owl:TransitiveProperty .
   WHERE {                           :reply rdfs:subPropertyOf :discuss .
    ?t a :MonitoredTweet .         :retweet rdfs:subPropertyOf :discuss .

    { ?pos :discuss ?t ;
      :ProduceReaction [ a :PositiveReaction ] .
    } UNION {
      ?neg :discuss ?t ;
      :ProduceReaction [ a :NegativeReaction ] .
    }
   } GROUP BY ?t

    Oxford, 2011-1-18      Emanuele Della Valle - visit http://streamreasoning.org   23
Achievements
Example of C-SPARQL and Reasoning 2/2


                                                                                             discuss	
  

                                                      discuss	
  
                      retweet	
                                                                 reply	
                                              retweet	
  
      t1	
                                                    t1-­‐1	
                                                             t1-­‐2	
                        t1-­‐3	
  
                       discuss	
                                                            discuss	
                                                discuss	
  

                                                                                                                                discuss	
  


               Monitored	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Posi.ve	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Nega.ve	
  




    Oxford, 2011-1-18                                             Emanuele Della Valle - visit http://streamreasoning.org                                                       24
Achievements
State-of-the-Art Approach [Ceri1994,Volz2005]
   1.  Overestimation of deletion: Overestimates deletions
        by computing all direct consequences of a deletion.
   2.  Rederivation: Prunes those estimated deletions for
        which alternative derivations (via some other facts
        in the program) exist.
   3.  Insertion: Adds the new derivations that are
        consequences of insertions to extensional
        predicates.




    Oxford, 2011-1-18   Emanuele Della Valle - visit http://streamreasoning.org   25
Achievements
our approach [ESWC2010] 1/2
   •  Assuption
      –  Insertions and deletions are triples respectively
         entering and exiting the window
      –  The window size is known
   •  Therefore
      –  The time when each triple will expire is known and
         determined by the window size
            •  E.g. if the window is 10s long a triple entering at time t will
               exit at time t+10s
      –  Note: all knowledge can be annotated with an
         expiration time
            •  i.e., background knowledge is annotated with +∞


    Oxford, 2011-1-18        Emanuele Della Valle - visit http://streamreasoning.org   26
Achievements
our approach [ESWC2010] 2/2
   •  The algorithm
      1.  deletes all triples (asserted or inferred) that have just
          expired
      2.  computes the entailments derived by the inserts,
      3.  annotates each entailed triple with a expiration time,
          and
      4.  eliminates from the current state all copies of derived
          triples except the one with the highest timestamp.


   •  learn more
      –  http://www.slideshare.net/emanueledellavalle/incremental-
         reasoning-on-streams-andrich-background-knowledge


    Oxford, 2011-1-18     Emanuele Della Valle - visit http://streamreasoning.org   27
Achievements
Comparative Evaluation 1/2 [ESWC2010]
   •  Hypothesis
               –  Background knowledge do not change and it is fully materialized
               –  Changes only take place in the window
   •  An experiment comparing the time required to compute a new
      materialization using
               –  Re-computing from scratch (i.e.,1250 ms in our setting)
               –  State of the art incremental approach [Volz, 2005]
               –  Our approach
   •  Results at increasing % of the materialization changed when
      the window slides
   10000


        1000
  ms.




        100


         10
           0,0%      2,0%   4,0%   6,0%               8,0%             10,0%              12,0%             14,0%        16,0%   18,0%   20,0%
   •  .                            %	
  of	
  t he	
  m aterialization	
   changed	
  when	
  t he	
  window	
  slides
                                      incremental-­‐volz                               incremental-­‐stream



        Oxford, 2011-1-18              Emanuele Della Valle - visit http://streamreasoning.org                                               28
Achievements
Comparative Evaluation 2/2
   •  Comparison of the average time needed to answer a
      C-SPARQL query using
      –  a forward reasoner,
      –  the naive approach of re-computing the materialization
      –  our approach
                   20


                   15


                   10
          ms.




                    5


                    0
                          forward	
  reasoning       naive	
  approach      incremental-­‐stream
        query                    5,82                      1,61                     1,61
        materialization            0                      15,91                     0,28


    Oxford, 2011-1-18             Emanuele Della Valle - visit http://streamreasoning.org          29
Retrospective and Conclusions
Wrap Up
   •  RDF Streams
       –  Notion defined
   •  C-SPARQL
       –  Syntax and semantics defined as a SPARQL extension
       –  Engine designed
       –  Engine implemented based on the decision to keep stream
          management and query evaluation separated
   •  Experiments with C-SPARQL under simple RDF entailment
      regimes
       –  window based selection of C-SPARQL outperforms the standard
          FILTER based selection
       –  having formally defined C-SPARQL semantics algebraic
          optimizations are possible
   •  Experiment with C-SPARQL under OWL-RL entailment
      regimes
       –  efficient incremental updates of deductive closures investigated
       –  our approach outperform state-of-the-art when updates comes as
          stream

     Oxford, 2011-1-18     Emanuele Della Valle - visit http://streamreasoning.org   30
Retrospective and Conclusions
Achievements vs. Research Challenges
   •  Relation with data-stream systems
       –  Notion of RDF stream :-|
   •  Query languages for semantic streams
       –  C-SPARQL :-D
   •  Reasoning on Streams
       –  Formal representations for stream reasoning
             •  :-P
       –  Notions of soundness and completeness
             •  :-P
       –  Efficient incremental updates of deductive closures
             •  ESWC 2010 paper :-) ... but much more work is needed!
       –  How to combine streams and background knowledge
             •  ESWC 2010 paper :-| ... but a lot needs to be studied ...
   •  Dealing with incomplete & noisy data
       –  :-P
   •  Distributed and parallel processing
       –  :-P

     Oxford, 2011-1-18         Emanuele Della Valle - visit http://streamreasoning.org   31
References
  •  Vision
      [IEEE-IS2009] Emanuele Della Valle, Stefano Ceri, Frank van Harmelen, Dieter Fensel
         It's a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent
         Systems 24(6): 83-89 (2009)
  •  Continuous SPARQL (C-SPARQL)
      [EDBT2010] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri and Michael
          Grossniklaus. An Execution Environment for C-SPARQL Queries. EDBT 2010
      [WWW2009] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle,
          Michael Grossniklaus: C-SPARQL: SPARQL for continuous querying. WWW 2009:
          1061-1062
      [IJSC2010] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle,
          Michael Grossniklaus: C-SPARQL: a Continuous Query Language for RDF Data Streams.
          Int. J. Semantic Computing 4(1): 3-25 (2010)
      [IEEE-IS2010] Davide Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle, Yi Huang,
          Volker Tresp, Achim Rettinger, Hendrik Wermser, "Deductive and Inductive Stream
          Reasoning for Semantic Social Media Analytics," IEEE Intelligent Systems, 30 Aug. 2010.
  •  Stream Reasoning
      [ESWC2010] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle,
         Michael Grossniklaus. Incremental Reasoning on Streams and Rich Background
         Knowledge. In. 7th Extended Semantic Web Conference (ESWC 2010)
  •  Background work
      [Ceri1994] Stefano Ceri, Jennifer Widom: Deriving Incremental Production Rules for Deductive
         Data. Inf. Syst. 19(6): 467-490 (1994)
      [Volz2005] Raphael Volz, Steffen Staab, Boris Motik: Incrementally Maintaining
         Materializations of Ontologies Stored in Logic Databases. J. Data Semantics 2: 1-34 (2005)


    Oxford, 2011-1-18            Emanuele Della Valle - visit http://streamreasoning.org         32
Thank You! Questions?




                                  Much More to Come!
                                    Keep an eye on
                            http://www.streamreasoning.org




        Oxford, 2011-1-18          For more information visit http://www.larkc.eu/   33

Contenu connexe

Tendances

Tendances (12)

OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
 
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX toolWi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX tool
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Incremental Reasoning on Streams and Rich Background Knowledge
Incremental Reasoning on Streams andRich Background Knowledge Incremental Reasoning on Streams andRich Background Knowledge
Incremental Reasoning on Streams and Rich Background Knowledge
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
 
TripleWave: Spreading RDF Streams on the Web
TripleWave: Spreading RDF Streams on the WebTripleWave: Spreading RDF Streams on the Web
TripleWave: Spreading RDF Streams on the Web
 
LOD2 Plenary Meeting 2011: University of Economics, Prague – Partner Introduc...
LOD2 Plenary Meeting 2011: University of Economics, Prague – Partner Introduc...LOD2 Plenary Meeting 2011: University of Economics, Prague – Partner Introduc...
LOD2 Plenary Meeting 2011: University of Economics, Prague – Partner Introduc...
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
 
RDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's React
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developments
 
RDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsRDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementations
 

En vedette (8)

International Renewables services
International Renewables servicesInternational Renewables services
International Renewables services
 
Valencia millor equip del mon
Valencia millor equip del monValencia millor equip del mon
Valencia millor equip del mon
 
IC2008 Connettivi Booleani
IC2008 Connettivi BooleaniIC2008 Connettivi Booleani
IC2008 Connettivi Booleani
 
Parent K-5 Math Audit Response #1
Parent K-5 Math Audit Response #1Parent K-5 Math Audit Response #1
Parent K-5 Math Audit Response #1
 
Speciale arte24 09-2012 p
Speciale arte24 09-2012 pSpeciale arte24 09-2012 p
Speciale arte24 09-2012 p
 
Crisis
CrisisCrisis
Crisis
 
IC2009 Information R-Evolution
IC2009 Information R-EvolutionIC2009 Information R-Evolution
IC2009 Information R-Evolution
 
IC2009 Enunciato e mondo del discorso
IC2009 Enunciato e mondo del discorsoIC2009 Enunciato e mondo del discorso
IC2009 Enunciato e mondo del discorso
 

Similaire à Stream Reasoning - where we got so far 2011.1.18 Oxford Key Note

Enabling ontology based streaming data access final
Enabling ontology based streaming data access finalEnabling ontology based streaming data access final
Enabling ontology based streaming data access final
Jean-Paul Calbimonte
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Jon Voss
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
Lucy McKenna
 

Similaire à Stream Reasoning - where we got so far 2011.1.18 Oxford Key Note (20)

Challenges, Approaches, and Solutions in Stream Reasoning
Challenges, Approaches, and Solutions in Stream ReasoningChallenges, Approaches, and Solutions in Stream Reasoning
Challenges, Approaches, and Solutions in Stream Reasoning
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
Reflections on Almost Two Decades of Research into Stream Processing
Reflections on Almost Two Decades of Research into Stream ProcessingReflections on Almost Two Decades of Research into Stream Processing
Reflections on Almost Two Decades of Research into Stream Processing
 
RSP4J: An API for RDF Stream Processing
RSP4J: An API for RDF Stream ProcessingRSP4J: An API for RDF Stream Processing
RSP4J: An API for RDF Stream Processing
 
20110728 datalift-rpi-troy
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troy
 
Enabling ontology based streaming data access final
Enabling ontology based streaming data access finalEnabling ontology based streaming data access final
Enabling ontology based streaming data access final
 
From ontology to wiki
From ontology to wikiFrom ontology to wiki
From ontology to wiki
 
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
 
Metadata is back!
Metadata is back!Metadata is back!
Metadata is back!
 
Update From OCLC Research May 2008
Update From OCLC Research May 2008Update From OCLC Research May 2008
Update From OCLC Research May 2008
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
 
Datalift: A Catalyser for the Web of Data - Francois Scharffe
Datalift: A Catalyser for the Web of Data - Francois ScharffeDatalift: A Catalyser for the Web of Data - Francois Scharffe
Datalift: A Catalyser for the Web of Data - Francois Scharffe
 
British Library Seminar: Shared Canvas (September 2011)
British Library Seminar: Shared Canvas (September 2011)British Library Seminar: Shared Canvas (September 2011)
British Library Seminar: Shared Canvas (September 2011)
 
Stream Reasoning : Where We Got So Far
Stream Reasoning: Where We Got So FarStream Reasoning: Where We Got So Far
Stream Reasoning : Where We Got So Far
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural Heritage
 

Plus de Emanuele Della Valle

Plus de Emanuele Della Valle (20)

Taming velocity - a tale of four streams
Taming velocity - a tale of four streamsTaming velocity - a tale of four streams
Taming velocity - a tale of four streams
 
Stream reasoning
Stream reasoningStream reasoning
Stream reasoning
 
Work in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream ReasoningWork in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream Reasoning
 
Big Data and Data Science W's
Big Data and Data Science W'sBig Data and Data Science W's
Big Data and Data Science W's
 
La città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - FluxedoLa città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - Fluxedo
 
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
 
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
 
Big Data: how to use it to create value
Big Data: how to use it to create valueBig Data: how to use it to create value
Big Data: how to use it to create value
 
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
 
Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF
 
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
 
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
 
On Stream Reasoning
On Stream ReasoningOn Stream Reasoning
On Stream Reasoning
 
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
 
Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03
 
City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)
 
Semantic technologies and Interoperability
Semantic technologies and InteroperabilitySemantic technologies and Interoperability
Semantic technologies and Interoperability
 
Big data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscapeBig data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscape
 
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
 
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Stream Reasoning - where we got so far 2011.1.18 Oxford Key Note

  • 1. Stream Reasoning Where We Got So Far Oxford - 2010.1.18 http://streamreasoning.org Emanuele Della Valle DEI - Politecnico di Milano emanuele.dellavalle@polimi.it http://emanueledellavalle.org Joint work with: Davide Francesco Barbieri, Daniele Braga, Stefano http://wiki.larkc.eu/UrbanComputing • For more information visit Ceri, and Michael Grossniklaus
  • 2. Agenda •  Motivation •  Running Example •  Background •  Concept •  Achievements •  Retrospective and Conclusions Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 2
  • 3. Motivation It s a streaming World! [IEEE-IS2009] •  Sensor networks, … •  traffic engineering, … •  social networking, … •  financial markets, … •  generate streams! Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 3
  • 4. Running Example Real-Time Streams on the Web •  Streams are appearing more and more often on the Web in sites that distribute and present information in real-time streams. •  Checkout http://activitystrea.ms/ for a standard API •  E.g. Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 4
  • 5. Running Example Examples of Questions Users are Asking •  Which topics have my close friends discussed in the last hour? •  Which book is my friend likely to read next? •  What impact have I been creating with my tweets in the last day? •  … •  <query> … <time dimension> ? Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 5
  • 6. Motivation Problem Statement •  Making sense –  in real time –  of gigantic and inevitably noisy data streams –  in order to support the decision process of extremely large numbers of concurrent user Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 6
  • 7. Background What are data streams anyway? •  Formally: –  Data streams are unbounded sequences of time- varying data elements time •  Less formally: –  an (almost) continuous flow of information –  with the recent information being more relevant as it describes the current state of a dynamic system Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 7
  • 8. Background Continuous Semantics •  Processing data streams in the space of one-time semantics is difficult because of the very nature of the underlying data •  Innovative* assumption: continuous semantics! –  streams can be consumed on the fly rather than being stored forever and –  queries are registered and continuously produce answers * This innovation arose in DB community in 90s Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 8
  • 9. Background Stream Processing •  Continuous queries registered over streams that are observed trough windows window input stream Registered   stream of answer Con-nuous   Query   Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 9
  • 10. Background Data Stream Management Systems (DSMS) •  Research Prototypes –  Amazon/Cougar (Cornell) – sensors –  Aurora (Brown/MIT) – sensor monitoring, dataflow –  Gigascope: AT&T Labs – Network Monitoring –  Hancock (AT&T) – Telecom streams –  Niagara (OGI/Wisconsin) – Internet DBs & XML –  OpenCQ (Georgia) – triggers, view maintenance –  Stream (Stanford) – general-purpose DSMS –  Stream Mill (UCLA) - power & extensibility –  Tapestry (Xerox) – publish/subscribe filtering –  Telegraph (Berkeley) – adaptive engine for sensors –  Tribeca (Bellcore) – network monitoring •  High-tech startups –  Streambase, Coral8, Apama, Truviso •  Major DBMS vendors are all adding stream extensions as well –  Oracle http://www.oracle.com/technology/products/dataint/htdocs/streams_fo.html –  DB2 http://www.eweek.com/c/a/Database/IBM-DB2-Turns-25-and-Prepares-for-New-Life/ Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 10
  • 11. Background Can the Semantic Web process data stream? •  The Semantic Web, the Web of Data is doing fine –  RDF, RDF Schema, SPARQL, OWL, RIF –  well understood theory, –  rapid increase in scalability •  BUT it pretends that the world is static or at best a low change rate both in change-volume and change-frequency –  ontology versioning –  belief revision –  time stamps on named graphs •  It sticks to the traditional one-time semantics Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 11
  • 12. Concept Stream Reasoning [IEEE-IS2010] •  Idea origination –  Can continuous semantics be ported to reasoning? –  This is an unexplored yet high impact research area! •  Stream Reasoning –  Logical reasoning in real time on gigantic and inevitably noisy data streams in order to support the decision process of extremely large numbers of concurrent users. -- S. Ceri, E. Della Valle, F. van Harmelen and H. Stuckenschmidt, 2010 •  Note: making sense of streams necessarily requires processing them against rich background knowledge Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 12
  • 13. Concept Research Challenges •  Relation with data-stream systems –  Just as RDF relates to data-base systems? •  Query languages for semantic streams –  Just as SPARQL for RDF but with continuous semantics? •  Reasoning on Streams –  Formal representations for stream reasoning –  Notions of soundness and completeness –  Efficiency –  Scalability •  Dealing with incomplete & noisy data –  Even more so than on the current Web of Data •  Distributed and parallel processing –  Streams are parallel in nature Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 13
  • 14. Achievements Explored Continuous Semantics for SeWeb •  We investigated –  Architecture of a Stream Reasoner –  RDF streams •  the natural extension of the RDF data model to the new continuous scenario and –  Continuous SPARQL (or simply C-SPARQL) •  the extension of SPARQL for querying RDF streams. –  Efficient incremental updates of deductive closures •  specifically considering the nature of data streams –  Effective inductive stream reasoning (joint work with Siemens - Munich) •  See paper in IEEE IS special issue on Social Media Analytics Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 14
  • 15. Achievements Architecture (IEEE-IS2010) Social  Media  Analytics Selector Abstracter Deductive C Window DSMS  . DSMS Reasoner C C Abstracter Inductive Legend Long-­‐Term P data  stream C C-­‐SPARQL  query Matrix Reasoner RDF  stream P SPARQL  with Probability Abstracter Inductive RDF  graph Hype P Matrix Reasoner •  Based on the LarKC conceptual framework http://www.larkc.eu Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 15
  • 16. Achievements RDF Stream [WWW2009,EDBT2010,IJSC2010] •  RDF Stream Data Type –  Ordered sequence of pairs, where each pair is made of an RDF triple and its timestamp t (< triple >, t) •  E.g., (<:Giulia :likes :Twilight >, 2010-02-12T13:34:41) (<:John :likes :TheLordOfTheRings >, 2010-02-12T13:36:28) (<:Alice :dislikes :Twilight >, 2010-02-12T13:36:28) Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 16
  • 17. Achievements C-SPARQL [WWW2009,EDBT2010,IJSC2010] •  We specificied of C-SPARQL syntax –  Incrementally, from existing specifications •  Including windows, grouping, aggregates, timestamping •  We gave the formal semantics of C-SPARQL –  Query registration, handling overloads –  Order of evaluation, pattern matching over time, … •  We investigated efficiency of evaluation –  Defining a suitable algebra –  Applying optimizations –  Efficient materialization of inferred data from streams Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 17
  • 18. Achievements An Example of C-SPARQL Query Who are the opinion makers? i.e., the users who are likely to influence the behavior of other users who follow them REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS CONSTRUCT { ?opinionMaker sd:about ?resource } FROM STREAM <http://streamingsocialdata.org/interactions> [RANGE 30m STEP 5m] WHERE { ?opinionMaker ?opinion ?resource . ?follower sioc:follows ?opinionMaker. ?follower ?opinion ?resource. FILTER ( cs:timestamp(?follower) > cs:timestamp(?opinionMaker) && ?opinion != sd:accesses ) } HAVING ( COUNT(DISTINCT ?follower) > 3 ) Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 18
  • 19. Achievements An Example of C-SPARQL Query Who are the opinion makers? i.e., the users who are likely to influence Query registration RDF Stream added as the (for continuous execution) who follow them behavior of other users new ouput format REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS CONSTRUCT { ?opinionMaker sd:about ?resource } FROM STREAM <http://streamingsocialdata.org/interactions> [RANGE 30m STEP 5m] FROM STREAM clause WHERE { ?opinionMaker ?opinion ?resource . WINDOW ?follower sioc:follows ?opinionMaker. Builtin to ?follower ?opinion ?resource. access timestamps FILTER ( cs:timestamp(?follower) > cs:timestamp(?opinionMaker) && ?opinion != sd:accesses ) Aggregates as in SPARQL 1.1 } HAVING ( COUNT(DISTINCT ?follower) > 3 ) Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 19
  • 20. Achievements Efficiency of Evaluation 1/3 [IEEE-IS2010] •  Evaluation of Window-based Selection Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 20
  • 21. Achievements Efficiency of Evaluation 2/3 [EDBT2010] •  Several transformations can be applied to algebraic representation of C-SPARQL •  some recalling well known results from classical relational optimization –  push of FILTERs and projections •  some being more specific to the domain of streams. –  push of aggregates. Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 21
  • 22. Achievements Efficiency of Evaluation 3/3 [EDBT2010] •  Push of filters and projections 125 100 75 ms 50 25 0 10 100 1000 10000 100000 Window Size None Static Only Streaming Only Both Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 22
  • 23. Achievements Example of C-SPARQL and Reasoning 1/2 What impact have I been creating with my tweets in the last hour? Is it positive or negative? Let’s count them … REGISTER QUERY CountPositiveAndNegativeReactions AS PREFIX : <http://ex.org/twitterImpactMining#> SELECT ?t count(?pos) count(?neg) FROM STREAM <http://ex.org/discussions.trdf> [RANGE 30m STEP 30s] :discuss a owl:TransitiveProperty . WHERE { :reply rdfs:subPropertyOf :discuss . ?t a :MonitoredTweet . :retweet rdfs:subPropertyOf :discuss . { ?pos :discuss ?t ; :ProduceReaction [ a :PositiveReaction ] . } UNION { ?neg :discuss ?t ; :ProduceReaction [ a :NegativeReaction ] . } } GROUP BY ?t Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 23
  • 24. Achievements Example of C-SPARQL and Reasoning 2/2 discuss   discuss   retweet   reply   retweet   t1   t1-­‐1   t1-­‐2   t1-­‐3   discuss   discuss   discuss   discuss   Monitored                        Posi.ve                            Nega.ve   Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 24
  • 25. Achievements State-of-the-Art Approach [Ceri1994,Volz2005] 1.  Overestimation of deletion: Overestimates deletions by computing all direct consequences of a deletion. 2.  Rederivation: Prunes those estimated deletions for which alternative derivations (via some other facts in the program) exist. 3.  Insertion: Adds the new derivations that are consequences of insertions to extensional predicates. Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 25
  • 26. Achievements our approach [ESWC2010] 1/2 •  Assuption –  Insertions and deletions are triples respectively entering and exiting the window –  The window size is known •  Therefore –  The time when each triple will expire is known and determined by the window size •  E.g. if the window is 10s long a triple entering at time t will exit at time t+10s –  Note: all knowledge can be annotated with an expiration time •  i.e., background knowledge is annotated with +∞ Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 26
  • 27. Achievements our approach [ESWC2010] 2/2 •  The algorithm 1.  deletes all triples (asserted or inferred) that have just expired 2.  computes the entailments derived by the inserts, 3.  annotates each entailed triple with a expiration time, and 4.  eliminates from the current state all copies of derived triples except the one with the highest timestamp. •  learn more –  http://www.slideshare.net/emanueledellavalle/incremental- reasoning-on-streams-andrich-background-knowledge Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 27
  • 28. Achievements Comparative Evaluation 1/2 [ESWC2010] •  Hypothesis –  Background knowledge do not change and it is fully materialized –  Changes only take place in the window •  An experiment comparing the time required to compute a new materialization using –  Re-computing from scratch (i.e.,1250 ms in our setting) –  State of the art incremental approach [Volz, 2005] –  Our approach •  Results at increasing % of the materialization changed when the window slides 10000 1000 ms. 100 10 0,0% 2,0% 4,0% 6,0% 8,0% 10,0% 12,0% 14,0% 16,0% 18,0% 20,0% •  . %  of  t he  m aterialization   changed  when  t he  window  slides incremental-­‐volz incremental-­‐stream Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 28
  • 29. Achievements Comparative Evaluation 2/2 •  Comparison of the average time needed to answer a C-SPARQL query using –  a forward reasoner, –  the naive approach of re-computing the materialization –  our approach 20 15 10 ms. 5 0 forward  reasoning naive  approach incremental-­‐stream query 5,82 1,61 1,61 materialization 0 15,91 0,28 Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 29
  • 30. Retrospective and Conclusions Wrap Up •  RDF Streams –  Notion defined •  C-SPARQL –  Syntax and semantics defined as a SPARQL extension –  Engine designed –  Engine implemented based on the decision to keep stream management and query evaluation separated •  Experiments with C-SPARQL under simple RDF entailment regimes –  window based selection of C-SPARQL outperforms the standard FILTER based selection –  having formally defined C-SPARQL semantics algebraic optimizations are possible •  Experiment with C-SPARQL under OWL-RL entailment regimes –  efficient incremental updates of deductive closures investigated –  our approach outperform state-of-the-art when updates comes as stream Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 30
  • 31. Retrospective and Conclusions Achievements vs. Research Challenges •  Relation with data-stream systems –  Notion of RDF stream :-| •  Query languages for semantic streams –  C-SPARQL :-D •  Reasoning on Streams –  Formal representations for stream reasoning •  :-P –  Notions of soundness and completeness •  :-P –  Efficient incremental updates of deductive closures •  ESWC 2010 paper :-) ... but much more work is needed! –  How to combine streams and background knowledge •  ESWC 2010 paper :-| ... but a lot needs to be studied ... •  Dealing with incomplete & noisy data –  :-P •  Distributed and parallel processing –  :-P Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 31
  • 32. References •  Vision [IEEE-IS2009] Emanuele Della Valle, Stefano Ceri, Frank van Harmelen, Dieter Fensel It's a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems 24(6): 83-89 (2009) •  Continuous SPARQL (C-SPARQL) [EDBT2010] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri and Michael Grossniklaus. An Execution Environment for C-SPARQL Queries. EDBT 2010 [WWW2009] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle, Michael Grossniklaus: C-SPARQL: SPARQL for continuous querying. WWW 2009: 1061-1062 [IJSC2010] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle, Michael Grossniklaus: C-SPARQL: a Continuous Query Language for RDF Data Streams. Int. J. Semantic Computing 4(1): 3-25 (2010) [IEEE-IS2010] Davide Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle, Yi Huang, Volker Tresp, Achim Rettinger, Hendrik Wermser, "Deductive and Inductive Stream Reasoning for Semantic Social Media Analytics," IEEE Intelligent Systems, 30 Aug. 2010. •  Stream Reasoning [ESWC2010] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle, Michael Grossniklaus. Incremental Reasoning on Streams and Rich Background Knowledge. In. 7th Extended Semantic Web Conference (ESWC 2010) •  Background work [Ceri1994] Stefano Ceri, Jennifer Widom: Deriving Incremental Production Rules for Deductive Data. Inf. Syst. 19(6): 467-490 (1994) [Volz2005] Raphael Volz, Steffen Staab, Boris Motik: Incrementally Maintaining Materializations of Ontologies Stored in Logic Databases. J. Data Semantics 2: 1-34 (2005) Oxford, 2011-1-18 Emanuele Della Valle - visit http://streamreasoning.org 32
  • 33. Thank You! Questions? Much More to Come! Keep an eye on http://www.streamreasoning.org Oxford, 2011-1-18 For more information visit http://www.larkc.eu/ 33