SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Digital Enterprise Research Institute                                            www.deri.ie




                                    Semantic Search on
                                Heterogeneous Wiki Systems

                                           Fabrizio Orlandi, Alexandre Passant
                                                      DERI – Galway




     Wikimania 2010
     Gdansk – 10th July 2010
© Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
Interlinking wikis
Digital Enterprise Research Institute                                                  www.deri.ie


        All wikis share a wide common knowledge, within many different
        wiki platforms:


                                            TWiki                           DokuWiki




                MoinMoin




                                   Widely used even in the workplace...

                                   Atlassian                   Trac
                                                                          XWiki
                                   Confluence                  Wiki




     All with different structures, platform dependent, all disconnected...

                                                     2 of 21
Many isolated communities of users and their data
Digital Enterprise Research Institute                               www.deri.ie




           Wikis are also disconnected with other
                    social media websites

   * Source: Pidgin Technologies, www.pidgintech.com
Interlinking wikis
Digital Enterprise Research Institute                               www.deri.ie


 We propose a new approach based on Linked Data principles to solve such
 issues and to enable semantic search across heterogeneous wiki systems




                                             4 of 21
Wiki Models
Digital Enterprise Research Institute                                                  www.deri.ie



      Several semantic models have been implemented and used within
      specific semantic wiki platforms

     e.g.:



                                                                  Semantic MediaWiki

       as well as efforts to create generic ontology models:
       •WikiOnt ontology (DERI)
       •WIF (Wiki Interchange Format) ontology
         (Völkel, Oren - 1st Workshop on Semantic Wikis - 2006)


      But they are all specific to wikis and not open to other social
                                  websites

                                             5 of 21
SIOC
                          Semantically-Interlinked Online Communities
Digital Enterprise Research Institute                                        www.deri.ie



  • A project developed by DERI to semantically describe the content
    and structure of community sites

  • It aims to create new connections between online discussion posts
    and items, forums, blogs... and wikis.

  • In particular the SIOC ontology is not specific to wikis and is widely
    used on the Web

  • Adopted in a framework of more than 50 applications, deployed on
     over 400 sites
    including Drupal 7 and Yahoo! SearchMonkey


  http://sioc-project.org


                                           6 of 21
Extending the SIOC ontology
Digital Enterprise Research Institute                                       www.deri.ie




      We decided to extend the SIOC ontology to make it compliant with wikis
      and make wikis interoperable and linkable to other social objects.


      Advantages:
      • Integration with all the existing semantic data
      • Ability to run the same queries to find items on:
             – wikis, forums, blogs, social neworking sites, etc.



      First we considered the typical and relevant features of wikis in terms of
      structure and social interactions.



                                            7 of 21
Relevant wiki features
Digital Enterprise Research Institute                                                                      www.deri.ie

   
       Multi-authoring: multiple users edit the same content collaboratively.




   • Categories: hierarchical organization of articles.
   A solution: SKOS vocabulary (W3C recommendation to model hierarchical structures between various
   categories) and the sioct:Category class


   • Social Tagging: non-organized but dynamic organization process.
   The properties sioc:topic (using URIs) and dc:subject (using keywords) can be used to represent tags
   related to a particular wiki page.

               http://wiki.../The_Clash                  sioc:topic
                                                                                http://wiki.../Punk_rock


                                           dc:subject                 tag:hasTag

                                                          Punk rock




                                                        8 of 21
Relevant wiki features
Digital Enterprise Research Institute                                                                 www.deri.ie

   • Discussions: pages where people can discuss about the article subject.
   We added a new sioc:has_discussion property, with domain sioc:Item and open range.

    • Backlinks: (or “what links here”) wiki internal links pointing to the same wiki article.
    We use the already existing sioc:links_to property.

    
        Pages Versioning: each page has an associated page history.
           
               We use sioc:next(previous)_version and sioc:latest_version properties.
           
               Added 2 transitive (OWL) properties: sioc:earlier_version & sioc:later_version;
           
               Defined sioc:next(previous)_version as subproperties of sioc:later(earlier)_version.




                                                    9 of 21
SIOC-MediaWiki Exporter
Digital Enterprise Research Institute                                             www.deri.ie


           An exporter from a popular wiki platform to expose data in RDF using our
                                      proposed model.

      A webservice, written in PHP, that exports a MediaWiki article in RDF publicly
      available at:
                       http://ws.sioc-project.org/mediawiki/




                                         10 of 21
SIOC-MediaWiki Exporter
Digital Enterprise Research Institute                                             www.deri.ie


           An exporter from a popular wiki platform to expose data in RDF using our
                                      proposed model.

      A webservice, written in PHP, that exports a MediaWiki article in RDF publicly
      available at:
                       http://ws.sioc-project.org/mediawiki/




                                         11 of 21
Browsing the generated data
Digital Enterprise Research Institute                                          www.deri.ie


         RDF data extracted from a wiki page is browsable with tools such as
                                   The Tabulator

               To offer a better browsing experience and ease the process of
               crawling SIOC exports of MediaWiki instances, the webservice
               automatically produces rdfs:seeAlso links between wiki pages,
                            following the Linked Data practices;


           Link to the corresponding Dbpedia resource added automatically, if
            the article is from the Wikipedia [English] (with foaf:primaryTopic)
          A RDF crawler can easily follow all the seeAlso links found on every
          document and continue to crawl, so it is possible to crawl an entire
                         wiki site starting from a single URI.
Browsing the generated data
Digital Enterprise Research Institute                                        www.deri.ie


         RDF data extracted from a wiki page is browsable with tools such as
                                   The Tabulator




            The webservice automatically produces rdfs:seeAlso links between
                   wiki pages, following the Linked Data principles;



           A RDF crawler can easily follow all the seeAlso links found on every
           document and continue to crawl, so it is possible to crawl an entire
                          wiki site starting from a single URI.




                                        13 of 21
The DokuSIOC plugin
Digital Enterprise Research Institute                                            www.deri.ie




    
        A plugin for DokuWiki that exports RDF data using popular lightweight ontologies
    (originally developed by M. Haschke, a SIOC contributor).

    
     We modified and extended this plug-in in order to be compliant with our proposed
    model and to export all the needed wiki features.

    
      It takes information from the metadata stored in the wiki system about pages,
    users, links, etc. and provides it as raw RDF/XML serialized data
    (instead of the usual HTML page).

    
        Developed in PHP and easy to install in every DokuWiki system.

    
        It uses the SIOC PHP API.




                                            14 of 21
The DokuSIOC plugin
Digital Enterprise Research Institute                    www.deri.ie
Collecting Data
Digital Enterprise Research Institute                                        www.deri.ie


            To evaluate our proposal, we exported and crawled 5 different
                         MediaWiki and DokuWiki instances
                       Collecting more than: 1GB of RDF data,
                           3000 wiki articles and 700 users
                 Data loaded in a triple-store (Sesame + OWLIM)
              On the top of that it is possible to run cross-sites queries
                            by combining FOAF and SIOC
        e.g.:

                     SELECT DISTINCT ?content
                     WHERE {
                        <http://example.org/js#me> foaf:account ?account .
                        ?account rdf:type sioc:UserAccount .
                        ?content sioc:has_creator ?account .
                     }



                                            16 of 21
Collecting Data
Digital Enterprise Research Institute                                        www.deri.ie




                     SELECT DISTINCT ?content
                     WHERE {
                        <http://example.org/js#me> foaf:account ?account .
                        ?account rdf:type sioc:UserAccount .
                        ?content sioc:has_creator ?account .
                     }


                                             17 of 21
Building the application
Digital Enterprise Research Institute                                        www.deri.ie




  
      The data acquisition module is a PHP script that:
        
          queries the triple-store
        
          collects and parses the results
        
          translates the data in the correct format (JSON) for the visualization
          layer

  
   The visualization layer has been built with the Exhibit framework by the
  MIT SIMILE Project
     
        It is a set of Javascript files directly configurable on the HTML code of
        the page to display
     
        It allows for faceted browsing capabilities



                                                18 of 21
Digital Enterprise Research Institute   www.deri.ie
Conclusions
Digital Enterprise Research Institute                                         www.deri.ie




     
      Presented how the SIOC ontology and lightweight semantics can be
     used and extended to represent the structure of wikis;

     
         How to interlink wikis to other online communities;

     
      Demonstrated an overall benefit on applying SemWeb technologies
     to wikis:
            – enabling end-users to access the information generated in a
              simple and transparent way,
            – showing potentialities that cannot be obtained using the traditional
              Web 2.0 instruments;




                                         20 of 21
Digital Enterprise Research Institute                    www.deri.ie




                                        Thank you!

                                        Any questions?




                                            21 of 21

Contenu connexe

Tendances

Web2.0 Applications
Web2.0 ApplicationsWeb2.0 Applications
Web2.0 Applicationsdomenico79
 
Breaking Down Walls in Enterprise with Social Semantics
Breaking Down Walls in Enterprise with Social SemanticsBreaking Down Walls in Enterprise with Social Semantics
Breaking Down Walls in Enterprise with Social SemanticsJohn Breslin
 
A Survey of the Landscape and State-of-Art in Semantic Wiki
A Survey of the Landscape and State-of-Art in Semantic WikiA Survey of the Landscape and State-of-Art in Semantic Wiki
A Survey of the Landscape and State-of-Art in Semantic WikiMax Völkel
 
Law Libraries And Emerging Technologies
Law Libraries And Emerging TechnologiesLaw Libraries And Emerging Technologies
Law Libraries And Emerging TechnologiesSaskia Mehlhorn
 
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data WebData Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data WebJohn Breslin
 
Linking In-Game Events and Entities to Social Data on the Web
Linking In-Game Events and Entities to Social Data on the WebLinking In-Game Events and Entities to Social Data on the Web
Linking In-Game Events and Entities to Social Data on the WebJohn Breslin
 
Web 2.0 in Libraries
Web 2.0 in LibrariesWeb 2.0 in Libraries
Web 2.0 in LibrariesAnupama Saini
 
Connotea and CiteUlike | Milk Group | 3 EHAIL 2010 | Carlos Lopes_ppt
Connotea and CiteUlike | Milk Group | 3 EHAIL 2010 | Carlos Lopes_pptConnotea and CiteUlike | Milk Group | 3 EHAIL 2010 | Carlos Lopes_ppt
Connotea and CiteUlike | Milk Group | 3 EHAIL 2010 | Carlos Lopes_pptCarlos Lopes
 
SIOC: Semantic Web for Social Media Sites
SIOC: Semantic Web for Social Media SitesSIOC: Semantic Web for Social Media Sites
SIOC: Semantic Web for Social Media SitesUldis Bojars
 
Aswc2009 Smw Tutorial Part 1 Intro And Examples
Aswc2009 Smw Tutorial Part 1 Intro And ExamplesAswc2009 Smw Tutorial Part 1 Intro And Examples
Aswc2009 Smw Tutorial Part 1 Intro And ExamplesJesse Wang
 
Web 2.0 lib_2.0_1
Web 2.0 lib_2.0_1Web 2.0 lib_2.0_1
Web 2.0 lib_2.0_1smtcd
 
Wikis[1]
Wikis[1]Wikis[1]
Wikis[1]experts
 

Tendances (15)

Web 2.0 and the LMS
Web 2.0 and the LMSWeb 2.0 and the LMS
Web 2.0 and the LMS
 
Web2.0 Applications
Web2.0 ApplicationsWeb2.0 Applications
Web2.0 Applications
 
Breaking Down Walls in Enterprise with Social Semantics
Breaking Down Walls in Enterprise with Social SemanticsBreaking Down Walls in Enterprise with Social Semantics
Breaking Down Walls in Enterprise with Social Semantics
 
A Survey of the Landscape and State-of-Art in Semantic Wiki
A Survey of the Landscape and State-of-Art in Semantic WikiA Survey of the Landscape and State-of-Art in Semantic Wiki
A Survey of the Landscape and State-of-Art in Semantic Wiki
 
Law Libraries And Emerging Technologies
Law Libraries And Emerging TechnologiesLaw Libraries And Emerging Technologies
Law Libraries And Emerging Technologies
 
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data WebData Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
 
Linking In-Game Events and Entities to Social Data on the Web
Linking In-Game Events and Entities to Social Data on the WebLinking In-Game Events and Entities to Social Data on the Web
Linking In-Game Events and Entities to Social Data on the Web
 
Web 2.0 in Libraries
Web 2.0 in LibrariesWeb 2.0 in Libraries
Web 2.0 in Libraries
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
SOCIAL TECNHOLOGIES
SOCIAL TECNHOLOGIESSOCIAL TECNHOLOGIES
SOCIAL TECNHOLOGIES
 
Connotea and CiteUlike | Milk Group | 3 EHAIL 2010 | Carlos Lopes_ppt
Connotea and CiteUlike | Milk Group | 3 EHAIL 2010 | Carlos Lopes_pptConnotea and CiteUlike | Milk Group | 3 EHAIL 2010 | Carlos Lopes_ppt
Connotea and CiteUlike | Milk Group | 3 EHAIL 2010 | Carlos Lopes_ppt
 
SIOC: Semantic Web for Social Media Sites
SIOC: Semantic Web for Social Media SitesSIOC: Semantic Web for Social Media Sites
SIOC: Semantic Web for Social Media Sites
 
Aswc2009 Smw Tutorial Part 1 Intro And Examples
Aswc2009 Smw Tutorial Part 1 Intro And ExamplesAswc2009 Smw Tutorial Part 1 Intro And Examples
Aswc2009 Smw Tutorial Part 1 Intro And Examples
 
Web 2.0 lib_2.0_1
Web 2.0 lib_2.0_1Web 2.0 lib_2.0_1
Web 2.0 lib_2.0_1
 
Wikis[1]
Wikis[1]Wikis[1]
Wikis[1]
 

En vedette

Практическое применение HTML5 в Я.Почте
Практическое применение HTML5 в Я.ПочтеПрактическое применение HTML5 в Я.Почте
Практическое применение HTML5 в Я.ПочтеAlexey Androsov
 
Semantic Search on Heterogeneous Wiki Systems - wikisym2010
Semantic Search on Heterogeneous Wiki Systems - wikisym2010Semantic Search on Heterogeneous Wiki Systems - wikisym2010
Semantic Search on Heterogeneous Wiki Systems - wikisym2010Fabrizio Orlandi
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebFabrizio Orlandi
 
Profiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebProfiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebFabrizio Orlandi
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your BusinessBarry Feldman
 

En vedette (7)

Практическое применение HTML5 в Я.Почте
Практическое применение HTML5 в Я.ПочтеПрактическое применение HTML5 в Я.Почте
Практическое применение HTML5 в Я.Почте
 
Vipin Sahni
Vipin Sahni Vipin Sahni
Vipin Sahni
 
Semantic Search on Heterogeneous Wiki Systems - wikisym2010
Semantic Search on Heterogeneous Wiki Systems - wikisym2010Semantic Search on Heterogeneous Wiki Systems - wikisym2010
Semantic Search on Heterogeneous Wiki Systems - wikisym2010
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
 
Profiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebProfiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic Web
 
Ai Presentacion
Ai PresentacionAi Presentacion
Ai Presentacion
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
 

Similaire à Semantic search on heterogeneous wiki systems - Wikimania 2010

Enabling cross-wikis integration by extending the SIOC ontology
Enabling cross-wikis integration by extending the SIOC ontologyEnabling cross-wikis integration by extending the SIOC ontology
Enabling cross-wikis integration by extending the SIOC ontologyFabrizio Orlandi
 
Semantic Wiki: Social Semantic Web In Action:
Semantic Wiki: Social Semantic Web In Action: Semantic Wiki: Social Semantic Web In Action:
Semantic Wiki: Social Semantic Web In Action: Jesse Wang
 
Semantic Wiki: Social Semantic Web in Use
Semantic Wiki: Social Semantic Web in UseSemantic Wiki: Social Semantic Web in Use
Semantic Wiki: Social Semantic Web in UseJesse Wang
 
Msra talk smw+apps
Msra talk smw+appsMsra talk smw+apps
Msra talk smw+appsJesse Wang
 
Jist tutorial semantic wikis and applications
Jist tutorial   semantic wikis and applicationsJist tutorial   semantic wikis and applications
Jist tutorial semantic wikis and applicationsJesse Wang
 
Pre-SMWCon Spring 2012 meetup (short)
Pre-SMWCon Spring 2012 meetup (short)Pre-SMWCon Spring 2012 meetup (short)
Pre-SMWCon Spring 2012 meetup (short)Jesse Wang
 
Semantic Wikis - Social Semantic Web in Action
Semantic Wikis - Social Semantic Web in ActionSemantic Wikis - Social Semantic Web in Action
Semantic Wikis - Social Semantic Web in ActionJesse Wang
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaElena-Oana Tabaranu
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...John Breslin
 
7 things you should know about wikis
7 things you should know about wikis7 things you should know about wikis
7 things you should know about wikisAykut Özmen
 
How To Use Wikis For Business
How To Use Wikis For BusinessHow To Use Wikis For Business
How To Use Wikis For Businessarnoldn
 
Exploring Article Networks on Wikipedia with NodeXL
Exploring Article Networks on Wikipedia with NodeXLExploring Article Networks on Wikipedia with NodeXL
Exploring Article Networks on Wikipedia with NodeXLShalin Hai-Jew
 
Chapter6 McHaney 2nd edition
Chapter6 McHaney 2nd editionChapter6 McHaney 2nd edition
Chapter6 McHaney 2nd editionRoger McHaney
 
Enhancing the Web Experience
Enhancing the Web ExperienceEnhancing the Web Experience
Enhancing the Web ExperienceJohn Breslin
 

Similaire à Semantic search on heterogeneous wiki systems - Wikimania 2010 (20)

Enabling cross-wikis integration by extending the SIOC ontology
Enabling cross-wikis integration by extending the SIOC ontologyEnabling cross-wikis integration by extending the SIOC ontology
Enabling cross-wikis integration by extending the SIOC ontology
 
Semantic Wiki: Social Semantic Web In Action:
Semantic Wiki: Social Semantic Web In Action: Semantic Wiki: Social Semantic Web In Action:
Semantic Wiki: Social Semantic Web In Action:
 
Semantic Wiki: Social Semantic Web in Use
Semantic Wiki: Social Semantic Web in UseSemantic Wiki: Social Semantic Web in Use
Semantic Wiki: Social Semantic Web in Use
 
Msra talk smw+apps
Msra talk smw+appsMsra talk smw+apps
Msra talk smw+apps
 
Jist tutorial semantic wikis and applications
Jist tutorial   semantic wikis and applicationsJist tutorial   semantic wikis and applications
Jist tutorial semantic wikis and applications
 
Pre-SMWCon Spring 2012 meetup (short)
Pre-SMWCon Spring 2012 meetup (short)Pre-SMWCon Spring 2012 meetup (short)
Pre-SMWCon Spring 2012 meetup (short)
 
Are you wiki?
Are you wiki?Are you wiki?
Are you wiki?
 
Semantic Wikis - Social Semantic Web in Action
Semantic Wikis - Social Semantic Web in ActionSemantic Wikis - Social Semantic Web in Action
Semantic Wikis - Social Semantic Web in Action
 
Wiki on Library Perspective
Wiki on Library PerspectiveWiki on Library Perspective
Wiki on Library Perspective
 
The Social Web
The Social WebThe Social Web
The Social Web
 
The SIOC Project
The SIOC ProjectThe SIOC Project
The SIOC Project
 
Chapter6 McHaney
Chapter6 McHaneyChapter6 McHaney
Chapter6 McHaney
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...
 
7 things you should know about wikis
7 things you should know about wikis7 things you should know about wikis
7 things you should know about wikis
 
Wikis biblio
Wikis biblioWikis biblio
Wikis biblio
 
How To Use Wikis For Business
How To Use Wikis For BusinessHow To Use Wikis For Business
How To Use Wikis For Business
 
Exploring Article Networks on Wikipedia with NodeXL
Exploring Article Networks on Wikipedia with NodeXLExploring Article Networks on Wikipedia with NodeXL
Exploring Article Networks on Wikipedia with NodeXL
 
Chapter6 McHaney 2nd edition
Chapter6 McHaney 2nd editionChapter6 McHaney 2nd edition
Chapter6 McHaney 2nd edition
 
Enhancing the Web Experience
Enhancing the Web ExperienceEnhancing the Web Experience
Enhancing the Web Experience
 

Plus de Fabrizio Orlandi

Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Fabrizio Orlandi
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Fabrizio Orlandi
 
Modelling context and statement-level metadata in knowledge graphs
Modelling context and statement-level metadata in knowledge graphsModelling context and statement-level metadata in knowledge graphs
Modelling context and statement-level metadata in knowledge graphsFabrizio Orlandi
 
iRap - Interest based RDF update propagation
iRap - Interest based RDF update propagationiRap - Interest based RDF update propagation
iRap - Interest based RDF update propagationFabrizio Orlandi
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Fabrizio Orlandi
 
Semantic user profiling and Personalised filtering of the Twitter stream
Semantic user profiling and Personalised filtering of the Twitter streamSemantic user profiling and Personalised filtering of the Twitter stream
Semantic user profiling and Personalised filtering of the Twitter streamFabrizio Orlandi
 
Semantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in WikipediaSemantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in WikipediaFabrizio Orlandi
 
Semantic Search on Heterogeneous Wiki Systems - poster
Semantic Search on Heterogeneous Wiki Systems - posterSemantic Search on Heterogeneous Wiki Systems - poster
Semantic Search on Heterogeneous Wiki Systems - posterFabrizio Orlandi
 

Plus de Fabrizio Orlandi (8)

Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
 
Modelling context and statement-level metadata in knowledge graphs
Modelling context and statement-level metadata in knowledge graphsModelling context and statement-level metadata in knowledge graphs
Modelling context and statement-level metadata in knowledge graphs
 
iRap - Interest based RDF update propagation
iRap - Interest based RDF update propagationiRap - Interest based RDF update propagation
iRap - Interest based RDF update propagation
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
 
Semantic user profiling and Personalised filtering of the Twitter stream
Semantic user profiling and Personalised filtering of the Twitter streamSemantic user profiling and Personalised filtering of the Twitter stream
Semantic user profiling and Personalised filtering of the Twitter stream
 
Semantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in WikipediaSemantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in Wikipedia
 
Semantic Search on Heterogeneous Wiki Systems - poster
Semantic Search on Heterogeneous Wiki Systems - posterSemantic Search on Heterogeneous Wiki Systems - poster
Semantic Search on Heterogeneous Wiki Systems - poster
 

Semantic search on heterogeneous wiki systems - Wikimania 2010

  • 1. Digital Enterprise Research Institute www.deri.ie Semantic Search on Heterogeneous Wiki Systems Fabrizio Orlandi, Alexandre Passant DERI – Galway Wikimania 2010 Gdansk – 10th July 2010 © Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
  • 2. Interlinking wikis Digital Enterprise Research Institute www.deri.ie All wikis share a wide common knowledge, within many different wiki platforms: TWiki DokuWiki MoinMoin Widely used even in the workplace... Atlassian Trac XWiki Confluence Wiki All with different structures, platform dependent, all disconnected... 2 of 21
  • 3. Many isolated communities of users and their data Digital Enterprise Research Institute www.deri.ie Wikis are also disconnected with other social media websites * Source: Pidgin Technologies, www.pidgintech.com
  • 4. Interlinking wikis Digital Enterprise Research Institute www.deri.ie We propose a new approach based on Linked Data principles to solve such issues and to enable semantic search across heterogeneous wiki systems 4 of 21
  • 5. Wiki Models Digital Enterprise Research Institute www.deri.ie Several semantic models have been implemented and used within specific semantic wiki platforms e.g.: Semantic MediaWiki as well as efforts to create generic ontology models: •WikiOnt ontology (DERI) •WIF (Wiki Interchange Format) ontology (Völkel, Oren - 1st Workshop on Semantic Wikis - 2006) But they are all specific to wikis and not open to other social websites 5 of 21
  • 6. SIOC Semantically-Interlinked Online Communities Digital Enterprise Research Institute www.deri.ie • A project developed by DERI to semantically describe the content and structure of community sites • It aims to create new connections between online discussion posts and items, forums, blogs... and wikis. • In particular the SIOC ontology is not specific to wikis and is widely used on the Web • Adopted in a framework of more than 50 applications, deployed on over 400 sites including Drupal 7 and Yahoo! SearchMonkey http://sioc-project.org 6 of 21
  • 7. Extending the SIOC ontology Digital Enterprise Research Institute www.deri.ie We decided to extend the SIOC ontology to make it compliant with wikis and make wikis interoperable and linkable to other social objects. Advantages: • Integration with all the existing semantic data • Ability to run the same queries to find items on: – wikis, forums, blogs, social neworking sites, etc. First we considered the typical and relevant features of wikis in terms of structure and social interactions. 7 of 21
  • 8. Relevant wiki features Digital Enterprise Research Institute www.deri.ie  Multi-authoring: multiple users edit the same content collaboratively. • Categories: hierarchical organization of articles. A solution: SKOS vocabulary (W3C recommendation to model hierarchical structures between various categories) and the sioct:Category class • Social Tagging: non-organized but dynamic organization process. The properties sioc:topic (using URIs) and dc:subject (using keywords) can be used to represent tags related to a particular wiki page. http://wiki.../The_Clash sioc:topic http://wiki.../Punk_rock dc:subject tag:hasTag Punk rock 8 of 21
  • 9. Relevant wiki features Digital Enterprise Research Institute www.deri.ie • Discussions: pages where people can discuss about the article subject. We added a new sioc:has_discussion property, with domain sioc:Item and open range. • Backlinks: (or “what links here”) wiki internal links pointing to the same wiki article. We use the already existing sioc:links_to property.  Pages Versioning: each page has an associated page history.  We use sioc:next(previous)_version and sioc:latest_version properties.  Added 2 transitive (OWL) properties: sioc:earlier_version & sioc:later_version;  Defined sioc:next(previous)_version as subproperties of sioc:later(earlier)_version. 9 of 21
  • 10. SIOC-MediaWiki Exporter Digital Enterprise Research Institute www.deri.ie An exporter from a popular wiki platform to expose data in RDF using our proposed model. A webservice, written in PHP, that exports a MediaWiki article in RDF publicly available at: http://ws.sioc-project.org/mediawiki/ 10 of 21
  • 11. SIOC-MediaWiki Exporter Digital Enterprise Research Institute www.deri.ie An exporter from a popular wiki platform to expose data in RDF using our proposed model. A webservice, written in PHP, that exports a MediaWiki article in RDF publicly available at: http://ws.sioc-project.org/mediawiki/ 11 of 21
  • 12. Browsing the generated data Digital Enterprise Research Institute www.deri.ie RDF data extracted from a wiki page is browsable with tools such as The Tabulator To offer a better browsing experience and ease the process of crawling SIOC exports of MediaWiki instances, the webservice automatically produces rdfs:seeAlso links between wiki pages, following the Linked Data practices; Link to the corresponding Dbpedia resource added automatically, if the article is from the Wikipedia [English] (with foaf:primaryTopic) A RDF crawler can easily follow all the seeAlso links found on every document and continue to crawl, so it is possible to crawl an entire wiki site starting from a single URI.
  • 13. Browsing the generated data Digital Enterprise Research Institute www.deri.ie RDF data extracted from a wiki page is browsable with tools such as The Tabulator The webservice automatically produces rdfs:seeAlso links between wiki pages, following the Linked Data principles; A RDF crawler can easily follow all the seeAlso links found on every document and continue to crawl, so it is possible to crawl an entire wiki site starting from a single URI. 13 of 21
  • 14. The DokuSIOC plugin Digital Enterprise Research Institute www.deri.ie  A plugin for DokuWiki that exports RDF data using popular lightweight ontologies (originally developed by M. Haschke, a SIOC contributor).  We modified and extended this plug-in in order to be compliant with our proposed model and to export all the needed wiki features.  It takes information from the metadata stored in the wiki system about pages, users, links, etc. and provides it as raw RDF/XML serialized data (instead of the usual HTML page).  Developed in PHP and easy to install in every DokuWiki system.  It uses the SIOC PHP API. 14 of 21
  • 15. The DokuSIOC plugin Digital Enterprise Research Institute www.deri.ie
  • 16. Collecting Data Digital Enterprise Research Institute www.deri.ie To evaluate our proposal, we exported and crawled 5 different MediaWiki and DokuWiki instances Collecting more than: 1GB of RDF data, 3000 wiki articles and 700 users Data loaded in a triple-store (Sesame + OWLIM) On the top of that it is possible to run cross-sites queries by combining FOAF and SIOC e.g.: SELECT DISTINCT ?content WHERE { <http://example.org/js#me> foaf:account ?account . ?account rdf:type sioc:UserAccount . ?content sioc:has_creator ?account . } 16 of 21
  • 17. Collecting Data Digital Enterprise Research Institute www.deri.ie SELECT DISTINCT ?content WHERE { <http://example.org/js#me> foaf:account ?account . ?account rdf:type sioc:UserAccount . ?content sioc:has_creator ?account . } 17 of 21
  • 18. Building the application Digital Enterprise Research Institute www.deri.ie  The data acquisition module is a PHP script that:  queries the triple-store  collects and parses the results  translates the data in the correct format (JSON) for the visualization layer  The visualization layer has been built with the Exhibit framework by the MIT SIMILE Project  It is a set of Javascript files directly configurable on the HTML code of the page to display  It allows for faceted browsing capabilities 18 of 21
  • 19. Digital Enterprise Research Institute www.deri.ie
  • 20. Conclusions Digital Enterprise Research Institute www.deri.ie  Presented how the SIOC ontology and lightweight semantics can be used and extended to represent the structure of wikis;  How to interlink wikis to other online communities;  Demonstrated an overall benefit on applying SemWeb technologies to wikis: – enabling end-users to access the information generated in a simple and transparent way, – showing potentialities that cannot be obtained using the traditional Web 2.0 instruments; 20 of 21
  • 21. Digital Enterprise Research Institute www.deri.ie Thank you! Any questions? 21 of 21