2. Interlinking wikis
Digital Enterprise Research Institute www.deri.ie
All wikis share a wide common knowledge, within many different
wiki platforms:
TWiki DokuWiki
MoinMoin
Widely used even in the workplace...
Atlassian Trac
XWiki
Confluence Wiki
All with different structures, platform dependent, all disconnected...
2 of 21
3. Many isolated communities of users and their data
Digital Enterprise Research Institute www.deri.ie
Wikis are also disconnected with other
social media websites
* Source: Pidgin Technologies, www.pidgintech.com
4. Interlinking wikis
Digital Enterprise Research Institute www.deri.ie
We propose a new approach based on Linked Data principles to solve such
issues and to enable semantic search across heterogeneous wiki systems
4 of 21
5. Wiki Models
Digital Enterprise Research Institute www.deri.ie
Several semantic models have been implemented and used within
specific semantic wiki platforms
e.g.:
Semantic MediaWiki
as well as efforts to create generic ontology models:
•WikiOnt ontology (DERI)
•WIF (Wiki Interchange Format) ontology
(Völkel, Oren - 1st Workshop on Semantic Wikis - 2006)
But they are all specific to wikis and not open to other social
websites
5 of 21
6. SIOC
Semantically-Interlinked Online Communities
Digital Enterprise Research Institute www.deri.ie
• A project developed by DERI to semantically describe the content
and structure of community sites
• It aims to create new connections between online discussion posts
and items, forums, blogs... and wikis.
• In particular the SIOC ontology is not specific to wikis and is widely
used on the Web
• Adopted in a framework of more than 50 applications, deployed on
over 400 sites
including Drupal 7 and Yahoo! SearchMonkey
http://sioc-project.org
6 of 21
7. Extending the SIOC ontology
Digital Enterprise Research Institute www.deri.ie
We decided to extend the SIOC ontology to make it compliant with wikis
and make wikis interoperable and linkable to other social objects.
Advantages:
• Integration with all the existing semantic data
• Ability to run the same queries to find items on:
– wikis, forums, blogs, social neworking sites, etc.
First we considered the typical and relevant features of wikis in terms of
structure and social interactions.
7 of 21
8. Relevant wiki features
Digital Enterprise Research Institute www.deri.ie
Multi-authoring: multiple users edit the same content collaboratively.
• Categories: hierarchical organization of articles.
A solution: SKOS vocabulary (W3C recommendation to model hierarchical structures between various
categories) and the sioct:Category class
• Social Tagging: non-organized but dynamic organization process.
The properties sioc:topic (using URIs) and dc:subject (using keywords) can be used to represent tags
related to a particular wiki page.
http://wiki.../The_Clash sioc:topic
http://wiki.../Punk_rock
dc:subject tag:hasTag
Punk rock
8 of 21
9. Relevant wiki features
Digital Enterprise Research Institute www.deri.ie
• Discussions: pages where people can discuss about the article subject.
We added a new sioc:has_discussion property, with domain sioc:Item and open range.
• Backlinks: (or “what links here”) wiki internal links pointing to the same wiki article.
We use the already existing sioc:links_to property.
Pages Versioning: each page has an associated page history.
We use sioc:next(previous)_version and sioc:latest_version properties.
Added 2 transitive (OWL) properties: sioc:earlier_version & sioc:later_version;
Defined sioc:next(previous)_version as subproperties of sioc:later(earlier)_version.
9 of 21
10. SIOC-MediaWiki Exporter
Digital Enterprise Research Institute www.deri.ie
An exporter from a popular wiki platform to expose data in RDF using our
proposed model.
A webservice, written in PHP, that exports a MediaWiki article in RDF publicly
available at:
http://ws.sioc-project.org/mediawiki/
10 of 21
11. SIOC-MediaWiki Exporter
Digital Enterprise Research Institute www.deri.ie
An exporter from a popular wiki platform to expose data in RDF using our
proposed model.
A webservice, written in PHP, that exports a MediaWiki article in RDF publicly
available at:
http://ws.sioc-project.org/mediawiki/
11 of 21
12. Browsing the generated data
Digital Enterprise Research Institute www.deri.ie
RDF data extracted from a wiki page is browsable with tools such as
The Tabulator
To offer a better browsing experience and ease the process of
crawling SIOC exports of MediaWiki instances, the webservice
automatically produces rdfs:seeAlso links between wiki pages,
following the Linked Data practices;
Link to the corresponding Dbpedia resource added automatically, if
the article is from the Wikipedia [English] (with foaf:primaryTopic)
A RDF crawler can easily follow all the seeAlso links found on every
document and continue to crawl, so it is possible to crawl an entire
wiki site starting from a single URI.
13. Browsing the generated data
Digital Enterprise Research Institute www.deri.ie
RDF data extracted from a wiki page is browsable with tools such as
The Tabulator
The webservice automatically produces rdfs:seeAlso links between
wiki pages, following the Linked Data principles;
A RDF crawler can easily follow all the seeAlso links found on every
document and continue to crawl, so it is possible to crawl an entire
wiki site starting from a single URI.
13 of 21
14. The DokuSIOC plugin
Digital Enterprise Research Institute www.deri.ie
A plugin for DokuWiki that exports RDF data using popular lightweight ontologies
(originally developed by M. Haschke, a SIOC contributor).
We modified and extended this plug-in in order to be compliant with our proposed
model and to export all the needed wiki features.
It takes information from the metadata stored in the wiki system about pages,
users, links, etc. and provides it as raw RDF/XML serialized data
(instead of the usual HTML page).
Developed in PHP and easy to install in every DokuWiki system.
It uses the SIOC PHP API.
14 of 21
16. Collecting Data
Digital Enterprise Research Institute www.deri.ie
To evaluate our proposal, we exported and crawled 5 different
MediaWiki and DokuWiki instances
Collecting more than: 1GB of RDF data,
3000 wiki articles and 700 users
Data loaded in a triple-store (Sesame + OWLIM)
On the top of that it is possible to run cross-sites queries
by combining FOAF and SIOC
e.g.:
SELECT DISTINCT ?content
WHERE {
<http://example.org/js#me> foaf:account ?account .
?account rdf:type sioc:UserAccount .
?content sioc:has_creator ?account .
}
16 of 21
17. Collecting Data
Digital Enterprise Research Institute www.deri.ie
SELECT DISTINCT ?content
WHERE {
<http://example.org/js#me> foaf:account ?account .
?account rdf:type sioc:UserAccount .
?content sioc:has_creator ?account .
}
17 of 21
18. Building the application
Digital Enterprise Research Institute www.deri.ie
The data acquisition module is a PHP script that:
queries the triple-store
collects and parses the results
translates the data in the correct format (JSON) for the visualization
layer
The visualization layer has been built with the Exhibit framework by the
MIT SIMILE Project
It is a set of Javascript files directly configurable on the HTML code of
the page to display
It allows for faceted browsing capabilities
18 of 21
20. Conclusions
Digital Enterprise Research Institute www.deri.ie
Presented how the SIOC ontology and lightweight semantics can be
used and extended to represent the structure of wikis;
How to interlink wikis to other online communities;
Demonstrated an overall benefit on applying SemWeb technologies
to wikis:
– enabling end-users to access the information generated in a
simple and transparent way,
– showing potentialities that cannot be obtained using the traditional
Web 2.0 instruments;
20 of 21