SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Drupal and Apache Stanbol
LINKED DATA BASED SEMANTIC ANNOTATION
Gabriel Dragomir
Sunday, August 18, 13
The Semantic Web
Tim Berners Lee:
‘‘The first step is putting data on the Web in a form
that machines can naturally understand, or
converting it to that form. This creates what I call a
Semantic Web – a Web of data that can be
processed directly or indirectly by machines.’’
Sunday, August 18, 13
What’s the hype?
Most organizations need to organize/analyze/relate
huge amounts of textual, unstructured, dissipated data
Examples:
keyword extraction from content: annotate abstracts
text categorization: organize big volumes of text based
on a thesaurus
media monitoring of tags: occurences of a specific
keyword on social media channels
Sunday, August 18, 13
Linked data
http://lod-cloud.net/
Sunday, August 18, 13
Linked data
Project started in 2007
Aimed at building the Web of Data by:
identifying open access data sets
converting them into RDF vocabularies
publish them as open access data sets
Sunday, August 18, 13
Linked data ecosystem
Linked Open Vocabularies (LOV): http://lov.okfn.org/
dataset/lov/
Provides a conceptual map of the vocabularies
Various providers: libraries, governmental actors, NGOs
Sunday, August 18, 13
Linked data ecosystem
Where to find other data sets?
http://www.w3.org/2001/sw/wiki/SKOS/Datasets
Swoogle: http://swoogle.umbc.edu/
PoolParty: http://vocabulary.semantic-web.at
Sunday, August 18, 13
Linked data at work!
Sunday, August 18, 13
Semantic annotation
Creates specific metadata that enable new ways to
retrieve and aggregate information
Annotations are done based on a conceptual scheme,
an ontology (ex. FOAF, DC Core)
For more on ontologies see: http://www.w3.org/wiki/
Good_Ontologies
The annotations build semantic relationships: e.g.
rdf:type, owl:sameAs
Sunday, August 18, 13
Semantic annotation
Most common uses:
Named Entity Linking: limited recognizing entities of
type person, organization, place (e.g. OpenCalais)
Entityhub Linking: annotation based on vocabularies
with no limitations of entity types. Requires more
natural language processing prior to annotation.
Sunday, August 18, 13
Apache Stanbol on the fly
Here comes Apache Stanbol
A new approach:
modular semantic analysis of documents
processing components can be built for virtually any
language
flexible workflows via semantic annotation chains
any vocabulary (Linked Data, custom) can be used
Sunday, August 18, 13
From IKS to Apache Stanbol
IKS - Interactive Knowledge Stack for small to medium
CMS providers - EU funded consortium
An open source software stack written in Java
Goal: extract and process semantic data from
documents
Project undergoing incubation at Apache Foundation
http://stanbol.apache.org
Sunday, August 18, 13
Service oriented architecture
Stanbol is designed to offer service oriented integration
RESTful web services API returning RDF or JSON/
JSON-LD
Each component exposes an endpoint independently
Open Services Gateway initiative compliant (OSGi) via
Apache Felix and Apache Sling
Remote component management
Sunday, August 18, 13
Implementation
OSGi layer: Apache Felix and Apache Sling
Build environment: Apache Maven
RDF framework: Apache Clerezza
Triples store, reasoning engine: Apache Jena
Indexing and semantic search: Apache Solr
Content analysis/metadata extraction: Apache Tika
Natural language processing: Apache OpenNLP
Sunday, August 18, 13
Architecture
Sunday, August 18, 13
Components
Semantic layer:
Enhancer, EntityHub, ContentHub
Enhancement engines: internal, 3rd party
User interfaces
Knowledge integration (rule sets, reasoners)
Storage integration
Sunday, August 18, 13
Content enhancement
Examples:
retrieve additional metadata for a piece of content
identify the language of a text
extract entities (persons, places, organizations)
create annotations to external sources
use 3rd party services for named entities recognition
Sunday, August 18, 13
Drupal meets Stanbol
Several modules implement RDF support allowing data
transport to Stanbol semantic annotations
Taxonomy system allows for complex annotation
Fieldable taxonomy terms allow for storage of complex
semantic data
Sunday, August 18, 13
User scenarios
Semantic indexing via Stanbol (SOLR yard)
Content enrichment with semantically related
information (documents, factual data, images etc.)
Tag as you type: dynamic annotation of text in editors
Sunday, August 18, 13
How it works
POST request sends content via REST API
content is processed by an enhancement chain
Returns JSON-LD, RDF/XML, RDF/JSON etc
JSON-LD - JavaScript Object Notation for Linked Data
a human readable and simple linked data transport
format
for best results an enancement chain should do
language detection, tokenization, POS Tagging prior to
performing semantic annotation
http://drupalaton.jelastic.dogado.eu/stanbol/enhancer
Sunday, August 18, 13
Drupal integration
Source: blog.iks-project.eu
Sunday, August 18, 13
Drupal distribution: IKS CE
IKS CE distribution - Wolfgang Ziegler (fago),
Stéphane Corlosquet (scor)
Components:
Search API Stanbol
VIE.js - semantic annotation UI
https://drupal.org/project/iksce
http://drupal.org/project/vie
http://drupal.org/project/search_api_stanbol
Sunday, August 18, 13
Search API Stanbol
enables the indexing of Drupal entities such as nodes,
users, taxonomy terms, files, etc. in Stanbol EntityHub.
data sent as RDF
data can be mashed up with data from other sources
(Managed Sites, Remote Sites)
Sunday, August 18, 13
VIE.js
“Vienna IKS Editables”
JavaScript library for implementing decoupled Content
Management Systems and semantic interaction in web
applications.
Sunday, August 18, 13
Monolitic vs Decoupled
Content Management
Monolitic vs Decoupled Content Management Systems
source: Henri Bergius - http://bergie.iki.fi
Sunday, August 18, 13
Demo setup
we store Drupal entities in a SOLR index
annotations are to be made based on:
DBPedia - bundled with Apache Stanbol
a custom vocabulary of terms related to semantic
web - Social Semantic Web Thesaurus
SemWeb is imported as a SOLR index into Apache
Stanbol
Sunday, August 18, 13
Custom vocabularies
Social Semantic Web Thesaurus
1959 concepts related to semantic web
Author: Andreas Blumauer
http://vocabulary.semantic-web.at/semweb.html
http://vocabulary.semantic-web.at/semweb/8.visual
Sunday, August 18, 13
Demo
index Drupal entities in Apache Stanbol
retrieve annotated entites via REST API
annotate entities using dbpedia and semweb indexes
edit Drupal entities and annotate on the fly
retrieve linked data tag recommendations
Sunday, August 18, 13
Questions?
Sunday, August 18, 13
Contact me
gabriel.dragomir@webikon.com
twitter: gabidrg
Sunday, August 18, 13
Thank you!
Sunday, August 18, 13
http://mures2013.drupalcamp.ro
Sunday, August 18, 13

Contenu connexe

Tendances

Semantic Web
Semantic WebSemantic Web
Semantic Webhardchiu
 
Adventures in Linked Data Land (presentation by Richard Light)
Adventures in Linked Data Land (presentation by Richard Light)Adventures in Linked Data Land (presentation by Richard Light)
Adventures in Linked Data Land (presentation by Richard Light)jottevanger
 
ORE and SWAP: Composition and Complexity
ORE and SWAP: Composition and ComplexityORE and SWAP: Composition and Complexity
ORE and SWAP: Composition and ComplexityEduserv Foundation
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
A Semantic Data Model for Web Applications
A Semantic Data Model for Web ApplicationsA Semantic Data Model for Web Applications
A Semantic Data Model for Web ApplicationsArmin Haller
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic WebIvan Herman
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataMatthew Rowe
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlPrimal Pappachan
 
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked DataDo the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked DataAdrian Stevenson
 
Facilitating the discovery of public datasets
Facilitating the discovery of public datasetsFacilitating the discovery of public datasets
Facilitating the discovery of public datasetsNafiseh Navabpour
 
Of Cataloging & Context
Of Cataloging & ContextOf Cataloging & Context
Of Cataloging & Contextcharper
 
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Cory Lampert
 
Getting Started With The Talis Platform
Getting Started With The Talis PlatformGetting Started With The Talis Platform
Getting Started With The Talis PlatformLeigh Dodds
 
Linked Data Tutorial
Linked Data TutorialLinked Data Tutorial
Linked Data TutorialSören Auer
 
grlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIsgrlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIsAlbert Meroño-Peñuela
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Saveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF dataSaveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF dataFuming Shih
 

Tendances (20)

Semantic Web
Semantic WebSemantic Web
Semantic Web
 
Adventures in Linked Data Land (presentation by Richard Light)
Adventures in Linked Data Land (presentation by Richard Light)Adventures in Linked Data Land (presentation by Richard Light)
Adventures in Linked Data Land (presentation by Richard Light)
 
ORE and SWAP: Composition and Complexity
ORE and SWAP: Composition and ComplexityORE and SWAP: Composition and Complexity
ORE and SWAP: Composition and Complexity
 
Web of Data Usage Mining
Web of Data Usage MiningWeb of Data Usage Mining
Web of Data Usage Mining
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
A Semantic Data Model for Web Applications
A Semantic Data Model for Web ApplicationsA Semantic Data Model for Web Applications
A Semantic Data Model for Web Applications
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic Data
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing Webcrawl
 
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked DataDo the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
 
Facilitating the discovery of public datasets
Facilitating the discovery of public datasetsFacilitating the discovery of public datasets
Facilitating the discovery of public datasets
 
Of Cataloging & Context
Of Cataloging & ContextOf Cataloging & Context
Of Cataloging & Context
 
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
 
Getting Started With The Talis Platform
Getting Started With The Talis PlatformGetting Started With The Talis Platform
Getting Started With The Talis Platform
 
Linked Data Tutorial
Linked Data TutorialLinked Data Tutorial
Linked Data Tutorial
 
Webofdata
WebofdataWebofdata
Webofdata
 
grlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIsgrlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIs
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Saveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF dataSaveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF data
 

En vedette

I am an experience designer
I am an experience designer I am an experience designer
I am an experience designer CJ Jenkins
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchOntotext
 
Introduction to Text Mining and Semantics
Introduction to Text Mining and SemanticsIntroduction to Text Mining and Semantics
Introduction to Text Mining and SemanticsSeth Grimes
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryOntotext
 
Semantic Data Management in Graph Databases: ESWC 2014 Tutorial
Semantic Data Management in Graph Databases: ESWC 2014 TutorialSemantic Data Management in Graph Databases: ESWC 2014 Tutorial
Semantic Data Management in Graph Databases: ESWC 2014 TutorialMaribel Acosta Deibe
 
Nuxeo World Session: Semantic Technologies - Update on Recent Research
Nuxeo World Session: Semantic Technologies - Update on Recent ResearchNuxeo World Session: Semantic Technologies - Update on Recent Research
Nuxeo World Session: Semantic Technologies - Update on Recent ResearchNuxeo
 
Knowledge Representation and Reasoning with Apache Stanbol
Knowledge Representation and Reasoning with Apache StanbolKnowledge Representation and Reasoning with Apache Stanbol
Knowledge Representation and Reasoning with Apache StanbolAndrea Nuzzolese
 
Text mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi MohammadzadehText mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi MohammadzadehHadi Mohammadzadeh
 
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public DataGain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public DataOntotext
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDatamining Tools
 
What Apache Stanbol Can Do for You
What Apache Stanbol Can Do for YouWhat Apache Stanbol Can Do for You
What Apache Stanbol Can Do for YouFabian Christ
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest Ontotext
 
Semantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesSemantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesCambridge Semantics
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectOntotext
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingOntotext
 
Using PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataUsing PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataJimmy Angelakos
 
Towards Answering Provenance-Enabled SPARQL Queries over RDF Data Cubes
Towards Answering Provenance-Enabled SPARQL Queries over RDF Data CubesTowards Answering Provenance-Enabled SPARQL Queries over RDF Data Cubes
Towards Answering Provenance-Enabled SPARQL Queries over RDF Data CubesKim Ahlstrøm
 

En vedette (20)

I am an experience designer
I am an experience designer I am an experience designer
I am an experience designer
 
Comparing Ontotext KIM and Apache Stanbol
Comparing Ontotext KIM and Apache StanbolComparing Ontotext KIM and Apache Stanbol
Comparing Ontotext KIM and Apache Stanbol
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial Research
 
Introduction to Text Mining and Semantics
Introduction to Text Mining and SemanticsIntroduction to Text Mining and Semantics
Introduction to Text Mining and Semantics
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to Delivery
 
Semantic Data Management in Graph Databases: ESWC 2014 Tutorial
Semantic Data Management in Graph Databases: ESWC 2014 TutorialSemantic Data Management in Graph Databases: ESWC 2014 Tutorial
Semantic Data Management in Graph Databases: ESWC 2014 Tutorial
 
Nuxeo World Session: Semantic Technologies - Update on Recent Research
Nuxeo World Session: Semantic Technologies - Update on Recent ResearchNuxeo World Session: Semantic Technologies - Update on Recent Research
Nuxeo World Session: Semantic Technologies - Update on Recent Research
 
Azure Data Overview
Azure Data OverviewAzure Data Overview
Azure Data Overview
 
Knowledge Representation and Reasoning with Apache Stanbol
Knowledge Representation and Reasoning with Apache StanbolKnowledge Representation and Reasoning with Apache Stanbol
Knowledge Representation and Reasoning with Apache Stanbol
 
Text mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi MohammadzadehText mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi Mohammadzadeh
 
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public DataGain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
What Apache Stanbol Can Do for You
What Apache Stanbol Can Do for YouWhat Apache Stanbol Can Do for You
What Apache Stanbol Can Do for You
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest
 
Semantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesSemantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational Databases
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
 
Using PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataUsing PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic Data
 
Towards Answering Provenance-Enabled SPARQL Queries over RDF Data Cubes
Towards Answering Provenance-Enabled SPARQL Queries over RDF Data CubesTowards Answering Provenance-Enabled SPARQL Queries over RDF Data Cubes
Towards Answering Provenance-Enabled SPARQL Queries over RDF Data Cubes
 

Similaire à Linked data based semantic annotation using Drupal and Apache Stanbol

ELK-Stack-Grid-KA-School.pptx
ELK-Stack-Grid-KA-School.pptxELK-Stack-Grid-KA-School.pptx
ELK-Stack-Grid-KA-School.pptxabenyeung1
 
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...gardensofmeaning
 
Apachecon 2011 stanbol_ogrisel
Apachecon 2011 stanbol_ogriselApachecon 2011 stanbol_ogrisel
Apachecon 2011 stanbol_ogriselNuxeo
 
One Page, One App -or- How to Write a Crawlable Single Page Web App
One Page, One App -or- How to Write a Crawlable Single Page Web AppOne Page, One App -or- How to Write a Crawlable Single Page Web App
One Page, One App -or- How to Write a Crawlable Single Page Web Apptechnicolorenvy
 
Hadoop webinar-130808141030-phpapp01
Hadoop webinar-130808141030-phpapp01Hadoop webinar-130808141030-phpapp01
Hadoop webinar-130808141030-phpapp01Kaushik Dey
 
Intro elasticsearch taswarbhatti
Intro elasticsearch taswarbhattiIntro elasticsearch taswarbhatti
Intro elasticsearch taswarbhattiTaswar Bhatti
 
Museum Collections Management: Possibilities for Access and Use with Linked D...
Museum Collections Management: Possibilities for Access and Use with Linked D...Museum Collections Management: Possibilities for Access and Use with Linked D...
Museum Collections Management: Possibilities for Access and Use with Linked D...cbogen
 
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)Beat Signer
 
Configuring Greenstone's OAI server
Configuring Greenstone's OAI serverConfiguring Greenstone's OAI server
Configuring Greenstone's OAI serverDiego Spano
 
Towards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTTowards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTHong-Linh Truong
 
Doing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebDoing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebMathieu d'Aquin
 
Pattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and HadoopPattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and HadoopPaco Nathan
 
Sti community – an approach for a semantically enhanced company platform 20...
Sti community – an approach for a semantically enhanced company platform   20...Sti community – an approach for a semantically enhanced company platform   20...
Sti community – an approach for a semantically enhanced company platform 20...STIinnsbruck
 
Vital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI
 
Devteach 2017 Store 2 million of audit a day into elasticsearch
Devteach 2017 Store 2 million of audit a day into elasticsearchDevteach 2017 Store 2 million of audit a day into elasticsearch
Devteach 2017 Store 2 million of audit a day into elasticsearchTaswar Bhatti
 
Web data from R
Web data from RWeb data from R
Web data from Rschamber
 
Semantic Pipes and Semantic Mashups
Semantic Pipes and Semantic MashupsSemantic Pipes and Semantic Mashups
Semantic Pipes and Semantic Mashupsgiurca
 
Stackoverflow Data Analysis-Homework3
Stackoverflow Data Analysis-Homework3Stackoverflow Data Analysis-Homework3
Stackoverflow Data Analysis-Homework3Ayush Tak
 

Similaire à Linked data based semantic annotation using Drupal and Apache Stanbol (20)

ELK-Stack-Grid-KA-School.pptx
ELK-Stack-Grid-KA-School.pptxELK-Stack-Grid-KA-School.pptx
ELK-Stack-Grid-KA-School.pptx
 
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
 
Apachecon 2011 stanbol_ogrisel
Apachecon 2011 stanbol_ogriselApachecon 2011 stanbol_ogrisel
Apachecon 2011 stanbol_ogrisel
 
One Page, One App -or- How to Write a Crawlable Single Page Web App
One Page, One App -or- How to Write a Crawlable Single Page Web AppOne Page, One App -or- How to Write a Crawlable Single Page Web App
One Page, One App -or- How to Write a Crawlable Single Page Web App
 
Hadoop webinar-130808141030-phpapp01
Hadoop webinar-130808141030-phpapp01Hadoop webinar-130808141030-phpapp01
Hadoop webinar-130808141030-phpapp01
 
Intro elasticsearch taswarbhatti
Intro elasticsearch taswarbhattiIntro elasticsearch taswarbhatti
Intro elasticsearch taswarbhatti
 
Museum Collections Management: Possibilities for Access and Use with Linked D...
Museum Collections Management: Possibilities for Access and Use with Linked D...Museum Collections Management: Possibilities for Access and Use with Linked D...
Museum Collections Management: Possibilities for Access and Use with Linked D...
 
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
 
Configuring Greenstone's OAI server
Configuring Greenstone's OAI serverConfiguring Greenstone's OAI server
Configuring Greenstone's OAI server
 
Towards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTTowards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoT
 
Doing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebDoing Clever Things with the Semantic Web
Doing Clever Things with the Semantic Web
 
Pattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and HadoopPattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and Hadoop
 
Sti community – an approach for a semantically enhanced company platform 20...
Sti community – an approach for a semantically enhanced company platform   20...Sti community – an approach for a semantically enhanced company platform   20...
Sti community – an approach for a semantically enhanced company platform 20...
 
Vital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent Apps
 
Devteach 2017 Store 2 million of audit a day into elasticsearch
Devteach 2017 Store 2 million of audit a day into elasticsearchDevteach 2017 Store 2 million of audit a day into elasticsearch
Devteach 2017 Store 2 million of audit a day into elasticsearch
 
Web data from R
Web data from RWeb data from R
Web data from R
 
Semantic Pipes and Semantic Mashups
Semantic Pipes and Semantic MashupsSemantic Pipes and Semantic Mashups
Semantic Pipes and Semantic Mashups
 
Linked Data In Action
Linked Data In ActionLinked Data In Action
Linked Data In Action
 
elasticsearch
elasticsearchelasticsearch
elasticsearch
 
Stackoverflow Data Analysis-Homework3
Stackoverflow Data Analysis-Homework3Stackoverflow Data Analysis-Homework3
Stackoverflow Data Analysis-Homework3
 

Dernier

IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Valere | Digital Solutions & AI Transformation Portfolio | 2024
Valere | Digital Solutions & AI Transformation Portfolio | 2024Valere | Digital Solutions & AI Transformation Portfolio | 2024
Valere | Digital Solutions & AI Transformation Portfolio | 2024Alexander Turgeon
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdfPaige Cruz
 
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"DianaGray10
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5DianaGray10
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
IEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK GuideIEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK GuideHironori Washizaki
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 

Dernier (20)

IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Valere | Digital Solutions & AI Transformation Portfolio | 2024
Valere | Digital Solutions & AI Transformation Portfolio | 2024Valere | Digital Solutions & AI Transformation Portfolio | 2024
Valere | Digital Solutions & AI Transformation Portfolio | 2024
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdf
 
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
IEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK GuideIEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 

Linked data based semantic annotation using Drupal and Apache Stanbol

  • 1. Drupal and Apache Stanbol LINKED DATA BASED SEMANTIC ANNOTATION Gabriel Dragomir Sunday, August 18, 13
  • 2. The Semantic Web Tim Berners Lee: ‘‘The first step is putting data on the Web in a form that machines can naturally understand, or converting it to that form. This creates what I call a Semantic Web – a Web of data that can be processed directly or indirectly by machines.’’ Sunday, August 18, 13
  • 3. What’s the hype? Most organizations need to organize/analyze/relate huge amounts of textual, unstructured, dissipated data Examples: keyword extraction from content: annotate abstracts text categorization: organize big volumes of text based on a thesaurus media monitoring of tags: occurences of a specific keyword on social media channels Sunday, August 18, 13
  • 5. Linked data Project started in 2007 Aimed at building the Web of Data by: identifying open access data sets converting them into RDF vocabularies publish them as open access data sets Sunday, August 18, 13
  • 6. Linked data ecosystem Linked Open Vocabularies (LOV): http://lov.okfn.org/ dataset/lov/ Provides a conceptual map of the vocabularies Various providers: libraries, governmental actors, NGOs Sunday, August 18, 13
  • 7. Linked data ecosystem Where to find other data sets? http://www.w3.org/2001/sw/wiki/SKOS/Datasets Swoogle: http://swoogle.umbc.edu/ PoolParty: http://vocabulary.semantic-web.at Sunday, August 18, 13
  • 8. Linked data at work! Sunday, August 18, 13
  • 9. Semantic annotation Creates specific metadata that enable new ways to retrieve and aggregate information Annotations are done based on a conceptual scheme, an ontology (ex. FOAF, DC Core) For more on ontologies see: http://www.w3.org/wiki/ Good_Ontologies The annotations build semantic relationships: e.g. rdf:type, owl:sameAs Sunday, August 18, 13
  • 10. Semantic annotation Most common uses: Named Entity Linking: limited recognizing entities of type person, organization, place (e.g. OpenCalais) Entityhub Linking: annotation based on vocabularies with no limitations of entity types. Requires more natural language processing prior to annotation. Sunday, August 18, 13
  • 11. Apache Stanbol on the fly Here comes Apache Stanbol A new approach: modular semantic analysis of documents processing components can be built for virtually any language flexible workflows via semantic annotation chains any vocabulary (Linked Data, custom) can be used Sunday, August 18, 13
  • 12. From IKS to Apache Stanbol IKS - Interactive Knowledge Stack for small to medium CMS providers - EU funded consortium An open source software stack written in Java Goal: extract and process semantic data from documents Project undergoing incubation at Apache Foundation http://stanbol.apache.org Sunday, August 18, 13
  • 13. Service oriented architecture Stanbol is designed to offer service oriented integration RESTful web services API returning RDF or JSON/ JSON-LD Each component exposes an endpoint independently Open Services Gateway initiative compliant (OSGi) via Apache Felix and Apache Sling Remote component management Sunday, August 18, 13
  • 14. Implementation OSGi layer: Apache Felix and Apache Sling Build environment: Apache Maven RDF framework: Apache Clerezza Triples store, reasoning engine: Apache Jena Indexing and semantic search: Apache Solr Content analysis/metadata extraction: Apache Tika Natural language processing: Apache OpenNLP Sunday, August 18, 13
  • 16. Components Semantic layer: Enhancer, EntityHub, ContentHub Enhancement engines: internal, 3rd party User interfaces Knowledge integration (rule sets, reasoners) Storage integration Sunday, August 18, 13
  • 17. Content enhancement Examples: retrieve additional metadata for a piece of content identify the language of a text extract entities (persons, places, organizations) create annotations to external sources use 3rd party services for named entities recognition Sunday, August 18, 13
  • 18. Drupal meets Stanbol Several modules implement RDF support allowing data transport to Stanbol semantic annotations Taxonomy system allows for complex annotation Fieldable taxonomy terms allow for storage of complex semantic data Sunday, August 18, 13
  • 19. User scenarios Semantic indexing via Stanbol (SOLR yard) Content enrichment with semantically related information (documents, factual data, images etc.) Tag as you type: dynamic annotation of text in editors Sunday, August 18, 13
  • 20. How it works POST request sends content via REST API content is processed by an enhancement chain Returns JSON-LD, RDF/XML, RDF/JSON etc JSON-LD - JavaScript Object Notation for Linked Data a human readable and simple linked data transport format for best results an enancement chain should do language detection, tokenization, POS Tagging prior to performing semantic annotation http://drupalaton.jelastic.dogado.eu/stanbol/enhancer Sunday, August 18, 13
  • 22. Drupal distribution: IKS CE IKS CE distribution - Wolfgang Ziegler (fago), Stéphane Corlosquet (scor) Components: Search API Stanbol VIE.js - semantic annotation UI https://drupal.org/project/iksce http://drupal.org/project/vie http://drupal.org/project/search_api_stanbol Sunday, August 18, 13
  • 23. Search API Stanbol enables the indexing of Drupal entities such as nodes, users, taxonomy terms, files, etc. in Stanbol EntityHub. data sent as RDF data can be mashed up with data from other sources (Managed Sites, Remote Sites) Sunday, August 18, 13
  • 24. VIE.js “Vienna IKS Editables” JavaScript library for implementing decoupled Content Management Systems and semantic interaction in web applications. Sunday, August 18, 13
  • 25. Monolitic vs Decoupled Content Management Monolitic vs Decoupled Content Management Systems source: Henri Bergius - http://bergie.iki.fi Sunday, August 18, 13
  • 26. Demo setup we store Drupal entities in a SOLR index annotations are to be made based on: DBPedia - bundled with Apache Stanbol a custom vocabulary of terms related to semantic web - Social Semantic Web Thesaurus SemWeb is imported as a SOLR index into Apache Stanbol Sunday, August 18, 13
  • 27. Custom vocabularies Social Semantic Web Thesaurus 1959 concepts related to semantic web Author: Andreas Blumauer http://vocabulary.semantic-web.at/semweb.html http://vocabulary.semantic-web.at/semweb/8.visual Sunday, August 18, 13
  • 28. Demo index Drupal entities in Apache Stanbol retrieve annotated entites via REST API annotate entities using dbpedia and semweb indexes edit Drupal entities and annotate on the fly retrieve linked data tag recommendations Sunday, August 18, 13