SlideShare une entreprise Scribd logo
1  sur  34
SEMANTIC WEB
WITH JAHIA
February 2014

www.sigma.fr
SUMMARY

• WHY ?
• Background
• Web 2.0 is not enough
• WHAT ?
• Definitions
• It’s real
• HOW ?
• JAHIA fits
• Integration

www.sigma.fr
WHY ?

• Background
• Web 2.0 is not enough

www.sigma.fr
Background : who we are ?
Thomas Delerm and Adrien Di Mascio from Logilab will explain the interest of
web semantics in modern web applications for the best use of your data.
They’ll give the recipes that make Jahia an appropriate CMS for the semantic
and linked data web, a.k.a. "web 3.0"



Adrien DI MASCIO - Semantic Web Director
Company : Logilab



Thomas DELERM - Web Architect
Company : SIGMA
Worked in cell and IPTV content startups

www.sigma.fr
How the web evolved


Web « 1 » was about documents
and links



Web « 2.0 » is about social and
users

https://web.archive.org/web/19991116151216/http://www4.yahoo.com/

www.sigma.fr
WHY ?

• Background

• Web 2.0 is not enough

www.sigma.fr
Failures of Web 2.0


All the databases and APIs are in “silo”  searches are limited



Results are documents, not objects



Are my results up to date and reliable ?

Example : Renault : Too many combinations when you want to buy a car : more than 10^20

[1]

[1] http://www.semweb.pro/talk/2474
www.sigma.fr
Failures of Web 2.0


Web 2.0 is far from perfect :



User tag
– Different orthography
– Different meanings for the
same orthography (Hollande)
– No relationships between
tags



You cannot (in one request)
answer complex queries like “List
on my website 10 products
whose producer is Samsung and
price under $50”

www.sigma.fr
We have a solution


There is always a technical evolution
– From PC to Web : WWW and links

– From Web to Web 2.0 : AJAX (dynamic web sites)

– From Web 2.0 to Web 3.0 : Semantic properties and Linked
data

So let’s learn what the semantic web is !
www.sigma.fr
WHAT ?

• Definitions
• It’s real

www.sigma.fr
Semantic Web – (Anti)definitions

Today, Semantic Web is not:
Magic
Natural Language Processing
Image Automatic Processing
A new protocol
It's a worldwide network of data built upon a set of interoperable standards that
use URLs to identify data and link them together.

www.sigma.fr
No Natural Language Processing
A human reads:

<h1>Semantic Web</h1>
 <p>Semantic Web is worldwide network of data invented by <a
href="http://w3.org/People/Berners-Lee">Tim Berners Lee</a> in
1994.</p>

A machine reads:

<h1> ????????????</h1>
 <p> ??????????????????????????????????????????????????
?????<a href="http://w3.org/People/BernersLee"> ???????????????</a> ????????</p>

www.sigma.fr
If only ...
… The machine could read:



SemanticWeb is_a network



SemanticWeb was_created_by TimBernersLee



SemanticWeb was_created_in 1994

www.sigma.fr
Annotate your document
Use rdfa or schema.org

<p itemtype="Concept">
<span itemprop="name">Semantic Web</span> is
<span itemprop="description">worldwide network of data</span>
invented by
<a itemprop="creator" href="http://w3.org/People/Berners-Lee">
Tim Berners Lee</a>
in <span="creation_date">1994</span>.</p>

www.sigma.fr
Publish another representation
Publish RDF and use HTTP content-negotiation
<http://mysite.com/SemanticWeb>
a <http://www.w3.org/2004/02/skos/core#Concept>;
skos:closeMatch <http://data.bnf.fr/ark:/12148/cb119328992> ;
dc:creator <http://w3.org/People/Berners-Lee/> ;
dc:date "1994".

More familiar with JSON ? Take a look at JSON-LD

www.sigma.fr
Vocabularies, ontologies



An ontology is a structured set of terms and concepts.



Each term and concept is also identified by a URL

 There are quite a few standard ontologies for various domains
(social interactions, libraries, music, events, etc.)

www.sigma.fr
Make it happen now !



RDF is nice



Some database engines store RDF graphs
- You can query them with the SPARQL language



Standardized by W3C



You don't necessarily need to change your technology stack



If your data is structured, publishing RDF is easy
- Choosing an ontology or a vocabulary can be hard
- Make your relational database answer a SPARQL query is hard

www.sigma.fr
WHAT ?

• Definitions

• It’s real

www.sigma.fr
It's all about data
Publishing structured data:

Helps search engines
Better indexation
Better page rank
Eases external data integration
Importing a CSV file requires a preliminary agreement on its structure
Maintaining data is expensive, reuse published data (dbpedia, freebase,
geonames)

www.sigma.fr
Examples
GoodRelations annotations

Schema.org annotations

www.sigma.fr
HOW ?

• Jahia fits
• Integration

www.sigma.fr
Client case : Bpi


One goal : use state-of-the art Semantic Web since they are a library
(Bibliothèque Publique d’information)



3 main needs:
– Input data easily for contents and within contents
– Store data in a safe, RDF-friendly manner
– Output data
• On every page for SEO (RDFa)
• In searches
• In exports (RDF)



Good news : Jahia fits !

www.sigma.fr
The choice of Jahia


Input :
- Jahia allows to define clear content definitions (CND files) with
inheritance.
- Jahia is content-centric



Enrich within contents : CKEditor



On contents : contribution or edition (GWT) modes

www.sigma.fr
The choice of Jahia : storage and output
Storage : you need a framework than can abstract different sources of data :
enter JCR
– Unique repository for all content
– External data are abstract : LDAP, Files, other DB…
Output:
– Graph structure + XML format  fit for meta data
– JSP views can be easily tailored for special export formats

www.sigma.fr
HOW ?

• Jahia fits

• Integration

www.sigma.fr
Input : CKEditor and categories


Make sure text data is stored as plain HTML
- Properties file to map schema.org  HTML code
- In-content schema.org properties  Created a CKEditor Plugin



Triple categorization of contents
–Categories (closed list)
–Tags (open)
–Authorities (closed – linked with BnF)



Next steps
–Need for a triple store ?
–Categorization through automatic spider browsing ?

www.sigma.fr
Content structure


Directories per category



The semantic mapping is transparent :
no additional field to fill in



Properties files to map a field and its
semantic exports (Dublin Core, FOAF..)

 Kind of challenges met
– Where to store meta data of a file 
extend jnt:file
– How to create a sub content while
creating its parents  edit Spring GWT
XML

www.sigma.fr
Vocabularies used
Page

Schema.org

OpenGraph

Dublin Core

FOAF

Lists
Details on short and
long contents

No
Yes

No
Yes

No
Yes

No
Partial

Details : events, IT
resource [file]

Yes

No

Yes

No

Auteurs
Place

No
 

No
 

Yes
 

Yes
 

In HTML

Everywhere

Header

Header

Everywhere

Format in HTML

RDFa

Meta

Meta

RDFa

In RDF

Yes

Yes, one line per 
meta
 
Automatic 
(mapping)

Yes, native

Contributed
By

Yes, one line per 
meta
 
 
Automatic +  Automatic 
Manual Bpi
(mapping)

 
Automatic 
(mapping)
www.sigma.fr
Output


We chose RDFa because more widely used for now (than microdata)



Debate : shall enrichment be made manually ? Automatically ? Though a
mixed technology ?



The field  dc:xxx mapping will be used to improve search results



“ARK” URIs are used to exchange objects between repositories (internal,
Jahia, external like BnF)

www.sigma.fr
Future




Free your data !
Put them together
Share them between applications and
externally



Forces you to organize your IT
differently

www.sigma.fr
Future : Facebook


Facebook is gradually promoting the
posts that contain Opengraph data [1]



« Facebook testing more uses for
Open Graph » [2]

[1] http://newsroom.fb.com/News/787/News-Feed-FYI-WhatHappens-When-You-See-More-Updates-fromFriends(January 21, 2014)
[2] http://allfacebook.com/add-to-my-movies-link_b128387

www.sigma.fr
Future : Web 3.0

www.sigma.fr
Conclusion


“If you’re not paying for it, you are the product” [1]



Semantic Web is going to be imposed by internet giants because they need it
to know you better



Make the first step to enrich your data, don’t miss the train !



Jahia 7 catches it :
– External data provider
– Quality, extendable editor

[1] http://blogs.law.harvard.edu/futureoftheinternet/2012/03/21/meme-patrol-when-something-online-is-free-youre-not-the-customer-youre-the-product/

www.sigma.fr
Questions & Answers



Webography:
New W3C Blog on Semantic Web & linked data : http://www.w3.org/blog/data/
http://fr.slideshare.net/AntidotNet/time2-market-lyon-13nov2013-slideshare#
http://fr.slideshare.net/terraces/technologies-du-web-smantique-pour-lentreprise-20
http://fr.slideshare.net/AntidotNet/web-smantique-web-de-donnes-web-30-linked-dataquelques-repres-pour-sy-retrouver

www.sigma.fr

Contenu connexe

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

En vedette

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
 

En vedette (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

Web Semantics with Jahia

  • 2. SUMMARY • WHY ? • Background • Web 2.0 is not enough • WHAT ? • Definitions • It’s real • HOW ? • JAHIA fits • Integration www.sigma.fr
  • 3. WHY ? • Background • Web 2.0 is not enough www.sigma.fr
  • 4. Background : who we are ? Thomas Delerm and Adrien Di Mascio from Logilab will explain the interest of web semantics in modern web applications for the best use of your data. They’ll give the recipes that make Jahia an appropriate CMS for the semantic and linked data web, a.k.a. "web 3.0"  Adrien DI MASCIO - Semantic Web Director Company : Logilab  Thomas DELERM - Web Architect Company : SIGMA Worked in cell and IPTV content startups www.sigma.fr
  • 5. How the web evolved  Web « 1 » was about documents and links  Web « 2.0 » is about social and users https://web.archive.org/web/19991116151216/http://www4.yahoo.com/ www.sigma.fr
  • 6. WHY ? • Background • Web 2.0 is not enough www.sigma.fr
  • 7. Failures of Web 2.0  All the databases and APIs are in “silo”  searches are limited  Results are documents, not objects  Are my results up to date and reliable ? Example : Renault : Too many combinations when you want to buy a car : more than 10^20 [1] [1] http://www.semweb.pro/talk/2474 www.sigma.fr
  • 8. Failures of Web 2.0  Web 2.0 is far from perfect :  User tag – Different orthography – Different meanings for the same orthography (Hollande) – No relationships between tags  You cannot (in one request) answer complex queries like “List on my website 10 products whose producer is Samsung and price under $50” www.sigma.fr
  • 9. We have a solution  There is always a technical evolution – From PC to Web : WWW and links – From Web to Web 2.0 : AJAX (dynamic web sites) – From Web 2.0 to Web 3.0 : Semantic properties and Linked data So let’s learn what the semantic web is ! www.sigma.fr
  • 10. WHAT ? • Definitions • It’s real www.sigma.fr
  • 11. Semantic Web – (Anti)definitions Today, Semantic Web is not: Magic Natural Language Processing Image Automatic Processing A new protocol It's a worldwide network of data built upon a set of interoperable standards that use URLs to identify data and link them together. www.sigma.fr
  • 12. No Natural Language Processing A human reads: <h1>Semantic Web</h1>  <p>Semantic Web is worldwide network of data invented by <a href="http://w3.org/People/Berners-Lee">Tim Berners Lee</a> in 1994.</p> A machine reads: <h1> ????????????</h1>  <p> ?????????????????????????????????????????????????? ?????<a href="http://w3.org/People/BernersLee"> ???????????????</a> ????????</p> www.sigma.fr
  • 13. If only ... … The machine could read:  SemanticWeb is_a network  SemanticWeb was_created_by TimBernersLee  SemanticWeb was_created_in 1994 www.sigma.fr
  • 14. Annotate your document Use rdfa or schema.org <p itemtype="Concept"> <span itemprop="name">Semantic Web</span> is <span itemprop="description">worldwide network of data</span> invented by <a itemprop="creator" href="http://w3.org/People/Berners-Lee"> Tim Berners Lee</a> in <span="creation_date">1994</span>.</p> www.sigma.fr
  • 15. Publish another representation Publish RDF and use HTTP content-negotiation <http://mysite.com/SemanticWeb> a <http://www.w3.org/2004/02/skos/core#Concept>; skos:closeMatch <http://data.bnf.fr/ark:/12148/cb119328992> ; dc:creator <http://w3.org/People/Berners-Lee/> ; dc:date "1994". More familiar with JSON ? Take a look at JSON-LD www.sigma.fr
  • 16. Vocabularies, ontologies  An ontology is a structured set of terms and concepts.  Each term and concept is also identified by a URL  There are quite a few standard ontologies for various domains (social interactions, libraries, music, events, etc.) www.sigma.fr
  • 17. Make it happen now !  RDF is nice  Some database engines store RDF graphs - You can query them with the SPARQL language  Standardized by W3C  You don't necessarily need to change your technology stack  If your data is structured, publishing RDF is easy - Choosing an ontology or a vocabulary can be hard - Make your relational database answer a SPARQL query is hard www.sigma.fr
  • 18. WHAT ? • Definitions • It’s real www.sigma.fr
  • 19. It's all about data Publishing structured data: Helps search engines Better indexation Better page rank Eases external data integration Importing a CSV file requires a preliminary agreement on its structure Maintaining data is expensive, reuse published data (dbpedia, freebase, geonames) www.sigma.fr
  • 21. HOW ? • Jahia fits • Integration www.sigma.fr
  • 22. Client case : Bpi  One goal : use state-of-the art Semantic Web since they are a library (Bibliothèque Publique d’information)  3 main needs: – Input data easily for contents and within contents – Store data in a safe, RDF-friendly manner – Output data • On every page for SEO (RDFa) • In searches • In exports (RDF)  Good news : Jahia fits ! www.sigma.fr
  • 23. The choice of Jahia  Input : - Jahia allows to define clear content definitions (CND files) with inheritance. - Jahia is content-centric  Enrich within contents : CKEditor  On contents : contribution or edition (GWT) modes www.sigma.fr
  • 24. The choice of Jahia : storage and output Storage : you need a framework than can abstract different sources of data : enter JCR – Unique repository for all content – External data are abstract : LDAP, Files, other DB… Output: – Graph structure + XML format  fit for meta data – JSP views can be easily tailored for special export formats www.sigma.fr
  • 25. HOW ? • Jahia fits • Integration www.sigma.fr
  • 26. Input : CKEditor and categories  Make sure text data is stored as plain HTML - Properties file to map schema.org  HTML code - In-content schema.org properties  Created a CKEditor Plugin  Triple categorization of contents –Categories (closed list) –Tags (open) –Authorities (closed – linked with BnF)  Next steps –Need for a triple store ? –Categorization through automatic spider browsing ? www.sigma.fr
  • 27. Content structure  Directories per category  The semantic mapping is transparent : no additional field to fill in  Properties files to map a field and its semantic exports (Dublin Core, FOAF..)  Kind of challenges met – Where to store meta data of a file  extend jnt:file – How to create a sub content while creating its parents  edit Spring GWT XML www.sigma.fr
  • 28. Vocabularies used Page Schema.org OpenGraph Dublin Core FOAF Lists Details on short and long contents No Yes No Yes No Yes No Partial Details : events, IT resource [file] Yes No Yes No Auteurs Place No   No   Yes   Yes   In HTML Everywhere Header Header Everywhere Format in HTML RDFa Meta Meta RDFa In RDF Yes Yes, one line per  meta   Automatic  (mapping) Yes, native Contributed By Yes, one line per  meta     Automatic +  Automatic  Manual Bpi (mapping)   Automatic  (mapping) www.sigma.fr
  • 29. Output  We chose RDFa because more widely used for now (than microdata)  Debate : shall enrichment be made manually ? Automatically ? Though a mixed technology ?  The field  dc:xxx mapping will be used to improve search results  “ARK” URIs are used to exchange objects between repositories (internal, Jahia, external like BnF) www.sigma.fr
  • 30. Future    Free your data ! Put them together Share them between applications and externally  Forces you to organize your IT differently www.sigma.fr
  • 31. Future : Facebook  Facebook is gradually promoting the posts that contain Opengraph data [1]  « Facebook testing more uses for Open Graph » [2] [1] http://newsroom.fb.com/News/787/News-Feed-FYI-WhatHappens-When-You-See-More-Updates-fromFriends(January 21, 2014) [2] http://allfacebook.com/add-to-my-movies-link_b128387 www.sigma.fr
  • 32. Future : Web 3.0 www.sigma.fr
  • 33. Conclusion  “If you’re not paying for it, you are the product” [1]  Semantic Web is going to be imposed by internet giants because they need it to know you better  Make the first step to enrich your data, don’t miss the train !  Jahia 7 catches it : – External data provider – Quality, extendable editor [1] http://blogs.law.harvard.edu/futureoftheinternet/2012/03/21/meme-patrol-when-something-online-is-free-youre-not-the-customer-youre-the-product/ www.sigma.fr
  • 34. Questions & Answers  Webography: New W3C Blog on Semantic Web & linked data : http://www.w3.org/blog/data/ http://fr.slideshare.net/AntidotNet/time2-market-lyon-13nov2013-slideshare# http://fr.slideshare.net/terraces/technologies-du-web-smantique-pour-lentreprise-20 http://fr.slideshare.net/AntidotNet/web-smantique-web-de-donnes-web-30-linked-dataquelques-repres-pour-sy-retrouver www.sigma.fr

Notes de l'éditeur

  1. 19 July 2013 at Google : Knowledge Graph expansion – More than a quarter of all searches started showing some kind of knwoledge graph after this date20 August 2013 Google Hummingbird foces on conversational and semantic search to try and delivery correct answers to broad meanung questions
  2. We chose not to output semantics on lists pages on purpose