SlideShare a Scribd company logo
1 of 16
From Big Linked Data to Linked Big Data:
DBpedia as a framework for
data integration
Giuseppe Futia1, Antonio Vetrò1, Giuseppe Rizzo2
1- Nexa Center for Internet and Society, DAUIN, Politecnico di Torino
2- Istituto Superiore Mario Boella (ISMB)
7th DBpedia Community Meeting in Leipzig
15 September 2016
PhD candidate on semantics at
Nexa Center for Internet & Society,
DAUIN, Politecnico di Torino
Experiences with LOD and DBpedia
• TellMeFirst, a tool for classifying and enriching
textual documents built on DBpedia Spotlight
(http://tellmefirst.polito.it)
• Contratti Pubblici, a tool for processing, exploring,
and visualizing Italian Public Procurements
(http://public-contracts.nexacenter.org/)
4
How TellMeFirst works
TellMeFirst
Results obtained with a
description of the
Eyes Wide Shut movie
Anti-corruption National Authority
Contratti Pubblici
(Synapta + Nexa)
Different data sources to
build a search engine on
Italian Public Contracts
Agency for Digital Italy
Linked Data repository of
Public Contracts, linked to
DBpedia and SPC
Contratti Pubblici
(Synapta + Nexa)
Contratti Pubblici
DBpedia in our projects
• TellMeFirst:
–Training set used for the semantic classification task
–Several interlinks used for the enrichment task
• Contratti Pubblici:
–Data enrichment to enable advanced SPARQL queries
–Data quality improvement (i.e., consistent labels)
• Big Linked Data
–Already implemented as shown by the exponential growth
of Linked Data in the last years
• Linked Big Data
–RDF data model for Big Data Variety
–Meta information to enable powerful analytics
–Simplify Big Data access, integration, and interlinking
From Big Linked Data to Linked Big Data
Big Data notion of Variety
• Variety of data and representation formats
• Variety of conceptualizations and data models
• Variety related to temporal and spatial dependencies
• Variety as a “generalization of the semantic
heterogeneity as studied in the field of Linked Data”
(Pascal Hitzler & Krzysztof Janowicz)
PhD research questions (i)
• RQ1: How can the technological foundations of Linked
Data and Big Data can be further improved and
combined to create an open software architecture for a
multi-thematic, multi-perspective, and multi-medial
knowledge graph from heterogeneous sources?
PhD research questions (ii)
• RQ2: Which are the features of a research method to
meet and evaluate security, scalability, performance,
openness, interoperability of the software architecture
mentioned earlier? And how we can measure the quality
of the knowledge graph produced with this software
architecture?
Key ideas for my PhD
• Get concepts and ontologies from the DBpedia
knowledge base to support semantic alignment during
the integration stage
• Use frameworks for data integration of structured
information with Big Data technologies:
RDF Mapping Language (RML) + Hadoop or Spark
• Exploit Machine Learning techniques to increment
datasets with unstructured data (i.e., Deep Learning)
DBpedia as knowledge base for:
• Entity linking and annotations in documents
• Assertion of additional categories for data
• Improvement of multilingual information
• Estimation of data quality of integrated information
according to different features (i.e., provenance)
Challenges
• Greater accuracy (integrating different datasets)
• Immediacy (near-real time data, from new data sources)
• Flexibility (not constrained by database structure)
• Better analytics (the ability to change the rules)
• Data quality (reliability and effectiveness of data)
Suggestions and/or comments?
Mail
giuseppe.futia@polito.it
Repository GitHub
https://github.com/giuseppefutia/

More Related Content

What's hot

Coreon - Making Sure IoT Devices Understand Each Other!
Coreon - Making Sure IoT Devices Understand Each Other!Coreon - Making Sure IoT Devices Understand Each Other!
Coreon - Making Sure IoT Devices Understand Each Other!Jochen Hummel
 
AZ to eDiscovery
AZ to eDiscoveryAZ to eDiscovery
AZ to eDiscoveryeamonnsfl
 
Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Elena Simperl
 
Supporting the uptake of TDM
Supporting the uptake of TDMSupporting the uptake of TDM
Supporting the uptake of TDMopenminted_eu
 
Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Andrés Gómez
 
PhD Projects in P2P Live Streaming Research Assistance
PhD Projects in P2P Live Streaming Research AssistancePhD Projects in P2P Live Streaming Research Assistance
PhD Projects in P2P Live Streaming Research AssistancePhD Services
 
PhD Projects in NS3 Tutorials
PhD Projects in NS3 TutorialsPhD Projects in NS3 Tutorials
PhD Projects in NS3 TutorialsPhD Services
 
Design for Findability at the Library of Congress
Design for Findability at the Library of CongressDesign for Findability at the Library of Congress
Design for Findability at the Library of CongressJill MacNeice
 
Design for Findability: metadata, metrics and collaboration on LOC.gov
Design for Findability: metadata, metrics and collaboration on LOC.govDesign for Findability: metadata, metrics and collaboration on LOC.gov
Design for Findability: metadata, metrics and collaboration on LOC.govUXPA International
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 
The open semantic enterprise enterprise data meets web data
The open semantic enterprise   enterprise data meets web dataThe open semantic enterprise   enterprise data meets web data
The open semantic enterprise enterprise data meets web dataGeorg Guentner
 
PhD Research Topics in Data Mining Tutorials
PhD Research Topics in Data Mining TutorialsPhD Research Topics in Data Mining Tutorials
PhD Research Topics in Data Mining TutorialsPhD Services
 
PhD Projects in Digital Forensics Research Guidance
PhD Projects in Digital Forensics Research GuidancePhD Projects in Digital Forensics Research Guidance
PhD Projects in Digital Forensics Research GuidancePhD Services
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Europe
 
#opendata Back to the future
#opendata Back to the future#opendata Back to the future
#opendata Back to the futureSlim Turki, Dr.
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 

What's hot (20)

Text Data Mining & Publishing
Text Data Mining & PublishingText Data Mining & Publishing
Text Data Mining & Publishing
 
Coreon - Making Sure IoT Devices Understand Each Other!
Coreon - Making Sure IoT Devices Understand Each Other!Coreon - Making Sure IoT Devices Understand Each Other!
Coreon - Making Sure IoT Devices Understand Each Other!
 
AZ to eDiscovery
AZ to eDiscoveryAZ to eDiscovery
AZ to eDiscovery
 
Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?
 
Supporting the uptake of TDM
Supporting the uptake of TDMSupporting the uptake of TDM
Supporting the uptake of TDM
 
Dotnet ieee titles 2013 14
Dotnet ieee titles 2013 14Dotnet ieee titles 2013 14
Dotnet ieee titles 2013 14
 
Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2
 
PhD Projects in P2P Live Streaming Research Assistance
PhD Projects in P2P Live Streaming Research AssistancePhD Projects in P2P Live Streaming Research Assistance
PhD Projects in P2P Live Streaming Research Assistance
 
PhD Projects in NS3 Tutorials
PhD Projects in NS3 TutorialsPhD Projects in NS3 Tutorials
PhD Projects in NS3 Tutorials
 
Design for Findability at the Library of Congress
Design for Findability at the Library of CongressDesign for Findability at the Library of Congress
Design for Findability at the Library of Congress
 
Design for Findability: metadata, metrics and collaboration on LOC.gov
Design for Findability: metadata, metrics and collaboration on LOC.govDesign for Findability: metadata, metrics and collaboration on LOC.gov
Design for Findability: metadata, metrics and collaboration on LOC.gov
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
The open semantic enterprise enterprise data meets web data
The open semantic enterprise   enterprise data meets web dataThe open semantic enterprise   enterprise data meets web data
The open semantic enterprise enterprise data meets web data
 
20140521 presentation ce de mv3
20140521 presentation ce de mv320140521 presentation ce de mv3
20140521 presentation ce de mv3
 
PhD Research Topics in Data Mining Tutorials
PhD Research Topics in Data Mining TutorialsPhD Research Topics in Data Mining Tutorials
PhD Research Topics in Data Mining Tutorials
 
PhD Projects in Digital Forensics Research Guidance
PhD Projects in Digital Forensics Research GuidancePhD Projects in Digital Forensics Research Guidance
PhD Projects in Digital Forensics Research Guidance
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?
 
#opendata Back to the future
#opendata Back to the future#opendata Back to the future
#opendata Back to the future
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Semantic web on Cloud Infrastructure
Semantic web on Cloud InfrastructureSemantic web on Cloud Infrastructure
Semantic web on Cloud Infrastructure
 

Viewers also liked

Visualization of Linked Data
Visualization of Linked DataVisualization of Linked Data
Visualization of Linked Datagiuseppe_futia
 
TellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery frameworkTellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery frameworkgiuseppe_futia
 
ORAM: A Brief Overview
ORAM: A Brief OverviewORAM: A Brief Overview
ORAM: A Brief OverviewDev Nath
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
Bitspend Introduction
Bitspend IntroductionBitspend Introduction
Bitspend Introductionbitcoin
 
Bitcoin (Global Digital Currency)
Bitcoin (Global Digital Currency) Bitcoin (Global Digital Currency)
Bitcoin (Global Digital Currency) Paramkusa K
 
What is Bitcoin? - A guide for beginners
What is Bitcoin? - A guide for beginnersWhat is Bitcoin? - A guide for beginners
What is Bitcoin? - A guide for beginnersJonathan Waller
 
Introduction to bitcoin
Introduction to bitcoinIntroduction to bitcoin
Introduction to bitcoinWolf McNally
 
Bitcoin: The Internet of Money
Bitcoin: The Internet of MoneyBitcoin: The Internet of Money
Bitcoin: The Internet of Moneywinklevosscap
 

Viewers also liked (13)

Visualization of Linked Data
Visualization of Linked DataVisualization of Linked Data
Visualization of Linked Data
 
TellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery frameworkTellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery framework
 
ORAM: A Brief Overview
ORAM: A Brief OverviewORAM: A Brief Overview
ORAM: A Brief Overview
 
Analyzing Bitcoin Security
Analyzing Bitcoin SecurityAnalyzing Bitcoin Security
Analyzing Bitcoin Security
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Bitspend Introduction
Bitspend IntroductionBitspend Introduction
Bitspend Introduction
 
Bitcoin
BitcoinBitcoin
Bitcoin
 
Introduction Bitcoin
Introduction BitcoinIntroduction Bitcoin
Introduction Bitcoin
 
Bitcoin (Global Digital Currency)
Bitcoin (Global Digital Currency) Bitcoin (Global Digital Currency)
Bitcoin (Global Digital Currency)
 
What is Bitcoin? - A guide for beginners
What is Bitcoin? - A guide for beginnersWhat is Bitcoin? - A guide for beginners
What is Bitcoin? - A guide for beginners
 
Bitcoin - the Basics
Bitcoin - the BasicsBitcoin - the Basics
Bitcoin - the Basics
 
Introduction to bitcoin
Introduction to bitcoinIntroduction to bitcoin
Introduction to bitcoin
 
Bitcoin: The Internet of Money
Bitcoin: The Internet of MoneyBitcoin: The Internet of Money
Bitcoin: The Internet of Money
 

Similar to From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration

Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseRinke Hoekstra
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madnesssemanticsconference
 
Paths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphsPaths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphsAlan Morrison
 
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...tmra
 
A Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfA Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfGeethaPratyusha
 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloudNational Institute of Informatics
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us? Andrea Volpini
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...giuseppe_futia
 
Towards long-term preservation of linked data - the PRELIDA project
Towards long-term preservation of linked data - the PRELIDA projectTowards long-term preservation of linked data - the PRELIDA project
Towards long-term preservation of linked data - the PRELIDA projectPRELIDA Project
 
Toward FAIR Semantic Resources
Toward FAIR Semantic ResourcesToward FAIR Semantic Resources
Toward FAIR Semantic ResourcesEUDAT
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Projectvty
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEnno Meijers
 
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...Digitalmikkeli
 
Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database  Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database dannyijwest
 
DISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloudDISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloudPaolo Nesi
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands Vivien Bonazzi
 

Similar to From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration (20)

Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS case
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madness
 
Paths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphsPaths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphs
 
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
 
A Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfA Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdf
 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloud
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us?
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
Rdaeu russia_fg_1_july2014_final
Rdaeu  russia_fg_1_july2014_finalRdaeu  russia_fg_1_july2014_final
Rdaeu russia_fg_1_july2014_final
 
Towards long-term preservation of linked data - the PRELIDA project
Towards long-term preservation of linked data - the PRELIDA projectTowards long-term preservation of linked data - the PRELIDA project
Towards long-term preservation of linked data - the PRELIDA project
 
Toward FAIR Semantic Resources
Toward FAIR Semantic ResourcesToward FAIR Semantic Resources
Toward FAIR Semantic Resources
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Project
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage information
 
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
 
Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database  Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database
 
186-RISIS
186-RISIS186-RISIS
186-RISIS
 
DISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloudDISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloud
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands
 

Recently uploaded

办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationMarko4394
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxeditsforyah
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxmibuzondetrabajo
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Internet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptxInternet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptxErYashwantJagtap
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 

Recently uploaded (17)

办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentation
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptx
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptx
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Internet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptxInternet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptx
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 

From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration

  • 1. From Big Linked Data to Linked Big Data: DBpedia as a framework for data integration Giuseppe Futia1, Antonio Vetrò1, Giuseppe Rizzo2 1- Nexa Center for Internet and Society, DAUIN, Politecnico di Torino 2- Istituto Superiore Mario Boella (ISMB) 7th DBpedia Community Meeting in Leipzig 15 September 2016
  • 2. PhD candidate on semantics at Nexa Center for Internet & Society, DAUIN, Politecnico di Torino
  • 3. Experiences with LOD and DBpedia • TellMeFirst, a tool for classifying and enriching textual documents built on DBpedia Spotlight (http://tellmefirst.polito.it) • Contratti Pubblici, a tool for processing, exploring, and visualizing Italian Public Procurements (http://public-contracts.nexacenter.org/)
  • 5. TellMeFirst Results obtained with a description of the Eyes Wide Shut movie
  • 6. Anti-corruption National Authority Contratti Pubblici (Synapta + Nexa) Different data sources to build a search engine on Italian Public Contracts Agency for Digital Italy
  • 7. Linked Data repository of Public Contracts, linked to DBpedia and SPC Contratti Pubblici (Synapta + Nexa) Contratti Pubblici
  • 8. DBpedia in our projects • TellMeFirst: –Training set used for the semantic classification task –Several interlinks used for the enrichment task • Contratti Pubblici: –Data enrichment to enable advanced SPARQL queries –Data quality improvement (i.e., consistent labels)
  • 9. • Big Linked Data –Already implemented as shown by the exponential growth of Linked Data in the last years • Linked Big Data –RDF data model for Big Data Variety –Meta information to enable powerful analytics –Simplify Big Data access, integration, and interlinking From Big Linked Data to Linked Big Data
  • 10. Big Data notion of Variety • Variety of data and representation formats • Variety of conceptualizations and data models • Variety related to temporal and spatial dependencies • Variety as a “generalization of the semantic heterogeneity as studied in the field of Linked Data” (Pascal Hitzler & Krzysztof Janowicz)
  • 11. PhD research questions (i) • RQ1: How can the technological foundations of Linked Data and Big Data can be further improved and combined to create an open software architecture for a multi-thematic, multi-perspective, and multi-medial knowledge graph from heterogeneous sources?
  • 12. PhD research questions (ii) • RQ2: Which are the features of a research method to meet and evaluate security, scalability, performance, openness, interoperability of the software architecture mentioned earlier? And how we can measure the quality of the knowledge graph produced with this software architecture?
  • 13. Key ideas for my PhD • Get concepts and ontologies from the DBpedia knowledge base to support semantic alignment during the integration stage • Use frameworks for data integration of structured information with Big Data technologies: RDF Mapping Language (RML) + Hadoop or Spark • Exploit Machine Learning techniques to increment datasets with unstructured data (i.e., Deep Learning)
  • 14. DBpedia as knowledge base for: • Entity linking and annotations in documents • Assertion of additional categories for data • Improvement of multilingual information • Estimation of data quality of integrated information according to different features (i.e., provenance)
  • 15. Challenges • Greater accuracy (integrating different datasets) • Immediacy (near-real time data, from new data sources) • Flexibility (not constrained by database structure) • Better analytics (the ability to change the rules) • Data quality (reliability and effectiveness of data)