SlideShare a Scribd company logo
1 of 16
From Big Linked Data to Linked Big Data:
DBpedia as a framework for
data integration
Giuseppe Futia1, Antonio Vetrò1, Giuseppe Rizzo2
1- Nexa Center for Internet and Society, DAUIN, Politecnico di Torino
2- Istituto Superiore Mario Boella (ISMB)
7th DBpedia Community Meeting in Leipzig
15 September 2016
PhD candidate on semantics at
Nexa Center for Internet & Society,
DAUIN, Politecnico di Torino
Experiences with LOD and DBpedia
• TellMeFirst, a tool for classifying and enriching
textual documents built on DBpedia Spotlight
(http://tellmefirst.polito.it)
• Contratti Pubblici, a tool for processing, exploring,
and visualizing Italian Public Procurements
(http://public-contracts.nexacenter.org/)
4
How TellMeFirst works
TellMeFirst
Results obtained with a
description of the
Eyes Wide Shut movie
Anti-corruption National Authority
Contratti Pubblici
(Synapta + Nexa)
Different data sources to
build a search engine on
Italian Public Contracts
Agency for Digital Italy
Linked Data repository of
Public Contracts, linked to
DBpedia and SPC
Contratti Pubblici
(Synapta + Nexa)
Contratti Pubblici
DBpedia in our projects
• TellMeFirst:
–Training set used for the semantic classification task
–Several interlinks used for the enrichment task
• Contratti Pubblici:
–Data enrichment to enable advanced SPARQL queries
–Data quality improvement (i.e., consistent labels)
• Big Linked Data
–Already implemented as shown by the exponential growth
of Linked Data in the last years
• Linked Big Data
–RDF data model for Big Data Variety
–Meta information to enable powerful analytics
–Simplify Big Data access, integration, and interlinking
From Big Linked Data to Linked Big Data
Big Data notion of Variety
• Variety of data and representation formats
• Variety of conceptualizations and data models
• Variety related to temporal and spatial dependencies
• Variety as a “generalization of the semantic
heterogeneity as studied in the field of Linked Data”
(Pascal Hitzler & Krzysztof Janowicz)
PhD research questions (i)
• RQ1: How can the technological foundations of Linked
Data and Big Data can be further improved and
combined to create an open software architecture for a
multi-thematic, multi-perspective, and multi-medial
knowledge graph from heterogeneous sources?
PhD research questions (ii)
• RQ2: Which are the features of a research method to
meet and evaluate security, scalability, performance,
openness, interoperability of the software architecture
mentioned earlier? And how we can measure the quality
of the knowledge graph produced with this software
architecture?
Key ideas for my PhD
• Get concepts and ontologies from the DBpedia
knowledge base to support semantic alignment during
the integration stage
• Use frameworks for data integration of structured
information with Big Data technologies:
RDF Mapping Language (RML) + Hadoop or Spark
• Exploit Machine Learning techniques to increment
datasets with unstructured data (i.e., Deep Learning)
DBpedia as knowledge base for:
• Entity linking and annotations in documents
• Assertion of additional categories for data
• Improvement of multilingual information
• Estimation of data quality of integrated information
according to different features (i.e., provenance)
Challenges
• Greater accuracy (integrating different datasets)
• Immediacy (near-real time data, from new data sources)
• Flexibility (not constrained by database structure)
• Better analytics (the ability to change the rules)
• Data quality (reliability and effectiveness of data)
Suggestions and/or comments?
Mail
giuseppe.futia@polito.it
Repository GitHub
https://github.com/giuseppefutia/

More Related Content

What's hot

AZ to eDiscovery
AZ to eDiscoveryAZ to eDiscovery
AZ to eDiscovery
eamonnsfl
 
The open semantic enterprise enterprise data meets web data
The open semantic enterprise   enterprise data meets web dataThe open semantic enterprise   enterprise data meets web data
The open semantic enterprise enterprise data meets web data
Georg Guentner
 

What's hot (20)

Text Data Mining & Publishing
Text Data Mining & PublishingText Data Mining & Publishing
Text Data Mining & Publishing
 
Coreon - Making Sure IoT Devices Understand Each Other!
Coreon - Making Sure IoT Devices Understand Each Other!Coreon - Making Sure IoT Devices Understand Each Other!
Coreon - Making Sure IoT Devices Understand Each Other!
 
AZ to eDiscovery
AZ to eDiscoveryAZ to eDiscovery
AZ to eDiscovery
 
Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?
 
Supporting the uptake of TDM
Supporting the uptake of TDMSupporting the uptake of TDM
Supporting the uptake of TDM
 
Dotnet ieee titles 2013 14
Dotnet ieee titles 2013 14Dotnet ieee titles 2013 14
Dotnet ieee titles 2013 14
 
Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2
 
PhD Projects in P2P Live Streaming Research Assistance
PhD Projects in P2P Live Streaming Research AssistancePhD Projects in P2P Live Streaming Research Assistance
PhD Projects in P2P Live Streaming Research Assistance
 
PhD Projects in NS3 Tutorials
PhD Projects in NS3 TutorialsPhD Projects in NS3 Tutorials
PhD Projects in NS3 Tutorials
 
Design for Findability at the Library of Congress
Design for Findability at the Library of CongressDesign for Findability at the Library of Congress
Design for Findability at the Library of Congress
 
Design for Findability: metadata, metrics and collaboration on LOC.gov
Design for Findability: metadata, metrics and collaboration on LOC.govDesign for Findability: metadata, metrics and collaboration on LOC.gov
Design for Findability: metadata, metrics and collaboration on LOC.gov
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
The open semantic enterprise enterprise data meets web data
The open semantic enterprise   enterprise data meets web dataThe open semantic enterprise   enterprise data meets web data
The open semantic enterprise enterprise data meets web data
 
20140521 presentation ce de mv3
20140521 presentation ce de mv320140521 presentation ce de mv3
20140521 presentation ce de mv3
 
PhD Research Topics in Data Mining Tutorials
PhD Research Topics in Data Mining TutorialsPhD Research Topics in Data Mining Tutorials
PhD Research Topics in Data Mining Tutorials
 
PhD Projects in Digital Forensics Research Guidance
PhD Projects in Digital Forensics Research GuidancePhD Projects in Digital Forensics Research Guidance
PhD Projects in Digital Forensics Research Guidance
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?
 
#opendata Back to the future
#opendata Back to the future#opendata Back to the future
#opendata Back to the future
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Semantic web on Cloud Infrastructure
Semantic web on Cloud InfrastructureSemantic web on Cloud Infrastructure
Semantic web on Cloud Infrastructure
 

Viewers also liked

Bitspend Introduction
Bitspend IntroductionBitspend Introduction
Bitspend Introduction
bitcoin
 
Introduction to bitcoin
Introduction to bitcoinIntroduction to bitcoin
Introduction to bitcoin
Wolf McNally
 

Viewers also liked (13)

Visualization of Linked Data
Visualization of Linked DataVisualization of Linked Data
Visualization of Linked Data
 
TellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery frameworkTellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery framework
 
ORAM: A Brief Overview
ORAM: A Brief OverviewORAM: A Brief Overview
ORAM: A Brief Overview
 
Analyzing Bitcoin Security
Analyzing Bitcoin SecurityAnalyzing Bitcoin Security
Analyzing Bitcoin Security
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Bitspend Introduction
Bitspend IntroductionBitspend Introduction
Bitspend Introduction
 
Bitcoin
BitcoinBitcoin
Bitcoin
 
Introduction Bitcoin
Introduction BitcoinIntroduction Bitcoin
Introduction Bitcoin
 
Bitcoin (Global Digital Currency)
Bitcoin (Global Digital Currency) Bitcoin (Global Digital Currency)
Bitcoin (Global Digital Currency)
 
What is Bitcoin? - A guide for beginners
What is Bitcoin? - A guide for beginnersWhat is Bitcoin? - A guide for beginners
What is Bitcoin? - A guide for beginners
 
Bitcoin - the Basics
Bitcoin - the BasicsBitcoin - the Basics
Bitcoin - the Basics
 
Introduction to bitcoin
Introduction to bitcoinIntroduction to bitcoin
Introduction to bitcoin
 
Bitcoin: The Internet of Money
Bitcoin: The Internet of MoneyBitcoin: The Internet of Money
Bitcoin: The Internet of Money
 

Similar to From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration

DISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloudDISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloud
Paolo Nesi
 

Similar to From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration (20)

Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS case
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madness
 
Paths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphsPaths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphs
 
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
 
A Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfA Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdf
 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloud
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us?
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
Rdaeu russia_fg_1_july2014_final
Rdaeu  russia_fg_1_july2014_finalRdaeu  russia_fg_1_july2014_final
Rdaeu russia_fg_1_july2014_final
 
Towards long-term preservation of linked data - the PRELIDA project
Towards long-term preservation of linked data - the PRELIDA projectTowards long-term preservation of linked data - the PRELIDA project
Towards long-term preservation of linked data - the PRELIDA project
 
Toward FAIR Semantic Resources
Toward FAIR Semantic ResourcesToward FAIR Semantic Resources
Toward FAIR Semantic Resources
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Project
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage information
 
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
 
Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database  Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database
 
186-RISIS
186-RISIS186-RISIS
186-RISIS
 
DISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloudDISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloud
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands
 

Recently uploaded

Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Chandigarh Call girls 9053900678 Call girls in Chandigarh
 
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
 
Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
 
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
 
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
 
Yerawada ] Independent Escorts in Pune - Book 8005736733 Call Girls Available...
Yerawada ] Independent Escorts in Pune - Book 8005736733 Call Girls Available...Yerawada ] Independent Escorts in Pune - Book 8005736733 Call Girls Available...
Yerawada ] Independent Escorts in Pune - Book 8005736733 Call Girls Available...
 
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
 
Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
 

From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration

  • 1. From Big Linked Data to Linked Big Data: DBpedia as a framework for data integration Giuseppe Futia1, Antonio Vetrò1, Giuseppe Rizzo2 1- Nexa Center for Internet and Society, DAUIN, Politecnico di Torino 2- Istituto Superiore Mario Boella (ISMB) 7th DBpedia Community Meeting in Leipzig 15 September 2016
  • 2. PhD candidate on semantics at Nexa Center for Internet & Society, DAUIN, Politecnico di Torino
  • 3. Experiences with LOD and DBpedia • TellMeFirst, a tool for classifying and enriching textual documents built on DBpedia Spotlight (http://tellmefirst.polito.it) • Contratti Pubblici, a tool for processing, exploring, and visualizing Italian Public Procurements (http://public-contracts.nexacenter.org/)
  • 5. TellMeFirst Results obtained with a description of the Eyes Wide Shut movie
  • 6. Anti-corruption National Authority Contratti Pubblici (Synapta + Nexa) Different data sources to build a search engine on Italian Public Contracts Agency for Digital Italy
  • 7. Linked Data repository of Public Contracts, linked to DBpedia and SPC Contratti Pubblici (Synapta + Nexa) Contratti Pubblici
  • 8. DBpedia in our projects • TellMeFirst: –Training set used for the semantic classification task –Several interlinks used for the enrichment task • Contratti Pubblici: –Data enrichment to enable advanced SPARQL queries –Data quality improvement (i.e., consistent labels)
  • 9. • Big Linked Data –Already implemented as shown by the exponential growth of Linked Data in the last years • Linked Big Data –RDF data model for Big Data Variety –Meta information to enable powerful analytics –Simplify Big Data access, integration, and interlinking From Big Linked Data to Linked Big Data
  • 10. Big Data notion of Variety • Variety of data and representation formats • Variety of conceptualizations and data models • Variety related to temporal and spatial dependencies • Variety as a “generalization of the semantic heterogeneity as studied in the field of Linked Data” (Pascal Hitzler & Krzysztof Janowicz)
  • 11. PhD research questions (i) • RQ1: How can the technological foundations of Linked Data and Big Data can be further improved and combined to create an open software architecture for a multi-thematic, multi-perspective, and multi-medial knowledge graph from heterogeneous sources?
  • 12. PhD research questions (ii) • RQ2: Which are the features of a research method to meet and evaluate security, scalability, performance, openness, interoperability of the software architecture mentioned earlier? And how we can measure the quality of the knowledge graph produced with this software architecture?
  • 13. Key ideas for my PhD • Get concepts and ontologies from the DBpedia knowledge base to support semantic alignment during the integration stage • Use frameworks for data integration of structured information with Big Data technologies: RDF Mapping Language (RML) + Hadoop or Spark • Exploit Machine Learning techniques to increment datasets with unstructured data (i.e., Deep Learning)
  • 14. DBpedia as knowledge base for: • Entity linking and annotations in documents • Assertion of additional categories for data • Improvement of multilingual information • Estimation of data quality of integrated information according to different features (i.e., provenance)
  • 15. Challenges • Greater accuracy (integrating different datasets) • Immediacy (near-real time data, from new data sources) • Flexibility (not constrained by database structure) • Better analytics (the ability to change the rules) • Data quality (reliability and effectiveness of data)