SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Data coherence between OSM and Wikipedia
Cristian Consonni
Fondazione Bruno Kessler
State of the Map 2013 - Birmingham
September 2013
Cristian Consonni Data coherence between OSM and WIkipedia 1 / 16
Outline
1 Introduction
2 The Problem
3 Proposing a Solution
Wikipedia-OSM comparator
Nut4Nuts
4 Conclusions
5 Questions
Cristian Consonni Data coherence between OSM and WIkipedia 2 / 16
Collecting Information About the Real World
Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16
Collecting Information About the Real World
Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16
Collecting Information About the Real World
Wikipedia and OpenStreetMap are:
collaborative
volunteer-driven
free (as in freedom and as in beer)
Both projects collect information about the real world.
Cristian Consonni Data coherence between OSM and WIkipedia 4 / 16
Different Processes and Communities
Wikipedia
anonymous users can edit
entries consist in text (or media)
only encyclopedical subjects
content can be protected from
editing in case of problems
OpenStreetMap
only registered users can edit
entries consist in data
everything can be described
content is always editable
Cristian Consonni Data coherence between OSM and WIkipedia 5 / 16
Inconsistencies in the data
Data in Wikipedia can be inconsistent with data from OpenStreetMap.
We should compare the data and reconcile the differences.
Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
Inconsistencies in the data
Data in Wikipedia can be inconsistent with data from OpenStreetMap.
We should compare the data and reconcile the differences.
On Wikipedia the metro station
“Colosseum” is inside the Colosseum
itself.
Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
Inconsistencies in the data
Data in Wikipedia can be inconsistent with data from OpenStreetMap.
We should compare the data and reconcile the differences.
On Wikipedia the metro station
“Colosseum” is inside the Colosseum
itself.
On OpenStreetMap the metro station is
correctly placed outside the monument.
OpenStreetMap maps on Wikipedia provided by WIWOSM tool by User:Master and User:Kolossos, check it out on:
http://wiki.openstreetmap.org/wiki/WIWOSM
Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
Proposal of the Solution
Two steps towards a solution:
1 Compare the data
Identify links between Wikipedia pages and OSM entities
Extract all the available geographical information
Define metrics to calculate if the data are “close” or not
2 Reconcile the differences
Provide the communities with the result of previous analysis
Creating tools to facilitate the reconciliation
Cristian Consonni Data coherence between OSM and WIkipedia 7 / 16
Comparing the data
Wikipedia-OpenStreetMap comparator
Proof-of-concept: comparing data about churches in Italy:
Wikipedia-OpenStreetMap comparator
source code: https://github.com/CristianCantoro/WOcomparator
Easy case:
pre-defined category of items (selection on a set of features in OSM,
articles with a given template in Wikipedia)
only entities with a (it:)Wikipedia attribute were selected
⇒ linking is straightforward.
Cristian Consonni Data coherence between OSM and WIkipedia 8 / 16
Comparing the data
Wikipedia-OpenStreetMap comparator
http://it.wikipedia.org/wiki/Utente:CristianCantoro/Georeferenziazione
Cristian Consonni Data coherence between OSM and WIkipedia 9 / 16
Comparing the data
nuts4nuts
For the hard case (try to link every possible thing), another tool:
Nuts4Nuts
source code: https://github.com/SpazioDati/Nuts4Nuts
http://nuts4nutsrecon.spaziodati.eu/reconcile?queries={%22q0%22:%20{%22query%22:%20%22Palazzo%20Vecchio%22}}
Known limitations:
limited to Italy
uses of external services
grab the source code: https://github.com/SpazioDati/Nuts4Nuts
Cristian Consonni Data coherence between OSM and WIkipedia 10 / 16
Dandelion
Nuts4Nuts is built using the infrastracture provided by
Dandelion (http://dandelion.eu)
a datamarket by SpazioDati srl.
Cristian Consonni Data coherence between OSM and WIkipedia 11 / 16
Future Work
Nuts4nuts is a step to find geographical information for Wikipedia article
that have no explicit coordinates in them.
Future work:
study new approaches to link entities between Wikipedia and
OpenStreetMap
an application to fix inconsistencies or fill in missing data, like this:
Cristian Consonni Data coherence between OSM and WIkipedia 12 / 16
Conclusions
Wikipedia and OSM collect information about the real world
Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
Conclusions
Wikipedia and OSM collect information about the real world
Comparing data among the two project can highlight inconsistencies
Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
Conclusions
Wikipedia and OSM collect information about the real world
Comparing data among the two project can highlight inconsistencies
We should fix them
Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
Questions & Contacts
Questions?
mail: consonni@fbk.eu
twitter: @CristianCantoro
github: https://github.com/CristianCantoro
Cristian Consonni Data coherence between OSM and WIkipedia 14 / 16
Thank you
Thank you!
This work was supported by:
A project by:
SpazioDati srl
Edizioni Curcu & Genovese
with funds from the European Regional Development Fund.
More information: http://trentino.dandelion.eu
Cristian Consonni Data coherence between OSM and WIkipedia 15 / 16
Copyright notice
The following presentation is realeased under the licence CC3.0-BY-SA.
Further info:
http://creativecommons.org/licenses/by-sa/3.0/
Logos and trademarks are of the respective owners.
Cristian Consonni Data coherence between OSM and WIkipedia 16 / 16

Contenu connexe

Similaire à Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

EO in Society: Open Science and Innovation
EO in Society: Open Science and InnovationEO in Society: Open Science and Innovation
EO in Society: Open Science and InnovationMaria Antonia Brovelli
 
From Digital Earth to the Internet of Places for Management of Risks and Emer...
From Digital Earth to the Internet of Places for Management of Risks and Emer...From Digital Earth to the Internet of Places for Management of Risks and Emer...
From Digital Earth to the Internet of Places for Management of Risks and Emer...Maria Antonia Brovelli
 
DSD-INT 2016 A crowd-sourced spatial database can change the way we work - Va...
DSD-INT 2016 A crowd-sourced spatial database can change the way we work - Va...DSD-INT 2016 A crowd-sourced spatial database can change the way we work - Va...
DSD-INT 2016 A crowd-sourced spatial database can change the way we work - Va...Deltares
 
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Istituto nazionale di statistica
 
WHITE PAPER: Data Harmonization & Interoperability in OpenTransportNet
WHITE PAPER: Data Harmonization & Interoperability in OpenTransportNetWHITE PAPER: Data Harmonization & Interoperability in OpenTransportNet
WHITE PAPER: Data Harmonization & Interoperability in OpenTransportNetplan4all
 
Workshop e-participation Bahia-Potenza
Workshop e-participation Bahia-PotenzaWorkshop e-participation Bahia-Potenza
Workshop e-participation Bahia-PotenzaGilberto Corso Pereira
 
#migrantsfiles international
#migrantsfiles international#migrantsfiles international
#migrantsfiles internationalDataninja
 
SC7 Workshop 3: The BDE pilot for secure societies
SC7 Workshop 3: The BDE pilot for secure societiesSC7 Workshop 3: The BDE pilot for secure societies
SC7 Workshop 3: The BDE pilot for secure societiesBigData_Europe
 
Тіло Бекер, Технічний університет Дрездена (Німеччина) Як поміряти прогрес ве...
Тіло Бекер, Технічний університет Дрездена (Німеччина) Як поміряти прогрес ве...Тіло Бекер, Технічний університет Дрездена (Німеччина) Як поміряти прогрес ве...
Тіло Бекер, Технічний університет Дрездена (Німеччина) Як поміряти прогрес ве...Vadym Denysenko
 
Big data in resilience-building of rangeland communities.
Big data in resilience-building of rangeland communities.Big data in resilience-building of rangeland communities.
Big data in resilience-building of rangeland communities.ILRI
 
Fab City Summer School Milan 2016 - Technologies, processes, participation - ...
Fab City Summer School Milan 2016 - Technologies, processes, participation - ...Fab City Summer School Milan 2016 - Technologies, processes, participation - ...
Fab City Summer School Milan 2016 - Technologies, processes, participation - ...Massimo Menichinelli
 
The Adoption of Public Urban Space as a Driving Force for Third Places
The Adoption of Public Urban Space as a Driving Force for Third PlacesThe Adoption of Public Urban Space as a Driving Force for Third Places
The Adoption of Public Urban Space as a Driving Force for Third PlacesFederico Gobbo
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyvty
 
BDE SC6-pilot - 05/12/16 - cologne Michalis Vafopoulos
BDE SC6-pilot - 05/12/16 - cologne Michalis VafopoulosBDE SC6-pilot - 05/12/16 - cologne Michalis Vafopoulos
BDE SC6-pilot - 05/12/16 - cologne Michalis VafopoulosBigData_Europe
 

Similaire à Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham (20)

EO in Society: Open Science and Innovation
EO in Society: Open Science and InnovationEO in Society: Open Science and Innovation
EO in Society: Open Science and Innovation
 
From Digital Earth to the Internet of Places for Management of Risks and Emer...
From Digital Earth to the Internet of Places for Management of Risks and Emer...From Digital Earth to the Internet of Places for Management of Risks and Emer...
From Digital Earth to the Internet of Places for Management of Risks and Emer...
 
DSD-INT 2016 A crowd-sourced spatial database can change the way we work - Va...
DSD-INT 2016 A crowd-sourced spatial database can change the way we work - Va...DSD-INT 2016 A crowd-sourced spatial database can change the way we work - Va...
DSD-INT 2016 A crowd-sourced spatial database can change the way we work - Va...
 
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
 
Ongoing Research in Data Studies
Ongoing Research in Data StudiesOngoing Research in Data Studies
Ongoing Research in Data Studies
 
Data and science
Data and scienceData and science
Data and science
 
OKFN_OpenDataMx
OKFN_OpenDataMxOKFN_OpenDataMx
OKFN_OpenDataMx
 
Versioning for Linked Data: Archiving Systems and Benchmarks
Versioning for Linked Data: Archiving Systems and BenchmarksVersioning for Linked Data: Archiving Systems and Benchmarks
Versioning for Linked Data: Archiving Systems and Benchmarks
 
WHITE PAPER: Data Harmonization & Interoperability in OpenTransportNet
WHITE PAPER: Data Harmonization & Interoperability in OpenTransportNetWHITE PAPER: Data Harmonization & Interoperability in OpenTransportNet
WHITE PAPER: Data Harmonization & Interoperability in OpenTransportNet
 
Workshop e-participation Bahia-Potenza
Workshop e-participation Bahia-PotenzaWorkshop e-participation Bahia-Potenza
Workshop e-participation Bahia-Potenza
 
#migrantsfiles international
#migrantsfiles international#migrantsfiles international
#migrantsfiles international
 
SC7 Workshop 3: The BDE pilot for secure societies
SC7 Workshop 3: The BDE pilot for secure societiesSC7 Workshop 3: The BDE pilot for secure societies
SC7 Workshop 3: The BDE pilot for secure societies
 
Тіло Бекер, Технічний університет Дрездена (Німеччина) Як поміряти прогрес ве...
Тіло Бекер, Технічний університет Дрездена (Німеччина) Як поміряти прогрес ве...Тіло Бекер, Технічний університет Дрездена (Німеччина) Як поміряти прогрес ве...
Тіло Бекер, Технічний університет Дрездена (Німеччина) Як поміряти прогрес ве...
 
Big data in resilience-building of rangeland communities.
Big data in resilience-building of rangeland communities.Big data in resilience-building of rangeland communities.
Big data in resilience-building of rangeland communities.
 
Fab City Summer School Milan 2016 - Technologies, processes, participation - ...
Fab City Summer School Milan 2016 - Technologies, processes, participation - ...Fab City Summer School Milan 2016 - Technologies, processes, participation - ...
Fab City Summer School Milan 2016 - Technologies, processes, participation - ...
 
Sight, Sound, Numbers & Us: Data Visualization + Data Sonification = Data Acc...
Sight, Sound, Numbers & Us: Data Visualization + Data Sonification = Data Acc...Sight, Sound, Numbers & Us: Data Visualization + Data Sonification = Data Acc...
Sight, Sound, Numbers & Us: Data Visualization + Data Sonification = Data Acc...
 
The Adoption of Public Urban Space as a Driving Force for Third Places
The Adoption of Public Urban Space as a Driving Force for Third PlacesThe Adoption of Public Urban Space as a Driving Force for Third Places
The Adoption of Public Urban Space as a Driving Force for Third Places
 
slides
slidesslides
slides
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhy
 
BDE SC6-pilot - 05/12/16 - cologne Michalis Vafopoulos
BDE SC6-pilot - 05/12/16 - cologne Michalis VafopoulosBDE SC6-pilot - 05/12/16 - cologne Michalis Vafopoulos
BDE SC6-pilot - 05/12/16 - cologne Michalis Vafopoulos
 

Plus de Cristian Consonni

Cloud computing and networking course: paper presentation -Data Mining for In...
Cloud computing and networking course: paper presentation -Data Mining for In...Cloud computing and networking course: paper presentation -Data Mining for In...
Cloud computing and networking course: paper presentation -Data Mining for In...Cristian Consonni
 
Introdution to Docker (theory and hands on) dbCafé - dbTrento
Introdution to Docker (theory and hands on) dbCafé - dbTrentoIntrodution to Docker (theory and hands on) dbCafé - dbTrento
Introdution to Docker (theory and hands on) dbCafé - dbTrentoCristian Consonni
 
Introduzione a Docker (parte 2 - Pratica)
Introduzione a Docker (parte 2 - Pratica)Introduzione a Docker (parte 2 - Pratica)
Introduzione a Docker (parte 2 - Pratica)Cristian Consonni
 
Le opportunità della rete nel rispetto del copyright
Le opportunità della rete nel rispetto del copyrightLe opportunità della rete nel rispetto del copyright
Le opportunità della rete nel rispetto del copyrightCristian Consonni
 
Aziende e Wikipedia: dobbiamo parlare - Social Media Week Milano - Febbraio 2...
Aziende e Wikipedia: dobbiamo parlare - Social Media Week Milano - Febbraio 2...Aziende e Wikipedia: dobbiamo parlare - Social Media Week Milano - Febbraio 2...
Aziende e Wikipedia: dobbiamo parlare - Social Media Week Milano - Febbraio 2...Cristian Consonni
 
Archeowiki, When Open-Source Strategies Attract Visitors' Presence In Museums...
Archeowiki, When Open-Source Strategies Attract Visitors' Presence In Museums...Archeowiki, When Open-Source Strategies Attract Visitors' Presence In Museums...
Archeowiki, When Open-Source Strategies Attract Visitors' Presence In Museums...Cristian Consonni
 
OpenData e progetti collaborativi
OpenData e progetti collaborativiOpenData e progetti collaborativi
OpenData e progetti collaborativiCristian Consonni
 
La privacy nei progetti aperti e collaborativi - il caso di Wikipedia
La privacy nei progetti aperti e collaborativi - il caso di Wikipedia La privacy nei progetti aperti e collaborativi - il caso di Wikipedia
La privacy nei progetti aperti e collaborativi - il caso di Wikipedia Cristian Consonni
 
School of data Trento: basic spreadsheet
School of data Trento: basic spreadsheetSchool of data Trento: basic spreadsheet
School of data Trento: basic spreadsheetCristian Consonni
 
Presentazione Wikipedia Scuole Civiche
Presentazione Wikipedia Scuole CivichePresentazione Wikipedia Scuole Civiche
Presentazione Wikipedia Scuole CivicheCristian Consonni
 

Plus de Cristian Consonni (11)

Cloud computing and networking course: paper presentation -Data Mining for In...
Cloud computing and networking course: paper presentation -Data Mining for In...Cloud computing and networking course: paper presentation -Data Mining for In...
Cloud computing and networking course: paper presentation -Data Mining for In...
 
Introdution to Docker (theory and hands on) dbCafé - dbTrento
Introdution to Docker (theory and hands on) dbCafé - dbTrentoIntrodution to Docker (theory and hands on) dbCafé - dbTrento
Introdution to Docker (theory and hands on) dbCafé - dbTrento
 
Introduzione a Docker (parte 2 - Pratica)
Introduzione a Docker (parte 2 - Pratica)Introduzione a Docker (parte 2 - Pratica)
Introduzione a Docker (parte 2 - Pratica)
 
Le opportunità della rete nel rispetto del copyright
Le opportunità della rete nel rispetto del copyrightLe opportunità della rete nel rispetto del copyright
Le opportunità della rete nel rispetto del copyright
 
Aziende e Wikipedia: dobbiamo parlare - Social Media Week Milano - Febbraio 2...
Aziende e Wikipedia: dobbiamo parlare - Social Media Week Milano - Febbraio 2...Aziende e Wikipedia: dobbiamo parlare - Social Media Week Milano - Febbraio 2...
Aziende e Wikipedia: dobbiamo parlare - Social Media Week Milano - Febbraio 2...
 
Archeowiki, When Open-Source Strategies Attract Visitors' Presence In Museums...
Archeowiki, When Open-Source Strategies Attract Visitors' Presence In Museums...Archeowiki, When Open-Source Strategies Attract Visitors' Presence In Museums...
Archeowiki, When Open-Source Strategies Attract Visitors' Presence In Museums...
 
OpenData e progetti collaborativi
OpenData e progetti collaborativiOpenData e progetti collaborativi
OpenData e progetti collaborativi
 
La privacy nei progetti aperti e collaborativi - il caso di Wikipedia
La privacy nei progetti aperti e collaborativi - il caso di Wikipedia La privacy nei progetti aperti e collaborativi - il caso di Wikipedia
La privacy nei progetti aperti e collaborativi - il caso di Wikipedia
 
School of data Trento: basic spreadsheet
School of data Trento: basic spreadsheetSchool of data Trento: basic spreadsheet
School of data Trento: basic spreadsheet
 
Presentazione Wikipedia Scuole Civiche
Presentazione Wikipedia Scuole CivichePresentazione Wikipedia Scuole Civiche
Presentazione Wikipedia Scuole Civiche
 
Linux Burning Machine
Linux Burning MachineLinux Burning Machine
Linux Burning Machine
 

Dernier

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Dernier (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

  • 1. Data coherence between OSM and Wikipedia Cristian Consonni Fondazione Bruno Kessler State of the Map 2013 - Birmingham September 2013 Cristian Consonni Data coherence between OSM and WIkipedia 1 / 16
  • 2. Outline 1 Introduction 2 The Problem 3 Proposing a Solution Wikipedia-OSM comparator Nut4Nuts 4 Conclusions 5 Questions Cristian Consonni Data coherence between OSM and WIkipedia 2 / 16
  • 3. Collecting Information About the Real World Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16
  • 4. Collecting Information About the Real World Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16
  • 5. Collecting Information About the Real World Wikipedia and OpenStreetMap are: collaborative volunteer-driven free (as in freedom and as in beer) Both projects collect information about the real world. Cristian Consonni Data coherence between OSM and WIkipedia 4 / 16
  • 6. Different Processes and Communities Wikipedia anonymous users can edit entries consist in text (or media) only encyclopedical subjects content can be protected from editing in case of problems OpenStreetMap only registered users can edit entries consist in data everything can be described content is always editable Cristian Consonni Data coherence between OSM and WIkipedia 5 / 16
  • 7. Inconsistencies in the data Data in Wikipedia can be inconsistent with data from OpenStreetMap. We should compare the data and reconcile the differences. Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
  • 8. Inconsistencies in the data Data in Wikipedia can be inconsistent with data from OpenStreetMap. We should compare the data and reconcile the differences. On Wikipedia the metro station “Colosseum” is inside the Colosseum itself. Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
  • 9. Inconsistencies in the data Data in Wikipedia can be inconsistent with data from OpenStreetMap. We should compare the data and reconcile the differences. On Wikipedia the metro station “Colosseum” is inside the Colosseum itself. On OpenStreetMap the metro station is correctly placed outside the monument. OpenStreetMap maps on Wikipedia provided by WIWOSM tool by User:Master and User:Kolossos, check it out on: http://wiki.openstreetmap.org/wiki/WIWOSM Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
  • 10. Proposal of the Solution Two steps towards a solution: 1 Compare the data Identify links between Wikipedia pages and OSM entities Extract all the available geographical information Define metrics to calculate if the data are “close” or not 2 Reconcile the differences Provide the communities with the result of previous analysis Creating tools to facilitate the reconciliation Cristian Consonni Data coherence between OSM and WIkipedia 7 / 16
  • 11. Comparing the data Wikipedia-OpenStreetMap comparator Proof-of-concept: comparing data about churches in Italy: Wikipedia-OpenStreetMap comparator source code: https://github.com/CristianCantoro/WOcomparator Easy case: pre-defined category of items (selection on a set of features in OSM, articles with a given template in Wikipedia) only entities with a (it:)Wikipedia attribute were selected ⇒ linking is straightforward. Cristian Consonni Data coherence between OSM and WIkipedia 8 / 16
  • 12. Comparing the data Wikipedia-OpenStreetMap comparator http://it.wikipedia.org/wiki/Utente:CristianCantoro/Georeferenziazione Cristian Consonni Data coherence between OSM and WIkipedia 9 / 16
  • 13. Comparing the data nuts4nuts For the hard case (try to link every possible thing), another tool: Nuts4Nuts source code: https://github.com/SpazioDati/Nuts4Nuts http://nuts4nutsrecon.spaziodati.eu/reconcile?queries={%22q0%22:%20{%22query%22:%20%22Palazzo%20Vecchio%22}} Known limitations: limited to Italy uses of external services grab the source code: https://github.com/SpazioDati/Nuts4Nuts Cristian Consonni Data coherence between OSM and WIkipedia 10 / 16
  • 14. Dandelion Nuts4Nuts is built using the infrastracture provided by Dandelion (http://dandelion.eu) a datamarket by SpazioDati srl. Cristian Consonni Data coherence between OSM and WIkipedia 11 / 16
  • 15. Future Work Nuts4nuts is a step to find geographical information for Wikipedia article that have no explicit coordinates in them. Future work: study new approaches to link entities between Wikipedia and OpenStreetMap an application to fix inconsistencies or fill in missing data, like this: Cristian Consonni Data coherence between OSM and WIkipedia 12 / 16
  • 16. Conclusions Wikipedia and OSM collect information about the real world Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
  • 17. Conclusions Wikipedia and OSM collect information about the real world Comparing data among the two project can highlight inconsistencies Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
  • 18. Conclusions Wikipedia and OSM collect information about the real world Comparing data among the two project can highlight inconsistencies We should fix them Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
  • 19. Questions & Contacts Questions? mail: consonni@fbk.eu twitter: @CristianCantoro github: https://github.com/CristianCantoro Cristian Consonni Data coherence between OSM and WIkipedia 14 / 16
  • 20. Thank you Thank you! This work was supported by: A project by: SpazioDati srl Edizioni Curcu & Genovese with funds from the European Regional Development Fund. More information: http://trentino.dandelion.eu Cristian Consonni Data coherence between OSM and WIkipedia 15 / 16
  • 21. Copyright notice The following presentation is realeased under the licence CC3.0-BY-SA. Further info: http://creativecommons.org/licenses/by-sa/3.0/ Logos and trademarks are of the respective owners. Cristian Consonni Data coherence between OSM and WIkipedia 16 / 16