SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
OpenCage FOSSGIS 2015
http://worldwideberlin.com/
OpenCage FOSSGIS 2015
Overview
I. place name disambiguation (homonyms)
– with & without spellcheck
II. Nominatim
III. other (open data) geocoders
– 2015 trends
– opportunities to share data, config, tests
IV. shared ranking/scoring data
OpenCage FOSSGIS 2015
OpenCage Geocoder
OpenCage FOSSGIS 2015
Welches Münster meinen sie?
OpenCage FOSSGIS 2015
Nominatim geocoder
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
Mühlheim vs Mülheim
OpenCage FOSSGIS 2015
“eifelturm”
OpenCage FOSSGIS 2015
“eiffel turm”
OpenCage FOSSGIS 2015
“eiffeltower” => no result
OpenCage FOSSGIS 2015
“eifel tower”
=> fair ground, Varna Bulgaria (fixed last week)
OpenCage FOSSGIS 2015
“eiffel tower”
=> one in Paris
=> replicas around the world
=> restaurants around the world
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
http://www.openstreetmap.org/#map=17/39.80885/116.28163
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
Nominatim
●
OSM data, minutely updates
●
+ UK postal codes, TIGER
●
1TB PostGIS
●
import in C, setup scripts in PHP, Postgres stored
procedures, PHP frontend, Python&PHP test suite
●
autocomplete if you add Photon geocoder
●
no spellcheck
OpenCage FOSSGIS 2015
regression/blackbox tests
OpenCage FOSSGIS 2015
other geocoders
Closed source Open source, high resources Open source, low resources
Google Maps Mapzen “Pelias” OpenStreetMap “Nominatim”
Bing/Yahoo Mapbox “Carmen” OpenCage (multiple)
Mapquest Mapquest open (Nominatim) geonames
ESRI/ArcGIS Online Foursquare “Quattroshapes” geocod.io (Tiger data)
Baidu Scout Photon (Nominatim)
Yandex Cloudmade geo.io (Nominatim)
TomTom DSTK (Tiger, geonames)
Amazon (Android only) SmartyStreets
Telenav ...
Nokia/Ovi/Here
Apple (iOS only)
...
OpenCage FOSSGIS 2015
trends
●
SSD
●
Add commercial sources
●
Full builds, downloadable index
●
High parallel (map/reduce, nodejs), cloud scaling,
noSQL
●
Community building, guidelines
●
Test suites
OpenCage FOSSGIS 2015
typical features to improve
●
horizontal scaling
●
autocomplete
●
spellcheck
●
improve text parsing (App 3, 111-113b)
●
crossings (Main & 2nd N, New Orleans)
●
“4km north of $cityname on the N6”
●
tests for non-latin alphabets
●
postal code boundaries
●
localsearch/POIs
OpenCage FOSSGIS 2015
what should be shared
●
aka. don't reinvent everything
●
standard test suite to compare geocoders
●
hierarchy data
●
address parsing
●
address formatting
●
language configuration
●
data parsing, e.g. OSM tags
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
openaddresses.io
●
110m addresses
●
10GB of text files
1174 SMITH CREEK WAY, BRASSFIELD, WAKE FOREST, NC 27587
732 STEWARTS ROAD, LANEXA, VA 23124
OpenCage FOSSGIS 2015
address formatting
https://github.com/lokku/address-formatting/
– configuration
– test cases for 33 countries
– reference implementation in Perl
{ country_code: 'dk', village: 'Ærøskøbing', county: 'Ærø
Municipality', house_number: '17A', neighbourhood: 'Paradiset',
postcode: '5970', road: 'Baggårde', state: 'Region of Southern Denmark'
}
Baggårde 17A, 5970 Ærøskøbing, Denmark
Adama Asnyka 1, 59-700 Bolesławiec, Poland
CAI, Cerrito 1250, Retiro, C1010AAZ Buenos Aires, Argentina
OpenCage FOSSGIS 2015
wikipedia data
OpenCage FOSSGIS 2015
core geocoding logic
1. tokenize
2. filter
•
fixed bounding box, browser window, country
•
OSM tags/POI search
•
min-max admin
3. search
4. rank
•
country bias
•
language bias (client, explicit)
•
location boost (client, explicit, history)
•
maybe: spellcheck
•
maybe: retry/failover/remove phrases
•
importance boost
OpenCage FOSSGIS 2015
http://blog.mayflower.de/755-Schnelle-Volltextsuche-mit-Solr.html
OpenCage FOSSGIS 2015
map to hierachy (ranks)
http://wiki.openstreetmap.org/wiki/Nominatim/Development_overview
OpenCage FOSSGIS 2015
names, names, names
OpenCage FOSSGIS 2015
name is one of many factors
ranking examples:
●
Altona
– type: suburb vs train station vs town ins US/Canada
●
Germany
– admin_level=2 (country) vs island
●
Mt everest
– importance: viewpoint vs peak vs island
●
Oktoberfest
– actually a alt_name of Theresienwiese
●
Königsberg
– 10x a peak, 1x old_name of Kaliningrad
●
Hitlerberg
– old_name:1934-1945 of Heigelkopf
OpenCage FOSSGIS 2015
status on wikipedia_articles.bin
●
version 1: wikipedia pageview logs
– https://en.wikipedia.org/wiki/Wikipedia:Notability
●
version 2 (current): parsing wikipedia articles and count links
– last updated 2013
– 80m wikipedia entries + 15m redirects
– 0.6m places in OSM have wikipedia tag set (2013: 0.4m)
●
Version 3 (TBD): parsing wikipedia geo exports
– http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Georeferenzierung/Haupts
eite/Wikipedia-World/en
– 3.4m entries, more languages, regular dumps, new documentaton
●
version 4 (?)
- used wikidata exports
- used by multiple geocoders
OpenCage FOSSGIS 2015
what can mappers do?
●
add wikipedia tags
●
fix administrative levels
●
don't add wrong names (typos)
●
file bugs (github)
http://nominatim.openstreetmap.org/
OpenCage FOSSGIS 2015
… and if all fails: rename city
OpenCage FOSSGIS 2015
Questions ?
mtm@opencagedata.com

Contenu connexe

Plus de lokku

Geo-search-location-based-results-for-site-search
Geo-search-location-based-results-for-site-searchGeo-search-location-based-results-for-site-search
Geo-search-location-based-results-for-site-searchlokku
 
Geocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR event
Geocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR eventGeocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR event
Geocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR eventlokku
 
Nestoria new design
Nestoria new designNestoria new design
Nestoria new designlokku
 
CSS::SpriteMaker in action!
CSS::SpriteMaker in action!CSS::SpriteMaker in action!
CSS::SpriteMaker in action!lokku
 
Reducing the technical hurdle - why we started OpenCage Data
Reducing the technical hurdle - why we started OpenCage DataReducing the technical hurdle - why we started OpenCage Data
Reducing the technical hurdle - why we started OpenCage Datalokku
 
Css sprite_maker-1
Css  sprite_maker-1Css  sprite_maker-1
Css sprite_maker-1lokku
 
Geo-Data for Search Marketing SEM & SEO
Geo-Data for Search Marketing SEM & SEOGeo-Data for Search Marketing SEM & SEO
Geo-Data for Search Marketing SEM & SEOlokku
 
Making using OSM data simpler - OpenCage Data
Making using OSM data simpler - OpenCage Data Making using OSM data simpler - OpenCage Data
Making using OSM data simpler - OpenCage Data lokku
 
What’s next in mapping for portals? ppw2012
What’s next in mapping for portals? ppw2012What’s next in mapping for portals? ppw2012
What’s next in mapping for portals? ppw2012lokku
 
How Nestoria switched to OpenStreetMap maps
How Nestoria switched to OpenStreetMap mapsHow Nestoria switched to OpenStreetMap maps
How Nestoria switched to OpenStreetMap mapslokku
 
Remote Geocoding
Remote GeocodingRemote Geocoding
Remote Geocodinglokku
 
Lessons learned in doing lots with few people
Lessons learned in  doing lots with few peopleLessons learned in  doing lots with few people
Lessons learned in doing lots with few peoplelokku
 
Mapstraction
MapstractionMapstraction
Mapstractionlokku
 
Bar Camp London 7
Bar Camp London 7Bar Camp London 7
Bar Camp London 7lokku
 
How People Search For Locations
How People Search For LocationsHow People Search For Locations
How People Search For Locationslokku
 
Arbyte - A modular, flexible, scalable job queing and execution system
Arbyte - A modular, flexible, scalable job queing and execution systemArbyte - A modular, flexible, scalable job queing and execution system
Arbyte - A modular, flexible, scalable job queing and execution systemlokku
 
Planning for Debugging
Planning for DebuggingPlanning for Debugging
Planning for Debugginglokku
 
YAPC::Europe 2008 - Mike Astle - Profiling
YAPC::Europe 2008 - Mike Astle - ProfilingYAPC::Europe 2008 - Mike Astle - Profiling
YAPC::Europe 2008 - Mike Astle - Profilinglokku
 
SOTM08
SOTM08SOTM08
SOTM08lokku
 
LPW 2007 - Perl Plumbing
LPW 2007 - Perl PlumbingLPW 2007 - Perl Plumbing
LPW 2007 - Perl Plumbinglokku
 

Plus de lokku (20)

Geo-search-location-based-results-for-site-search
Geo-search-location-based-results-for-site-searchGeo-search-location-based-results-for-site-search
Geo-search-location-based-results-for-site-search
 
Geocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR event
Geocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR eventGeocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR event
Geocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR event
 
Nestoria new design
Nestoria new designNestoria new design
Nestoria new design
 
CSS::SpriteMaker in action!
CSS::SpriteMaker in action!CSS::SpriteMaker in action!
CSS::SpriteMaker in action!
 
Reducing the technical hurdle - why we started OpenCage Data
Reducing the technical hurdle - why we started OpenCage DataReducing the technical hurdle - why we started OpenCage Data
Reducing the technical hurdle - why we started OpenCage Data
 
Css sprite_maker-1
Css  sprite_maker-1Css  sprite_maker-1
Css sprite_maker-1
 
Geo-Data for Search Marketing SEM & SEO
Geo-Data for Search Marketing SEM & SEOGeo-Data for Search Marketing SEM & SEO
Geo-Data for Search Marketing SEM & SEO
 
Making using OSM data simpler - OpenCage Data
Making using OSM data simpler - OpenCage Data Making using OSM data simpler - OpenCage Data
Making using OSM data simpler - OpenCage Data
 
What’s next in mapping for portals? ppw2012
What’s next in mapping for portals? ppw2012What’s next in mapping for portals? ppw2012
What’s next in mapping for portals? ppw2012
 
How Nestoria switched to OpenStreetMap maps
How Nestoria switched to OpenStreetMap mapsHow Nestoria switched to OpenStreetMap maps
How Nestoria switched to OpenStreetMap maps
 
Remote Geocoding
Remote GeocodingRemote Geocoding
Remote Geocoding
 
Lessons learned in doing lots with few people
Lessons learned in  doing lots with few peopleLessons learned in  doing lots with few people
Lessons learned in doing lots with few people
 
Mapstraction
MapstractionMapstraction
Mapstraction
 
Bar Camp London 7
Bar Camp London 7Bar Camp London 7
Bar Camp London 7
 
How People Search For Locations
How People Search For LocationsHow People Search For Locations
How People Search For Locations
 
Arbyte - A modular, flexible, scalable job queing and execution system
Arbyte - A modular, flexible, scalable job queing and execution systemArbyte - A modular, flexible, scalable job queing and execution system
Arbyte - A modular, flexible, scalable job queing and execution system
 
Planning for Debugging
Planning for DebuggingPlanning for Debugging
Planning for Debugging
 
YAPC::Europe 2008 - Mike Astle - Profiling
YAPC::Europe 2008 - Mike Astle - ProfilingYAPC::Europe 2008 - Mike Astle - Profiling
YAPC::Europe 2008 - Mike Astle - Profiling
 
SOTM08
SOTM08SOTM08
SOTM08
 
LPW 2007 - Perl Plumbing
LPW 2007 - Perl PlumbingLPW 2007 - Perl Plumbing
LPW 2007 - Perl Plumbing
 

Dernier

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Dernier (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Geocoding Overview