SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
GeoNames
        “Under the Hood: How GeoNames Aggregates
             many Sources into One Data Set“




             GeoNames is ...
        aggregator of free geo data


                   I am ...
                 Marc Wick
self employed software engineer, Switzerland
GeoNames Feature Density Map




GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   2
GeoNames - Gazetteer
    Pragmatic, useful, ease of use
    Over 6.5 million features
    Cc-by licence
    9 feature classes




GeoNames, Marc Wick    Web 2.0 Expo - 8. Nov 2007 Berlin   3
Screen shot Berlin




GeoNames, Marc Wick       Web 2.0 Expo - 8. Nov 2007 Berlin   4
Origins and Goal
    Proprietary application
    Team up together
    contribute modifications to central data base.
    applications switch to GeoNames from
    proprietary aggregation




GeoNames, Marc Wick      Web 2.0 Expo - 8. Nov 2007 Berlin   5
Challenge
    A lot of data IS
    available
    Many providers
    Languages
    Scripts




GeoNames, Marc Wick    Web 2.0 Expo - 8. Nov 2007 Berlin   6
GeoNames Ambassadors
                                             GeoNames contact
                                             Speak local language
                                             Know local situation




GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin             7
Data Sources
    National Mapping Agencies
    Statistical Offices
    Postal codes
    National Geospatial-Intelligence Agency (NGA)‫‏‬
    Applications using GeoNames
      −   Data files
      −   Manual modifications


GeoNames, Marc Wick     Web 2.0 Expo - 8. Nov 2007 Berlin   8
US vs Europe
    US data is freely available
    European data is not available
    Rest of the World?
    Consequences




GeoNames, Marc Wick    Web 2.0 Expo - 8. Nov 2007 Berlin   9
GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   10
Future of geodata availability
    We believe basic geodata will be free in most
    countries


    Why :
      −   Economy
      −   Traffic Policy and Road Safety (road signs)‫‏‬




GeoNames, Marc Wick      Web 2.0 Expo - 8. Nov 2007 Berlin   11
GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   12
Free Availability is only a First Step




GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   13
Who aggregates data
    GeoNames
    Super national mapping agencies
    Super national organisations


    INSPIRE




GeoNames, Marc Wick        Web 2.0 Expo - 8. Nov 2007 Berlin   14
Problems and Solutions I
    Shape / GML                               FWTools/ GDAL/OGR
    Datum reprojection                        Postgis/epsg/native
                                              tools/custom impl




GeoNames, Marc Wick    Web 2.0 Expo - 8. Nov 2007 Berlin            15
Problems and Solutions II
    FeatureCodes not 1:1                     Pattern matching
    non-ASCII                                Transliteration
    Country codes
    Admin1 codes




GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin         16
Place name matching
    Geocoding
    Distance
    feature type and feature code
    Reverse geocoding, compare name similarity
      −   levenshtein distance
      −   letter pair similarity




GeoNames, Marc Wick        Web 2.0 Expo - 8. Nov 2007 Berlin   17
GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   18
Wikipedia GeoTemplates
    Proliferation of GeoFormats
    No consensus, Anarchy
    Examples
      −   <geo>48 46 36 N 121 48 51 W</geo>
      −   {{coor d|48.7767|N|121.8142|W|}}
      −   Berlin : |lat_deg = 52|lat_min = 31
      −   ... (Any template you could possibly think of is used somewhere)‫‏‬



GeoNames, Marc Wick            Web 2.0 Expo - 8. Nov 2007 Berlin              19
Alternate Names

  ...
  Italian : Berlino
  English : Berlin
  Arabic : ‫نيلرب‬
  Korean :
  Thai          : เบอรลิน
  Russian : Берлин
  Chinese :
  Marathi : बर् लि न
  ... (ca 100 names)‫‏‬
GeoNames, Marc Wick    Web 2.0 Expo - 8. Nov 2007 Berlin   20
Postal codes
    Geocode – postal code numeric distance
    Accuracy, completeness


    ScribbleMaps by Robert Kosara




GeoNames, Marc Wick     Web 2.0 Expo - 8. Nov 2007 Berlin   21
GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   22
GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   23
Data Dump
    Flat csv files
    Simple format
    Ease of use
    Full daily dump
    daily modifications
    rdf



GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   24
Web Services
    Search
      −   Ranking
              Tf idf
              Relevancy
      −   I18n




GeoNames, Marc Wick        Web 2.0 Expo - 8. Nov 2007 Berlin   25
GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   26
Hierarchy Web Services
    Hierarchy
    Child
    Neighbour
    Sibling




GeoNames, Marc Wick    Web 2.0 Expo - 8. Nov 2007 Berlin   27
Apache

                       mod rewrite

                                   ROME (RSS)‫‏‬         jdom.org (xml)‫ ‏‬JSON

                                Tomcat (Java)‫‏‬                          JMS
                                                                        activeMQ


                        Lucene




                                                                           SRTM3
                                                                                   Gtopo30
                                                    JDBC
                      Full Text Index
                      TF-IDF



                                     Database : Postgres
                                                                        (postgis)‫‏‬


GeoNames, Marc Wick                 Web 2.0 Expo - 8. Nov 2007 Berlin                        28
Libraries
                                             Java
                                             Drupal
                                             Ruby
                                             Php
                                             Perl
                                             Python
                                             Lisp

GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   29
Synchronization
    Dail dump
    Daily modification
    Jms


    Rdf dump, periodically




GeoNames, Marc Wick     Web 2.0 Expo - 8. Nov 2007 Berlin   30
Linked Data




GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   31
Applications using GeoNames
    thousands of applications
    search
    Site navigation
    geo-coding




GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   32
GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   33
GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   34
GeoNames, Marc Wick   Web 2.0 Expo - 8. Nov 2007 Berlin   35
Thank you for your attention.




GeoNames, Marc Wick         Web 2.0 Expo - 8. Nov 2007 Berlin   36

Contenu connexe

Similaire à Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set

The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...
The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...
The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...Channy Yun
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryGhislain Atemezing
 
GeoKnow: Making the Web an Exploratory Place for Spatial Data
GeoKnow: Making the Web an Exploratory Place for Spatial DataGeoKnow: Making the Web an Exploratory Place for Spatial Data
GeoKnow: Making the Web an Exploratory Place for Spatial DataOpenLink Software
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approachesadunne
 
GIS in the Rockies Geospatial Revolution
GIS in the Rockies Geospatial RevolutionGIS in the Rockies Geospatial Revolution
GIS in the Rockies Geospatial RevolutionPeter Batty
 
Devteach 2017 Store 2 million of audit a day into elasticsearch
Devteach 2017 Store 2 million of audit a day into elasticsearchDevteach 2017 Store 2 million of audit a day into elasticsearch
Devteach 2017 Store 2 million of audit a day into elasticsearchTaswar Bhatti
 
OSGeo Live Lightening Overview
OSGeo Live Lightening OverviewOSGeo Live Lightening Overview
OSGeo Live Lightening OverviewJody Garnett
 
PCIC Data Portal 2.0
PCIC Data Portal 2.0PCIC Data Portal 2.0
PCIC Data Portal 2.0James Hiebert
 
Open Source Databases And Gis
Open Source Databases And GisOpen Source Databases And Gis
Open Source Databases And GisKudos S.A.S
 
The User-participated Geospatial Web as Open Platform
The User-participated Geospatial Web as Open PlatformThe User-participated Geospatial Web as Open Platform
The User-participated Geospatial Web as Open PlatformChanny Yun
 
Softshake 2013: Introduction to NoSQL with Couchbase
Softshake 2013: Introduction to NoSQL with CouchbaseSoftshake 2013: Introduction to NoSQL with Couchbase
Softshake 2013: Introduction to NoSQL with CouchbaseTugdual Grall
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFKeith Kraus
 
NCGIC The Geospatial Revolution
NCGIC The Geospatial RevolutionNCGIC The Geospatial Revolution
NCGIC The Geospatial RevolutionPeter Batty
 
Scalable Data Analytics and Visualization with Cloud Optimized Services
Scalable Data Analytics and Visualization with Cloud Optimized ServicesScalable Data Analytics and Visualization with Cloud Optimized Services
Scalable Data Analytics and Visualization with Cloud Optimized ServicesGlobus
 
Introduction to NoSQL with Couchbase
Introduction to NoSQL with CouchbaseIntroduction to NoSQL with Couchbase
Introduction to NoSQL with CouchbaseTugdual Grall
 
Geoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSMGeoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSMCraig Taverner
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityMongoDB
 

Similaire à Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set (20)

The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...
The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...
The Construction of the Internet Geological Data System Using WWW+Java+DB Tec...
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their Geometry
 
ITCV
ITCVITCV
ITCV
 
GeoKnow: Making the Web an Exploratory Place for Spatial Data
GeoKnow: Making the Web an Exploratory Place for Spatial DataGeoKnow: Making the Web an Exploratory Place for Spatial Data
GeoKnow: Making the Web an Exploratory Place for Spatial Data
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approaches
 
GIS in the Rockies Geospatial Revolution
GIS in the Rockies Geospatial RevolutionGIS in the Rockies Geospatial Revolution
GIS in the Rockies Geospatial Revolution
 
Devteach 2017 Store 2 million of audit a day into elasticsearch
Devteach 2017 Store 2 million of audit a day into elasticsearchDevteach 2017 Store 2 million of audit a day into elasticsearch
Devteach 2017 Store 2 million of audit a day into elasticsearch
 
OSGeo Live Lightening Overview
OSGeo Live Lightening OverviewOSGeo Live Lightening Overview
OSGeo Live Lightening Overview
 
PCIC Data Portal 2.0
PCIC Data Portal 2.0PCIC Data Portal 2.0
PCIC Data Portal 2.0
 
Instalação geo ip
Instalação geo ipInstalação geo ip
Instalação geo ip
 
Open Source Databases And Gis
Open Source Databases And GisOpen Source Databases And Gis
Open Source Databases And Gis
 
The User-participated Geospatial Web as Open Platform
The User-participated Geospatial Web as Open PlatformThe User-participated Geospatial Web as Open Platform
The User-participated Geospatial Web as Open Platform
 
Softshake 2013: Introduction to NoSQL with Couchbase
Softshake 2013: Introduction to NoSQL with CouchbaseSoftshake 2013: Introduction to NoSQL with Couchbase
Softshake 2013: Introduction to NoSQL with Couchbase
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
 
NCGIC The Geospatial Revolution
NCGIC The Geospatial RevolutionNCGIC The Geospatial Revolution
NCGIC The Geospatial Revolution
 
Scalable Data Analytics and Visualization with Cloud Optimized Services
Scalable Data Analytics and Visualization with Cloud Optimized ServicesScalable Data Analytics and Visualization with Cloud Optimized Services
Scalable Data Analytics and Visualization with Cloud Optimized Services
 
Introduction to NoSQL with Couchbase
Introduction to NoSQL with CouchbaseIntroduction to NoSQL with Couchbase
Introduction to NoSQL with Couchbase
 
Big Data Seervices in Danaos Use Case
Big Data Seervices in Danaos Use CaseBig Data Seervices in Danaos Use Case
Big Data Seervices in Danaos Use Case
 
Geoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSMGeoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSM
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS Community
 

Plus de adunne

Seedcamp Overview
Seedcamp OverviewSeedcamp Overview
Seedcamp Overviewadunne
 
Netvibes Preview
Netvibes PreviewNetvibes Preview
Netvibes Previewadunne
 
Community Practices: From Forums to Social Networks
Community Practices: From Forums to Social NetworksCommunity Practices: From Forums to Social Networks
Community Practices: From Forums to Social Networksadunne
 
Designing Tag Navigation
Designing Tag NavigationDesigning Tag Navigation
Designing Tag Navigationadunne
 
Social Commerce and Community
Social Commerce and CommunitySocial Commerce and Community
Social Commerce and Communityadunne
 
The Starfish and the Spider
The Starfish and the SpiderThe Starfish and the Spider
The Starfish and the Spideradunne
 
Ginger Preview
Ginger PreviewGinger Preview
Ginger Previewadunne
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solradunne
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Appsadunne
 
The Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms IndustryThe Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms Industryadunne
 
Building Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data CentersBuilding Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data Centersadunne
 
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...adunne
 
Designing for a Web of Data
Designing for a Web of DataDesigning for a Web of Data
Designing for a Web of Dataadunne
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Appsadunne
 
Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...adunne
 
Your User's Privacy
Your User's PrivacyYour User's Privacy
Your User's Privacyadunne
 
Trends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine MarketingTrends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine Marketingadunne
 
Wuala, P2P Online Storage
Wuala, P2P Online StorageWuala, P2P Online Storage
Wuala, P2P Online Storageadunne
 
Breaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for AccessibilityBreaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for Accessibilityadunne
 
Web 2.0 Design Patterns, Models and Analysis
Web 2.0 Design Patterns, Models and AnalysisWeb 2.0 Design Patterns, Models and Analysis
Web 2.0 Design Patterns, Models and Analysisadunne
 

Plus de adunne (20)

Seedcamp Overview
Seedcamp OverviewSeedcamp Overview
Seedcamp Overview
 
Netvibes Preview
Netvibes PreviewNetvibes Preview
Netvibes Preview
 
Community Practices: From Forums to Social Networks
Community Practices: From Forums to Social NetworksCommunity Practices: From Forums to Social Networks
Community Practices: From Forums to Social Networks
 
Designing Tag Navigation
Designing Tag NavigationDesigning Tag Navigation
Designing Tag Navigation
 
Social Commerce and Community
Social Commerce and CommunitySocial Commerce and Community
Social Commerce and Community
 
The Starfish and the Spider
The Starfish and the SpiderThe Starfish and the Spider
The Starfish and the Spider
 
Ginger Preview
Ginger PreviewGinger Preview
Ginger Preview
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solr
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
 
The Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms IndustryThe Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms Industry
 
Building Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data CentersBuilding Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data Centers
 
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
 
Designing for a Web of Data
Designing for a Web of DataDesigning for a Web of Data
Designing for a Web of Data
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
 
Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...
 
Your User's Privacy
Your User's PrivacyYour User's Privacy
Your User's Privacy
 
Trends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine MarketingTrends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine Marketing
 
Wuala, P2P Online Storage
Wuala, P2P Online StorageWuala, P2P Online Storage
Wuala, P2P Online Storage
 
Breaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for AccessibilityBreaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for Accessibility
 
Web 2.0 Design Patterns, Models and Analysis
Web 2.0 Design Patterns, Models and AnalysisWeb 2.0 Design Patterns, Models and Analysis
Web 2.0 Design Patterns, Models and Analysis
 

Dernier

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 

Dernier (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 

Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set

  • 1. GeoNames “Under the Hood: How GeoNames Aggregates many Sources into One Data Set“ GeoNames is ... aggregator of free geo data I am ... Marc Wick self employed software engineer, Switzerland
  • 2. GeoNames Feature Density Map GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 2
  • 3. GeoNames - Gazetteer Pragmatic, useful, ease of use Over 6.5 million features Cc-by licence 9 feature classes GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 3
  • 4. Screen shot Berlin GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 4
  • 5. Origins and Goal Proprietary application Team up together contribute modifications to central data base. applications switch to GeoNames from proprietary aggregation GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 5
  • 6. Challenge A lot of data IS available Many providers Languages Scripts GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 6
  • 7. GeoNames Ambassadors GeoNames contact Speak local language Know local situation GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 7
  • 8. Data Sources National Mapping Agencies Statistical Offices Postal codes National Geospatial-Intelligence Agency (NGA)‫‏‬ Applications using GeoNames − Data files − Manual modifications GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 8
  • 9. US vs Europe US data is freely available European data is not available Rest of the World? Consequences GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 9
  • 10. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 10
  • 11. Future of geodata availability We believe basic geodata will be free in most countries Why : − Economy − Traffic Policy and Road Safety (road signs)‫‏‬ GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 11
  • 12. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 12
  • 13. Free Availability is only a First Step GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 13
  • 14. Who aggregates data GeoNames Super national mapping agencies Super national organisations INSPIRE GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 14
  • 15. Problems and Solutions I Shape / GML FWTools/ GDAL/OGR Datum reprojection Postgis/epsg/native tools/custom impl GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 15
  • 16. Problems and Solutions II FeatureCodes not 1:1 Pattern matching non-ASCII Transliteration Country codes Admin1 codes GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 16
  • 17. Place name matching Geocoding Distance feature type and feature code Reverse geocoding, compare name similarity − levenshtein distance − letter pair similarity GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 17
  • 18. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 18
  • 19. Wikipedia GeoTemplates Proliferation of GeoFormats No consensus, Anarchy Examples − <geo>48 46 36 N 121 48 51 W</geo> − {{coor d|48.7767|N|121.8142|W|}} − Berlin : |lat_deg = 52|lat_min = 31 − ... (Any template you could possibly think of is used somewhere)‫‏‬ GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 19
  • 20. Alternate Names ... Italian : Berlino English : Berlin Arabic : ‫نيلرب‬ Korean : Thai : เบอรลิน Russian : Берлин Chinese : Marathi : बर् लि न ... (ca 100 names)‫‏‬ GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 20
  • 21. Postal codes Geocode – postal code numeric distance Accuracy, completeness ScribbleMaps by Robert Kosara GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 21
  • 22. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 22
  • 23. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 23
  • 24. Data Dump Flat csv files Simple format Ease of use Full daily dump daily modifications rdf GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 24
  • 25. Web Services Search − Ranking Tf idf Relevancy − I18n GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 25
  • 26. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 26
  • 27. Hierarchy Web Services Hierarchy Child Neighbour Sibling GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 27
  • 28. Apache mod rewrite ROME (RSS)‫‏‬ jdom.org (xml)‫ ‏‬JSON Tomcat (Java)‫‏‬ JMS activeMQ Lucene SRTM3 Gtopo30 JDBC Full Text Index TF-IDF Database : Postgres (postgis)‫‏‬ GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 28
  • 29. Libraries Java Drupal Ruby Php Perl Python Lisp GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 29
  • 30. Synchronization Dail dump Daily modification Jms Rdf dump, periodically GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 30
  • 31. Linked Data GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 31
  • 32. Applications using GeoNames thousands of applications search Site navigation geo-coding GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 32
  • 33. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 33
  • 34. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 34
  • 35. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 35
  • 36. Thank you for your attention. GeoNames, Marc Wick Web 2.0 Expo - 8. Nov 2007 Berlin 36