2. (CC BY) 2013 Nokia 2
GIS is a lot of things.
Open Geospatial Consortium defines lots of standards
• http://www.opengeospatial.org/standards/sfs
The one we are talking about is:
OpenGIS Implementation Specification for Geographic
information - Simple feature access - Part 2: SQL option
WHAT is GIS?
3. (CC BY) 2013 Nokia 3
Is the world flat, or a sphere?
GEOMETRY types GEOGRAPHY types
4. (CC BY) 2013 Nokia 4
It's neither!
But what about mountains and skyscrapers?
5. (CC BY) 2013 Nokia 5
Projections?
A
B C
Distance(A, B) = 0.0001 deg = 11 m
Distance(B, C) = 0.0001 deg = 8.5 m
(in Manhattan)
A
B C
All the lines above are straight.
6. (CC BY) 2013 Nokia 6
POINT(0 0) LINESTRING(0 0,1 1,1 2) POLYGON((0 0,4 0,4 4,0 4,0 0),(1 1, 2 1, 2 2, 1 2,1 1))
...
INSERT INTO geotable ( the_geom, the_name )
VALUES ( ST_GeomFromText('POINT(-126.4 45.32)', 312), 'A Place');
db=# SELECT road_id, ST_AsText(road_geom) AS geom, road_name FROM roads;
road_id | geom | road_name
--------+-----------------------------------------+-----------
1 | LINESTRING(191232 243118,191108 243242) | Jeff Rd
2 | LINESTRING(189141 244158,189265 244817) | Geordie Rd
3 | LINESTRING(192783 228138,192612 229814) | Paul St
4 | LINESTRING(189412 252431,189631 259122) | Graeme Ave
5 | LINESTRING(190131 224148,190871 228134) | Phil Tce
6 | LINESTRING(198231 263418,198213 268322) | Dave Cres
7 | LINESTRING(218421 284121,224123 241231) | Chris Way
(6 rows)
SELECT the_geom
FROM geom_table
WHERE ST_Distance(the_geom, ST_GeomFromText('POINT(100000 200000)')) < 100 AND type="road"
See also: http://blog.mariadb.org/screencast-mariadb-gis-demo/
Example SQL
7. (CC BY) 2013 Nokia 7
PostgreSQL MySQL & MariaDB MongoDB Solr SQLite
Standard feature PostGIS +
Extension
+ + + Spatialite
Type: Point + + + + +
Type: Geometry (x,y) + + * - +
Type: Geography (lat, lon) + - * - -
Type: 3D (ish) + - - - -
SRID projections + - * - +
Query by radius + ~ + + ~
Precise decimal math - MariaDB - - -
Query by bounding box + + * - +
Notes: Most
functions
don't support
Geography
MyISAM only WGS84 only
Limited
function set.
Indexes
have to be
explicitly
JOINed
Products that implement GIS
* Since MongoDB 2.4. This evaluation was done on v 2.0.
~ No, but you can query with bounding box (uses index) AND sort that result set by radius.
8. (CC BY) 2013 Nokia 8
Spatial use cases
-74.001417, 40.719811Canal Street, New York, USA
Geocoding
Reverse Geocoding
(text search)
(GIS)
Points-of-Interest
We are here
9. (CC BY) 2013 Nokia 9
• Scan HERE.com with script:
40.48, -75.23 to 42.42, -73.38
New York City
+ 4 neighbor states
+ Atlantic Ocean
• 0.0001 deg steps =
11 m vertically, 8.5 m horizontally
• 358M points
9.6M unique locations
• 7 days
Creating my data set
10. (CC BY) 2013 Nokia 10
SELECT * FROM Location
JOIN Point ON Location.id=Point.LocationId
WHERE Location.id=1;
id Label Country State County
1 E Sawmill Rd, Haycock Twp, PA
18951, United States
USA PA Bucks
PostalCode City District Street House
Number
Location
Type
18951 Haycock E Sawmill Rd street
1:n
11. (CC BY) 2013 Nokia 11
• GIS functions
used:
ST_Envelope()
ST_Union()
• Limitations in
Geography type
• 12 days
Bottlenecked by
CPU
Creating areas out of points
16. (CC BY) 2013 Nokia 16
SQL with polygons
SELECT *
FROM "GeomArea"
JOIN "Location" ON "GeomArea"."id" = "Location"."id"
WHERE ST_Within(ST_GeomFromEWKT('SRID=4326;POINT(<lon> <lat>)'), "p")
SQL with points
SELECT *
FROM Point
JOIN Location ON Point.LocationId = Location.id
WHERE ST_Within(p, ST_GeomFromText('POLYGON((<lon>+1 <lat>+1, <lon>+1 <lat>-1,
<lon>-1 <lat>-1, <lon>-1 <lat>+1,
<lon>+1 <lat>+1))'))
ORDER BY ST_Distance(ST_GeomFromText('POINT(<lon> <lat>)'), p)
MongoDB with points
point = db.point.find( { "p": { "$near" : [ lon, lat ] } } ).limit(1)
id = point[0]["LocationId"]
location = db.location.find_one( {"_id": id} )
Reverse geocoding HowTo
17. (CC BY) 2013 Nokia 17
Centos 6
8 CPUs, 32GB RAM, all tests with data set in RAM
PostGIS 9.1
MySQL 5.6.9 RC
MariaDB 5.5.29
MongoDB 2.0.7
Versions
18. (CC BY) 2013 Nokia 18
My data (GB) World (GB)
PostGIS polygons 34 165 240
PostGIS points 70 340 200
MySQL & MariaDB polygons 3.9 18 954
MySQL & MariaDB points 18 87 480
MongoDB 71 345 060
Data size (note that my data set not packed for optimal for size)
Size for World is extrapolated by multiplier 4860
This is based on 30% of the Earth surface being land
Polygons could be smoothened to reduce data set size by factor of 20-100
23. (CC BY) 2013 Nokia 23
• Nice linear scalability, stable response times
• Most advanced, but "bolted on" user experience
• Wasteful in CPU and data size
• Decent on disk bound workload
• Polygon based performance a small disappointment
• Wishlist:
• No more feutures needed.
• Ease of use and performance please.
• Future: Real 3D
PostGIS Summary
24. (CC BY) 2013 Nokia 24
MongoDB
• Simple: Radius from point (Foursquare)
• Combinations possible: type=restaurant within 1 km
• Single thread performance ok, but didn't scale
• Could be issue with benchmark framework
• Main gotcha: don't use python dictionary for (lon, lat)
• 2.4 brings lots of enhancements, not covered here.
MongoDB Summary
25. (CC BY) 2013 Nokia 25
• 5x better than anything else
• For Within()
• Contention on sorting by Distance()
• Delivered on the vision of polygon based model
• Different implementations, same performance
• MySQL slightly faster, but within +/- 10%
• MariaDB has precise math operations
• Wishlist:
• Projections (SRID)
• InnoDB support
• Distance() using RTree index
MySQL & MariaDB Summary