Twitter has launched a Geotagging API – we really wanted to enable users to not only talk about “What’s happening?” but also “What’s happening right here?” For a while now, we’ve been watching as users have been trying to geo-tag their tweets through a variety of methods, all of which involve a link to a map service embedded in their Tweet. This talk will delve into how Twitter handles their geocontent including tool suggestions.
As a platform, we’ve tried to make it easier for our users by making location be omnipresent through our platform, and an inherent (but optional) part of a tweet. We’re making the platform be not just about time, but also about place.
3. Background
Wherehoo (2000)
‣ “The Stuff Around You”
‣ “Wherehoo Server: An interactive location service for software agents and intelligent
systems” - J.Youll, R.Krikorian
‣ In your /etc/services file
BusRadio (2004)
‣ Designed mobile computers to play media while also transmitting telemetry
‣ Looked and sounded like a radio - but really a Linux computer
OneHop (2007)
‣ Bluetooth proximity-based social networking
4. Table of Contents
Background
‣ Why are we interested in this?
Twitter’s Geo APIs
‣ How do we allow people to talk about place?
Problem statement
‣ What are we trying to have our system do?
Infrastructure
‣ How is Twitter solving this problem?
11. Original attempts
Adding it to the tweet
‣ Use myloc.me, et. al. to add text to the tweet
‣ Localizes mobile phone and puts location “in band”
‣ Takes from 140 characters
Setting profile level locations
‣ Set the user/location of a Twitter user
‣ There is an API for that!
‣ Not on a per-tweet basis and not designed for high frequency updates
16. Geotagging API
Adding it to the tweet
‣ Per-tweet basis
‣ Out of band / pure meta-data
‣ Does not take from the 140 characters
Native Twitter support
‣ Simple way to update status with location data
‣ Ability to remove geotags from your tweets en masse
‣ Using GeoRSS and GeoJSON as the encoding format
‣ Across all Twitter APIs (REST, Search, and Streaming)
19. Search
search (with geocode)
curl "http://search.twitter.com/search.atom?
geocode=40.757929%2C-73.985506%2C25km&source=foursquare"
geocode parameter takes “latitude,longitude,radius” where radius has
units of mi or km
...
<title>On the way to ace now, so whenever you can make it I'll be there. (@
Port Imperial Ferry in Weehawken) http://4sq.com/2rq0vO</title>
...
<twitter:geo>
<georss:point>40.7759 -74.0129</georss:point>
</twitter:geo>
...
28. Trends API
Global trends
‣ Currently on front page of Twitter.com and on search.twitter.com
‣ Analysis of “hot conversations”
‣ Does not take from the 140 characters
Location specific trends
‣ Tweets being localized through a variety of means into trends
‣ Locations exposed over the API as WOEIDs
‣ Can ask for available trends sorted by distance from your location
‣ Querying for a parent of a location will return all locations under it
29. Available locations
trends/available
curl "http://api.twitter.com/1/trends/available.xml"
Can optionally take a lat and long parameter to have trends locations
returned, sorted, as distance from you.
<locations type=”array”>
<location>
<woeid>2487956</woeid>
<name>San Francisco</name>
<placeTypeName code=”7”>Town</placeTypeName>
<country type=”Country” code=”US”>United States</country>
<url>http://where.yahooapis.com/v1/place/2487956</url>
</location>
...
</locations>
30. Available locations
trends/woeid.xml (trends/twid.xml coming soon)
curl "http://api.twitter.com/1/trends/2487956.xml"
Look up the trends at the given WOEID
<matching_trends type=”array”>
<trends as_of=”2009-12-15T20:19:09Z”>
...
<trend url=”http://search.twitter.com/search?q=Golden+Globe+nominations” query=”Golden
+Globe+nominations”>Golden Globe nominations</trend>
<trend url=”http://search.twitter.com/search?q=%23somethingaintright”
query=”%23somethingaintright”>#somethingaintright</trend>
...
</trends>
</matching_trends>
32. Geo-place API
Support for “names"
‣ Not just coordinates
‣ More contextually relevant
‣ Positive privacy benefits
Increased complexity
‣ Need to be able to look up a list of places
‣ Requires a “reverse geocoder”
‣ Human driven tagging and not possible to be fully automatic
38. What do we need to build?
‣ Database of places
‣ Given a real-world location, find programatic places that that
place maps to
‣ Spatial search
‣ Method to store places with content
‣ Per user basis
‣ Per tweet basis
40. As background... MySQL + GIS
‣ Ability to index points and do a spatial query
‣ For example, get points within a bounding rectangle
‣ SELECT
MBRContains(GeomFromText(
'POLYGON((0 0,0 3,3 3,3 0,0 0))' ), coord)
FROM geometry
‣ Hard to cache the spatial query
‣ Possibly requires a DB hit on every query
41. Options
Grid / Quad-tree
‣ Create a grid (possibly nested) of the entire Earth
Geohash
‣ Arbitrarily precise and hierarchical spatial data reference
Space filling curves
‣ Mapping 2D space into 1D while preserving locality
R-Tree
‣ Spatial access data structure
46. Geohash
‣ 37o18’N 121o54’W = 9q9k4
‣ Hierarchical spatial data structure
‣ Precision encoded
‣ Distance captured
‣ Nearby places (usually) share the same prefix
‣ The longer the string match, the closer the places are
48. Geohash
‣ Possible to do range query in database
‣ Matching based on prefix will return all the points that fit in that
“grid”
‣ Able to store 2D data in a 1D space
51. Space filling curve
‣ Generalization of geohash
‣ 2D to 1D mapping
‣ Nearness is captured
‣ Recursively can fill up space
depending on resolution desired
‣ Fractal-like pattern can be used to
take up as much room as possible
54. R-Tree
‣ Height-balanced tree data
structure for spatial data
‣ Uses hierarchically nested
bounding boxes
‣ Nearby elements are placed in
the same node
57. How do you store precision?
‣ “Precision” is a hard thing to encode
‣ Accuracy can be encoded with an error radius
‣ Twitter opts for tracking the number of decimals passed
‣ 140.0 != 140.00
‣ DecimalTrackingFloat
60. Twitter Infrastructure
‣ Ruby on Rails-ish frontend
‣ Scala-based services backend
‣ MySQL and soon to be Cassandra as the store
‣ RPC to back-end or put items into queues
63. Simplified architecture
‣ R-Tree for spatial lookup
‣ Data provider for front-end lookups
‣ Store place object with envelope of place in R-Tree
‣ Mapping from ID to place object
64. Java Topology Suite (JTS)
‣ http://www.vividsolutions.com/jts/jtshome.htm
‣ Open source
‣ Good for representing and manipulating “geometries”
‣ Has support for fundamental geometric operations
‣ contains
‣ envelope
‣ Has a R-Tree implementation
65. point
Insid
point e in
Outsi polyg
de in on? t
polyg rue
on? f
alse
66. at (0
.0, 0
-- re .0)
at (1 gion
.0, 1 1
-- re .0)
gion
-- re 1
at (2 gion
.0, 2 2
-- re .0)
gion
-- re 1
at (3 gion
.0, 3 2
-- re .0)
at (4 gion
.0, 4 2
-- em .0)
pty
67. Java Topology Suite (JTS)
‣ Serializers and deserializers
‣ Well-known text (WKT)
‣ Well-known binary (WKB)
‣ No GeoRSS or GeoJSON support
68. Interface / RPC
‣ RockDove is a backend service
‣ Data provider for front-end lookups
‣ Uses some form of RPC (Thrift, Avro, etc.) to communicate with
‣ Data could be cached on frontend to prevent lookups
‣ Simple RPC interface
‣ get(id)
‣ containedWithin(lat, long)
69.
70. Interface / RPC
‣ Watch those RPC queues!
‣ Fail fast and potentially throw “over capacity” messages
‣ get(id) throws OverCapacity
‣ containedWithin(lat, long) throws
OverCapacity
‣ Distinguish between write path and read path
71. GeoRuby
‣ http://georuby.rubyforge.org/
‣ Open source
‣ OpenGIS Simple Features Interface Standard
‣ Only good for representing geometric entities
‣ GeoRuby::SimpleFeatures::Geometry::from_ewkb
‣ No GeoJSON serializers
74. Location in Browser
‣ Geolocation API Specification for JavaScript
navigator.geolocation.getCurrentPosition
‣ Does a callback with a position object
‣ position.coords has
‣ latitude and longitude
‣ accuracy
‣ other stuff
‣ Support in Firefox 3.5, Chromium, Opera, and others with Google Gears