This document discusses how Twitter uses location data to target advertising. It explains that Twitter has over 284 million monthly active users, with 80% being mobile. It then discusses how Twitter generates user content by encouraging users to share private location data. It also discusses how Twitter monetizes this location data by using it to target ads geographically and based on user attributes and behaviors. Finally, it discusses the technical infrastructure needed to process location data at scale, including issues around data quality, normalization, and geocoding/reverse geocoding.
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
3 Easy Ways to Reach Financial Freedom: How Twitter user Geo to win advertising
1. 3 Easy Ways To Reach
Financial Freedom:
How Twitter use Geo to win Advertising
Sen Xu
SIGSpatial 2016
MELT Workshop
Mobile Entity Localization, Tracking and Analysis
4. • Twitter has more than 284 million monthly active users. (October 2014)
• 500 million Tweets are sent per day, or 1 billion every ~2 days. (August 2013)
• More than 300 billion Tweets have been sent since company founding in 2006. (October 2013)
• TPS record: one-second peak of 143,199 Tweets per second, in Japan (August 2013)
• 80% of our active users are mobile users. (October 2014)
• 40% of our active users simply consume content on Twitter.
• Twitter supports 35 different languages. (March 2013)
• 77% of Twitter accounts are outside the U.S. (October 2013)
Twitter
5. • Content Generation
– How to create features that make users want to share private information with us?
• How to get user to turn on locationservice?
• How to collect user birthday?
– Import third party data: data plumbing
– Data Correctness/Legal Issue/Disputed Territory
• Monetization
– Features for the other side: advertisers
– Targeting: Geo, Age, Interest, Behavior (Follow/Following)
• Service/Technology (AKA How to make your service faster)
– QA your data source
– Tech infra (Geohash-based Reverse-geocoding)
6. How to create features so attractive that users are
willing to share data
Content Generation
20. –A place has id, names, attributes, parents, and geography
•place_id: unique u64 id
•name: one place may have multiple names, but only one preferred name
•Attributes (annotations): open-ended key-value store for custom attributes. For POI, address,
phone, URL, twitter, existence, etc.
•parents: upper administrative level, e.g., in US, City’s closest parent is State (Admin1). Or
determined by geometry containment, e.g., POI can have Neighborhood as parent if it contained
by it.
•geography: point (for POI), polygon/multi-polygon (for all other place types). line geometry
Place:
Glossary
POI: Point of Interest. Using a point (lat,lon) as a simplified representation of places, common POIs are
restaurant, landmarks, parks, and dentist offices*
*although POIs can all be interesting/useful under certain occasions, some
will be more interesting than others for geotagging purposes.
Pitney Bowes:
Factual:
Polygonal data vendor (188 countries)
POI data vendor (49 countries)
21. What kind of data do we need for a fully-fledged Geo Service?
Service Required Data Set Rockdove Geoduck
Geocoding
(text to lat/lon)
Reverse-
Geocoding
(lat/lon to text)
• Popular Geopolitical names
and geometry (e.g.,
Neighborhood, City, State,
Country)
Unresolved merge of
13 different data
source of various
data quality
Pitney Bowes
• Polygonal data for specific
marketing needs
Unlicensed
simplified
geometries
Nielson
• Useful, High quality POI UGC… Factual
IP reverse-lookup IP blocks to lat/lon or Place
(confidence)
NetAcuity NetAcuity with
User modeling
22. 22
User generate places (e.g., Mom’s basement)
Overlaps within the same PlaceType (data bug!)
Historically…rockdove allows
23. 23
• Geometries within each PlaceType do not overlap against each other
• Keep Reverse-Geocoding (RGC) Trie sane
• Maintain Rockdove ID
• Historically geo-tagged Tweets will display
correctly (deleted)
• Reuse Rockdove ID and update with geometry
• Historically geotagged “New York City” tweets
will be related to the same PlaceID, with updated
geometry and attributes
Requirement for Geoduck
26. 26
• Duplicate places coming from different vendors with slightly different name and
geometry
• Simple Solution: For each incoming place, find potential candidates (name-match,
Levenshtein distance) then validate using geometry
Conflation Challenges
39. • What would happen when user don’t share GPS?
– IP: NetAcuity, MaxMind, NeoStar
– DIY?
• Blacklist
• Whitelist
• Requires polygons
40. Mapping Uber’s Future: Uber Maps is Hiring
*https://newsroom.uber.com/mapping-ubers-future/
“
Over the past decade mapping innovation has
disrupted industries and changed daily life in ways I
couldn’t have imagined when I started. That
progress will only accelerate in the coming years
especially with technologies like self-driving cars. I
remain excited by the prospect of how maps can
put the world at our fingertips, improve everyday
life, impact billions of people and enable
innovations we can’t even imagine today.
”
--Brian McClendon, VP of Engineering, Uber
In the Geo stack. The data pipeline is the part that covers vendor delivery, rockdove data migration, data base schema design, conflation pipeline (import, normalize, conflate), various data quality report, coverage maps, and data export. It’s the process that precedes the service, and the data it provides determines what kind of data the service will serve. The primary goal for the data pipeline is to have sane and high quality data delivered to downstream.