DevEX - reference for building teams, processes, and platforms
ISCRAM 2013: A step towards real-time analysis of major disaster events based on tweets
1. Geodätisches Institut1 Web Feature Service für DB4GeOKIT – Universität des Landes Baden-Württemberg und
nationales Forschungszentrum in der Helmholtz-Gemeinschaft
Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology (KIT)
www.kit.edu
A step towards real-time analysis of major
disaster events based on tweets
M.Sc. André Dittrich May 13, 2013
8. M. Sc. André Dittrich8
Resources & Data
Resources
Twitter
Streaming API → access to 1% of the Firehose
Real-time → up to 60 tweets/sec
9. M. Sc. André Dittrich9
Resources & Data
Resources
Twitter
Streaming API → access to 1% of the Firehose
Real-time → up to 60 tweets/sec
MongoDB
document-oriented Database technology
(binary) JSON (JavaScript Object Model) → tweet ≡ JSON !
Spatial Indexing
10. M. Sc. André Dittrich10
Resources & Data
Resources
Data
Tweet
11. M. Sc. André Dittrich11
Resources & Data
Resources
Data
Tweet
timestamp WHEN?
12. M. Sc. André Dittrich12
Resources & Data
Resources
Data
Tweet
timestamp
GNSS coordinates
WHEN?
WHERE?
13. M. Sc. André Dittrich13
Resources & Data
Resources
Data
Tweet
timestamp
GNSS coordinates
message text
WHEN?
WHERE?
WHAT?
14. M. Sc. André Dittrich14
Resources & Data
Resources
Data
Tweet
geographical bounding box
Longitude [°] Latitude [°]
Lower left -86 0
Upper right -67 53
15. M. Sc. André Dittrich15
Resources & Data
Resources
Data
Tweet
geographical bounding box
reference data
several 24 hour records
1 pm to 1 pm EST
approx. 6 GB each
16. M. Sc. André Dittrich16
Resources & Data
Resources
Data
Tweet
geographical bounding box
reference data
several 24 hour records
1 pm to 1 pm EST
approx. 6 GB each
Keyword filtered tweets related to winterstorm Sandy
→ October 29, 2012 – October 31, 2012
→ December 29, 2012 – December 30, 2012
17. M. Sc. André Dittrich17
Data Analysis
Number
of
tweets
Time of day [h]
18. M. Sc. André Dittrich18
Data Analysis
Number
of
tweets
Time of day [h]
19. M. Sc. André Dittrich19
Data Analysis
a0 a1 b1 a2 b2 a3 b3 a4 b4 w RMSE
Type 1 3463,6 928,1 -2062,3 998,1 -704,2 353,2 -205,5 30,2 -78,2 0,27 151,1
Type 2 3602,9 781,1 -1887,9 1251,1 239,5 -3,7 0,7 178,2 92,5 0,26 92,8
20. M. Sc. André Dittrich20
Event Data – New Year‘s Eve
Number
of
tweets
Time of day [h]
21. M. Sc. André Dittrich21
Keyword Data
shelter | winterstorm | weather AND sandy | subway AND flood | sandy AND victims | snowfall | power outages
Reference day
12/29/2012 1 pm to 12/30/2012 1 pm
22. M. Sc. André Dittrich22
Keyword Data
shelter | winterstorm | weather AND sandy | subway AND flood | sandy AND victims | snowfall | power outages
Winterstorm “Sandy“
10/29/2012 1 pm to 10/30/2012 1 pm
Reference day
12/29/2012 1 pm to 12/30/2012 1 pm
23. M. Sc. André Dittrich23
Outlook
Collection of further data
→ more reliable reference model and robust statistics
Grid-based approach
→ faster and more accurate localization
Test with different NLP APIs
→ robust event classification
Exlpoitation of further resources
→ e.g. Ushahidi
25. M. Sc. André Dittrich25
Most Stable Time Interval
26. M. Sc. André Dittrich26
Social Event
Number
of
tweets
Time of day [h]
27. M. Sc. André Dittrich27
Social Event – Superbowl 2013
Number
of
tweets
Time of day [h]
28. M. Sc. André Dittrich28
Social Event – Superbowl 2013
Number
of
tweets
Time of day [h]
29. M. Sc. André Dittrich29
Social Event – Superbowl 2013
Number
of
tweets
Time of day [h]
30. M. Sc. André Dittrich30
Social Event – Superbowl 2013
Number
of
tweets
Time of day [h]
31. M. Sc. André Dittrich31
Social Event – Superbowl 2013
Number
of
tweets
Time of day [h]
Named Entity References [%]
Beyoncé 27,51
Destiny‘s Child 10,26
show 4,08
halftime 3,06
32. M. Sc. André Dittrich32
Social Event – Superbowl 2013
Number
of
tweets
Time of day [h]