Talk given to the Smart City course students at CEPT University. Oct 19, 2014.
* Overview on Physical (IoT/Sensor), Cyber (OpenGov) and Social (citizen Sensing) data
* Relevance to City Departments
* Three smart city applications (from India, Europe and US)
More on the course: http://indianexpress.com/article/india/india-others/cept-launches-first-ever-course-on-smart-cities/
2. Kno.e.sis in 2013 = ~100 researchers (15 faculty, ~50 PhD students)
Amit Sheth’s
PHD students
Ashutosh Jadhav
Hemant
Purohit
Vinh Nguyen Lu Chen
Pramod
Sujan Anantharam
Perera
Alan Smith
Maryam Panahiazar
Sarasi Lalithsena
Cory Henson
Kalpa
Gunaratna
Delroy Cameron
Sanjaya
Wijeratne
Wenbo
Wang
Pavan
Kapanipathi
Shreyansh Bhatt
Acknowledgements: Kno.e.sis team, Funds - NSF, NIH, AFRL, Industry…
2
3. • Among top universities in the world in World Wide Web (cf: 10-yr impact,
Microsoft Academic Search: among top 10 in June2014)
• Among the largest academic groups in the US in Semantic Web + Social/Sensor
Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical &
Biomedicine Applications
• Exceptional student success: internships and jobs at top salary (IBM
Watson/Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research
universities, NLM, startups )
• 100 researchers including 15 World Class faculty (>3K citations/faculty avg) and
~45 PhD students- practically all funded
• Extensive research for largely multidisciplinary projects; world class resources;
industry sponsorships/collaborations (Google, IBM, …)
3
5. • Social Media Big Data – Twitris, eDrugTrends
• Sensor/IoT Big Data – CityPulse, kHealth
• Healthcare Big Data – kHealth, EMR, Prediction
• Biomedical Big Data – Biomarker from NextGen
Sequencing and Proteomics, SCOONER
• Big and Smart Data Certificate
Kno.e.sis private cloud: 864 CPU cores, 18TB RAM, 17TB SSD,
435TB disk
5
6. 6
Smart Cities and
Back to the future
Thanks to Dr. Payam Barnaghi for sharing the slide
7. 7
Future cities: a view from 1998
Source LAT Times, http://documents.latimes.com/la-2013/
Thanks to Dr. Payam Barnaghi for sharing the slide
8. 8
Image courtesy: Avatar wiki
Thanks to Dr. Payam Barnaghi for sharing the slide
9. 9
Thanks to Dr. Payam Barnaghi for sharing the slide
10. Imperatives
Why:
Improved Economic/Social/Human development in an era of
increased Urbanization
What?
All aspects of economy: Agricultural + Manufacturing + Service +
Knowledge
How?
• Next
10
11. Enablers of Economic Developments
Economic de Civilizations on river banks velopment on trade routes
Economic development now increasingly rely on digital infrastructure
11
Image credit: http://www.rcet.org/twd/students/socialstudies/ss_extensions_1intro.html
Image credit: http://www.shutterstock.com/pic-157118819/stock-vector-conceptual-tag-cloud-containing-words-related-to-smart-city-digital-city-infrastructure-ict.html
12. General Economic Trends
Over 340 million people live in cities of India in 2008 and it is expected to
grow to 590 million by 2030 leading to rapid urbanization1
We are increasingly moving from Agriculture Industry Services
The next growth should be toward Knowledge Economy
1http://www.mckinsey.com/insights/urbanization/urban_awakening_in_india
12
13. One aspect of characterizing a City: All its
functions
Image credit: http://www.ibm.com/smarterplanet/us/en/smarter_cities/overview/index.html 13
15. Five Key Elements of Smart City*
Utility Services
Transportation Services
Social Infrastructure
Safety & health Services Recycling Services
* By Indian Urban Development Ministry
15
16. Unprecedented Digital Data Growth
• Every thing is becoming data driven
• Many types of data: Physical, Cyber, and Social
• Effective collection and use of this Big Data has to be a core
part of designing Smart Cities
http://www.tribalcafe.co.uk/big-data-infographic/
16
17. Understanding wealth of data
• Increased citizen participation (Social)
• Increase monitoring using sensors (Physical)
• Increase Digital Government (eGov) data
(Cyber)
Let’s not develop future applications with constraints of the past
India ranks 8th in civic engagement!
http://www.informationweek.com/government/leadership/digital-civic-engagement-us-lags/d/d-id/1113938
17
18. What do we need for developing Smart
18
City Applications?
Physical
Amit Sheth, Pramod Anantharam, Cory Henson, 'Physical-Cyber-Social Computing: An Early 21st Century Approach,' IEEE Intelligent Systems,
vol. 28, no. 1, pp. 78-82, Jan.-Feb., 2013. http://doi.ieeecomputersociety.org/10.1109/MIS.2013.20
http://wiki.knoesis.org/index.php/PCS
Cyber
Social*
Developers need to Consider observations from Physical-Cyber-Social
systems in Building Smart City applications
*http://www.ichangemycity.com/
19. 19
Physical: Sensors monitoring physical
world
- Programmable devices
- Off-the-shelf gadgets/tools
Thanks to Dr. Payam Barnaghi for sharing the slide
20. 20
Cyber: Observations pushed to the
cyber world
Thanks to Dr. Payam Barnaghi for sharing the slide
21. 21
Social: People interacting with the
physical world
ECG sensor
Motion sensor
Motion sensor
Motion sensor
World Wide Web
Road block, A3
Road block, A3
Thanks to Dr. Payam Barnaghi for sharing the slide
22. Scope of this talk
• Smart City application in Indian Context
• Smart City Use Cases in Developed World
– Smart City application in European Context
– Smart City application in US context
22
23. • Smart City application in Indian Context
• Smart City Use Cases in Developed World
– Smart City application in European Context
– Smart City application in US context
23
24. Dynamic schedule update of Public Transport vehicles
in A CITY Lacking Traffic Instrumentation*
Pramod Anantharam
Joint work with Biplav Srivastava and Raj Gupta, IBM IRL
Aug 31, 2012
*Work done as part of internship at IBM Research
24
25. Motivation
By 2001 over 285 million Indians lived in cities, more than in all
North American cities combined (Office of the Registrar General of India 2001)1
1The Crisis of Public Transport in India
2IBM Smarter Traffic
Texas Transportation Institute (TTI)
Congestion report in U.S.
Modes of transportation in Indian Cities
25
26. Motivation: Why SMS for Events?
• Prevalence
– In India, 11 cities provide notifications to citizens using SMS
– SMS based alerts common for business transactions
– Low-cost phones constitute 95% of all phones (~930 million
mobile connections in India2)
• Social media (Facebook, Twitter) and SMS
– Commuters prefer dynamic updates such as SMS verses any other
form of traffic updates1.
26
1Caulfield et al. Factors Which Influence the Preferences for real-time Public Transport Information,
Association of European Transport and contributors 2007
2http://en.wikipedia.org/wiki/Communications_in_India
27. Problem
• Input:
– Traffic related text alerts, domain knowledge, public
transport routes, and historical data.
• Output
– Events in desired form
– Impact of events on public transport routes (e.g.
probability of delay given location + event)
• Challenges
– No instrumentation (sensors) leading to sparse and
imprecise information, event extraction from free
text.
27
28. Solution Components
As events are reported to MDU (Multi-modal Dynamic Update):
• Traffic event detection from SMS alerts – event <Type, Time
(Reported, Published), Location (From, To, On), Description>
• Reasoning over traffic events for delay assessment
– Find stops in the region affected by event (Qualitative)
– Estimate delay at stops (Quantitative)
• Consider time of day and history of such events
• Have an attenuation function based on event types
– Propagate delay estimates to neighboring stops
• Account for time, schedule and direction of travel
28
29. eventtype = BreakDown
eventdescription = “Traffic movement is slow from Sanjay
point towards Vasant Vihar due to break down of an HTV in
front of Signal Enclave.msg@10.15am,210612.”
eventstartloc = Sanjay Point
eventendloc = Vasant vihar
eventonloc = Signal Enclave
eventtime = June 21, 2012, 10:15am
c.p.w.d.cly.
vasant vihar
vasant
vihar
depot.
paschim marg
vasant vihar
vasant
vihar(t)
Signal Enclave
Vasant Vihar
vasant vihar model school
c.p.w.d.cly. vasant vihar
paschim marg vasant vihar
vasant vihar(t)
vasant vihar depot.
“Traffic movement is slow from Sanjay point towards
Vasant Vihar due to break down of an HTV in front
of Signal Enclave.msg@10.15am,210612.”
Illustration from New Delhi (India)
29
30. Evaluation: Event Extraction
•Run for ~50 messages in Delhi
•Accurate extraction of location from, to
and type.
Sample
30
31. Bayesian Model: Impact of Events on Delay
The probability of having a delay at a stop , Si, given events observed at the stop, is given by
31
32. Impact (Delay) Propagation Across
Stops
Vehicle moves from S1 towards S4
Actual steps are by loopy belief propagation algorithm
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Tech-niques.
MIT Press (2009)
Assumption: delay at a node is directly
influenced by delay at the next node.
32
34. 1-Slide Summary: Multi-Mode Commuting Recommender in Delhi And Bangalore
Highlights
• Published data of multiple
authorities used; repeatable
process
•Multiple modes searched
• Preference over modes, time,
hops and number of choices
supported; more extensions,
like fare possible
• Integration of results with map
as future work; already done as
part of other projects, viz.
SCRIBE-STAT
IRL – Transit on March 2012
First
Version
34
35. IRL – Transit in Aug 2012
Key Points
•SMS message from city
• Event and location identified
• Impact assessed
• Impact used in search
35
36. Matching Stop Names to OSM
Location
36
• 3931 multi-modal stops in Delhi
• Matching algorithm involves chucking of stop names,
distance metrics, and voting to select the best match.
• Matches categorized as confident (>50% of techniques
resulted in the match), possible (=50%), and uncertain
(<50%).
• 1496 (confident matches) stops mapped to OSM locations.
OSM names
37. Event Information
VasantVihar
StartLoc = Sanjay Point
EndLoc = VasantVihar
OnLoc = Signal Enclave
Lat-lon = 28.5561025,77.1645187
textual
observation
IRL-Transit routes (ordered)
Background
knowledge
GIS Location
Assess
impact
Observations
parameterize
Fetch text
message
Extract Locations
StartLoc, EndLoc, and
OnLoc
Is
location
present?
Extract Event Information
Extract GIS location
using Open Street
Maps
No
Yes
Is event
present?
Domain knowledge of
categorization of events
Text message
with
metadata
Yes
No
Traffic movement is slow from Sanjay point towards
VasantVihar due to break down of an HTV in front of
Signal Enclave.msg@10.15am,210612.
STOPID STOPNAME
321 c.p.w.d. cly. vasantvihar
369 vasantvihar (t)
814 vasantvihar model school
956 vasantviharcpwdcly.
957 paschimmargvasantvihar
1274 vasantvihar depot
STOPID STOPNAME
957 paschimmargvasantvihar
814 vasantvihar model school
321 c.p.w.d. cly. vasantvihar
956 vasantviharcpwdcly.
1274 vasantvihar depot
369 vasantvihar (t)
StartLoc = Sanjay Point
EndLoc = VasantVihar
OnLoc = Signal Enclave
Event = break down of an HTV
StartLoc = Sanjay Point
EndLoc = VasantVihar
OnLoc = Signal Enclave
Event = break down of an HTV
Event Type = BreakDown
IRL-Transit routes (unordered)
Signal
Enclave
37
38. Evaluation: Reasoning over traffic events
• Traffic alerts collected for 10 cities in India for
two years.
• Prior probability of events computed using
these alerts.
• Probability of having a delay given an event
type at ten locations in Delhi is summarized:
38
39. Number of SMS messages for bus stops in
Delhi for 2 years (Aug 2010 – Aug 2012)
• 344 stops
with updates
• 3931 total stops
39
40. • Smart City application in India Context
• Smart City Use Cases in Developed World
– Smart City application in European Context
– Smart City application in US context
40
41. 41
CityPulse Consortium
Partners:
Industrial SIE, ERIC
SME AI,
Higher
Education
UNIS, NUIG,
UASO, WSU
City BR, AA
Duration: 36 months
49. Public parking space availability prediction
• Finding parking space in a city can be challenging
• Predicting the probability of parking given various input
variables such as scheduled events, time of day & location.
• Reduced emission and frustration for citizens
http://www.ict-citypulse.eu/scenarios/scenarios
49
50. • Smart City application in India Context
• Smart City Use Cases in Developed World
– Smart City application in European Context
– Smart City application in US context
50
51. Extracting City Events from Social
Streams
Toward a Citizen Centered Smart City
Pramod Anantharam1
1Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
Wright State University, Dayton, Ohio, USA
http://www.ict-citypulse.eu/page/
Mentor/Supervisor: Dr. Payam Barnaghi
51
52. Pulse of a City (CityPulse)
Public Safety Urban planning Gov. & agency
admin.
Energy &
water
Environmental Transportation Social Programs Healthcare Education
Image credit: http://www.ibm.com/smarterplanet/us/en/smarter_cities/overview/index.html
52
53. Research Questions
• Are people talking about city infrastructure on
twitter?
• Can we extract city infrastructure related
events from twitter?
• How can we leverage event and location
knowledge bases for event extraction?
• How well can we extract city events?
53
55. Some Challenges in Extracting Events from Tweets
• No well accepted definition of ‘events related to a
city’
• Tweets are short (140 characters) and its informal
nature make it hard to analyze
– Entity, location, time, and type of the event
• Multiple reports of the same event and sparse
report of some events (biased sample)
– Numbers don’t necessarily indicate intensity
• Validation of the solution is hard due to the open
domain nature of the problem
55
56. Open Domain [Kumaran and Allan 2004] [Roitman et al. 2012]
[Ritter et al. 2012]
[Wang et al. 2012]
Formal Text Informal Text
Closed Domain
[Lampos and Cristianini 2012]
[Becker et al. 2011]
Related Work on Event Extraction
56
57. Tweets from a city
City Infrastructure
POS
Tagging
Hybrid NER+
Event term
extraction
Impact
Assessment
Temporal
Estimation
Event
Aggregation
Geohashing
OSM
Locations
SCRIBE
ontology
511.org hierarchy
City Event Extraction
City Event Extraction Solution Architecture
City Event Annotation
57
58. Evaluation
• City Event Annotation
– Automated creation of training data
– Annotation task (our CRF model vs. baseline CRF
model)
• City Event Extraction
– Use aggregation algorithm for event extraction
– Extracted events AND ground truth
• Dataset (Aug – Nov 2013) ~ 8 GB of data on disk
– Over 8 million tweets
– Over 162 million sensor data points
– 311 active events and 170 scheduled events
58
59. Ground Truth Data (only incident reports) -- City Event Extraction
We have around 162 million data records from sensors monitoring over 3,700 links in San Franciso Bay Area
<link_id, link_speed, link_volume, link_travel_time,time_stamp> a data record
GREEN – Active Events
YELLOW – Scheduled Events
311 active events and 170 scheduled events
59
61. Traffic Analytics using Probabilistic Graphical Models
Enhanced with Knowledge Bases
Pramod Anantharam, T. K. Prasad, Amit Sheth
Ohio Center of Excellence in Knowledge-enabled Computing (kno.e.sis)
Wright State University, Dayton, Ohio
2nd International Workshop on Analytics for Cyber-Physical Systems (ACS-2013)
61
62. Slow moving
traffic
Link
Description
Scheduled
Event
Scheduled
Event
511.org
511.org
Schedule Information
511.org
62
63. Uncertainty in the Real-world
• Observation: Slow Moving Traffic
• Multiple Causes (Uncertain about the cause):
– Scheduled Events: music events, fair, theatre
events, concerts, road work, repairs, etc.
– Active Events: accidents, disabled vehicles, break
down of roads/bridges, fire, bad weather, etc.
– Peak hour: e.g. 7 am – 9 am OR 4 pm – 6 pm
• Each of these events may have a varying
impact on traffic
63
64. Why Probabilistic Graphical Models?
“As far as the laws of mathematics refer to reality, they are not
certain, as far as they are certain, they do not refer to reality”
-- Albert Einstein, 1921.
“Graphical models are a marriage between probability theory
and graph theory. They provide a natural tool for dealing with
two problems that occur throughout applied mathematics and
engineering -- uncertainty and complexity …”
-- Michael Jordan, UC Berkley, 1998.
64
65. Graphical Models – Bayesian Network
Example
Cold
T 0.33
F 0.67
IcyRoad PoorVisibilit
SlowTraffic
y
Random
variable
cold
T F
0.75 0.05
0.25 0.95
Edge between random
variables which is indicative
of conditional independence
IcyRoad
T
F
cold
T F
0.85 0.40
0.15 0.60
PoorVisibility
T
F
cold
T F
IcyRoad PoorVisibility
T F T F
0.85 0.4 0.9 0.2
0.15 0.6 0.1 0.8
SlowTraffic
T
F
Conditional
Probability Table
A graphical model has
structure (nodes and edges)
(CPT)
and parameters; CPD – continuous variables, CPT – discrete variables 65
66. How do we get nodes and edges?
Domain Experts
Declarative domain knowledge
ColdWeather
PoorVisibility
SlowTraffic
Variables and
relationships
IcyRoad
Causal
knowledge
Linked Open Data
Domain Observations
ColdWeather(YES/NO) IcyRoad (ON/OFF) PoorVisibility (YES/NO) SlowTraffic (YES/NO)
1 0 1 0
1 1 1 1
1 1 1 0
1 0 1 1
Domain Knowledge
Structure and parameters
66
WinterSeaso
n
67. Domain Knowledge
• Declarative knowledge about various domains
are increasingly being published on the web1,2.
• Declarative knowledge describes concepts and
relationships in a domain (structure).
• Linked Open Data may be used to derive priors
probability of events (parameters).
• In this work, we focus only on use of declarative
knowledge for structure using ConceptNet 5.
1http://conceptnet5.media.mit.edu/
2http://linkeddata.org/ 67
68. ConceptNet 5
ScheduledEvent
http://conceptnet5.media.mit.edu/web/c/en/traffic_jam
Delay
go to baseball game
traffic jam
traffic accident
traffic jam
ActiveEvent
Causes
Causes
traffic jam
traffic jam
CapableOf
slow traffic
CapableOf
occur twice each day
Causes
is_a
bad weather
CapableOf
slow traffic
road ice
Causes
accident
TimeOfDay
go to concert
HasSubevent
car crash
accident
RelatedTo
car crash
BadWeather
Causes
Causes
is_a
is_a
is_a is_a is_a
is_a
is_a
68
69. Key Idea
• Probabilistic Graphical Models (PGM) use
statistical approaches to uncover correlations.
• Declarative knowledge curated by humans
provide richer relationships including causal
knowledge.
• Goal: Utilizing declarative knowledge with
PGM structure learning algorithms to build
richer (quality and coverage) models.
69
70. Complementing graphical model structure extraction
traffic jam CapableOfoccur twice each day
traffic jam CapableOf slow traffic
Traffic jam
Link
Description
Add missing random variables
Scheduled
Event
baseball game traffic jam
slow traffic
slow traffic
slow traffic
Time of day
bad weather CapableOf slow traffic
bad weather
Traffic data from sensors deployed on road
network in San Francisco Bay Area
time of day
baseball game traffic jam
time of day
Add missing links bad weather
baseball game traffic jam
time of day
Add link direction
bad weather
baseball game traffic jam
time of day
go to baseball game Causes traffic jam
Knowledge from ConceptNet5
70
71. Smart Cities: Opportunities
• empower citizens
• provide more business opportunities for
companies (and SMEs) and private sector
services
• create better governance of our cities and
better public services
• provide smarter monitoring and control
• improve energy efficiency, create greener
environments…
• create better healthcare, elderly-care…
Thanks to Dr. Payam Barnaghi for sharing the slide 71
72. Smart Cities: Challenges
• Adherence to open data standards by all the
city authorities
• Sufficient guidance and support for city
authorities in managing their data
• Reliability and quality of citizen reporting of
city events
• Privacy and Security issues in event reporting
72