SlideShare a Scribd company logo
1 of 26
Download to read offline
PeGOV 2014 – 2nd Workshop on Personalization in eGovernment Services and Applications 
11 July 2014, Aalborg, Denmark 
TweetAlert: 
Semantic Analytics in Social Networks 
for Citizen Opinion Mining 
in the City of the Future 
Julio Villena-Román1,2, 
Adrián Luna-Cobos1,3, José Carlos González-Cristóbal3,1 
1 DAEDALUS - Data, Decisions and Language, S.A. 
2 Universidad Carlos III de Madrid 
3 Universidad Politécnica de Madrid 
jvillena@daedalus.es, aluna@daedalus.es, josecarlos.gonzalez@upm.es
PeGOV-2014 
11 July 2014, Aalborg, Denmark 2 
Agenda 
! Framework 
! Citizen Sensor 
! System 
! Business cases 
! Future work
PeGOV-2014 
11 July 2014, Aalborg, Denmark 3 
Framework 
! Ciudad 2020 aims to achieve significant improvements in areas of 
energetic efficiency, Internet of the Future, Internet of Things, human 
behaviour, environmental sustainability and mobility and transport, in 
order to design the City of the Future: sustainable, efficient, smart. 
! Spanish R&D project, INNPRONTA Programme, Center for Industrial 
Technological Development (CDTI), Ministry of Economy and 
Competitiveness 
! 2011-2014 
! 16,3 M€ budget 
! 5 multinational corporations, 4 SMEs, 8 PRIs 
! Daedalus focuses on the automatic extraction of meaning from all types 
of multimedia content, using NLP technologies and data/text analytics to 
help our customers solve any challenge in these areas.
leisure and free time 
surveys 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 4 
Citizen Sensor 
mobility 
professional activities 
opinions in 
social media 
relationship with 
public administration 
collaborative 
sensing 
relationship with 
other people 
Citizen 2020 = another city sensor
PeGOV-2014 
11 July 2014, Aalborg, Denmark 5 
Citizen Sensor 
! Innovative way to capture a very descriptive high-level 
heterogeneous information, bringing high added value 
especially when considering aggregations 
! More complex and richer information than other sensors 
! “smells awful”, “there is a fire”, “I’m going to the sales”… 
! Individual actions may show citizen trends 
! validate a bus ticket " route density 
! Opinion/sentiments of the citizen about the city 
! follow social networks to assess the impact of new policies 
! Collaborative sensing 
! using smartphones to get data (pollution, energy consumption) with low 
cost and new possibilities
Our approach 
What: Build a system able to capture, store and analyze user 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 6 
messages 
Where: In Twitter 
For whom: City administrators 
What for: To help them rapidly and easily understand citizen 
behaviour trends and know their opinion about city 
services, events, etc. 
Why: To enable them to better understand citizen necessities, 
generate hypotheses over urban behaviour models, in 
order to improve municipal management policies, 
bringing them closer to the actual reality of the citizens 
How: Using NLP technologies 
When: In real-time
PeGOV-2014 
11 July 2014, Aalborg, Denmark 7 
Architecture
Information Repository 
! Stores the high volume of data and provides advanced search 
functionality to exploit the information 
! Based on Elasticsearch 
! open source, distributed, real-time search and analytics engine 
! complex search capabilities 
! scalable high-performance solution 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 8 
http://www.elasticsearch.org
PeGOV-2014 
11 July 2014, Aalborg, Denmark 9 
Gatherer 
! Set of concurrent processes that query the Twitter APIs to collect 
tweets 
! Search or Streaming API 
! Filter by a list of user identifiers, a list of keywords to track (terms, 
hashtags) and/or a set of geographical bounding boxes 
! Returns tweet text, author, location, embedded media 
https://dev.twitter.com/docs/api/1.1
Text 
Classification 
API 
http://textalytics.com 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 10 
Inquirer 
! Set of concurrent processes that annotate tweets using our 
Textalytics Core APIs 
! Entities 
! Concepts 
Topic Extraction API 
! Hashtags 
! Thematic area of the message (transport, economy, daily life…) 
! Citizen Sensor model 
! Alert situations (road accidents, fires, street violence…) 
! Specific location of the user (building, means of transport...) 
! Events to which the text refers (cultural events, sports...) 
! Sentiment polarity : P+, P, NEU, N, N+, NONE 
! Irony and subjectivity 
! User demographics: gender, age, type of tweet author 
Sentiment Analysis API 
User Demographics API
Entities, concepts, hashtags 
Advanced NLP to obtain POS, syntactic tree and semantic analyses of the 
text and use it to identify different types of significant elements 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 11
Text classification 
State-of-the-art hybrid text classification model using a statistical 
classification combined with a rule-based filtering 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 12 
Social Media 
Citizen Sensor
PeGOV-2014 
11 July 2014, Aalborg, Denmark 13 
Topics
PeGOV-2014 
11 July 2014, Aalborg, Denmark 14 
Alerts
PeGOV-2014 
11 July 2014, Aalborg, Denmark 15 
Locations, events
Sentiment analysis 
State-of-the-art lexicon-based model for sentiment analysis, using POS 
and syntactic tree for detecting negation and controlling the scope of 
modifiers + subjectivity classification + irony detection 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 16
User Demographics 
Text classification based on n-grams model to guess user type, gender and 
age from his/her login, name and profile description 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 17
PeGOV-2014 
11 July 2014, Aalborg, Denmark 18 
Example 
{ 
"text":"el viento ha roto una rama y hay un atascazo increible en toda la gran vía...", 
"tag_list":[ 
{"type":"sensor", "value":"011002 Ubicación - Exteriores - Vías públicas"}, 
{"type":"sensor", "value":"070700 Alertas meteorológicas - Viento"}, 
{"type":"sensor", "value":"080100 Incidencia - Congestión de tráfico"}, 
{"type":"topic", "value":"06 medio ambiente, meteorología y energía"}, 
{"type":"entity", "value":"Gran Vía"}, 
{"type":"concept", "value":"viento"}, 
{"type":"sentiment", "value":"N"}, 
{"type":"subjectivity", "value":"OBJ"}, 
{"type":"irony", "value":"NONIRONIC"}, 
{"type":"user_type", "value":"PERSON"}, 
{"type":"user_gender", "value":"FEMALE"}, 
{"type":"user_age", "value":"25-35"} 
] 
}
PeGOV-2014 
11 July 2014, Aalborg, Denmark 19 
Geolocation
PeGOV-2014 
11 July 2014, Aalborg, Denmark 20 
Visualization 
http://www.highcharts.com 
http://openlayers.org 
http://d3js.org
PeGOV-2014 
11 July 2014, Aalborg, Denmark 21 
Ongoing business cases 
! City console for a local administration to analyze in real-time the 
behaviour and topics of interest of the citizens, with two 
components: 
! a private console, internal for the city services, for analytics 
! a public dashboard to engage citizens with their city, displaying 
attractive, summarized, non-confidential information at selected 
public locations (town hall, libraries, museums) or a LED video wall in 
a populous square in downtown 
! Social alert detection system 
! For 112 emergency services, providing early detection of security-related 
issues
For short/mid term future 
! Trending topics geolocation clustering 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 22 
! Analysis at neighbourhood level 
health 
traffic 
jam 
air pollution 
jellyfish 
pollen
For short/mid term future 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 23 
! Analysis of city pace of life
For short/mid term future 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 24 
! Mobility analysis 
! How, when, why people move through the city 
! Route identification (home"work"free time"home) 
! Route changes (due to weather)
For short/mid term future 
! City reputation and brand personality 
! Automated satisfaction surveys 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 25
This work has been supported by several Spanish R&D projects: Ciudad2020: Hacia un nuevo modelo de ciudad inteligente 
sostenible (INNPRONTA IPT-20111006), MA2VICMR: Improving the access, analysis and visibility of the multilingual and 
multimedia information in web for the Region of Madrid (S2009/TIC-1542) and MULTIMEDICA: Multilingual Information 
Extraction in Health domain and application to scientific and informative documents (TIN2010-20644-C03-01). Authors 
would like to thank all partners for their knowledge and support. 
PeGOV-2014 
11 July 2014, Aalborg, Denmark 26

More Related Content

Viewers also liked

Viewers also liked (13)

Big Data at Twitter, Chirp 2010
Big Data at Twitter, Chirp 2010Big Data at Twitter, Chirp 2010
Big Data at Twitter, Chirp 2010
 
Social media data for Social science research
Social media data for Social science researchSocial media data for Social science research
Social media data for Social science research
 
PPT FOR BIG
PPT FOR BIGPPT FOR BIG
PPT FOR BIG
 
Data Mining on Twitter
Data Mining on TwitterData Mining on Twitter
Data Mining on Twitter
 
Apache Flume and its use case in Manufacturing
Apache Flume and its use case in ManufacturingApache Flume and its use case in Manufacturing
Apache Flume and its use case in Manufacturing
 
Analyse Tweets using Flume, Hadoop and Hive
Analyse Tweets using Flume, Hadoop and HiveAnalyse Tweets using Flume, Hadoop and Hive
Analyse Tweets using Flume, Hadoop and Hive
 
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real world
 
Ph.D. defense: semantic social network analysis
Ph.D. defense: semantic social network analysisPh.D. defense: semantic social network analysis
Ph.D. defense: semantic social network analysis
 
A data analyst view of Bigdata
A data analyst view of Bigdata A data analyst view of Bigdata
A data analyst view of Bigdata
 
Social media mining PPT
Social media mining PPTSocial media mining PPT
Social media mining PPT
 
Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 

Similar to Tweet alert - semantic analysis in social networks for citizen opinion mining

Utrecht Open Data, 19 juni 2012
Utrecht Open Data, 19 juni 2012Utrecht Open Data, 19 juni 2012
Utrecht Open Data, 19 juni 2012
Ivonne Jansen
 
Plan4business newsletter 01 2013
Plan4business newsletter 01 2013Plan4business newsletter 01 2013
Plan4business newsletter 01 2013
plan4business
 
Marie Ostergard- Day 2, Session 2
Marie Ostergard- Day 2, Session 2Marie Ostergard- Day 2, Session 2
Marie Ostergard- Day 2, Session 2
Civic Agenda EU
 

Similar to Tweet alert - semantic analysis in social networks for citizen opinion mining (20)

Using gamification to generate citizen input for public transport planning
Using gamification to generate citizen input for public transport planningUsing gamification to generate citizen input for public transport planning
Using gamification to generate citizen input for public transport planning
 
Open data hackathon jelgava - report
Open data hackathon   jelgava - reportOpen data hackathon   jelgava - report
Open data hackathon jelgava - report
 
eCreative Project presentation
eCreative Project presentationeCreative Project presentation
eCreative Project presentation
 
Miguel Alvarez Rodriguez, digital government, public service delivery, SIGMA,...
Miguel Alvarez Rodriguez, digital government, public service delivery, SIGMA,...Miguel Alvarez Rodriguez, digital government, public service delivery, SIGMA,...
Miguel Alvarez Rodriguez, digital government, public service delivery, SIGMA,...
 
Utrecht Open Data, 19 juni 2012
Utrecht Open Data, 19 juni 2012Utrecht Open Data, 19 juni 2012
Utrecht Open Data, 19 juni 2012
 
ESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking SessionESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking Session
 
Smart Citizen - Sense Making - Óscar González, Fablab Barcelona
Smart Citizen - Sense Making - Óscar González, Fablab Barcelona Smart Citizen - Sense Making - Óscar González, Fablab Barcelona
Smart Citizen - Sense Making - Óscar González, Fablab Barcelona
 
Plan4business newsletter 01 2013
Plan4business newsletter 01 2013Plan4business newsletter 01 2013
Plan4business newsletter 01 2013
 
Sensor SDI in PDOK with Smart Emission Platform
Sensor SDI in PDOK with Smart Emission PlatformSensor SDI in PDOK with Smart Emission Platform
Sensor SDI in PDOK with Smart Emission Platform
 
Citiviz Corporate Presentation | Smart mobility for Citizen's Quality of Life
Citiviz Corporate Presentation | Smart mobility for Citizen's Quality of LifeCitiviz Corporate Presentation | Smart mobility for Citizen's Quality of Life
Citiviz Corporate Presentation | Smart mobility for Citizen's Quality of Life
 
How Open Culture Data and Digital Cultural Heritage Content can contribute si...
How Open Culture Data and Digital Cultural Heritage Content can contribute si...How Open Culture Data and Digital Cultural Heritage Content can contribute si...
How Open Culture Data and Digital Cultural Heritage Content can contribute si...
 
Smart Environment - understanding and meaning
Smart Environment  - understanding and meaningSmart Environment  - understanding and meaning
Smart Environment - understanding and meaning
 
Marie Ostergard- Day 2, Session 2
Marie Ostergard- Day 2, Session 2Marie Ostergard- Day 2, Session 2
Marie Ostergard- Day 2, Session 2
 
Study Trip Report (by Swedish Cities and SKR) to CityLab Eindhoven
Study Trip Report (by Swedish Cities and SKR) to CityLab EindhovenStudy Trip Report (by Swedish Cities and SKR) to CityLab Eindhoven
Study Trip Report (by Swedish Cities and SKR) to CityLab Eindhoven
 
Mapping OER in the Global South
Mapping OER in the Global SouthMapping OER in the Global South
Mapping OER in the Global South
 
7scenes @ kultur på nett 2013
7scenes @ kultur på nett 20137scenes @ kultur på nett 2013
7scenes @ kultur på nett 2013
 
Wunderkammer Prototype
Wunderkammer PrototypeWunderkammer Prototype
Wunderkammer Prototype
 
Conceptualising Smart Tourism Destination Dimensions
Conceptualising Smart Tourism Destination DimensionsConceptualising Smart Tourism Destination Dimensions
Conceptualising Smart Tourism Destination Dimensions
 
Knowledge Technologies group at Cefriel
Knowledge Technologies group at CefrielKnowledge Technologies group at Cefriel
Knowledge Technologies group at Cefriel
 
Policies on the digitalization of cultural heritage
Policies on the digitalization of cultural heritagePolicies on the digitalization of cultural heritage
Policies on the digitalization of cultural heritage
 

More from Sngular Meaning

Real time semantic search engine for social tv streams
Real time semantic search engine for social tv streamsReal time semantic search engine for social tv streams
Real time semantic search engine for social tv streams
Sngular Meaning
 

More from Sngular Meaning (20)

Customer Analytics; qué se necesita y cómo conseguirlo by Josep Curto
Customer Analytics; qué se necesita y cómo conseguirlo by Josep CurtoCustomer Analytics; qué se necesita y cómo conseguirlo by Josep Curto
Customer Analytics; qué se necesita y cómo conseguirlo by Josep Curto
 
Customer Analytics: de text analytics a Voice of Customer
Customer Analytics: de text analytics a Voice of CustomerCustomer Analytics: de text analytics a Voice of Customer
Customer Analytics: de text analytics a Voice of Customer
 
s|ngular Data and Analytics Intro
s|ngular Data and Analytics Intros|ngular Data and Analytics Intro
s|ngular Data and Analytics Intro
 
Stilus corrector ortografico gramatical de estilo en espanol
Stilus   corrector ortografico gramatical de estilo en espanolStilus   corrector ortografico gramatical de estilo en espanol
Stilus corrector ortografico gramatical de estilo en espanol
 
Social Media Analytics for Emergency Management - Telefonica Daedalus 2014
Social Media Analytics for Emergency Management -  Telefonica Daedalus 2014Social Media Analytics for Emergency Management -  Telefonica Daedalus 2014
Social Media Analytics for Emergency Management - Telefonica Daedalus 2014
 
Webinar Herramientas semánticas para sector Salud - Daedalus 4 noviembre 2014
Webinar Herramientas semánticas para sector Salud - Daedalus 4 noviembre 2014Webinar Herramientas semánticas para sector Salud - Daedalus 4 noviembre 2014
Webinar Herramientas semánticas para sector Salud - Daedalus 4 noviembre 2014
 
Tecnologías semánticas en sanidad
Tecnologías semánticas en sanidadTecnologías semánticas en sanidad
Tecnologías semánticas en sanidad
 
Semantic Technologies for Healthcare
Semantic Technologies for HealthcareSemantic Technologies for Healthcare
Semantic Technologies for Healthcare
 
Tracking Buzz and Sentiment for Second Screens - Daedalus - ACM TVX 2014
Tracking Buzz and Sentiment for Second Screens - Daedalus - ACM TVX 2014Tracking Buzz and Sentiment for Second Screens - Daedalus - ACM TVX 2014
Tracking Buzz and Sentiment for Second Screens - Daedalus - ACM TVX 2014
 
Stilus en IX Seminario Internacional de Lengua y Periodismo 2014
Stilus en IX Seminario Internacional de Lengua y Periodismo 2014Stilus en IX Seminario Internacional de Lengua y Periodismo 2014
Stilus en IX Seminario Internacional de Lengua y Periodismo 2014
 
Mineria de informacion util en medios sociales - Daedalus - Big Data Week 201...
Mineria de informacion util en medios sociales - Daedalus - Big Data Week 201...Mineria de informacion util en medios sociales - Daedalus - Big Data Week 201...
Mineria de informacion util en medios sociales - Daedalus - Big Data Week 201...
 
Stilus lenguando-lc aplicada a la correccion
Stilus lenguando-lc aplicada a la correccionStilus lenguando-lc aplicada a la correccion
Stilus lenguando-lc aplicada a la correccion
 
Textalytics - Voice of the Customer - Sentiment Analysis Symposium 2014
Textalytics - Voice of the Customer - Sentiment Analysis Symposium 2014Textalytics - Voice of the Customer - Sentiment Analysis Symposium 2014
Textalytics - Voice of the Customer - Sentiment Analysis Symposium 2014
 
An Introduction to Textalytics API - Redradix Weekend
An Introduction to Textalytics API - Redradix WeekendAn Introduction to Textalytics API - Redradix Weekend
An Introduction to Textalytics API - Redradix Weekend
 
Real time semantic search engine for social tv streams
Real time semantic search engine for social tv streamsReal time semantic search engine for social tv streams
Real time semantic search engine for social tv streams
 
Webinar Textalytics Meaning as a Service - Daedalus 8 octubre 2013
Webinar Textalytics Meaning as a Service - Daedalus 8 octubre 2013Webinar Textalytics Meaning as a Service - Daedalus 8 octubre 2013
Webinar Textalytics Meaning as a Service - Daedalus 8 octubre 2013
 
Textalytics, Meaning as a Service
Textalytics, Meaning as a ServiceTextalytics, Meaning as a Service
Textalytics, Meaning as a Service
 
A Tale of Two (Semantic) APIs - Daedalus - API Days Mediterranea
A Tale of Two (Semantic) APIs - Daedalus - API Days MediterraneaA Tale of Two (Semantic) APIs - Daedalus - API Days Mediterranea
A Tale of Two (Semantic) APIs - Daedalus - API Days Mediterranea
 
Webinar Análisis Semántico de Medios Sociales - Daedalus 21 may 2013
Webinar Análisis Semántico de Medios Sociales - Daedalus 21 may 2013Webinar Análisis Semántico de Medios Sociales - Daedalus 21 may 2013
Webinar Análisis Semántico de Medios Sociales - Daedalus 21 may 2013
 
Language Processing at the Core of the Media & Publishing Industries - Daedal...
Language Processing at the Core of the Media & Publishing Industries - Daedal...Language Processing at the Core of the Media & Publishing Industries - Daedal...
Language Processing at the Core of the Media & Publishing Industries - Daedal...
 

Recently uploaded

Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
mbmh111980
 

Recently uploaded (20)

GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdf
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion Production
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabber
 
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
 
5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand
 
how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdf
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
 
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purityAPVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
 
SQL Injection Introduction and Prevention
SQL Injection Introduction and PreventionSQL Injection Introduction and Prevention
SQL Injection Introduction and Prevention
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
 
10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data Migration
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
 
What need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersWhat need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java Developers
 

Tweet alert - semantic analysis in social networks for citizen opinion mining

  • 1. PeGOV 2014 – 2nd Workshop on Personalization in eGovernment Services and Applications 11 July 2014, Aalborg, Denmark TweetAlert: Semantic Analytics in Social Networks for Citizen Opinion Mining in the City of the Future Julio Villena-Román1,2, Adrián Luna-Cobos1,3, José Carlos González-Cristóbal3,1 1 DAEDALUS - Data, Decisions and Language, S.A. 2 Universidad Carlos III de Madrid 3 Universidad Politécnica de Madrid jvillena@daedalus.es, aluna@daedalus.es, josecarlos.gonzalez@upm.es
  • 2. PeGOV-2014 11 July 2014, Aalborg, Denmark 2 Agenda ! Framework ! Citizen Sensor ! System ! Business cases ! Future work
  • 3. PeGOV-2014 11 July 2014, Aalborg, Denmark 3 Framework ! Ciudad 2020 aims to achieve significant improvements in areas of energetic efficiency, Internet of the Future, Internet of Things, human behaviour, environmental sustainability and mobility and transport, in order to design the City of the Future: sustainable, efficient, smart. ! Spanish R&D project, INNPRONTA Programme, Center for Industrial Technological Development (CDTI), Ministry of Economy and Competitiveness ! 2011-2014 ! 16,3 M€ budget ! 5 multinational corporations, 4 SMEs, 8 PRIs ! Daedalus focuses on the automatic extraction of meaning from all types of multimedia content, using NLP technologies and data/text analytics to help our customers solve any challenge in these areas.
  • 4. leisure and free time surveys PeGOV-2014 11 July 2014, Aalborg, Denmark 4 Citizen Sensor mobility professional activities opinions in social media relationship with public administration collaborative sensing relationship with other people Citizen 2020 = another city sensor
  • 5. PeGOV-2014 11 July 2014, Aalborg, Denmark 5 Citizen Sensor ! Innovative way to capture a very descriptive high-level heterogeneous information, bringing high added value especially when considering aggregations ! More complex and richer information than other sensors ! “smells awful”, “there is a fire”, “I’m going to the sales”… ! Individual actions may show citizen trends ! validate a bus ticket " route density ! Opinion/sentiments of the citizen about the city ! follow social networks to assess the impact of new policies ! Collaborative sensing ! using smartphones to get data (pollution, energy consumption) with low cost and new possibilities
  • 6. Our approach What: Build a system able to capture, store and analyze user PeGOV-2014 11 July 2014, Aalborg, Denmark 6 messages Where: In Twitter For whom: City administrators What for: To help them rapidly and easily understand citizen behaviour trends and know their opinion about city services, events, etc. Why: To enable them to better understand citizen necessities, generate hypotheses over urban behaviour models, in order to improve municipal management policies, bringing them closer to the actual reality of the citizens How: Using NLP technologies When: In real-time
  • 7. PeGOV-2014 11 July 2014, Aalborg, Denmark 7 Architecture
  • 8. Information Repository ! Stores the high volume of data and provides advanced search functionality to exploit the information ! Based on Elasticsearch ! open source, distributed, real-time search and analytics engine ! complex search capabilities ! scalable high-performance solution PeGOV-2014 11 July 2014, Aalborg, Denmark 8 http://www.elasticsearch.org
  • 9. PeGOV-2014 11 July 2014, Aalborg, Denmark 9 Gatherer ! Set of concurrent processes that query the Twitter APIs to collect tweets ! Search or Streaming API ! Filter by a list of user identifiers, a list of keywords to track (terms, hashtags) and/or a set of geographical bounding boxes ! Returns tweet text, author, location, embedded media https://dev.twitter.com/docs/api/1.1
  • 10. Text Classification API http://textalytics.com PeGOV-2014 11 July 2014, Aalborg, Denmark 10 Inquirer ! Set of concurrent processes that annotate tweets using our Textalytics Core APIs ! Entities ! Concepts Topic Extraction API ! Hashtags ! Thematic area of the message (transport, economy, daily life…) ! Citizen Sensor model ! Alert situations (road accidents, fires, street violence…) ! Specific location of the user (building, means of transport...) ! Events to which the text refers (cultural events, sports...) ! Sentiment polarity : P+, P, NEU, N, N+, NONE ! Irony and subjectivity ! User demographics: gender, age, type of tweet author Sentiment Analysis API User Demographics API
  • 11. Entities, concepts, hashtags Advanced NLP to obtain POS, syntactic tree and semantic analyses of the text and use it to identify different types of significant elements PeGOV-2014 11 July 2014, Aalborg, Denmark 11
  • 12. Text classification State-of-the-art hybrid text classification model using a statistical classification combined with a rule-based filtering PeGOV-2014 11 July 2014, Aalborg, Denmark 12 Social Media Citizen Sensor
  • 13. PeGOV-2014 11 July 2014, Aalborg, Denmark 13 Topics
  • 14. PeGOV-2014 11 July 2014, Aalborg, Denmark 14 Alerts
  • 15. PeGOV-2014 11 July 2014, Aalborg, Denmark 15 Locations, events
  • 16. Sentiment analysis State-of-the-art lexicon-based model for sentiment analysis, using POS and syntactic tree for detecting negation and controlling the scope of modifiers + subjectivity classification + irony detection PeGOV-2014 11 July 2014, Aalborg, Denmark 16
  • 17. User Demographics Text classification based on n-grams model to guess user type, gender and age from his/her login, name and profile description PeGOV-2014 11 July 2014, Aalborg, Denmark 17
  • 18. PeGOV-2014 11 July 2014, Aalborg, Denmark 18 Example { "text":"el viento ha roto una rama y hay un atascazo increible en toda la gran vía...", "tag_list":[ {"type":"sensor", "value":"011002 Ubicación - Exteriores - Vías públicas"}, {"type":"sensor", "value":"070700 Alertas meteorológicas - Viento"}, {"type":"sensor", "value":"080100 Incidencia - Congestión de tráfico"}, {"type":"topic", "value":"06 medio ambiente, meteorología y energía"}, {"type":"entity", "value":"Gran Vía"}, {"type":"concept", "value":"viento"}, {"type":"sentiment", "value":"N"}, {"type":"subjectivity", "value":"OBJ"}, {"type":"irony", "value":"NONIRONIC"}, {"type":"user_type", "value":"PERSON"}, {"type":"user_gender", "value":"FEMALE"}, {"type":"user_age", "value":"25-35"} ] }
  • 19. PeGOV-2014 11 July 2014, Aalborg, Denmark 19 Geolocation
  • 20. PeGOV-2014 11 July 2014, Aalborg, Denmark 20 Visualization http://www.highcharts.com http://openlayers.org http://d3js.org
  • 21. PeGOV-2014 11 July 2014, Aalborg, Denmark 21 Ongoing business cases ! City console for a local administration to analyze in real-time the behaviour and topics of interest of the citizens, with two components: ! a private console, internal for the city services, for analytics ! a public dashboard to engage citizens with their city, displaying attractive, summarized, non-confidential information at selected public locations (town hall, libraries, museums) or a LED video wall in a populous square in downtown ! Social alert detection system ! For 112 emergency services, providing early detection of security-related issues
  • 22. For short/mid term future ! Trending topics geolocation clustering PeGOV-2014 11 July 2014, Aalborg, Denmark 22 ! Analysis at neighbourhood level health traffic jam air pollution jellyfish pollen
  • 23. For short/mid term future PeGOV-2014 11 July 2014, Aalborg, Denmark 23 ! Analysis of city pace of life
  • 24. For short/mid term future PeGOV-2014 11 July 2014, Aalborg, Denmark 24 ! Mobility analysis ! How, when, why people move through the city ! Route identification (home"work"free time"home) ! Route changes (due to weather)
  • 25. For short/mid term future ! City reputation and brand personality ! Automated satisfaction surveys PeGOV-2014 11 July 2014, Aalborg, Denmark 25
  • 26. This work has been supported by several Spanish R&D projects: Ciudad2020: Hacia un nuevo modelo de ciudad inteligente sostenible (INNPRONTA IPT-20111006), MA2VICMR: Improving the access, analysis and visibility of the multilingual and multimedia information in web for the Region of Madrid (S2009/TIC-1542) and MULTIMEDICA: Multilingual Information Extraction in Health domain and application to scientific and informative documents (TIN2010-20644-C03-01). Authors would like to thank all partners for their knowledge and support. PeGOV-2014 11 July 2014, Aalborg, Denmark 26