SlideShare a Scribd company logo
1 of 17
Download to read offline
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Your Name – Your Company
your@email
www.yourwebsite.com
Stuart E. Middleton
University of Southampton IT Innovation Centre
sem@it-innovation.soton.ac.uk
www.it-innovation.soton.ac.uk
REVEAL Project - Trust and Credibility Analysis
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
RDSM 2015 Invited Talk Overview
1
 REVEAL Project
 Modality Analysis of Social Media Streams
 Geosemantics and Spatio-Temporal Grounding of Rumours
 Knowledge-based Approach to Trust and Credibility Modelling
 Future Work
Overview
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Overview
2
 Objectives
 Enable users to reveal hidden ‘modalities’ such as reputation, influence or credibility of information
 Approach - Modality Extraction and Analysis
 Real-time modality extraction
 On-demand analytics capabilities
 Event-driven architecture using RabbitMQ to communicate
 Processing based on a scalable STORM cluster (real-time) & standalone HTTP services (on-demand)
 Journalism Use Case
 Newsgathering - Find newsworthy content and evidence to help verify this content
 Enterprise Use Case
 Forums - Identify and help newbies, track product feedback & sentiment & emerging trends
REVEAL Project
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Modality Extraction
3
 Social Media Streams
 Twitter, You Tube, Instagram, Facebook, Four Square
 Search (historical) and Stream (real-time)
 Social Network Analysis
 Community Detection, Community Graph Extraction, Community Classification (e.g. topics), Role
Analysis (e.g. popular participant), Influence Models ...
 Content Analysis
 Image Feature Extraction (e.g. sky, city), Image Similarity Clustering, Multimedia Indexing, Image
Manipulation Detection, Topic Models, Original Content Detection, Text Stylometry ...
 Geospatial Analysis
 Geoparsing, Geosemantic Classification, Image-based Geolocation, Geospatial Topic Model ...
 Semantic Analysis
 Directed Linked-Data Crawler, Semantic Context (e.g. context summaries based on linked data) ...
Modality Analysis of Social Media Streams
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Real-Time Annotation of Social Media
4
Modality Analysis of Social Media Streams
JSON Twitter @stuart_e_middle <tweet>
Author
Timestamp
URI's
Hashtags
...
Geoparsed
Location
Influence
Score
Topic
Model
JSON Facebook BBCNews <post>
JSON YouTube CNN <video>
JSON Instagram bbcnews <image>
11:30
11:35
11:40
11:45
11:50
Content Created
Text
Stylometry
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Incremental Aggregation of Annotated Social Media
5
Modality Analysis of Social Media Streams
JSON Facebook BBCNews <post>
JSON YouTube CNN <video>
JSON Instagram bbcnews <image>
JSON Twitter @stuart.e.middle <tweet>
JSON Facebook Profile @gadgetshow
JSON Instagram gadgetshow <image>
Cross Check
- Timestamps
- Locations
- Authors
...
Trust Analysis
- Trusted sources
- Reputations
- Correlation to known facts
...
Trustworthy
Evidence
Credible
Evidence
User Feedback
User Feedback
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Geosemantic Classification of Evidence
6
 What is ‘geosemantics’
 Study of context of spatial data - in our case contextual text relating to mentioned locations
 Our Approach – Geosemantic feature classification
 Geoparsing [Planet OpenStreetMap database] » LOC (high precision, native language)
 Text + POS + LOC + training set » classifier » context of how is location is talked about
 Classes » timeliness past | future | present, situatedness insitu | remote, confirmation confirm | deny
 Eyewitness reports » insitu
 Breaking content » present
 Denial of rumours » deny
 State of the art – Geosemantic text analysis
 Text + POS + training set » classifier » event type
 Location text » NLP Grammar » direction & distance
 e.g. trouble spotted 5 miles north of London
 Location text » sentiment analysis » good / bad opinion of text
 Resilience of approaches across event types and languages an issue
Geosemantics and Spatio-Temporal Grounding of Rumours
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium 7
Geosemantics and Spatio-Temporal Grounding of Rumours
Open Street Map
Planet Database
Twitter Search & Streaming API
You Tube Search API
Instagram Search API
locations &
geometry
Keywords
Focus
Area(s)
RabbitMQ
HTTP, SQL,
SPARQL
JSON tweets
JSON tweets
+ Locations
JSON tweets
+ Locations
+ Class labels
Social
Media
Crawler
Geoparse
Geoparse
Geoparse
Geoparse
Geoparse
Geoclassify
Situation
Assessment
Visualization
Storm
Topology
Service
Key
Geospatial
Pre-processing
SQL content +
loc + class
CITATION geosemantics
Middleton, S.E. Krivcovs, V.
"Geoparsing and Geosemantic Analysis of Social Media for
Spatio-temporal Grounding of Rumours during Breaking News"
submitted TOIS 2015
CITATION geoparsing
Middleton, S.E. Middleton, L. Modafferi, S.
"Real-time Crisis Mapping of Natural Disasters using Social Media"
Intelligent Systems, IEEE , vol.29, no.2, pp.9,17, Mar.-Apr. 2014
CITATION tech blog
Middleton, S.E.
"From Twitter-based Crisis Mapping to Large-scale Real-Time
Situation Assessment with Trust and Credibility Analysis", 2014
http://revealproject.eu/
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Case Study - NYSE flooding in 2012 (false rumour)
8
 Geosemantic filtering of evidence
 7000+ tweets in 5 minute analysis window
 114 ground truth tweets - WeatherChannel & CNN
 Geosemantic filter reduced content volume by 95%
 100% ground truth recall for CONFIRM class
 85% ground truth recall for DENY class
Geosemantics and Spatio-Temporal Grounding of Rumours
New York Stock Exchange
(building)
Battery Park
(park)
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Case Study - Donetsk Airport 2015 (conflicting claims)
9
 Spatio-temporal mapping of evidence
 300,000 Tweets, YouTube & Instagram posts over 24 hours of Ukraine Crisis for 20th Jan 2015
 4 ground truth YouTube videos used by LifeNews reports on 20th Jan 2015
 Focus: Dontesk Airport cluster
 Ground truth URI's ranked 10,14 & 28 out of 30
Geosemantics and Spatio-Temporal Grounding of Rumours
Веселе
(suburb)
Donetsk Airport
(building)
Київський проспект
(road)
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Definitions
10
 What is ‘credibility’ and ‘trust’ ?
 Trust and credibility are not well defined – below is our interpretation
 Credibility - consistency with other content (e.g. similar reports) and contextual information (e.g. local
geography)
 Trust - subjective assessment of likelihood of content being false
 A credible news report might still be false!
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Approach
11
 Our Approach – Knowledge-based Trust Modelling
 Journalist already have personal sets of trusted sources they have come to rely upon
 Knowledge-based approach
 Journalist assert a-priori known facts (e.g. trusted sources, known locations)
 Evidence from streams asserted incrementally into a triple store (i.e. useekm + owlim)
 Simple inference » classify evidence » interactive exploration of evidence with journalist
 OWL classes & individuals, owl:Restriction, owl:intersectionOf, SPARQL, GeoSPARQL ...
 Not a black box - End users explore the evidence and we help them make a verification decision
 Scalable approach able to represent different viewpoints of Journalists
 State of the Art – Trust and Credibility Modelling
 Unsupervised learning (e.g. Bayesian Network, Damper Shafer) » trust prediction without explanation
 Supervised reputation models » trust prediction with explanation
 Heuristics & activity metrics » trust prediction with explanation
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Early Results - Work in Progress
12
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Early Results - Work in Progress
13
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Early Results - Work in Progress
14
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium
Roadmap for REVEAL Trust and Credibility Analysis
15
 REVEAL project runs until Sept 2016
 Ethnographic studies with Journalists
 Crawl content in parallel to journalists searching User Generated Content (UGC)
 Record ground truth by observing Journalists verifying news for real & explaining decisions
 Empirical analysis - compare automated decisions with ground truth
 MediaEval 2015 verification challenge
 Competition verifying images using a common Twitter dataset (10 different news events)
 Users trials 2015 - 2016
Future Work
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium 16
Any questions?
Stuart E. Middleton
University of Southampton IT Innovation Centre
email: sem@it-innovation.soton.ac.uk
web: www.it-innovation.soton.ac.uk
twitter:@stuart_e_middle, @IT_Innov, @RevealEU
Many thanks for your attention!

More Related Content

Viewers also liked

A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
SlideShare
 

Viewers also liked (7)

WWW2015 - RDSM2015 Workshop - Trust and Credibility Analysis
WWW2015 - RDSM2015 Workshop - Trust and Credibility AnalysisWWW2015 - RDSM2015 Workshop - Trust and Credibility Analysis
WWW2015 - RDSM2015 Workshop - Trust and Credibility Analysis
 
"Extracting Attributed Verification and Debunking Reports from Social Media: ...
"Extracting Attributed Verification and Debunking Reports from Social Media: ..."Extracting Attributed Verification and Debunking Reports from Social Media: ...
"Extracting Attributed Verification and Debunking Reports from Social Media: ...
 
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
 
Geoparsing and Real-time Social Media Analytics - technical and social challe...
Geoparsing and Real-time Social Media Analytics - technical and social challe...Geoparsing and Real-time Social Media Analytics - technical and social challe...
Geoparsing and Real-time Social Media Analytics - technical and social challe...
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
 

Similar to REVEAL Project - Trust and Credibility Analysis

Similar to REVEAL Project - Trust and Credibility Analysis (20)

TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...
TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...
TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...
 
Social media as a trustworthy news source: Exploring journalists’ working pra...
Social media as a trustworthy news source: Exploring journalists’ working pra...Social media as a trustworthy news source: Exploring journalists’ working pra...
Social media as a trustworthy news source: Exploring journalists’ working pra...
 
LinkedTV results at the end of the 3rd year
LinkedTV results at the end of the 3rd yearLinkedTV results at the end of the 3rd year
LinkedTV results at the end of the 3rd year
 
CloudTeams - Boosting Collaboration of Developers and End Users Together for ...
CloudTeams - Boosting Collaboration of Developers and End Users Together for ...CloudTeams - Boosting Collaboration of Developers and End Users Together for ...
CloudTeams - Boosting Collaboration of Developers and End Users Together for ...
 
Munir salman
Munir salmanMunir salman
Munir salman
 
TTO Keynote 08 10 2021
TTO Keynote 08 10 2021TTO Keynote 08 10 2021
TTO Keynote 08 10 2021
 
EMDS-FIWARE-Workshop-WP2_Inventory.pptx
EMDS-FIWARE-Workshop-WP2_Inventory.pptxEMDS-FIWARE-Workshop-WP2_Inventory.pptx
EMDS-FIWARE-Workshop-WP2_Inventory.pptx
 
WeVerify at NILC - May 2019.pptx
WeVerify at NILC - May 2019.pptxWeVerify at NILC - May 2019.pptx
WeVerify at NILC - May 2019.pptx
 
Web Strategy: Magic Formula For Success?
Web Strategy: Magic Formula For Success? Web Strategy: Magic Formula For Success?
Web Strategy: Magic Formula For Success?
 
Linked Television: a HbbTV application for enhancing broadcast TV with relate...
Linked Television: a HbbTV application for enhancing broadcast TV with relate...Linked Television: a HbbTV application for enhancing broadcast TV with relate...
Linked Television: a HbbTV application for enhancing broadcast TV with relate...
 
PrepData4Mobilty Inventory of mobility and logistics data ecosystems Sebastia...
PrepData4Mobilty Inventory of mobility and logistics data ecosystems Sebastia...PrepData4Mobilty Inventory of mobility and logistics data ecosystems Sebastia...
PrepData4Mobilty Inventory of mobility and logistics data ecosystems Sebastia...
 
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ ProjectQuantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
 
Future enterprise towards 2030 internet business innovation_20-21march2014,at...
Future enterprise towards 2030 internet business innovation_20-21march2014,at...Future enterprise towards 2030 internet business innovation_20-21march2014,at...
Future enterprise towards 2030 internet business innovation_20-21march2014,at...
 
OCRE: Adoption Funding - 1st Cloud Call - Jan Meijer
OCRE: Adoption Funding - 1st Cloud Call - Jan MeijerOCRE: Adoption Funding - 1st Cloud Call - Jan Meijer
OCRE: Adoption Funding - 1st Cloud Call - Jan Meijer
 
RAGE ECGBL
RAGE ECGBLRAGE ECGBL
RAGE ECGBL
 
ADEQUATe and CommuniData
ADEQUATe and CommuniDataADEQUATe and CommuniData
ADEQUATe and CommuniData
 
We verify poster a1 2019
We verify poster a1 2019We verify poster a1 2019
We verify poster a1 2019
 
Running faster market exploration with the cloud teams campaigns
Running faster market exploration with the cloud teams campaignsRunning faster market exploration with the cloud teams campaigns
Running faster market exploration with the cloud teams campaigns
 
Surfing from the WSNs to the IoT, 27nov2014
Surfing from the WSNs to the IoT,  27nov2014Surfing from the WSNs to the IoT,  27nov2014
Surfing from the WSNs to the IoT, 27nov2014
 
Using TV Metadata to optimise the repurposing and republication of TV Content...
Using TV Metadata to optimise the repurposing and republication of TV Content...Using TV Metadata to optimise the repurposing and republication of TV Content...
Using TV Metadata to optimise the repurposing and republication of TV Content...
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

REVEAL Project - Trust and Credibility Analysis

  • 1. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Your Name – Your Company your@email www.yourwebsite.com Stuart E. Middleton University of Southampton IT Innovation Centre sem@it-innovation.soton.ac.uk www.it-innovation.soton.ac.uk REVEAL Project - Trust and Credibility Analysis
  • 2. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium RDSM 2015 Invited Talk Overview 1  REVEAL Project  Modality Analysis of Social Media Streams  Geosemantics and Spatio-Temporal Grounding of Rumours  Knowledge-based Approach to Trust and Credibility Modelling  Future Work Overview
  • 3. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Overview 2  Objectives  Enable users to reveal hidden ‘modalities’ such as reputation, influence or credibility of information  Approach - Modality Extraction and Analysis  Real-time modality extraction  On-demand analytics capabilities  Event-driven architecture using RabbitMQ to communicate  Processing based on a scalable STORM cluster (real-time) & standalone HTTP services (on-demand)  Journalism Use Case  Newsgathering - Find newsworthy content and evidence to help verify this content  Enterprise Use Case  Forums - Identify and help newbies, track product feedback & sentiment & emerging trends REVEAL Project
  • 4. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Modality Extraction 3  Social Media Streams  Twitter, You Tube, Instagram, Facebook, Four Square  Search (historical) and Stream (real-time)  Social Network Analysis  Community Detection, Community Graph Extraction, Community Classification (e.g. topics), Role Analysis (e.g. popular participant), Influence Models ...  Content Analysis  Image Feature Extraction (e.g. sky, city), Image Similarity Clustering, Multimedia Indexing, Image Manipulation Detection, Topic Models, Original Content Detection, Text Stylometry ...  Geospatial Analysis  Geoparsing, Geosemantic Classification, Image-based Geolocation, Geospatial Topic Model ...  Semantic Analysis  Directed Linked-Data Crawler, Semantic Context (e.g. context summaries based on linked data) ... Modality Analysis of Social Media Streams
  • 5. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Real-Time Annotation of Social Media 4 Modality Analysis of Social Media Streams JSON Twitter @stuart_e_middle <tweet> Author Timestamp URI's Hashtags ... Geoparsed Location Influence Score Topic Model JSON Facebook BBCNews <post> JSON YouTube CNN <video> JSON Instagram bbcnews <image> 11:30 11:35 11:40 11:45 11:50 Content Created Text Stylometry
  • 6. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Incremental Aggregation of Annotated Social Media 5 Modality Analysis of Social Media Streams JSON Facebook BBCNews <post> JSON YouTube CNN <video> JSON Instagram bbcnews <image> JSON Twitter @stuart.e.middle <tweet> JSON Facebook Profile @gadgetshow JSON Instagram gadgetshow <image> Cross Check - Timestamps - Locations - Authors ... Trust Analysis - Trusted sources - Reputations - Correlation to known facts ... Trustworthy Evidence Credible Evidence User Feedback User Feedback
  • 7. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Geosemantic Classification of Evidence 6  What is ‘geosemantics’  Study of context of spatial data - in our case contextual text relating to mentioned locations  Our Approach – Geosemantic feature classification  Geoparsing [Planet OpenStreetMap database] » LOC (high precision, native language)  Text + POS + LOC + training set » classifier » context of how is location is talked about  Classes » timeliness past | future | present, situatedness insitu | remote, confirmation confirm | deny  Eyewitness reports » insitu  Breaking content » present  Denial of rumours » deny  State of the art – Geosemantic text analysis  Text + POS + training set » classifier » event type  Location text » NLP Grammar » direction & distance  e.g. trouble spotted 5 miles north of London  Location text » sentiment analysis » good / bad opinion of text  Resilience of approaches across event types and languages an issue Geosemantics and Spatio-Temporal Grounding of Rumours
  • 8. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium 7 Geosemantics and Spatio-Temporal Grounding of Rumours Open Street Map Planet Database Twitter Search & Streaming API You Tube Search API Instagram Search API locations & geometry Keywords Focus Area(s) RabbitMQ HTTP, SQL, SPARQL JSON tweets JSON tweets + Locations JSON tweets + Locations + Class labels Social Media Crawler Geoparse Geoparse Geoparse Geoparse Geoparse Geoclassify Situation Assessment Visualization Storm Topology Service Key Geospatial Pre-processing SQL content + loc + class CITATION geosemantics Middleton, S.E. Krivcovs, V. "Geoparsing and Geosemantic Analysis of Social Media for Spatio-temporal Grounding of Rumours during Breaking News" submitted TOIS 2015 CITATION geoparsing Middleton, S.E. Middleton, L. Modafferi, S. "Real-time Crisis Mapping of Natural Disasters using Social Media" Intelligent Systems, IEEE , vol.29, no.2, pp.9,17, Mar.-Apr. 2014 CITATION tech blog Middleton, S.E. "From Twitter-based Crisis Mapping to Large-scale Real-Time Situation Assessment with Trust and Credibility Analysis", 2014 http://revealproject.eu/
  • 9. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Case Study - NYSE flooding in 2012 (false rumour) 8  Geosemantic filtering of evidence  7000+ tweets in 5 minute analysis window  114 ground truth tweets - WeatherChannel & CNN  Geosemantic filter reduced content volume by 95%  100% ground truth recall for CONFIRM class  85% ground truth recall for DENY class Geosemantics and Spatio-Temporal Grounding of Rumours New York Stock Exchange (building) Battery Park (park)
  • 10. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Case Study - Donetsk Airport 2015 (conflicting claims) 9  Spatio-temporal mapping of evidence  300,000 Tweets, YouTube & Instagram posts over 24 hours of Ukraine Crisis for 20th Jan 2015  4 ground truth YouTube videos used by LifeNews reports on 20th Jan 2015  Focus: Dontesk Airport cluster  Ground truth URI's ranked 10,14 & 28 out of 30 Geosemantics and Spatio-Temporal Grounding of Rumours Веселе (suburb) Donetsk Airport (building) Київський проспект (road)
  • 11. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Definitions 10  What is ‘credibility’ and ‘trust’ ?  Trust and credibility are not well defined – below is our interpretation  Credibility - consistency with other content (e.g. similar reports) and contextual information (e.g. local geography)  Trust - subjective assessment of likelihood of content being false  A credible news report might still be false! Knowledge-based Approach to Trust and Credibility Modelling
  • 12. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Approach 11  Our Approach – Knowledge-based Trust Modelling  Journalist already have personal sets of trusted sources they have come to rely upon  Knowledge-based approach  Journalist assert a-priori known facts (e.g. trusted sources, known locations)  Evidence from streams asserted incrementally into a triple store (i.e. useekm + owlim)  Simple inference » classify evidence » interactive exploration of evidence with journalist  OWL classes & individuals, owl:Restriction, owl:intersectionOf, SPARQL, GeoSPARQL ...  Not a black box - End users explore the evidence and we help them make a verification decision  Scalable approach able to represent different viewpoints of Journalists  State of the Art – Trust and Credibility Modelling  Unsupervised learning (e.g. Bayesian Network, Damper Shafer) » trust prediction without explanation  Supervised reputation models » trust prediction with explanation  Heuristics & activity metrics » trust prediction with explanation Knowledge-based Approach to Trust and Credibility Modelling
  • 13. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Early Results - Work in Progress 12 Knowledge-based Approach to Trust and Credibility Modelling
  • 14. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Early Results - Work in Progress 13 Knowledge-based Approach to Trust and Credibility Modelling
  • 15. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Early Results - Work in Progress 14 Knowledge-based Approach to Trust and Credibility Modelling
  • 16. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium Roadmap for REVEAL Trust and Credibility Analysis 15  REVEAL project runs until Sept 2016  Ethnographic studies with Journalists  Crawl content in parallel to journalists searching User Generated Content (UGC)  Record ground truth by observing Journalists verifying news for real & explaining decisions  Empirical analysis - compare automated decisions with ground truth  MediaEval 2015 verification challenge  Competition verifying images using a common Twitter dataset (10 different news events)  Users trials 2015 - 2016 Future Work
  • 17. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2015 REVEAL consortium 16 Any questions? Stuart E. Middleton University of Southampton IT Innovation Centre email: sem@it-innovation.soton.ac.uk web: www.it-innovation.soton.ac.uk twitter:@stuart_e_middle, @IT_Innov, @RevealEU Many thanks for your attention!