"MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd", talk given at the 2nd Real Time Analysis and Mining of Social Streams Workshop (RAMSS) colocated with WWW 2013, Rio de Janeiro, Brazil
7. Social Media: some definitions
Media Item: a photo or a video that is shared on
a social network
Micropost: a text status message that can
optionally accompany a media item
Social Network: an online service that focuses
on building and reflecting social relationships
among people sharing interests or activities
Media Sharing Platforms: emphasis on sharing media
but blurred boundaries with social networks since users
are encouraged to react on media content
(like, comment, favorite, etc.)
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 7
8. Social networks and media items
First-order support:
Posting requires the inclusion of a media item
Example: Flickr, YouTube
Second-order support:
Possibility to post media items but also text-only messages
Example: Facebook
Third-order support:
No direct support for media items but rely on third party applications
to host them
Example: Twitter before the introduction of native photo support
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 8
9. Media Server
Composition of media item extractors (12 SNs)
Rely on search APIs + a fix 30s timeout window to provide results
Fallback on screen scraping when necessary (Twitter ecosystem)
Implemented as a NodeJS server
Serialize results in a common schema (JSON)
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 9
10. 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 10
Deep link
Permalink
Clean text for NLP
processing
Aggregate view of ALL
social interactions
12 Social Networks
12. Media Finder (zooming on media items)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 12
13. Media Finder (timeline view)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 13
14. Named Entities are Pivotal
Standalone software
GATE
Stanford CoreNLP
Temis
Web APIs
http://nerd.eurecom.fr/
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 14
15. What is NERD?
REST API2ontology1
UI3
1 http://nerd.eurecom.fr/ontology
2 http://nerd.eurecom.fr/api/application.wadl
3 http://nerd.eurecom.fr
The NERD ontology has been
integrated in the NIF project,
a EU FP7 in the context of the
LOD2: Creating Knowledge
out of Interlinked Data
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 15
16. NERD REST API
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 16
GET,
POST,
PUT,
DELETE
/document
/user
/annotation/{extractor}
/extraction
/evaluation
...
JSON/RDF*
“entities” : [{
“entity”: “Tim Berners-Lee” ,
“type”: “Person” ,
“uri”: "http://dbpedia.org/resource/Tim_berners_lee",
“nerdType”: "http://nerd.eurecom.fr/ontology#Person",
“startChar”: 30,
“endChar”: 45,
“confidence”: 1,
“relevance”: 0.5
}]
Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web Extraction
Tools. In: European chapter of the Association for Computational Linguistics (EACL'12), Avignon, France.
17. Media Finder Architecture
Media items harvesting using the Media Server
http://eventmedia.eurecom.fr/media-
server/search/{combined}/{term}
https://github.com/vuknje/media-server (@tomayac fork)
Image near de-duplication
DCT signature on image and video frame,
Hamming distance between image pairs
Clustering and disambiguation
Named Entity Extraction using NERD
Topic Generation using LDA
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 17
18. Media Finder (named entities clustering)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 18
19. Media Finder (zooming in a cluster)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 19
20. Media Finder
Live Topic Generation from Event Streams
Meet us at WWW 2013 Demo Session
http://www.youtube.com/watch?v=8iRiwz7cDYY
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 20
21. Tracking an event: Italian Election
Repeated queries over a period of time
We have tracked and analyzed media posts tagged as
elezioni2013 from 2013-02-26 to 2013-03-03
Cron job: every 30 minutes over the 6 days
Slice the data in 24 hours slots
Research questions:
Can we re-create the news headlines?
Storyboarding:
http://mediafinder.eurecom.fr/story/elezioni2013
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 21
22. Tracking an event: Italian Election
Dataset:
~16501 microposts containing (duplicate) media items
~21087 Named Entities extracted
Clustering
NER and LDA
Generate Bag of Entities (BOE) disambiguated with a
DBpedia URI
Examples:
Monti, Bersani, Italia, Berlusconi, Grillo, Stelle
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 22
23. Tracking an event: Italian Election
Tracking and Analyzing The 2013 Italian Election
To appear at ESWC 2013 Demo Session
http://www.youtube.com/watch?v=jIMdnwMoWnk
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 23
24. Take Home Message
Media Server / Media Finder:
Aggregating fresh social media items
Making sense of media collection for video hyper-linking
NERD platform for extracting key information
Vision: adoption of semantic multimedia
technologies will foster a European market for
media fragment re-purposing and re-selling
Sneak preview:
Interact with a Kinect and discover enriched hypervideo
http://www.youtube.com/watch?v=4mSC685AG7k
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 24
25. Credits
Vuk Milicic … interaction designer
Giuseppe Rizzo … NERD guru
José Luis Redondo Garcia … triplification and
clustering
Thomas Steiner … Media Server original code
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 25