SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
MediaFinder: Collect, Enrich
and Visualize Media Memes
Shared by the Crowd
Raphaël Troncy
raphael.troncy@eurecom.fr / @rtroncy
Conferences and natural disaster
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 2
- 314/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
- 414/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
- 514/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
- 614/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
Social Media: some definitions
 Media Item: a photo or a video that is shared on
a social network
 Micropost: a text status message that can
optionally accompany a media item
 Social Network: an online service that focuses
on building and reflecting social relationships
among people sharing interests or activities
Media Sharing Platforms: emphasis on sharing media
but blurred boundaries with social networks since users
are encouraged to react on media content
(like, comment, favorite, etc.)
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 7
Social networks and media items
 First-order support:
 Posting requires the inclusion of a media item
 Example: Flickr, YouTube
 Second-order support:
 Possibility to post media items but also text-only messages
 Example: Facebook
 Third-order support:
 No direct support for media items but rely on third party applications
to host them
 Example: Twitter before the introduction of native photo support
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 8
Media Server
 Composition of media item extractors (12 SNs)
 Rely on search APIs + a fix 30s timeout window to provide results
 Fallback on screen scraping when necessary (Twitter ecosystem)
 Implemented as a NodeJS server
 Serialize results in a common schema (JSON)
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 9
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 10
Deep link
Permalink
Clean text for NLP
processing
Aggregate view of ALL
social interactions
12 Social Networks
Media Finder (www2013)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 11
Media Finder (zooming on media items)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 12
Media Finder (timeline view)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 13
Named Entities are Pivotal
 Standalone software
GATE
Stanford CoreNLP
Temis
 Web APIs
http://nerd.eurecom.fr/
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 14
What is NERD?
REST API2ontology1
UI3
1 http://nerd.eurecom.fr/ontology
2 http://nerd.eurecom.fr/api/application.wadl
3 http://nerd.eurecom.fr
The NERD ontology has been
integrated in the NIF project,
a EU FP7 in the context of the
LOD2: Creating Knowledge
out of Interlinked Data
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 15
NERD REST API
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 16
GET,
POST,
PUT,
DELETE
/document
/user
/annotation/{extractor}
/extraction
/evaluation
...
JSON/RDF*
“entities” : [{
“entity”: “Tim Berners-Lee” ,
“type”: “Person” ,
“uri”: "http://dbpedia.org/resource/Tim_berners_lee",
“nerdType”: "http://nerd.eurecom.fr/ontology#Person",
“startChar”: 30,
“endChar”: 45,
“confidence”: 1,
“relevance”: 0.5
}]
Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web Extraction
Tools. In: European chapter of the Association for Computational Linguistics (EACL'12), Avignon, France.
Media Finder Architecture
 Media items harvesting using the Media Server
http://eventmedia.eurecom.fr/media-
server/search/{combined}/{term}
https://github.com/vuknje/media-server (@tomayac fork)
 Image near de-duplication
DCT signature on image and video frame,
Hamming distance between image pairs
 Clustering and disambiguation
Named Entity Extraction using NERD
Topic Generation using LDA
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 17
Media Finder (named entities clustering)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 18
Media Finder (zooming in a cluster)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 19
Media Finder
 Live Topic Generation from Event Streams
Meet us at WWW 2013 Demo Session
http://www.youtube.com/watch?v=8iRiwz7cDYY
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 20
Tracking an event: Italian Election
 Repeated queries over a period of time
We have tracked and analyzed media posts tagged as
elezioni2013 from 2013-02-26 to 2013-03-03
Cron job: every 30 minutes over the 6 days
Slice the data in 24 hours slots
 Research questions:
Can we re-create the news headlines?
 Storyboarding:
http://mediafinder.eurecom.fr/story/elezioni2013
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 21
Tracking an event: Italian Election
 Dataset:
~16501 microposts containing (duplicate) media items
~21087 Named Entities extracted
 Clustering
NER and LDA
Generate Bag of Entities (BOE) disambiguated with a
DBpedia URI
 Examples:
Monti, Bersani, Italia, Berlusconi, Grillo, Stelle
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 22
Tracking an event: Italian Election
 Tracking and Analyzing The 2013 Italian Election
To appear at ESWC 2013 Demo Session
http://www.youtube.com/watch?v=jIMdnwMoWnk
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 23
Take Home Message
 Media Server / Media Finder:
Aggregating fresh social media items
Making sense of media collection for video hyper-linking
 NERD platform for extracting key information
 Vision: adoption of semantic multimedia
technologies will foster a European market for
media fragment re-purposing and re-selling
 Sneak preview:
Interact with a Kinect and discover enriched hypervideo
http://www.youtube.com/watch?v=4mSC685AG7k
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 24
Credits
 Vuk Milicic … interaction designer
 Giuseppe Rizzo … NERD guru
 José Luis Redondo Garcia … triplification and
clustering
 Thomas Steiner … Media Server original code
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 25
http://www.slideshare.net/troncy
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 26

Contenu connexe

Similaire à MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd

Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...Alfonso Crisci
 
Fire incident data visualization
Fire incident data visualizationFire incident data visualization
Fire incident data visualizationLaurie Reynolds
 
Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)Sammy Fung
 
SocialSensor Project: Sensing User Generated Input for Improved Media Discove...
SocialSensor Project: Sensing User Generated Input for Improved Media Discove...SocialSensor Project: Sensing User Generated Input for Improved Media Discove...
SocialSensor Project: Sensing User Generated Input for Improved Media Discove...Yiannis Kompatsiaris
 
Fairhair.ai – alan turing institute june '17 (public)
Fairhair.ai – alan turing institute june '17 (public)Fairhair.ai – alan turing institute june '17 (public)
Fairhair.ai – alan turing institute june '17 (public)Giorgio Orsi
 
Use of Open Data in Hong Kong
Use of Open Data in Hong KongUse of Open Data in Hong Kong
Use of Open Data in Hong KongSammy Fung
 
Sti community – an approach for a semantically enhanced company platform 20...
Sti community – an approach for a semantically enhanced company platform   20...Sti community – an approach for a semantically enhanced company platform   20...
Sti community – an approach for a semantically enhanced company platform 20...STIinnsbruck
 
GNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based MedicineGNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based MedicineAdrian Olszewski
 
Approaches of Data Analysis: Networks generated through Social Media
Approaches of Data Analysis: Networks generated through Social MediaApproaches of Data Analysis: Networks generated through Social Media
Approaches of Data Analysis: Networks generated through Social MediaJanna Joceli Omena
 
Live Social Semantics @ ISWC2009
Live Social Semantics @ ISWC2009Live Social Semantics @ ISWC2009
Live Social Semantics @ ISWC2009Martin Szomszor
 
Fusing text and image for event
Fusing text and image for eventFusing text and image for event
Fusing text and image for eventijma
 
WG2 soa&plan 201401
WG2 soa&plan 201401WG2 soa&plan 201401
WG2 soa&plan 201401tu1204
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Amit Sheth
 
How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2Sammy Fung
 
Press Kit -LiMoSINe Project
Press Kit -LiMoSINe ProjectPress Kit -LiMoSINe Project
Press Kit -LiMoSINe ProjectLiMoSINe Project
 
Weaving the Web of People and Things for Intelligent Cities
Weaving the Web of People and Things for Intelligent CitiesWeaving the Web of People and Things for Intelligent Cities
Weaving the Web of People and Things for Intelligent CitiesAlessandro Bozzon
 

Similaire à MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd (20)

Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...
 
Fire incident data visualization
Fire incident data visualizationFire incident data visualization
Fire incident data visualization
 
Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)
 
SocialSensor Project: Sensing User Generated Input for Improved Media Discove...
SocialSensor Project: Sensing User Generated Input for Improved Media Discove...SocialSensor Project: Sensing User Generated Input for Improved Media Discove...
SocialSensor Project: Sensing User Generated Input for Improved Media Discove...
 
Fairhair.ai – alan turing institute june '17 (public)
Fairhair.ai – alan turing institute june '17 (public)Fairhair.ai – alan turing institute june '17 (public)
Fairhair.ai – alan turing institute june '17 (public)
 
Use of Open Data in Hong Kong
Use of Open Data in Hong KongUse of Open Data in Hong Kong
Use of Open Data in Hong Kong
 
Sti community – an approach for a semantically enhanced company platform 20...
Sti community – an approach for a semantically enhanced company platform   20...Sti community – an approach for a semantically enhanced company platform   20...
Sti community – an approach for a semantically enhanced company platform 20...
 
GNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based MedicineGNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based Medicine
 
Approaches of Data Analysis: Networks generated through Social Media
Approaches of Data Analysis: Networks generated through Social MediaApproaches of Data Analysis: Networks generated through Social Media
Approaches of Data Analysis: Networks generated through Social Media
 
Citymatter: UX / UI Design
Citymatter: UX / UI DesignCitymatter: UX / UI Design
Citymatter: UX / UI Design
 
Diata 2012 ARCOMEM
Diata 2012 ARCOMEMDiata 2012 ARCOMEM
Diata 2012 ARCOMEM
 
Live Social Semantics @ ISWC2009
Live Social Semantics @ ISWC2009Live Social Semantics @ ISWC2009
Live Social Semantics @ ISWC2009
 
Fusing text and image for event
Fusing text and image for eventFusing text and image for event
Fusing text and image for event
 
WG2 soa&plan 201401
WG2 soa&plan 201401WG2 soa&plan 201401
WG2 soa&plan 201401
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
 
How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2
 
Press Kit -LiMoSINe Project
Press Kit -LiMoSINe ProjectPress Kit -LiMoSINe Project
Press Kit -LiMoSINe Project
 
Semantic multimedia remixing
Semantic multimedia remixingSemantic multimedia remixing
Semantic multimedia remixing
 
SMSociety16_paper_209-FINAL VERSION
SMSociety16_paper_209-FINAL VERSIONSMSociety16_paper_209-FINAL VERSION
SMSociety16_paper_209-FINAL VERSION
 
Weaving the Web of People and Things for Intelligent Cities
Weaving the Web of People and Things for Intelligent CitiesWeaving the Web of People and Things for Intelligent Cities
Weaving the Web of People and Things for Intelligent Cities
 

Plus de Raphael Troncy

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyRaphael Troncy
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentRaphael Troncy
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningRaphael Troncy
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationRaphael Troncy
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...Raphael Troncy
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Raphael Troncy
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...Raphael Troncy
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Raphael Troncy
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionRaphael Troncy
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Raphael Troncy
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webRaphael Troncy
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Raphael Troncy
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED OpeningRaphael Troncy
 
ShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsRaphael Troncy
 
Finding media illustrating events
Finding media illustrating eventsFinding media illustrating events
Finding media illustrating eventsRaphael Troncy
 
Experiencing Events through User-Generated Media
Experiencing Events through User-Generated MediaExperiencing Events through User-Generated Media
Experiencing Events through User-Generated MediaRaphael Troncy
 
Linking Events with Media
Linking Events with MediaLinking Events with Media
Linking Events with MediaRaphael Troncy
 
Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010Raphael Troncy
 
LODE: Une Ontologie pour representer des evenements dans le Web de Donnees
LODE: Une Ontologie pour representer des evenements dans le Web de DonneesLODE: Une Ontologie pour representer des evenements dans le Web de Donnees
LODE: Une Ontologie pour representer des evenements dans le Web de DonneesRaphael Troncy
 
Provenance for Multimedia
Provenance for MultimediaProvenance for Multimedia
Provenance for MultimediaRaphael Troncy
 

Plus de Raphael Troncy (20)

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening Ceremony
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things Environment
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learning
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip Recommendation
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and Description
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social web
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED Opening
 
ShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting Events
 
Finding media illustrating events
Finding media illustrating eventsFinding media illustrating events
Finding media illustrating events
 
Experiencing Events through User-Generated Media
Experiencing Events through User-Generated MediaExperiencing Events through User-Generated Media
Experiencing Events through User-Generated Media
 
Linking Events with Media
Linking Events with MediaLinking Events with Media
Linking Events with Media
 
Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010
 
LODE: Une Ontologie pour representer des evenements dans le Web de Donnees
LODE: Une Ontologie pour representer des evenements dans le Web de DonneesLODE: Une Ontologie pour representer des evenements dans le Web de Donnees
LODE: Une Ontologie pour representer des evenements dans le Web de Donnees
 
Provenance for Multimedia
Provenance for MultimediaProvenance for Multimedia
Provenance for Multimedia
 

Dernier

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 

Dernier (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 

MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd

  • 1. MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd Raphaël Troncy raphael.troncy@eurecom.fr / @rtroncy
  • 2. Conferences and natural disaster 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 2
  • 3. - 314/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
  • 4. - 414/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
  • 5. - 514/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
  • 6. - 614/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
  • 7. Social Media: some definitions  Media Item: a photo or a video that is shared on a social network  Micropost: a text status message that can optionally accompany a media item  Social Network: an online service that focuses on building and reflecting social relationships among people sharing interests or activities Media Sharing Platforms: emphasis on sharing media but blurred boundaries with social networks since users are encouraged to react on media content (like, comment, favorite, etc.) Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 7
  • 8. Social networks and media items  First-order support:  Posting requires the inclusion of a media item  Example: Flickr, YouTube  Second-order support:  Possibility to post media items but also text-only messages  Example: Facebook  Third-order support:  No direct support for media items but rely on third party applications to host them  Example: Twitter before the introduction of native photo support Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 8
  • 9. Media Server  Composition of media item extractors (12 SNs)  Rely on search APIs + a fix 30s timeout window to provide results  Fallback on screen scraping when necessary (Twitter ecosystem)  Implemented as a NodeJS server  Serialize results in a common schema (JSON) Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 9
  • 10. 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 10 Deep link Permalink Clean text for NLP processing Aggregate view of ALL social interactions 12 Social Networks
  • 11. Media Finder (www2013) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 11
  • 12. Media Finder (zooming on media items) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 12
  • 13. Media Finder (timeline view) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 13
  • 14. Named Entities are Pivotal  Standalone software GATE Stanford CoreNLP Temis  Web APIs http://nerd.eurecom.fr/ 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 14
  • 15. What is NERD? REST API2ontology1 UI3 1 http://nerd.eurecom.fr/ontology 2 http://nerd.eurecom.fr/api/application.wadl 3 http://nerd.eurecom.fr The NERD ontology has been integrated in the NIF project, a EU FP7 in the context of the LOD2: Creating Knowledge out of Interlinked Data 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 15
  • 16. NERD REST API 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 16 GET, POST, PUT, DELETE /document /user /annotation/{extractor} /extraction /evaluation ... JSON/RDF* “entities” : [{ “entity”: “Tim Berners-Lee” , “type”: “Person” , “uri”: "http://dbpedia.org/resource/Tim_berners_lee", “nerdType”: "http://nerd.eurecom.fr/ontology#Person", “startChar”: 30, “endChar”: 45, “confidence”: 1, “relevance”: 0.5 }] Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web Extraction Tools. In: European chapter of the Association for Computational Linguistics (EACL'12), Avignon, France.
  • 17. Media Finder Architecture  Media items harvesting using the Media Server http://eventmedia.eurecom.fr/media- server/search/{combined}/{term} https://github.com/vuknje/media-server (@tomayac fork)  Image near de-duplication DCT signature on image and video frame, Hamming distance between image pairs  Clustering and disambiguation Named Entity Extraction using NERD Topic Generation using LDA 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 17
  • 18. Media Finder (named entities clustering) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 18
  • 19. Media Finder (zooming in a cluster) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 19
  • 20. Media Finder  Live Topic Generation from Event Streams Meet us at WWW 2013 Demo Session http://www.youtube.com/watch?v=8iRiwz7cDYY 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 20
  • 21. Tracking an event: Italian Election  Repeated queries over a period of time We have tracked and analyzed media posts tagged as elezioni2013 from 2013-02-26 to 2013-03-03 Cron job: every 30 minutes over the 6 days Slice the data in 24 hours slots  Research questions: Can we re-create the news headlines?  Storyboarding: http://mediafinder.eurecom.fr/story/elezioni2013 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 21
  • 22. Tracking an event: Italian Election  Dataset: ~16501 microposts containing (duplicate) media items ~21087 Named Entities extracted  Clustering NER and LDA Generate Bag of Entities (BOE) disambiguated with a DBpedia URI  Examples: Monti, Bersani, Italia, Berlusconi, Grillo, Stelle 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 22
  • 23. Tracking an event: Italian Election  Tracking and Analyzing The 2013 Italian Election To appear at ESWC 2013 Demo Session http://www.youtube.com/watch?v=jIMdnwMoWnk 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 23
  • 24. Take Home Message  Media Server / Media Finder: Aggregating fresh social media items Making sense of media collection for video hyper-linking  NERD platform for extracting key information  Vision: adoption of semantic multimedia technologies will foster a European market for media fragment re-purposing and re-selling  Sneak preview: Interact with a Kinect and discover enriched hypervideo http://www.youtube.com/watch?v=4mSC685AG7k 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 24
  • 25. Credits  Vuk Milicic … interaction designer  Giuseppe Rizzo … NERD guru  José Luis Redondo Garcia … triplification and clustering  Thomas Steiner … Media Server original code 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 25
  • 26. http://www.slideshare.net/troncy 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 26