SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Live Topic Generation
from Event Streams
Vuk Milicic, José Luis Redondo Garcia,
Giuseppe Rizzo, Raphaël Troncy, Thomas Steiner
raphael.troncy@eurecom.fr / @rtroncy
Media Finder (www2013)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 2
Media Finder (zooming on media items)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 3
Media Finder (timeline view)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 4
Media Finder (timeline view)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 5
Media Server
 Composition of media item extractors (12 SNs)
 Rely on search APIs + a fix 30s timeout window to provide results
 Fallback on screen scraping when necessary (Twitter ecosystem)
 Implemented as a NodeJS server
 Serialize results in a common schema (JSON)
22nd World Wide Web Conference (WWW) - Rio de Janeiro15/05/2013 - 6
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 7
Deep link
Permalink
Clean text for NLP
processing
Aggregate view of ALL
social interactions
12 Social Networks
Media Finder Architecture
 Media items harvesting using the Media Server
http://eventmedia.eurecom.fr/media-
server/search/{combined}/{term}
https://github.com/vuknje/media-server (@tomayac fork)
 Image near de-duplication
DCT signature on image and video frame,
Hamming distance between image pairs
 Clustering and disambiguation
Named Entity Extraction using NERD
Topic Generation using LDA
Density-based clustering using OPTICS
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 8
Named Entities are Pivotal
http://nerd.eurecom.fr/
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 9
REST API Ontology
Dashboard UI
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 10
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 11
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 12
Media Finder (named entities clustering)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 13
Media Finder (zooming in a cluster)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 14
Summary
 Pick an event identified with a hashtag
 Use MediaServer to get media items
aggregated over multiple social networks
 Use NERD to get entities
aggregated over multiple extractors
 Cluster and identify meaningful topics
(aka entities)
with a meaningful label
often disambiguated with a DBpedia URI giving access
to more encyclopedic knowledge
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 15
Live Topic Generation from Event Streams
 Meet us at WWW 2013 Demo Session, Booth 14
http://www.youtube.com/watch?v=8iRiwz7cDYY
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 16
http://www.slideshare.net/troncy
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 17

Contenu connexe

Similaire à Live topic generation from event streams

Raphaël troncy
Raphaël troncyRaphaël troncy
Raphaël troncy
IRI
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
Sammy Fung
 
IPTC Semantic Web Working Group Summer 2012
IPTC Semantic Web Working Group Summer 2012IPTC Semantic Web Working Group Summer 2012
IPTC Semantic Web Working Group Summer 2012
Stuart Myles
 
Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012
scorlosquet
 

Similaire à Live topic generation from event streams (20)

Raphaël troncy
Raphaël troncyRaphaël troncy
Raphaël troncy
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
 
1802_Crossminer_OCF2018
1802_Crossminer_OCF20181802_Crossminer_OCF2018
1802_Crossminer_OCF2018
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
 
IPTC Semantic Web Working Group Summer 2012
IPTC Semantic Web Working Group Summer 2012IPTC Semantic Web Working Group Summer 2012
IPTC Semantic Web Working Group Summer 2012
 
30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world
 
Deploying your Predictive Models as a Service via
 Domino API Endpoints
Deploying your Predictive Models as a Service via
 Domino API EndpointsDeploying your Predictive Models as a Service via
 Domino API Endpoints
Deploying your Predictive Models as a Service via
 Domino API Endpoints
 
Multiple awr reports_parser
Multiple awr reports_parserMultiple awr reports_parser
Multiple awr reports_parser
 
Kurento: a media server architecture and API for WebRTC
Kurento: a media server architecture and API for WebRTCKurento: a media server architecture and API for WebRTC
Kurento: a media server architecture and API for WebRTC
 
ROS Overview - Málaga 2012
ROS Overview - Málaga 2012ROS Overview - Málaga 2012
ROS Overview - Málaga 2012
 
LOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink SoftwareLOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink Software
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 
W3 presentation gfii 6 dec 2013
W3   presentation gfii 6 dec 2013W3   presentation gfii 6 dec 2013
W3 presentation gfii 6 dec 2013
 
TPAC2016 - From Linked Building Data to Building Data on the Web
TPAC2016 - From Linked Building Data to Building Data on the WebTPAC2016 - From Linked Building Data to Building Data on the Web
TPAC2016 - From Linked Building Data to Building Data on the Web
 
Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012
 
Ros platform overview
Ros platform overviewRos platform overview
Ros platform overview
 
#OSSPARIS17 - The CROSSMINER H2020 Project: Developer-Centric Knowledge Minin...
#OSSPARIS17 - The CROSSMINER H2020 Project: Developer-Centric Knowledge Minin...#OSSPARIS17 - The CROSSMINER H2020 Project: Developer-Centric Knowledge Minin...
#OSSPARIS17 - The CROSSMINER H2020 Project: Developer-Centric Knowledge Minin...
 
Paris Open Source Summit, Floss - Innovation collaborative 2017 Alessandra Ba...
Paris Open Source Summit, Floss - Innovation collaborative 2017 Alessandra Ba...Paris Open Source Summit, Floss - Innovation collaborative 2017 Alessandra Ba...
Paris Open Source Summit, Floss - Innovation collaborative 2017 Alessandra Ba...
 
The CROSSMINER H2020 Project: Developer-Centric Knowledge Mining from Large ...
 The CROSSMINER H2020 Project: Developer-Centric Knowledge Mining from Large ... The CROSSMINER H2020 Project: Developer-Centric Knowledge Mining from Large ...
The CROSSMINER H2020 Project: Developer-Centric Knowledge Mining from Large ...
 

Plus de Raphael Troncy

Plus de Raphael Troncy (20)

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening Ceremony
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things Environment
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learning
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip Recommendation
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and Description
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social Networks
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED Opening
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop opening
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED Opening
 
ShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting Events
 
Finding media illustrating events
Finding media illustrating eventsFinding media illustrating events
Finding media illustrating events
 
Experiencing Events through User-Generated Media
Experiencing Events through User-Generated MediaExperiencing Events through User-Generated Media
Experiencing Events through User-Generated Media
 
Linking Events with Media
Linking Events with MediaLinking Events with Media
Linking Events with Media
 
Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Live topic generation from event streams

  • 1. Live Topic Generation from Event Streams Vuk Milicic, José Luis Redondo Garcia, Giuseppe Rizzo, Raphaël Troncy, Thomas Steiner raphael.troncy@eurecom.fr / @rtroncy
  • 2. Media Finder (www2013) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 2
  • 3. Media Finder (zooming on media items) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 3
  • 4. Media Finder (timeline view) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 4
  • 5. Media Finder (timeline view) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 5
  • 6. Media Server  Composition of media item extractors (12 SNs)  Rely on search APIs + a fix 30s timeout window to provide results  Fallback on screen scraping when necessary (Twitter ecosystem)  Implemented as a NodeJS server  Serialize results in a common schema (JSON) 22nd World Wide Web Conference (WWW) - Rio de Janeiro15/05/2013 - 6
  • 7. 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 7 Deep link Permalink Clean text for NLP processing Aggregate view of ALL social interactions 12 Social Networks
  • 8. Media Finder Architecture  Media items harvesting using the Media Server http://eventmedia.eurecom.fr/media- server/search/{combined}/{term} https://github.com/vuknje/media-server (@tomayac fork)  Image near de-duplication DCT signature on image and video frame, Hamming distance between image pairs  Clustering and disambiguation Named Entity Extraction using NERD Topic Generation using LDA Density-based clustering using OPTICS 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 8
  • 9. Named Entities are Pivotal http://nerd.eurecom.fr/ 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 9 REST API Ontology Dashboard UI
  • 10. 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 10
  • 11. 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 11
  • 12. 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 12
  • 13. Media Finder (named entities clustering) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 13
  • 14. Media Finder (zooming in a cluster) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 14
  • 15. Summary  Pick an event identified with a hashtag  Use MediaServer to get media items aggregated over multiple social networks  Use NERD to get entities aggregated over multiple extractors  Cluster and identify meaningful topics (aka entities) with a meaningful label often disambiguated with a DBpedia URI giving access to more encyclopedic knowledge 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 15
  • 16. Live Topic Generation from Event Streams  Meet us at WWW 2013 Demo Session, Booth 14 http://www.youtube.com/watch?v=8iRiwz7cDYY 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 16
  • 17. http://www.slideshare.net/troncy 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 17