SlideShare a Scribd company logo
1 of 18
Download to read offline
Interactive Visualization of a News Clips Network:
A Journalistic Research and Knowledge Discovery Tool


                                     ´
                    Jos´ Devezas and Alvaro Figueira
                       e

              CRACS/INESC TEC, Faculdade de Ciˆncias, Universidade do Porto
                                               e
                Rua do Campo Alegre, 1021/1055, 4169-007 Porto, Portugal

                          jld@dcc.fc.up.pt, arf@dcc.fc.up.pt



   4th International Conference on Knowledge Discovery and Information Retrieval


                                October 5, 2012
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Introduction




               Breadcrumbs is a social network based on the relations established by collections
               of text fragments taken from online news.

               It can be used to collect and store fragments of text, called news clips, from
               online sources, in a Personal Digital Library.

               Text fragments are then semantically organized, based on several latent features
               found in the text, tags and comments assigned by the users.




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Introduction




               We present two interactive visualization tools based on a multidimensional
               network of news clips from the Breadcrumbs system:

                 1. A multiresolution visualization created with on gvmap (Gansner et al., 2010), a
                    GraphViz tool to generate static illustrations of graphs as maps.

                 2. A web-based force-directed visualization developed using the data-driven approach of
                    d3.js (Bostock et al., 2011).




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network       Mapping the Relationships      Multidimensional Network      Journalistic Research   Conclusions   Future Work




News Clips Network

               We used a small network of news clips with:
                       94 nodes representing news clips
                       166 edges representing entity coreferences1 in three dimensions:
                               17 people       (who dimension)
                               106 places      (where dimension)
                               43 dates        (when dimension)

               The following attributes were available for each node:
                       Clip ID
                       Creation Date
                       Source URL
                       Text Fragment
                       Community Membership ID

               The following attributes were available for each edge:
                       Number of Co-occurrences
                       Dimension Description (who, where or when)
                       DBpedia entity class (e.g. dbpedia-owl:Scientist)
                       DBpedia entity URI

            1
              Entity references were discovered independently from the idiom and resolved to their
        corresponding URI. This way connections are also established between news clips written
        in different languages (IAENG IJCS, Devezas, 2012).
   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                              CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Mapping the Relationships




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Mapping the Relationships




               Each zoom level was created from a single precomputed GraphViz file for the
               maximum resolution (illustrated in the previous figure).

                      The original input file described the network and defined news clips text as node labels.

                      Node positioning and spacing was done according to this initial file using sfdp and
                      gvmap.

                      The resulting GraphViz file was then used to create the highest resolution zoom level
                      PNG and as base for the lower levels.




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Mapping the Relationships




               For lower levels, clip text was replaced with a double label (illustrated next).

                      To keep the initially computed map structure across zoom levels, only the node label
                      was replaced in the original GraphViz file.

                      For the n most central nodes, according to PageRank, where n depends on the level, we
                      retrieved the word with highest TF-IDF for the concatenated text of the node’s
                      community, as well as for the node’s text.

                      This resulted in a double label in the format “community label: node label”, where
                      community label illustrates the global topic, while node label illustrates the local topic
                      of the node.




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Mapping the Relationships




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work

Node Pinning


Multidimensional Network: Node Pinning




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction     News Clips Network   Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work

Node Filtering


Multidimensional Network: Node Filtering




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                      CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction    News Clips Network   Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work

Dimension Filtering


Multidimensional Network: Dimension Filtering




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction    News Clips Network   Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work

Toggle Singleton Nodes Visibility


Multidimensional Network: Toggle Singleton Nodes Visibility




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Journalistic Research



               The goal: to provide visual tools that support the journalistic research process.
               We provide two complementary solutions:


           1. A network map for global topic exploration.

                      Maps are familiar to people and provide a much more intuitive way to explore networks
                      with identified community structure.

                      The semantic aggregation of news clips provided by the colored communities should
                      help identify the different groups of related topics, as well as the boundaries where
                      topics transition to new subjects.

                      Each “country” in the map contains a group of “semantically close” news clips.




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Journalistic Research




           2. A web-based multidimensional network.

                      Journalists can explore the connections between news clips based on three of the
                      Five-Ws: who, where and when.

                      This enables, for instance, the discovery of important people that serve as connectors
                      between different topics (e.g. Nicolas Sarkozy is a bridge between Europe and
                      international affairs, being connected to the United Nations, Libya and Barack Obama
                      — illustrated next).

                      The journalist can use this to discover and question connections (e.g. Why is Nicolas
                      Sarkozy being referenced alongside such distinct entities? What role does this entity
                      play in the whole?).




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Journalistic Research




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Conclusions




               The multiresolution map visualization was effective in producing a clear
               illustration of the network’s nodes and clusters.

               However it didn’t provide by itself a very rich interaction to the user apart from a
               semantic zooming behavior and quick access to news clips metadata.

               GraphViz only provides the means to generate a single static image for the
               network map, thus making it hard to create interactive and dynamic
               visualizations.




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Conclusions




               The visualization of the multidimensional network of news clips, developed using
               a data-driven approach, enabled the user to organize and filter nodes, as well as
               to visually toggle any of the available edge dimensions.

               This allowed the users to interactively explore several aspects of the data that
               would otherwise be difficult to interpret, resulting in a tool that can be helpful in
               journalist research.




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)
Introduction   News Clips Network    Mapping the Relationships   Multidimensional Network   Journalistic Research   Conclusions   Future Work




Future Work




               Improve on the existing network map visualization, specially in regards to the
               method of community and news clip topic discovery, when computing the pair of
               node labels.
               Evaluate the developed visualization tools based on human input, assessing user
               experience and usability, with a focus on the journalistic community.




   e                                ´
Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt)                                                     CRACS/INESC TEC
Breadcrumbs (http://breadcrumbs.up.pt)

More Related Content

Similar to Interactive Visualization of a News Clips Network: A Journalistic Research and Knowledge Discovery Tool

A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)Raphael Troncy
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataversevty
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Projectvty
 
Web technologies for accessible cartography
Web technologies for accessible cartographyWeb technologies for accessible cartography
Web technologies for accessible cartographyliddy
 
CU9255 internetworking Multimedia
CU9255 internetworking MultimediaCU9255 internetworking Multimedia
CU9255 internetworking MultimediaAdri Jovin
 
Traversing Networks of Complexity
Traversing Networks of ComplexityTraversing Networks of Complexity
Traversing Networks of Complexityfwiencek
 
2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...
2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...
2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...ICSM 2010
 
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...ACTUONDA
 
Integration of Accessible Documents into Digital Libraries of Tomorrow
Integration of Accessible Documents into Digital Libraries of TomorrowIntegration of Accessible Documents into Digital Libraries of Tomorrow
Integration of Accessible Documents into Digital Libraries of TomorrowAlexander Haffner
 
History and overview of csnet
History and overview of csnetHistory and overview of csnet
History and overview of csnetsugeladi
 
stanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdfstanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdfAdeIndriawan1
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioCHAKER ALLAOUI
 
Chapter1-HTML.docx
Chapter1-HTML.docxChapter1-HTML.docx
Chapter1-HTML.docxJanessaCruz
 
Collaboration for Digital Preservation
Collaboration for Digital PreservationCollaboration for Digital Preservation
Collaboration for Digital Preservationnicomachus
 
From data portal to knowledge portal: Leveraging semantic technologies to sup...
From data portal to knowledge portal: Leveraging semantic technologies to sup...From data portal to knowledge portal: Leveraging semantic technologies to sup...
From data portal to knowledge portal: Leveraging semantic technologies to sup...Xiaogang (Marshall) Ma
 
Chi2011 Case Study: Interactive, Dynamic Sparklines
Chi2011 Case Study: Interactive, Dynamic SparklinesChi2011 Case Study: Interactive, Dynamic Sparklines
Chi2011 Case Study: Interactive, Dynamic SparklinesLeo Frishberg
 
Developing a Digital Repository for International Symposium on Information Ma...
Developing a Digital Repository for International Symposium on Information Ma...Developing a Digital Repository for International Symposium on Information Ma...
Developing a Digital Repository for International Symposium on Information Ma...Müge Akbulut
 
Distributed mixed reality for diving and
Distributed mixed reality for diving andDistributed mixed reality for diving and
Distributed mixed reality for diving andijcsa
 

Similar to Interactive Visualization of a News Clips Network: A Journalistic Research and Knowledge Discovery Tool (20)

A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataverse
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Project
 
Web technologies for accessible cartography
Web technologies for accessible cartographyWeb technologies for accessible cartography
Web technologies for accessible cartography
 
CU9255 internetworking Multimedia
CU9255 internetworking MultimediaCU9255 internetworking Multimedia
CU9255 internetworking Multimedia
 
Traversing Networks of Complexity
Traversing Networks of ComplexityTraversing Networks of Complexity
Traversing Networks of Complexity
 
World Wide Web
World Wide WebWorld Wide Web
World Wide Web
 
2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...
2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...
2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...
 
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...
 
7- Grid Computing.Pdf
7- Grid Computing.Pdf7- Grid Computing.Pdf
7- Grid Computing.Pdf
 
Integration of Accessible Documents into Digital Libraries of Tomorrow
Integration of Accessible Documents into Digital Libraries of TomorrowIntegration of Accessible Documents into Digital Libraries of Tomorrow
Integration of Accessible Documents into Digital Libraries of Tomorrow
 
History and overview of csnet
History and overview of csnetHistory and overview of csnet
History and overview of csnet
 
stanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdfstanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdf
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process Scenario
 
Chapter1-HTML.docx
Chapter1-HTML.docxChapter1-HTML.docx
Chapter1-HTML.docx
 
Collaboration for Digital Preservation
Collaboration for Digital PreservationCollaboration for Digital Preservation
Collaboration for Digital Preservation
 
From data portal to knowledge portal: Leveraging semantic technologies to sup...
From data portal to knowledge portal: Leveraging semantic technologies to sup...From data portal to knowledge portal: Leveraging semantic technologies to sup...
From data portal to knowledge portal: Leveraging semantic technologies to sup...
 
Chi2011 Case Study: Interactive, Dynamic Sparklines
Chi2011 Case Study: Interactive, Dynamic SparklinesChi2011 Case Study: Interactive, Dynamic Sparklines
Chi2011 Case Study: Interactive, Dynamic Sparklines
 
Developing a Digital Repository for International Symposium on Information Ma...
Developing a Digital Repository for International Symposium on Information Ma...Developing a Digital Repository for International Symposium on Information Ma...
Developing a Digital Repository for International Symposium on Information Ma...
 
Distributed mixed reality for diving and
Distributed mixed reality for diving andDistributed mixed reality for diving and
Distributed mixed reality for diving and
 

Recently uploaded

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

Interactive Visualization of a News Clips Network: A Journalistic Research and Knowledge Discovery Tool

  • 1. Interactive Visualization of a News Clips Network: A Journalistic Research and Knowledge Discovery Tool ´ Jos´ Devezas and Alvaro Figueira e CRACS/INESC TEC, Faculdade de Ciˆncias, Universidade do Porto e Rua do Campo Alegre, 1021/1055, 4169-007 Porto, Portugal jld@dcc.fc.up.pt, arf@dcc.fc.up.pt 4th International Conference on Knowledge Discovery and Information Retrieval October 5, 2012
  • 2. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Introduction Breadcrumbs is a social network based on the relations established by collections of text fragments taken from online news. It can be used to collect and store fragments of text, called news clips, from online sources, in a Personal Digital Library. Text fragments are then semantically organized, based on several latent features found in the text, tags and comments assigned by the users. e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 3. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Introduction We present two interactive visualization tools based on a multidimensional network of news clips from the Breadcrumbs system: 1. A multiresolution visualization created with on gvmap (Gansner et al., 2010), a GraphViz tool to generate static illustrations of graphs as maps. 2. A web-based force-directed visualization developed using the data-driven approach of d3.js (Bostock et al., 2011). e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 4. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work News Clips Network We used a small network of news clips with: 94 nodes representing news clips 166 edges representing entity coreferences1 in three dimensions: 17 people (who dimension) 106 places (where dimension) 43 dates (when dimension) The following attributes were available for each node: Clip ID Creation Date Source URL Text Fragment Community Membership ID The following attributes were available for each edge: Number of Co-occurrences Dimension Description (who, where or when) DBpedia entity class (e.g. dbpedia-owl:Scientist) DBpedia entity URI 1 Entity references were discovered independently from the idiom and resolved to their corresponding URI. This way connections are also established between news clips written in different languages (IAENG IJCS, Devezas, 2012). e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 5. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Mapping the Relationships e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 6. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Mapping the Relationships Each zoom level was created from a single precomputed GraphViz file for the maximum resolution (illustrated in the previous figure). The original input file described the network and defined news clips text as node labels. Node positioning and spacing was done according to this initial file using sfdp and gvmap. The resulting GraphViz file was then used to create the highest resolution zoom level PNG and as base for the lower levels. e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 7. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Mapping the Relationships For lower levels, clip text was replaced with a double label (illustrated next). To keep the initially computed map structure across zoom levels, only the node label was replaced in the original GraphViz file. For the n most central nodes, according to PageRank, where n depends on the level, we retrieved the word with highest TF-IDF for the concatenated text of the node’s community, as well as for the node’s text. This resulted in a double label in the format “community label: node label”, where community label illustrates the global topic, while node label illustrates the local topic of the node. e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 8. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Mapping the Relationships e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 9. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Node Pinning Multidimensional Network: Node Pinning e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 10. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Node Filtering Multidimensional Network: Node Filtering e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 11. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Dimension Filtering Multidimensional Network: Dimension Filtering e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 12. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Toggle Singleton Nodes Visibility Multidimensional Network: Toggle Singleton Nodes Visibility e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 13. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Journalistic Research The goal: to provide visual tools that support the journalistic research process. We provide two complementary solutions: 1. A network map for global topic exploration. Maps are familiar to people and provide a much more intuitive way to explore networks with identified community structure. The semantic aggregation of news clips provided by the colored communities should help identify the different groups of related topics, as well as the boundaries where topics transition to new subjects. Each “country” in the map contains a group of “semantically close” news clips. e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 14. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Journalistic Research 2. A web-based multidimensional network. Journalists can explore the connections between news clips based on three of the Five-Ws: who, where and when. This enables, for instance, the discovery of important people that serve as connectors between different topics (e.g. Nicolas Sarkozy is a bridge between Europe and international affairs, being connected to the United Nations, Libya and Barack Obama — illustrated next). The journalist can use this to discover and question connections (e.g. Why is Nicolas Sarkozy being referenced alongside such distinct entities? What role does this entity play in the whole?). e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 15. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Journalistic Research e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 16. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Conclusions The multiresolution map visualization was effective in producing a clear illustration of the network’s nodes and clusters. However it didn’t provide by itself a very rich interaction to the user apart from a semantic zooming behavior and quick access to news clips metadata. GraphViz only provides the means to generate a single static image for the network map, thus making it hard to create interactive and dynamic visualizations. e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 17. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Conclusions The visualization of the multidimensional network of news clips, developed using a data-driven approach, enabled the user to organize and filter nodes, as well as to visually toggle any of the available edge dimensions. This allowed the users to interactively explore several aspects of the data that would otherwise be difficult to interpret, resulting in a tool that can be helpful in journalist research. e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)
  • 18. Introduction News Clips Network Mapping the Relationships Multidimensional Network Journalistic Research Conclusions Future Work Future Work Improve on the existing network map visualization, specially in regards to the method of community and news clip topic discovery, when computing the pair of node labels. Evaluate the developed visualization tools based on human input, assessing user experience and usability, with a focus on the journalistic community. e ´ Jos´ Devezas (jld@dcc.fc.up.pt) and Alvaro Figueira (arf@dcc.fc.up.pt) CRACS/INESC TEC Breadcrumbs (http://breadcrumbs.up.pt)