SlideShare une entreprise Scribd logo
1  sur  12
Visualization of evolutionary
cascades of messages using
       force-directed graphs
                            Artjom Kurapov
                   Supervisor: Helena Kruus

               Master’s thesis defense, 9 may 2011
Agenda

   Background
   Practical work
       Pling.ee,opensource Gephi
       Web-tool demo and twitter
Background

   Types of networks
   Properties / areas of application
   Research interest
Topics crossroads
Goals

   Visualize social networks (preferably in Estonia)
   Compare friends and messages topology
   Try to mine data visually using cascades

                                      A



                                            C



                                      B            D
Pling
Pling – Qualitative measure
                                 Friends   Messages
Average clustering coefficient   0.135     0.043
Average degree                   4.313     2.202
GCC diameter                     20        38
Average GCC diameter             5.38      13.009
Topic and interface matters

   Out of 18.6 mln messages - no (clearly visible)
    cascade

Possibly because
 89% private

 86% sent using phone
Javascript tool

   Up to 1000 nodes
   Can add nodes on the fly
   Navigation and filtering
   Properties calculation
   Recursive algorithm
Twitter

   Friendship and message network mined
   218 users / 12643 messages, 6.89% retweets
                     100000
                      10000
                       1000
                        100
                         10
                          1
                              0   2   3   4   5   7   8
Thank you
Questions?

Contenu connexe

En vedette

Spain V Miguel Hernandez
Spain V   Miguel HernandezSpain V   Miguel Hernandez
Spain V Miguel HernandezElliott Serbian
 
Edu expo anonymous peer review
Edu expo anonymous peer reviewEdu expo anonymous peer review
Edu expo anonymous peer reviewGjoa Andrichuk
 
6114 k2 pemkab. kuningan
6114 k2 pemkab. kuningan6114 k2 pemkab. kuningan
6114 k2 pemkab. kuninganbenipurnama
 
Online marketing trends in the UK
Online marketing trends in the UKOnline marketing trends in the UK
Online marketing trends in the UKMintTwist
 
Jorge Delgado Work
Jorge Delgado  WorkJorge Delgado  Work
Jorge Delgado Workguestf7f830
 
How effective is the combination of your main
How effective is the combination of your mainHow effective is the combination of your main
How effective is the combination of your mainChristina Worby
 
Sociala Medier - Hot Eller Möjlighet
Sociala Medier - Hot Eller MöjlighetSociala Medier - Hot Eller Möjlighet
Sociala Medier - Hot Eller MöjlighetAndreas Norman
 
Cambridge Solutions E Assessment
Cambridge Solutions E AssessmentCambridge Solutions E Assessment
Cambridge Solutions E Assessmentchristhatcher
 

En vedette (19)

Ensamble coral como momento de arendizaje
Ensamble coral como momento de arendizajeEnsamble coral como momento de arendizaje
Ensamble coral como momento de arendizaje
 
Influenza diego
Influenza diegoInfluenza diego
Influenza diego
 
Wisdom Circles Presentation09
Wisdom Circles Presentation09Wisdom Circles Presentation09
Wisdom Circles Presentation09
 
Rngnthn t2
Rngnthn t2Rngnthn t2
Rngnthn t2
 
Songs & chants in the chinese classroom nclc 2011
Songs & chants in the chinese classroom nclc 2011Songs & chants in the chinese classroom nclc 2011
Songs & chants in the chinese classroom nclc 2011
 
Spain V Miguel Hernandez
Spain V   Miguel HernandezSpain V   Miguel Hernandez
Spain V Miguel Hernandez
 
Edu expo anonymous peer review
Edu expo anonymous peer reviewEdu expo anonymous peer review
Edu expo anonymous peer review
 
6114 k2 pemkab. kuningan
6114 k2 pemkab. kuningan6114 k2 pemkab. kuningan
6114 k2 pemkab. kuningan
 
Online marketing trends in the UK
Online marketing trends in the UKOnline marketing trends in the UK
Online marketing trends in the UK
 
Jorge Delgado Work
Jorge Delgado  WorkJorge Delgado  Work
Jorge Delgado Work
 
Presentation1
Presentation1Presentation1
Presentation1
 
12 checex
12 checex12 checex
12 checex
 
Forum may 2011 yun zhang's presentation
Forum may 2011 yun zhang's presentationForum may 2011 yun zhang's presentation
Forum may 2011 yun zhang's presentation
 
Forum May 2011 Bing Qiu Getting Tenure
Forum May 2011 Bing Qiu Getting TenureForum May 2011 Bing Qiu Getting Tenure
Forum May 2011 Bing Qiu Getting Tenure
 
How effective is the combination of your main
How effective is the combination of your mainHow effective is the combination of your main
How effective is the combination of your main
 
Symfony
SymfonySymfony
Symfony
 
Sociala Medier - Hot Eller Möjlighet
Sociala Medier - Hot Eller MöjlighetSociala Medier - Hot Eller Möjlighet
Sociala Medier - Hot Eller Möjlighet
 
Joanne Wang: Teaching Math Provides Students with Authentic Exposure and COnt...
Joanne Wang: Teaching Math Provides Students with Authentic Exposure and COnt...Joanne Wang: Teaching Math Provides Students with Authentic Exposure and COnt...
Joanne Wang: Teaching Math Provides Students with Authentic Exposure and COnt...
 
Cambridge Solutions E Assessment
Cambridge Solutions E AssessmentCambridge Solutions E Assessment
Cambridge Solutions E Assessment
 

Plus de Артём Курапов (8)

Scaling GraphQL Subscriptions
Scaling GraphQL SubscriptionsScaling GraphQL Subscriptions
Scaling GraphQL Subscriptions
 
Variety of automated tests
Variety of automated testsVariety of automated tests
Variety of automated tests
 
Bacbkone js
Bacbkone jsBacbkone js
Bacbkone js
 
Php storm intro
Php storm introPhp storm intro
Php storm intro
 
Android intro
Android introAndroid intro
Android intro
 
В облаке AWS
В облаке AWSВ облаке AWS
В облаке AWS
 
Devclub hääletamine
Devclub hääletamineDevclub hääletamine
Devclub hääletamine
 
OAuthоризация и API социальных сетей
OAuthоризация и API социальных сетейOAuthоризация и API социальных сетей
OAuthоризация и API социальных сетей
 

Dernier

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Dernier (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Visualization of evolutionary cascades of messages using force-directed graphs

  • 1. Visualization of evolutionary cascades of messages using force-directed graphs Artjom Kurapov Supervisor: Helena Kruus Master’s thesis defense, 9 may 2011
  • 2. Agenda  Background  Practical work  Pling.ee,opensource Gephi  Web-tool demo and twitter
  • 3. Background  Types of networks  Properties / areas of application  Research interest
  • 5. Goals  Visualize social networks (preferably in Estonia)  Compare friends and messages topology  Try to mine data visually using cascades A C B D
  • 7. Pling – Qualitative measure Friends Messages Average clustering coefficient 0.135 0.043 Average degree 4.313 2.202 GCC diameter 20 38 Average GCC diameter 5.38 13.009
  • 8. Topic and interface matters  Out of 18.6 mln messages - no (clearly visible) cascade Possibly because  89% private  86% sent using phone
  • 9. Javascript tool  Up to 1000 nodes  Can add nodes on the fly  Navigation and filtering  Properties calculation  Recursive algorithm
  • 10. Twitter  Friendship and message network mined  218 users / 12643 messages, 6.89% retweets 100000 10000 1000 100 10 1 0 2 3 4 5 7 8

Notes de l'éditeur

  1. So, first a little introduction in the field,then some large dataset research I’ve done,Then personally made browser tool. A small demo, features and issues faced.And a small twitter dataset results
  2. Networks are everywhere. Most of us here study technological and information networks. But there are also biochemical, ecological and most interestingly – social networks which influence our daily life. These include sexual connections, friendship networks, citations or any kind of social behavior associated with it. In fact if you go strict about it, then citation is not really social behavior, since its directed and doesn’t imply talking to the real person. So its more like network of document dependencies. So it is important how you define connection and objects.Networks have different properties, some of which I list in the paper. And of course some of them are relevant only in one field, like bipartite graphs are only needed if you want to visualize them. Or cliques if you want to use clique analysis done.There are also different research interests. Like drawing, or how networks evolve, or how do they break apart, or where does traffic goes through, or how do can we do all kind of graph puzzles. Like graph search, coloring or solve travelling salesman problems.
  3. So to visualize such network and its processes, one needs to see surroundings in this field – like sociology with its laws of diffusion and prefferential attachment, likenetwork properties, drawing algorithms and its complexity, and ofcourse work that has been done before – both theoretical and practical as existing software.
  4. As a thesis goal, I suggest mining data through frequency analysis of messages and making a network topology map. That means that we want a graph representation of a network,We want both friendships and messages datasets,And then we want to see how they correlate and lead to higher forms of messages – cascades.And my hypothesis is that cascades are parts of social thought. Thus evolutionary cascades are linked cascades across multiple topics.
  5. So I have studied Estonian social network pling.ee which belongs to Elisa Eesti AS and has 75 thousands users on the left as friendship network and 12 thousand on the right as message network. As you can see its different, and assortative mixing is present. This means that we have red nodes is here are russian and blue are estonian users. This was read from the messages and symbols they used.
  6. So the numbers differ as well.. As you can see since it was a small portion of messages, the network is rather young and has bigger diameter. A the same time average degree is smaller which is natural, since people don’t talk to all of their friends. And clustering coefficient is also smaller, which is partially dependent on that degree tendency.
  7. The bad news for me was that I was not able to find a single cascade. Possibly because only around 14% were sent from the browser and there were no explicit resharing function in the interface. But comparing it to twitter – people there invented RT themselves. Most likely it’s the topic of discussion that didn’t stimulate sharing, since 89% of talks were private and almost all are teenagers discussing their love life.
  8. So to study cascades and make visualization, I’ve tried building own tool that is written in javascript and can draw small datasets along with its analysis.I’ve also done two dataset extractions from twitter.Its browser based, can do navigation.
  9. From 12 thousand messages, around 7% can be considered as a direct cascade. But there may be more, since I didn’t take into account normal posts with directed form, that can also lead to smaller forms of cascades.On the graph you can see how depth of the retweet depends on its number in the dataset.(demo here)
  10. I don’t talk about evolutionary network, because I study static snapshots here, but in general network does evolve from disconnected components into GCC. But it depends on a network. For example buyers in electronic shops, even though they may suggest products, don’t always lead to new customers with connection. So customers are not connected to anyone. On the other hand, there may be certain clusters in case there is some sort of affiliate network campaign.P – polynomial complexityT(n) = O (n^k)NP – nondeterministic polynomial complexity. Nondeterministic automata can have multiple decision paths from a single state.“NP complete” problems don’t have a polynomial time algorithm.“NP hard” are at least as hard as NP-complete.2. Yes, in social networks GCC diameter is maximal at first stages of network evolution, and decreases over time. I’m not so sure about other network types. Because social networks do get denser.. Since each new node can connect to 0,1 or all nodes, alpha is So in lowest case they grow linearly with exponent equal to 1, meaning like a tree.. In other case they can grow quadratically, with exponent equal to 2, they each new node basically connects to all other nodes. So the more people join in, the more friends can know the other end of the graph. Thus – smaller diameter.If you think of technological networks, then I don’t think making a wiring from japan to brasil is so easy.3. Markov centrality is one of the ways one can find most influential nodes in the network. Although its very complex to compute, my work also lists others centrality measures. And I think that4. Cascade analysis and data mining is still hand work.5. I used Fruchterman-Reingold and Yifan Hu algorithms for local forces and for adaptive cooling. I’ve added my own version of recursive force summing and presented it in the work.