Title: What Is the Difference between a Social and a Hyperlink Network? -- How the Type of Network Can Be Determined from the Network Structure Alone

•

1 j'aime•436 vues

Networks represent a type of dataset that is ubiquitous in many disciplines and areas. Examples are social networks (ties between people), communication networks, trophic networks ("who eats who"), the World Wide Web, computer networks, lexical networks (connections between words), transport networks, metabolic networks (e.g., interactions between proteins), neural networks, animal networks, citation networks, affiliation networks (of people in groups), software dependency networks, and many more. In this talk, we present ongoing work on answering the question "Can the type of network be detected from the network structure alone?" For instance, given a completely unlabeled network dataset consisting only of node and edges, can we detect whether the data represents a social network or a hyperlink network? We present machine learning and statistical approaches to answering questions of this type. The presented results will make use of data in the KONECT project, one of the largest repositories of network datasets, curated at the University of Namur.

Données & analyses

naXys – Namur Centre for Complex Networks – Univ. of Namur
What Is the Difference between a
Social and a Hyperlink Network?
How the Type of Network Can Be Determined from the
Network Structure Alone
Jérôme KUNEGIS
University of Oxford, Department of Statistics, 2017-09-12

“Network Category” 2J. Kunegis
Network Analysis

“Network Category” 3J. Kunegis
Networks Are Everywhere
 Cliché: “Everything is a Network”
 It's a cliché because it's true:
– Social network, road network, lexical network, metabolic
network, trophic network, affiliation network, citation
network, hyperlink network, etc., etc., etc.

“Network Category” 4J. Kunegis
Network Categories
From https://github.com/kunegis/konect-handbook

“Network Category” 5J. Kunegis
Collections of Network Datasets
 SNAP
– by Jure Leskovec, Stanford Univ. (~2009)
– several 100 networks; not systematic
– Available for download
– Some statistics available
 KONECT
– by Jérôme Kunegis, Univ. of Namur (~2011)
– 1000+ networks, but only 200+ unipartite
– Most networks available for download
– Many statistics available
 ICON
– by Aaron Clauset, Univ. of Colorado (~2016)
– 4000+ datasets
– Not available for download (“index”)

“Network Category” 6J. Kunegis
Datasets in KONECT In this work: 165
non-bipartite networks
(out of 194 non-bip.
networks in KONECT)

“Network Category” 7J. Kunegis
Network Statistics
 A statistic is a real number that characterizes a
network
 Examples:
– Average degree (d)
– Number of triangles (t)
– Diameter (δ)
– Clustering coefficient (c)
– Gini coefficient of degree distribution (G)
– Degree assortativity (ρ)

“Network Category” 8J. Kunegis
More Statistics
– Number of wegdes (s)
– Number of squares (q)
– Number of claws (z)
– Number of crosses (x)
– Maximum degree (dmax)
– Relative maximum degree (dMR = dmax / d)
– Number of degree-1 nodes (d )₁
– 50-percentile effective diameter (δ0.5)
– Relative edge distribution entropy (Her)
– Bipartivity (bA = 1 – λmin[A] / λmax[A])
– Normalized two-star count (sd = s / (n d (d – 1) / 2))
– Eigenvalues of certain matrices (a = λ2[L], |λmax[A]|, …)
– etc.

“Network Category” 9J. Kunegis
Distribution of Clustering Coefficient (c)
Communication
Interaction
Hyperlink
Online social

“Network Category” 10J. Kunegis
Distribution of Gini Coefficient (G)
Online social
Infrastructure
Interaction
Hum
an
social

“Network Category” 11J. Kunegis
Distribution of Diameter (δ)
Infrastructure
Hyperlink
Citation

“Network Category” 12J. Kunegis
Degree Assortativity (ρ)

“Network Category” 13J. Kunegis
Statistical Testing
Kolmogorov–Smirnov test on each pair of categories; non-white cell when statistic is
significantly different (p < 0.10). Base colour by HSL: Hue denotes network statistic; S & L is
constant. Shown colour is interpolated between base colour and white for 0 ≤ p ≤ 0.10.
Statistics (fixed position):

“Network Category” 14J. Kunegis
Statistics Are Not Uncorrelated

“Network Category” 15J. Kunegis
Principal Component Analysis of Statistics

“Network Category” 16J. Kunegis
PCA of Network Datasets

“Network Category” 17J. Kunegis
Feature Engineering
 Find size-independent formulations of statistics
– E.g., c instead of t
 Avoid highly correlated statistics
– E.g., keep only one of G and P
 Find statistics that are easy to compute
– E.g., algebraic connectivity (a) needs O(n²) runtime

“Network Category” 18J. Kunegis
Thank You
 What We Want:
– More datasets, in particular, more diverse categories!
– More statistics: both ideas, and code
 Contribute
– konect.math.fundp.ac.be (temporary URL!)
– Ask me about Stu: our build tool for doing all of this
https://github.com/kunegis/konecttoolbox
https://github.com/kunegis/konectanalysis
https://github.com/kunegis/konectextr
https://github.com/kunegis/konecthandbook
https://github.com/kunegis/konectwww
https://github.com/kunegis/stu
For more news about KONECT: follow @KONECTproject
Jérôme Kunegis <jerome.kunegis@unamur.be>

Contenu connexe

Tendances

Strylowski ResumeBradley Strylowski

Social Network, Metrics and Computational ProblemAndry Alamsyah

Disease spreading & control in temporal networksPetter Holme

Community Detection in Social Networks: A Brief OverviewSatyaki Sikdar

Community detection in complex social networksAboul Ella Hassanien

APPLICATION OF CLUSTERING TO ANALYZE ACADEMIC SOCIAL NETWORKSIJwest

seminar on To block unwanted messages _from osnShailesh kumar

D. Dluznevskij. YOLOv5 efektyvumo tyrimas „iPhone“ palaikomose sistemoseLietuvos kompiuterininkų sąjunga

Important spreaders in networks: exact results on small graphsPetter Holme

Temporal Networks of Human InteractionPetter Holme

Robin ravi gajria ima mathematics today review 7 the structure of complex net...Robin Ravi

01 Network Data Collection (2017)Duke Network Analysis Center

Computer networksPatelNensi

Tendances (13)

Strylowski Resume

Social Network, Metrics and Computational Problem

Disease spreading & control in temporal networks

Community Detection in Social Networks: A Brief Overview

Community detection in complex social networks

APPLICATION OF CLUSTERING TO ANALYZE ACADEMIC SOCIAL NETWORKS

seminar on To block unwanted messages _from osn

D. Dluznevskij. YOLOv5 efektyvumo tyrimas „iPhone“ palaikomose sistemose

Important spreaders in networks: exact results on small graphs

Temporal Networks of Human Interaction

Robin ravi gajria ima mathematics today review 7 the structure of complex net...

01 Network Data Collection (2017)

Computer networks

Similaire à Title: What Is the Difference between a Social and a Hyperlink Network? -- How the Type of Network Can Be Determined from the Network Structure Alone

Scott Complex Networksjilung hsieh

Network Science: Theory, Modeling and ApplicationsBiocomplexity Institute of Virginia Tech

Topology pptDaksh Bapna

Topology pptkaran saini

Topology pptboocse11

20120301 strata-marc smith-mapping social media networks with no coding using...Marc Smith

Informatics systemsAnimesh Chaturvedi

Netwoks icml09zhangzhao

COMMUNICATIONS OF THE ACM November 2004Vol. 47, No. 11 15.docxmonicafrancis71118

Community structure in social and biological structuresMaxim Boiko Savenko

Graph Representation LearningJure Leskovec

D1803022335IOSR Journals

The P4 of NetworkacyDmitry Zinoviev

Complexity Play&LearnMassimo Conte

Node similaritySURAJ NAYAK

Academic Course: 02 Self-organization and emergence in networked systemsFET AWARE project - Self Awareness in Autonomic Systems

TopologyPPT.pptssuser933685

Oxford Digital Humanities Summer SchoolScott A. Hale

introduction to Networkingiicecollege

It’s a “small world” after allquanmengli

Similaire à Title: What Is the Difference between a Social and a Hyperlink Network? -- How the Type of Network Can Be Determined from the Network Structure Alone (20)

Scott Complex Networks

Network Science: Theory, Modeling and Applications

Topology ppt

20120301 strata-marc smith-mapping social media networks with no coding using...

Informatics systems

Netwoks icml09

COMMUNICATIONS OF THE ACM November 2004Vol. 47, No. 11 15.docx

Community structure in social and biological structures

Graph Representation Learning

D1803022335

The P4 of Networkacy

Complexity Play&Learn

Node similarity

Academic Course: 02 Self-organization and emergence in networked systems

TopologyPPT.ppt

Oxford Digital Humanities Summer School

introduction to Networking

It’s a “small world” after all

Plus de Jérôme KUNEGIS

Succinct Summarisation of Large Networks via Small Synthetic Representative G...Jérôme KUNEGIS

Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...Jérôme KUNEGIS

Schach und ComputerJérôme KUNEGIS

Winning Science Slam by Jérôme Kunegis – First Prize at ICWSM 2016Jérôme KUNEGIS

Algebraic Graph-theoretic Measures of ConflictJérôme KUNEGIS

Generating Networks with Arbitrary PropertiesJérôme KUNEGIS

Karriere Lounge – INFORMATIK 2013Jérôme KUNEGIS

Eight Formalisms for Defining Graph ModelsJérôme KUNEGIS

What Is the Added Value of Negative Links in Online Social Networks?Jérôme KUNEGIS

KONECT – The Koblenz Network CollectionJérôme KUNEGIS

Preferential Attachment in Online Networks: Measurement and ExplanationsJérôme KUNEGIS

Predicting Directed Links using Nondiagonal Matrix DecompositionsJérôme KUNEGIS

Online Dating Recommender Systems: The Split-complex Number ApproachJérôme KUNEGIS

Why Beyoncé Is More Popular Than Me – Fairness, Diversity and Other MeasuresJérôme KUNEGIS

Fairness on the Web: Alternatives to the Power Law (WebSci 2012)Jérôme KUNEGIS

Fairness on the Web: Alternatives to the Power LawJérôme KUNEGIS

KONECT Cloud – Large Scale Network Mining in the CloudJérôme KUNEGIS

On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)Jérôme KUNEGIS

Searching Microblogs: Coping with Sparsity and Document QualityJérôme KUNEGIS

Bad News Travel Fast: A Content-based Analysis of Interestingness on TwitterJérôme KUNEGIS

Plus de Jérôme KUNEGIS (20)

Succinct Summarisation of Large Networks via Small Synthetic Representative G...

Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...

Schach und Computer

Winning Science Slam by Jérôme Kunegis – First Prize at ICWSM 2016

Algebraic Graph-theoretic Measures of Conflict

Generating Networks with Arbitrary Properties

Karriere Lounge – INFORMATIK 2013

Eight Formalisms for Defining Graph Models

What Is the Added Value of Negative Links in Online Social Networks?

KONECT – The Koblenz Network Collection

Preferential Attachment in Online Networks: Measurement and Explanations

Predicting Directed Links using Nondiagonal Matrix Decompositions

Online Dating Recommender Systems: The Split-complex Number Approach

Why Beyoncé Is More Popular Than Me – Fairness, Diversity and Other Measures

Fairness on the Web: Alternatives to the Power Law (WebSci 2012)

Fairness on the Web: Alternatives to the Power Law

KONECT Cloud – Large Scale Network Mining in the Cloud

On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)

Searching Microblogs: Coping with Sparsity and Document Quality

Bad News Travel Fast: A Content-based Analysis of Interestingness on Twitter

Dernier

Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha

2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07

Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy

GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch

Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort

PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort

Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2

Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ

原版1:1定制南十字星大学毕业证（SCU毕业证）#文凭成绩单#真实留信学历认证永久存档208367051

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort

From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck

ASML's Taxonomy Adventure by Daniel Cantervoginip

20240419 - Measurecamp Amsterdam - SAM.pdfHuman37

Machine learning classification ppt.pptamreenkhanum0307

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics

Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Universitat Politècnica de Catalunya

1:1定制(UQ毕业证）昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk

Dernier (20)

Call Girls In Dwarka 9654467111 Escorts Service

2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING

Student Profile Sample report on improving academic performance by uniting gr...

GA4 Without Cookies [Measure Camp AMS]

Generative AI for Social Good at Open Data Science East 2024

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)

PKS-TGC-1084-630 - Stage 1 Proposal.pptx

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi

Identifying Appropriate Test Statistics Involving Population Mean

Advanced Machine Learning for Business Professionals

原版1:1定制南十字星大学毕业证（SCU毕业证）#文凭成绩单#真实留信学历认证永久存档

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service

From idea to production in a day – Leveraging Azure ML and Streamlit to build...

ASML's Taxonomy Adventure by Daniel Canter

20240419 - Measurecamp Amsterdam - SAM.pdf

Machine learning classification ppt.ppt

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...

Defining Constituents, Data Vizzes and Telling a Data Story

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

1:1定制(UQ毕业证）昆士兰大学毕业证成绩单修改留信学历认证原版一模一样

Title: What Is the Difference between a Social and a Hyperlink Network? -- How the Type of Network Can Be Determined from the Network Structure Alone

1. naXys – Namur Centre for Complex Networks – Univ. of Namur What Is the Difference between a Social and a Hyperlink Network? How the Type of Network Can Be Determined from the Network Structure Alone Jérôme KUNEGIS University of Oxford, Department of Statistics, 2017-09-12

2. “Network Category” 2J. Kunegis Network Analysis

3. “Network Category” 3J. Kunegis Networks Are Everywhere  Cliché: “Everything is a Network”  It's a cliché because it's true: – Social network, road network, lexical network, metabolic network, trophic network, affiliation network, citation network, hyperlink network, etc., etc., etc.

4. “Network Category” 4J. Kunegis Network Categories From https://github.com/kunegis/konect-handbook

5. “Network Category” 5J. Kunegis Collections of Network Datasets  SNAP – by Jure Leskovec, Stanford Univ. (~2009) – several 100 networks; not systematic – Available for download – Some statistics available  KONECT – by Jérôme Kunegis, Univ. of Namur (~2011) – 1000+ networks, but only 200+ unipartite – Most networks available for download – Many statistics available  ICON – by Aaron Clauset, Univ. of Colorado (~2016) – 4000+ datasets – Not available for download (“index”)

6. “Network Category” 6J. Kunegis Datasets in KONECT In this work: 165 non-bipartite networks (out of 194 non-bip. networks in KONECT)

7. “Network Category” 7J. Kunegis Network Statistics  A statistic is a real number that characterizes a network  Examples: – Average degree (d) – Number of triangles (t) – Diameter (δ) – Clustering coefficient (c) – Gini coefficient of degree distribution (G) – Degree assortativity (ρ)

8. “Network Category” 8J. Kunegis More Statistics – Number of wegdes (s) – Number of squares (q) – Number of claws (z) – Number of crosses (x) – Maximum degree (dmax) – Relative maximum degree (dMR = dmax / d) – Number of degree-1 nodes (d )₁ – 50-percentile effective diameter (δ0.5) – Relative edge distribution entropy (Her) – Bipartivity (bA = 1 – λmin[A] / λmax[A]) – Normalized two-star count (sd = s / (n d (d – 1) / 2)) – Eigenvalues of certain matrices (a = λ2[L], |λmax[A]|, …) – etc.

9. “Network Category” 9J. Kunegis Distribution of Clustering Coefficient (c) Communication Interaction Hyperlink Online social

10. “Network Category” 10J. Kunegis Distribution of Gini Coefficient (G) Online social Infrastructure Interaction Hum an social

11. “Network Category” 11J. Kunegis Distribution of Diameter (δ) Infrastructure Hyperlink Citation

12. “Network Category” 12J. Kunegis Degree Assortativity (ρ)

13. “Network Category” 13J. Kunegis Statistical Testing Kolmogorov–Smirnov test on each pair of categories; non-white cell when statistic is significantly different (p < 0.10). Base colour by HSL: Hue denotes network statistic; S & L is constant. Shown colour is interpolated between base colour and white for 0 ≤ p ≤ 0.10. Statistics (fixed position):

14. “Network Category” 14J. Kunegis Statistics Are Not Uncorrelated

15. “Network Category” 15J. Kunegis Principal Component Analysis of Statistics

16. “Network Category” 16J. Kunegis PCA of Network Datasets

17. “Network Category” 17J. Kunegis Feature Engineering  Find size-independent formulations of statistics – E.g., c instead of t  Avoid highly correlated statistics – E.g., keep only one of G and P  Find statistics that are easy to compute – E.g., algebraic connectivity (a) needs O(n²) runtime

18. “Network Category” 18J. Kunegis Thank You  What We Want: – More datasets, in particular, more diverse categories! – More statistics: both ideas, and code  Contribute – konect.math.fundp.ac.be (temporary URL!) – Ask me about Stu: our build tool for doing all of this https://github.com/kunegis/konecttoolbox https://github.com/kunegis/konectanalysis https://github.com/kunegis/konectextr https://github.com/kunegis/konecthandbook https://github.com/kunegis/konectwww https://github.com/kunegis/stu For more news about KONECT: follow @KONECTproject Jérôme Kunegis <jerome.kunegis@unamur.be>

Title: What Is the Difference between a Social and a Hyperlink Network? -- How the Type of Network Can Be Determined from the Network Structure Alone

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (13)

Similaire à Title: What Is the Difference between a Social and a Hyperlink Network? -- How the Type of Network Can Be Determined from the Network Structure Alone

Similaire à Title: What Is the Difference between a Social and a Hyperlink Network? -- How the Type of Network Can Be Determined from the Network Structure Alone (20)

Plus de Jérôme KUNEGIS

Plus de Jérôme KUNEGIS (20)

Dernier

Dernier (20)

Title: What Is the Difference between a Social and a Hyperlink Network? -- How the Type of Network Can Be Determined from the Network Structure Alone