SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Introduction Methodology Experiments Conclusions
Multimodal graph-based analysis over the DBLP
repository: critical discoveries and hypotheses
Gabriel Perri Gimenes, Hugo Gualdron, Jose F Rodrigues Jr 1
Mario Gazziro 2
1University of Sao Paulo 2Fed. University of Santo Andre
Av Trab Sao-carlense, 400 Av dos Estados, 500
Sao Carlos, SP, Brazil - 13566-590 Santo Andre, SP, Brazil - 09210-580
{ggimenes,gualdron,junio}@icmc.usp.br mario.gazziro@ufabc.edu.br
This work has financial support from Fapesp (2013/10026-7)
http://www.icmc.usp.br/pessoas/junio/Site/index.htm
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 1/21
Introduction Methodology Experiments Conclusions
Summary
1 Introduction
2 Methodology
3 Experiments
4 Conclusions
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 2/21
Introduction Methodology Experiments Conclusions
Summary
1 Introduction
2 Methodology
3 Experiments
4 Conclusions
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 3/21
Introduction Methodology Experiments Conclusions
Introduction
High demand for informations about the behavior of
scientists: authors, editors, funding agencies and society
Combining analytical techniques - multimodal approach
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 4/21
Introduction Methodology Experiments Conclusions
Problem
Finding non-evident facts about DBLP is a non-trivial task
Single-technique approaches - limited analytical potential
Sistematic process - can be applied on similar data from other
domains
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 5/21
Introduction Methodology Experiments Conclusions
Hypothesis
Hypothesis
The use of multiple analytical techniques, through a well-defined
process, is capable of revealing important aspects of the scientific
community in computer science
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 6/21
Introduction Methodology Experiments Conclusions
Summary
1 Introduction
2 Methodology
3 Experiments
4 Conclusions
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 7/21
Introduction Methodology Experiments Conclusions
Materials
Cardinality of the entities extracted from DBLP - XML
Entity Number
Authors 1.060.221
Articles 1.801.576
Events 14.654
Publications 4.262
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 8/21
Introduction Methodology Experiments Conclusions
Data migration
Semi-structured format ⇒ Relational model
Need of specific software for the migration
Definition of the entity-relationship model:
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 9/21
Introduction Methodology Experiments Conclusions
Extracted relationships
Relationship Description
Co-authorship Authors that published an article
togheter.
Co-edition Authors that appear as editors in the
same event or journal.
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 10/21
Introduction Methodology Experiments Conclusions
Summary
1 Introduction
2 Methodology
3 Experiments
4 Conclusions
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 11/21
Introduction Methodology Experiments Conclusions
Multimodal Analysis - WCC
Weakly-connected components distribution - Co-authorship
13% small components with up to 30 nodes
Giant component with 87% of the authors
44.000 sub-networks of co-authorship - eventual researchers,
industry white papers
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 12/21
Introduction Methodology Experiments Conclusions
Multimodal Analysis - ACC
Node degree × average clustering coefficient - Co-authorship
High coefficient values are found in nodes with degree < 10
Coefficient value decreases as the node degree increases - ACC ∝ degree−1.06
Authors tend to colaborate with the co-authors of their co-authors - triangles
Young authors vs. older authors
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 13/21
Introduction Methodology Experiments Conclusions
Multimodal Analysis - Densification
Degree distribution - Co-autorship
As new authors appear new edges also appear - e(t) ∝ n(t)1.47 - densification
Edges appear exponentially vs. publication of elaborated articles
Master and Ph.D as regular courses
Funding agencies - numbers
More authors per paper
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 14/21
Introduction Methodology Experiments Conclusions
Multimodal Analysis - Diameter
Effective diameter evolution - Co-edition
Peaked near 1995 - beginning of a shrink period
Before that - new editors/publication vehicles vs. after that - same editor/same
vehicles
Densification period: more new edges than new nodes - editor commitees rotate
between same members
Editor: experience and expertise - limitations for new researchers
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 15/21
Introduction Methodology Experiments Conclusions
Multimodal Analysis - Previsibility
Previsibility analysis - Co-authoring
Can we predict new interactions in the DBLP newtork?
Extraction of topological features → supervised learning
Figure: Results - Interval G[1995, 2005], G[2006, 2007]
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 16/21
Introduction Methodology Experiments Conclusions
Multimodal Analysis - Counting and algebraic analysis
Counting - Bipartite author-article network with timestamps
Accomplishment: number of years with at least one
publication
Silence: number of consecutive years with no publications
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 17/21
Introduction Methodology Experiments Conclusions
Multimodal Analysis - Counting and algebraic analysis
Proposed metric
Importance = 1√
silence+1
∗ log(Accomplishment)
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 18/21
Introduction Methodology Experiments Conclusions
Summary
1 Introduction
2 Methodology
3 Experiments
4 Conclusions
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 19/21
Introduction Methodology Experiments Conclusions
Conclusions
Well-defined analytical process - combination of multiple
techniques
Non-trivial extraction of information from DBLP
Multi-perspective interpretations about the past and future of
the academic community in computer science
Application in the decision making process of funding agencies
and academic personnel
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 20/21
Introduction Methodology Experiments Conclusions
Thanks!
Questions?
The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 21/21

Contenu connexe

En vedette

6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
Universidade de São Paulo
 
Frequency plot and relevance plot to enhance visual data exploration
Frequency plot and relevance plot to enhance visual data explorationFrequency plot and relevance plot to enhance visual data exploration
Frequency plot and relevance plot to enhance visual data exploration
Universidade de São Paulo
 
SuperGraph visualization
SuperGraph visualizationSuperGraph visualization
SuperGraph visualization
Universidade de São Paulo
 
On the Support of a Similarity-Enabled Relational Database Management System ...
On the Support of a Similarity-Enabled Relational Database Management System ...On the Support of a Similarity-Enabled Relational Database Management System ...
On the Support of a Similarity-Enabled Relational Database Management System ...
Universidade de São Paulo
 
Techniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media imagesTechniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media images
Universidade de São Paulo
 
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale GraphsVertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Universidade de São Paulo
 

En vedette (20)

Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student projectArabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
 
A multimodal discourse analysis of video games (toh weimin)
A multimodal discourse analysis of video games (toh weimin)A multimodal discourse analysis of video games (toh weimin)
A multimodal discourse analysis of video games (toh weimin)
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
 
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
 
Reviewing Data Visualization: an Analytical Taxonomical Study
Reviewing Data Visualization: an Analytical Taxonomical StudyReviewing Data Visualization: an Analytical Taxonomical Study
Reviewing Data Visualization: an Analytical Taxonomical Study
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Visualization tree multiple linked analytical decisions
Visualization tree multiple linked analytical decisionsVisualization tree multiple linked analytical decisions
Visualization tree multiple linked analytical decisions
 
Unveiling smoke in social images with the SmokeBlock approach
Unveiling smoke in social images with the SmokeBlock approachUnveiling smoke in social images with the SmokeBlock approach
Unveiling smoke in social images with the SmokeBlock approach
 
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...
 
Apresentacao vldb
Apresentacao vldbApresentacao vldb
Apresentacao vldb
 
Frequency plot and relevance plot to enhance visual data exploration
Frequency plot and relevance plot to enhance visual data explorationFrequency plot and relevance plot to enhance visual data exploration
Frequency plot and relevance plot to enhance visual data exploration
 
SuperGraph visualization
SuperGraph visualizationSuperGraph visualization
SuperGraph visualization
 
On the Support of a Similarity-Enabled Relational Database Management System ...
On the Support of a Similarity-Enabled Relational Database Management System ...On the Support of a Similarity-Enabled Relational Database Management System ...
On the Support of a Similarity-Enabled Relational Database Management System ...
 
StructMatrix: large-scale visualization of graphs by means of structure detec...
StructMatrix: large-scale visualization of graphs by means of structure detec...StructMatrix: large-scale visualization of graphs by means of structure detec...
StructMatrix: large-scale visualization of graphs by means of structure detec...
 
Supervised-Learning Link Recommendation in the DBLP co-authoring network
Supervised-Learning Link Recommendation in the DBLP co-authoring networkSupervised-Learning Link Recommendation in the DBLP co-authoring network
Supervised-Learning Link Recommendation in the DBLP co-authoring network
 
Techniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media imagesTechniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media images
 
Fire Detection on Unconstrained Videos Using Color-Aware Spatial Modeling and...
Fire Detection on Unconstrained Videos Using Color-Aware Spatial Modeling and...Fire Detection on Unconstrained Videos Using Color-Aware Spatial Modeling and...
Fire Detection on Unconstrained Videos Using Color-Aware Spatial Modeling and...
 
Graph-based Relational Data Visualization
Graph-based RelationalData VisualizationGraph-based RelationalData Visualization
Graph-based Relational Data Visualization
 
Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model
Fast Billion-scale Graph Computation Using a Bimodal Block Processing ModelFast Billion-scale Graph Computation Using a Bimodal Block Processing Model
Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model
 
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale GraphsVertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
 

Similaire à Multimodal graph-based analysis over the DBLP repository: critical discoveries and hypotheses

FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
Carole Goble
 
GCCOM\_DART: Ensemble Data Assimilation Analysis System for Sub-mesoscale Pr...
GCCOM\_DART:  Ensemble Data Assimilation Analysis System for Sub-mesoscale Pr...GCCOM\_DART:  Ensemble Data Assimilation Analysis System for Sub-mesoscale Pr...
GCCOM\_DART: Ensemble Data Assimilation Analysis System for Sub-mesoscale Pr...
Mariangel (Angie) Garcia, Ph.D
 

Similaire à Multimodal graph-based analysis over the DBLP repository: critical discoveries and hypotheses (20)

02 stein intro_4th-pv_modeling_workshop_2015-10-22_sand2015-8571_c
02 stein intro_4th-pv_modeling_workshop_2015-10-22_sand2015-8571_c02 stein intro_4th-pv_modeling_workshop_2015-10-22_sand2015-8571_c
02 stein intro_4th-pv_modeling_workshop_2015-10-22_sand2015-8571_c
 
Visualization of high dimensional data set
Visualization of high dimensional data setVisualization of high dimensional data set
Visualization of high dimensional data set
 
Data Science at Udemy
Data Science at UdemyData Science at Udemy
Data Science at Udemy
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
 
ScilabTEC 2015 - CEA/CESTA
ScilabTEC 2015 - CEA/CESTAScilabTEC 2015 - CEA/CESTA
ScilabTEC 2015 - CEA/CESTA
 
M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...
 
Webinar: A bi-objective multiperiod fuzzy scheduling for a multimodal urban t...
Webinar: A bi-objective multiperiod fuzzy scheduling for a multimodal urban t...Webinar: A bi-objective multiperiod fuzzy scheduling for a multimodal urban t...
Webinar: A bi-objective multiperiod fuzzy scheduling for a multimodal urban t...
 
UKSG Conference 2015 - E-resources: ezPAARSE helps you discover who is readin...
UKSG Conference 2015 - E-resources: ezPAARSE helps you discover who is readin...UKSG Conference 2015 - E-resources: ezPAARSE helps you discover who is readin...
UKSG Conference 2015 - E-resources: ezPAARSE helps you discover who is readin...
 
On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...
On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...
On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...
 
An approach to production scheduling optimization, A Case of an Oil Lubricati...
An approach to production scheduling optimization, A Case of an Oil Lubricati...An approach to production scheduling optimization, A Case of an Oil Lubricati...
An approach to production scheduling optimization, A Case of an Oil Lubricati...
 
Model-Based Optimization for Effective and Reliable Decision-Making
Model-Based Optimization for Effective and Reliable Decision-MakingModel-Based Optimization for Effective and Reliable Decision-Making
Model-Based Optimization for Effective and Reliable Decision-Making
 
Open science in cognitive modeling
Open science in cognitive modelingOpen science in cognitive modeling
Open science in cognitive modeling
 
K.chart
K.chartK.chart
K.chart
 
GCCOM\_DART: Ensemble Data Assimilation Analysis System for Sub-mesoscale Pr...
GCCOM\_DART:  Ensemble Data Assimilation Analysis System for Sub-mesoscale Pr...GCCOM\_DART:  Ensemble Data Assimilation Analysis System for Sub-mesoscale Pr...
GCCOM\_DART: Ensemble Data Assimilation Analysis System for Sub-mesoscale Pr...
 
M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...
 
Model Transformations to adapt Component-based GUIs using an ATL-based tool
Model Transformations to adapt Component-based GUIs using an ATL-based toolModel Transformations to adapt Component-based GUIs using an ATL-based tool
Model Transformations to adapt Component-based GUIs using an ATL-based tool
 
MPI 794 (week-1 & 2)
MPI 794 (week-1 & 2)MPI 794 (week-1 & 2)
MPI 794 (week-1 & 2)
 
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
 
Dolap13 v9 7.docx
Dolap13 v9 7.docxDolap13 v9 7.docx
Dolap13 v9 7.docx
 
Karuta: Design Your Own Portfolio Process
Karuta: Design Your Own Portfolio ProcessKaruta: Design Your Own Portfolio Process
Karuta: Design Your Own Portfolio Process
 

Plus de Universidade de São Paulo

Metric s plat - a platform for quick development testing and visualization of...
Metric s plat - a platform for quick development testing and visualization of...Metric s plat - a platform for quick development testing and visualization of...
Metric s plat - a platform for quick development testing and visualization of...
Universidade de São Paulo
 
Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Hierarchical visual filtering pragmatic and epistemic actions for database vi...Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Universidade de São Paulo
 

Plus de Universidade de São Paulo (13)

A gentle introduction to Deep Learning
A gentle introduction to Deep LearningA gentle introduction to Deep Learning
A gentle introduction to Deep Learning
 
Computação: carreira e mercado de trabalho
Computação: carreira e mercado de trabalhoComputação: carreira e mercado de trabalho
Computação: carreira e mercado de trabalho
 
Introdução às ferramentas de Business Intelligence do ecossistema Hadoop
Introdução às ferramentas de Business Intelligence do ecossistema HadoopIntrodução às ferramentas de Business Intelligence do ecossistema Hadoop
Introdução às ferramentas de Business Intelligence do ecossistema Hadoop
 
Complexidade de Algoritmos, Notação assintótica, Algoritmos polinomiais e in...
Complexidade de Algoritmos, Notação assintótica, Algoritmos polinomiais e in...Complexidade de Algoritmos, Notação assintótica, Algoritmos polinomiais e in...
Complexidade de Algoritmos, Notação assintótica, Algoritmos polinomiais e in...
 
Dawarehouse e OLAP
Dawarehouse e OLAPDawarehouse e OLAP
Dawarehouse e OLAP
 
Metric s plat - a platform for quick development testing and visualization of...
Metric s plat - a platform for quick development testing and visualization of...Metric s plat - a platform for quick development testing and visualization of...
Metric s plat - a platform for quick development testing and visualization of...
 
Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Hierarchical visual filtering pragmatic and epistemic actions for database vi...Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Hierarchical visual filtering pragmatic and epistemic actions for database vi...
 
Java generics-basics
Java generics-basicsJava generics-basics
Java generics-basics
 
Java collections-basic
Java collections-basicJava collections-basic
Java collections-basic
 
Java network-sockets-etc
Java network-sockets-etcJava network-sockets-etc
Java network-sockets-etc
 
Java streams
Java streamsJava streams
Java streams
 
Infovis tutorial
Infovis tutorialInfovis tutorial
Infovis tutorial
 
Java platform
Java platformJava platform
Java platform
 

Multimodal graph-based analysis over the DBLP repository: critical discoveries and hypotheses

  • 1. Introduction Methodology Experiments Conclusions Multimodal graph-based analysis over the DBLP repository: critical discoveries and hypotheses Gabriel Perri Gimenes, Hugo Gualdron, Jose F Rodrigues Jr 1 Mario Gazziro 2 1University of Sao Paulo 2Fed. University of Santo Andre Av Trab Sao-carlense, 400 Av dos Estados, 500 Sao Carlos, SP, Brazil - 13566-590 Santo Andre, SP, Brazil - 09210-580 {ggimenes,gualdron,junio}@icmc.usp.br mario.gazziro@ufabc.edu.br This work has financial support from Fapesp (2013/10026-7) http://www.icmc.usp.br/pessoas/junio/Site/index.htm The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 1/21
  • 2. Introduction Methodology Experiments Conclusions Summary 1 Introduction 2 Methodology 3 Experiments 4 Conclusions The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 2/21
  • 3. Introduction Methodology Experiments Conclusions Summary 1 Introduction 2 Methodology 3 Experiments 4 Conclusions The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 3/21
  • 4. Introduction Methodology Experiments Conclusions Introduction High demand for informations about the behavior of scientists: authors, editors, funding agencies and society Combining analytical techniques - multimodal approach The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 4/21
  • 5. Introduction Methodology Experiments Conclusions Problem Finding non-evident facts about DBLP is a non-trivial task Single-technique approaches - limited analytical potential Sistematic process - can be applied on similar data from other domains The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 5/21
  • 6. Introduction Methodology Experiments Conclusions Hypothesis Hypothesis The use of multiple analytical techniques, through a well-defined process, is capable of revealing important aspects of the scientific community in computer science The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 6/21
  • 7. Introduction Methodology Experiments Conclusions Summary 1 Introduction 2 Methodology 3 Experiments 4 Conclusions The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 7/21
  • 8. Introduction Methodology Experiments Conclusions Materials Cardinality of the entities extracted from DBLP - XML Entity Number Authors 1.060.221 Articles 1.801.576 Events 14.654 Publications 4.262 The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 8/21
  • 9. Introduction Methodology Experiments Conclusions Data migration Semi-structured format ⇒ Relational model Need of specific software for the migration Definition of the entity-relationship model: The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 9/21
  • 10. Introduction Methodology Experiments Conclusions Extracted relationships Relationship Description Co-authorship Authors that published an article togheter. Co-edition Authors that appear as editors in the same event or journal. The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 10/21
  • 11. Introduction Methodology Experiments Conclusions Summary 1 Introduction 2 Methodology 3 Experiments 4 Conclusions The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 11/21
  • 12. Introduction Methodology Experiments Conclusions Multimodal Analysis - WCC Weakly-connected components distribution - Co-authorship 13% small components with up to 30 nodes Giant component with 87% of the authors 44.000 sub-networks of co-authorship - eventual researchers, industry white papers The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 12/21
  • 13. Introduction Methodology Experiments Conclusions Multimodal Analysis - ACC Node degree × average clustering coefficient - Co-authorship High coefficient values are found in nodes with degree < 10 Coefficient value decreases as the node degree increases - ACC ∝ degree−1.06 Authors tend to colaborate with the co-authors of their co-authors - triangles Young authors vs. older authors The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 13/21
  • 14. Introduction Methodology Experiments Conclusions Multimodal Analysis - Densification Degree distribution - Co-autorship As new authors appear new edges also appear - e(t) ∝ n(t)1.47 - densification Edges appear exponentially vs. publication of elaborated articles Master and Ph.D as regular courses Funding agencies - numbers More authors per paper The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 14/21
  • 15. Introduction Methodology Experiments Conclusions Multimodal Analysis - Diameter Effective diameter evolution - Co-edition Peaked near 1995 - beginning of a shrink period Before that - new editors/publication vehicles vs. after that - same editor/same vehicles Densification period: more new edges than new nodes - editor commitees rotate between same members Editor: experience and expertise - limitations for new researchers The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 15/21
  • 16. Introduction Methodology Experiments Conclusions Multimodal Analysis - Previsibility Previsibility analysis - Co-authoring Can we predict new interactions in the DBLP newtork? Extraction of topological features → supervised learning Figure: Results - Interval G[1995, 2005], G[2006, 2007] The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 16/21
  • 17. Introduction Methodology Experiments Conclusions Multimodal Analysis - Counting and algebraic analysis Counting - Bipartite author-article network with timestamps Accomplishment: number of years with at least one publication Silence: number of consecutive years with no publications The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 17/21
  • 18. Introduction Methodology Experiments Conclusions Multimodal Analysis - Counting and algebraic analysis Proposed metric Importance = 1√ silence+1 ∗ log(Accomplishment) The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 18/21
  • 19. Introduction Methodology Experiments Conclusions Summary 1 Introduction 2 Methodology 3 Experiments 4 Conclusions The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 19/21
  • 20. Introduction Methodology Experiments Conclusions Conclusions Well-defined analytical process - combination of multiple techniques Non-trivial extraction of information from DBLP Multi-perspective interpretations about the past and future of the academic community in computer science Application in the decision making process of funding agencies and academic personnel The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 20/21
  • 21. Introduction Methodology Experiments Conclusions Thanks! Questions? The 30th ACM/SIGAPP Symposium On Applied Computing, 2015 21/21