SlideShare une entreprise Scribd logo
1  sur  5
Télécharger pour lire hors ligne
Aristotle University of Thessaloniki
School of Computer Science - Master Studies - Spring Semester
Web Data Mining course - Instructor: Vakali Athena
Kouroupetroglou
Praxitelis Nikolaos
Master Student
Linked data and Graph properties
● A Graph Analysis of the Linked Data Cloud (2009)
Overview
Open Linked Data Graph Analysis (1)
● Connected Components
○ Without SCC, 31 WCC
○ Top WCC Sizes: DBPedia, DBLP
● Degree Correlations
○ In-degree - Out-degree, statistical
significance
○ Degree assortativity: data sets tend to
connect to data sets with differing degrees.
● Pagerank Data Centrality
○ Top central datasets: DBLP Berlin, DBLP
Hannover, DBpedia, KEGG, UniProt, GeneID
● Communities
○ Communities based on datasets content
○ Datasets with similar content exist in the
same structural area of the graph.
● Open Linked Data Cloud 2009 analysis
● Graph Construction
○ G = (V, E) model, directed graph
● General Statistics
○ #edges: 274, #vertices: 86
○ diameter: 10, avg path length: 3.916
● Degrees - Datasets references
○ In-degree - references
■ Top Nodes: DBpedia (14), DBLP (13),
ACM (10), GeneID (10), Geonames (10)
○ Out-degree - references
■ Top Nodes: DBpedia (17), DBLP (14),
ACM (10), SiteSeer (9), EPrints (9)
● Degree Distribution
○ with log-log plot, a power law distribution fits
with a = 1.496
Open Linked Data Graph Analysis (2)
Figure from Source [1]
● Visualization:
○ Vertex: Data Sets
○ Edge: Dataset Links
○ Vertex color: denote the structural
communities
● Conclusions:
○ Open Linked Data with RDF
technologies provides data useful
for data reuse and distribution
leading to Web of Data
○ Graphs are becoming a flexible
representational data structure in
contrast to RDBMS tables
1. Rodriguez, M. A. (2009). A graph analysis of the linked data cloud. arXiv
preprint arXiv:0903.0194.
References
Linked data and Graph properties
Thank you for listening
Kouroupetroglou
Praxitelis Nikolaos

Contenu connexe

En vedette

Working Synopsis Annual
Working  Synopsis AnnualWorking  Synopsis Annual
Working Synopsis Annual
Shibu Das
 

En vedette (20)

Semantic Linked Data
Semantic Linked DataSemantic Linked Data
Semantic Linked Data
 
Incremental clustering in search engines
Incremental clustering in search enginesIncremental clustering in search engines
Incremental clustering in search engines
 
Прием в 1-й класс в 2016 2017 учебном году
Прием в 1-й класс в 2016 2017 учебном годуПрием в 1-й класс в 2016 2017 учебном году
Прием в 1-й класс в 2016 2017 учебном году
 
An Exercise in Legislative Drafting that Displeases All Parties
An Exercise in Legislative Drafting that Displeases All PartiesAn Exercise in Legislative Drafting that Displeases All Parties
An Exercise in Legislative Drafting that Displeases All Parties
 
Why You Should Join a Nonprofit Board, by Don Chamberlin
Why You Should Join a Nonprofit Board, by Don ChamberlinWhy You Should Join a Nonprofit Board, by Don Chamberlin
Why You Should Join a Nonprofit Board, by Don Chamberlin
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
 
RECENT PEER REVIEWED ARTICLES REFERENCE
RECENT PEER REVIEWED ARTICLES REFERENCERECENT PEER REVIEWED ARTICLES REFERENCE
RECENT PEER REVIEWED ARTICLES REFERENCE
 
Dariperi
DariperiDariperi
Dariperi
 
Working Synopsis Annual
Working  Synopsis AnnualWorking  Synopsis Annual
Working Synopsis Annual
 
Meetup symfony 30 janvier 2017 - événement
Meetup symfony 30 janvier 2017 - événementMeetup symfony 30 janvier 2017 - événement
Meetup symfony 30 janvier 2017 - événement
 
Experimental Causal Inference
Experimental Causal InferenceExperimental Causal Inference
Experimental Causal Inference
 
Exploring Language Communities on Github
Exploring Language Communities on GithubExploring Language Communities on Github
Exploring Language Communities on Github
 
Τweetfix: Data Analytics on Match Fixing
Τweetfix: Data Analytics on Match FixingΤweetfix: Data Analytics on Match Fixing
Τweetfix: Data Analytics on Match Fixing
 
Estimating Causal Effects from Observations
Estimating Causal Effects from ObservationsEstimating Causal Effects from Observations
Estimating Causal Effects from Observations
 
Social Media Fraud Metrics
Social Media Fraud MetricsSocial Media Fraud Metrics
Social Media Fraud Metrics
 
Transitivity of Trust
Transitivity of TrustTransitivity of Trust
Transitivity of Trust
 
Opinion mining
Opinion miningOpinion mining
Opinion mining
 
Periscope: A Content-based Image Retrieval Engine
Periscope: A Content-based Image Retrieval EnginePeriscope: A Content-based Image Retrieval Engine
Periscope: A Content-based Image Retrieval Engine
 
Unit3
Unit3Unit3
Unit3
 
Operations Research Problem
Operations Research  ProblemOperations Research  Problem
Operations Research Problem
 

Similaire à Linked data and Graph properties

Predicting Communication Intention in Social Media
Predicting Communication Intention in Social MediaPredicting Communication Intention in Social Media
Predicting Communication Intention in Social Media
Charalampos Chelmis
 

Similaire à Linked data and Graph properties (20)

Dataset reuse: An analysis of references in community discussions, publicatio...
Dataset reuse: An analysis of references in community discussions, publicatio...Dataset reuse: An analysis of references in community discussions, publicatio...
Dataset reuse: An analysis of references in community discussions, publicatio...
 
A seminar on neo4 j
A seminar on neo4 jA seminar on neo4 j
A seminar on neo4 j
 
Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection
 
Clustering output of Apache Nutch using Apache Spark
Clustering output of Apache Nutch using Apache SparkClustering output of Apache Nutch using Apache Spark
Clustering output of Apache Nutch using Apache Spark
 
The Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing SystemsThe Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing Systems
 
Chapter 3.pptx
Chapter 3.pptxChapter 3.pptx
Chapter 3.pptx
 
From Data to Knowledge thru Grailog Visualization
From Data to Knowledge thru Grailog VisualizationFrom Data to Knowledge thru Grailog Visualization
From Data to Knowledge thru Grailog Visualization
 
Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...
Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...
Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...
 
On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...
 
Indexing data on the web a comparison of schema level indices for data search
Indexing data on the web a comparison of schema level indices for data searchIndexing data on the web a comparison of schema level indices for data search
Indexing data on the web a comparison of schema level indices for data search
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
Graph Analysis over Relational Database. Roberto Franchini - Arcade Analytics
Graph Analysis over Relational Database. Roberto Franchini - Arcade AnalyticsGraph Analysis over Relational Database. Roberto Franchini - Arcade Analytics
Graph Analysis over Relational Database. Roberto Franchini - Arcade Analytics
 
Statistical Databases
Statistical DatabasesStatistical Databases
Statistical Databases
 
polystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdfpolystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdf
 
Poster Final
Poster FinalPoster Final
Poster Final
 
Keynote at AImWD
Keynote at AImWDKeynote at AImWD
Keynote at AImWD
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
 
Predicting Communication Intention in Social Media
Predicting Communication Intention in Social MediaPredicting Communication Intention in Social Media
Predicting Communication Intention in Social Media
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
 
Towards an Incremental Schema-level Index for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index for Distributed Linked Open Data G...
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Dernier (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Linked data and Graph properties

  • 1. Aristotle University of Thessaloniki School of Computer Science - Master Studies - Spring Semester Web Data Mining course - Instructor: Vakali Athena Kouroupetroglou Praxitelis Nikolaos Master Student Linked data and Graph properties
  • 2. ● A Graph Analysis of the Linked Data Cloud (2009) Overview
  • 3. Open Linked Data Graph Analysis (1) ● Connected Components ○ Without SCC, 31 WCC ○ Top WCC Sizes: DBPedia, DBLP ● Degree Correlations ○ In-degree - Out-degree, statistical significance ○ Degree assortativity: data sets tend to connect to data sets with differing degrees. ● Pagerank Data Centrality ○ Top central datasets: DBLP Berlin, DBLP Hannover, DBpedia, KEGG, UniProt, GeneID ● Communities ○ Communities based on datasets content ○ Datasets with similar content exist in the same structural area of the graph. ● Open Linked Data Cloud 2009 analysis ● Graph Construction ○ G = (V, E) model, directed graph ● General Statistics ○ #edges: 274, #vertices: 86 ○ diameter: 10, avg path length: 3.916 ● Degrees - Datasets references ○ In-degree - references ■ Top Nodes: DBpedia (14), DBLP (13), ACM (10), GeneID (10), Geonames (10) ○ Out-degree - references ■ Top Nodes: DBpedia (17), DBLP (14), ACM (10), SiteSeer (9), EPrints (9) ● Degree Distribution ○ with log-log plot, a power law distribution fits with a = 1.496
  • 4. Open Linked Data Graph Analysis (2) Figure from Source [1] ● Visualization: ○ Vertex: Data Sets ○ Edge: Dataset Links ○ Vertex color: denote the structural communities ● Conclusions: ○ Open Linked Data with RDF technologies provides data useful for data reuse and distribution leading to Web of Data ○ Graphs are becoming a flexible representational data structure in contrast to RDBMS tables
  • 5. 1. Rodriguez, M. A. (2009). A graph analysis of the linked data cloud. arXiv preprint arXiv:0903.0194. References Linked data and Graph properties Thank you for listening Kouroupetroglou Praxitelis Nikolaos