SlideShare a Scribd company logo
1 of 17
Metrics14 - ASIS&T SIGMET Workshop, Seattle, 5th November, 2014 
Exploring data quality and retrieval 
strategies for Mendeley reader counts 
Zohreh Zahedi1, Stefanie Haustein2 & Timothy D. Bowman2 
z.zahedi.2@cwts.leidenuniv.nl stefanie.haustein@umontreal.ca tim.bowman@gmail.com 
@zohrehzahedi @stefhaustein @timothydbowman 
1Leiden University, The Netherlands 
2Université de Montréal, Canada
• online reference management tool 
• usage statistics, available via open API
• 2.8 million users, 275,860 groups, 
535 user documents (02/2014) 
• 68 million unique publications (08/2012; 
281 million user documents) 
Mendeley statistics based on monthly user counts from 10/2010 to 02/2014 on the Mendeley website accessed through the Internet Archive
Research Objectives 
• metadata quality and its effect on retrieval
Research Objectives 
• fluctuation in Mendeley coverage and readership 
counts over time and through different retrieval 
strategies (Bar-Ilan, 2014) 
• altmetric studies and tools use different retrieval 
strategies 
• DOI API search 
• title search (e.g., Webometric Analyst) 
 lack of systematic study to determine effect of 
retrieval strategy
Research Objectives 
• analyzing metadata quality of Mendeley entries 
systematically 
• testing completeness and accuracy of relevant 
metadata fields 
• identify and quantify error types 
• analyze difference between retrieval strategies 
determine best retrieval strategy for collecting 
Mendeley reader counts
Research Questions 
• How accurate is the metadata on Mendeley for a 
random sample of publications? 
• In how far do results differ between: 
• manual title search in online catalog 
• API search via DOI 
• What are the most frequent error types in the 
bibliographic data on Mendeley? 
• What retrieval strategy provides the most 
accurate and complete results for the sampled 
publications?
Data set and Method 
• random sample of 2012 WoS publications: 
384 of 1,873,759 documents 
• manual title search via Mendeley online catalog 
n=384 
• DOI search via Mendeley API simultaneously 
n=264 (=-31%) 
• comparison of all relevant metadata 
• Author 
• DOI 
• ISSN 
• Pages 
• Source 
• Title 
• Title 
• Volume 
• Year
Found by manual title search
Found by API DOI search
Results: overview 
n=264 
2 false positives 
91.3% of searched documents 
n=384 
47.4% of searched documents
Results: overview 
documents reader counts 
N % N % + 
identical reader counts 103 36.4 975 41.1 0 
identical 102 36.0 975 41.1 0 
identical, both 0 1 0.4 0 0 0 
API higher 111 39.2 752 31.7 718 
API higher 10 3.5 204 8.6 170 
API higher, manual not found 80 28.3 548 23.1 548 
API 0, manual not found 21 7.4 0 0 0 
manual higher 69 24.4 644 27.2 563 
manual higher 21 7.4 379 16.0 298 
manual higher, API not found 40 14.1 242 10.2 242 
manual higher, API 0 6 2.1 23 1.0 23 
manual 0, API not found 2 0.7 0 0 0 
all documents 283 100.0 2,371 100.0 1,281
Results: incorrect metadata 
Title search 
n=182 
DOI search 
n=241 
93% 
92% 
87% 
90% 
80% 
73% 
85% 
94% 
99% 
7% 
4% 
13% 
6% 
14% 
27% 
15% 
6% 
1% 
Author 
DOI 
ISSN 
Issue 
Pages 
Source 
Title 
Volume 
Year 
6% 
0%* 
68% 
10% 
10% 
24% 
18% 
7% 
1% 
94% 
100%* 
32% 
83% 
83% 
76% 
82% 
91% 
99% 
*the API DOI search retrieved two false positives which are not included in this analysis
Results: error types 
Title search DOI search
Conclusions 
• errors in fields commonly used for matching: 
• Title: 15/18% 
• First author: 7/6% 
• Year: 1/1% 
• source (27/24%), ISSN (13/68%), volume (6/7%), 
issue (6/10%), page number (14/10%) should not 
be used for matching 
• special characters produce most errors, removing 
them would resolve large share of errors: 
• Title: 81/84% 
• First author: 67/73%
Conclusions 
• results of retrieval strategies: 
• manual title: 182 (64%) documents & 1,653 readers 
• API DOI: 241 (85%) & 1,808 
• combined: 283 & 2,371 (max) / 2,486 (sum) 
• DOI search found 101 (36%) additional documents, 
but: 
• could not be applied to 120 (31%) documents w/out DOI 
• did not retrieve 42 (15%) documents found by title 
search 
• led to 2 (1%) false positives 
 combination of DOI and title search w/out special 
characters
Thank you for your attention!

More Related Content

More from Stefanie Haustein

Haustein, S. (2016). Les « altmetrics » et les médias sociaux dans la communi...
Haustein, S. (2016). Les « altmetrics » et les médias sociaux dans la communi...Haustein, S. (2016). Les « altmetrics » et les médias sociaux dans la communi...
Haustein, S. (2016). Les « altmetrics » et les médias sociaux dans la communi...Stefanie Haustein
 
Haustein, S. (2016). Analyzing, measuring and visualizing the success of inte...
Haustein, S. (2016). Analyzing, measuring and visualizing the success of inte...Haustein, S. (2016). Analyzing, measuring and visualizing the success of inte...
Haustein, S. (2016). Analyzing, measuring and visualizing the success of inte...Stefanie Haustein
 
Lés médias sociaux dans la communication et l'évaluation scientifique : résul...
Lés médias sociaux dans la communication et l'évaluation scientifique : résul...Lés médias sociaux dans la communication et l'évaluation scientifique : résul...
Lés médias sociaux dans la communication et l'évaluation scientifique : résul...Stefanie Haustein
 
Interpreting social media acts. The various meanings of altmetrics
Interpreting social media acts. The various meanings of altmetricsInterpreting social media acts. The various meanings of altmetrics
Interpreting social media acts. The various meanings of altmetricsStefanie Haustein
 
Identifying Twitter audiences: Who is tweeting about scientific papers?
Identifying Twitter audiences: Who is tweeting about scientific papers?Identifying Twitter audiences: Who is tweeting about scientific papers?
Identifying Twitter audiences: Who is tweeting about scientific papers?Stefanie Haustein
 
Rodrigo Costas & Stefanie Haustein: Citation theories and their application t...
Rodrigo Costas & Stefanie Haustein: Citation theories and their application t...Rodrigo Costas & Stefanie Haustein: Citation theories and their application t...
Rodrigo Costas & Stefanie Haustein: Citation theories and their application t...Stefanie Haustein
 
Communities of attention' around journal papers: Who is tweeting about scient...
Communities of attention' around journal papers: Who is tweeting about scient...Communities of attention' around journal papers: Who is tweeting about scient...
Communities of attention' around journal papers: Who is tweeting about scient...Stefanie Haustein
 
When is an article actually published? An analysis of online availability, pu...
When is an article actually published? An analysis of online availability, pu...When is an article actually published? An analysis of online availability, pu...
When is an article actually published? An analysis of online availability, pu...Stefanie Haustein
 
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...Stefanie Haustein
 
Scientific Interactions and Research Evaluation: From Bibliometrics to Altmet...
Scientific Interactions and Research Evaluation: From Bibliometrics to Altmet...Scientific Interactions and Research Evaluation: From Bibliometrics to Altmet...
Scientific Interactions and Research Evaluation: From Bibliometrics to Altmet...Stefanie Haustein
 
Scholarly communication and evaluation: from bibliometrics to altmetrics
Scholarly communicationand evaluation: from bibliometrics to altmetricsScholarly communicationand evaluation: from bibliometrics to altmetrics
Scholarly communication and evaluation: from bibliometrics to altmetricsStefanie Haustein
 
Mendeley as a Source of Readership by Students and Postdocs? Evaluating Ar...
Mendeley as a Source of Readership by Students and Postdocs? Evaluating Ar...Mendeley as a Source of Readership by Students and Postdocs? Evaluating Ar...
Mendeley as a Source of Readership by Students and Postdocs? Evaluating Ar...Stefanie Haustein
 
Automated arXiv feeds on Twitter: On the role of bots in scholarly communication
Automated arXiv feeds on Twitter:On the role of bots in scholarly communicationAutomated arXiv feeds on Twitter:On the role of bots in scholarly communication
Automated arXiv feeds on Twitter: On the role of bots in scholarly communicationStefanie Haustein
 
The heterogeneity of social media metrics and its effects on statistics
The heterogeneity of social media metrics and its effects on statisticsThe heterogeneity of social media metrics and its effects on statistics
The heterogeneity of social media metrics and its effects on statisticsStefanie Haustein
 
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...Stefanie Haustein
 
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...Stefanie Haustein
 
Tweets and Mendeley readers: Two different types of article level metrics
Tweets and Mendeley readers: Two different types of article level metricsTweets and Mendeley readers: Two different types of article level metrics
Tweets and Mendeley readers: Two different types of article level metricsStefanie Haustein
 
NISO Webinar: New Perspectives on Assessment How Altmetrics Measure Scholarly...
NISO Webinar: New Perspectives on Assessment How Altmetrics Measure Scholarly...NISO Webinar: New Perspectives on Assessment How Altmetrics Measure Scholarly...
NISO Webinar: New Perspectives on Assessment How Altmetrics Measure Scholarly...Stefanie Haustein
 
SIGMET Panel at ASIST: Altmetrics - Present and Future
SIGMET Panel at ASIST: Altmetrics - Present and FutureSIGMET Panel at ASIST: Altmetrics - Present and Future
SIGMET Panel at ASIST: Altmetrics - Present and FutureStefanie Haustein
 
Empirical analyses of scientific papers and researchers on Twitter: Results...
 	Empirical analyses of scientific papers and researchers on Twitter: Results... 	Empirical analyses of scientific papers and researchers on Twitter: Results...
Empirical analyses of scientific papers and researchers on Twitter: Results...Stefanie Haustein
 

More from Stefanie Haustein (20)

Haustein, S. (2016). Les « altmetrics » et les médias sociaux dans la communi...
Haustein, S. (2016). Les « altmetrics » et les médias sociaux dans la communi...Haustein, S. (2016). Les « altmetrics » et les médias sociaux dans la communi...
Haustein, S. (2016). Les « altmetrics » et les médias sociaux dans la communi...
 
Haustein, S. (2016). Analyzing, measuring and visualizing the success of inte...
Haustein, S. (2016). Analyzing, measuring and visualizing the success of inte...Haustein, S. (2016). Analyzing, measuring and visualizing the success of inte...
Haustein, S. (2016). Analyzing, measuring and visualizing the success of inte...
 
Lés médias sociaux dans la communication et l'évaluation scientifique : résul...
Lés médias sociaux dans la communication et l'évaluation scientifique : résul...Lés médias sociaux dans la communication et l'évaluation scientifique : résul...
Lés médias sociaux dans la communication et l'évaluation scientifique : résul...
 
Interpreting social media acts. The various meanings of altmetrics
Interpreting social media acts. The various meanings of altmetricsInterpreting social media acts. The various meanings of altmetrics
Interpreting social media acts. The various meanings of altmetrics
 
Identifying Twitter audiences: Who is tweeting about scientific papers?
Identifying Twitter audiences: Who is tweeting about scientific papers?Identifying Twitter audiences: Who is tweeting about scientific papers?
Identifying Twitter audiences: Who is tweeting about scientific papers?
 
Rodrigo Costas & Stefanie Haustein: Citation theories and their application t...
Rodrigo Costas & Stefanie Haustein: Citation theories and their application t...Rodrigo Costas & Stefanie Haustein: Citation theories and their application t...
Rodrigo Costas & Stefanie Haustein: Citation theories and their application t...
 
Communities of attention' around journal papers: Who is tweeting about scient...
Communities of attention' around journal papers: Who is tweeting about scient...Communities of attention' around journal papers: Who is tweeting about scient...
Communities of attention' around journal papers: Who is tweeting about scient...
 
When is an article actually published? An analysis of online availability, pu...
When is an article actually published? An analysis of online availability, pu...When is an article actually published? An analysis of online availability, pu...
When is an article actually published? An analysis of online availability, pu...
 
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
 
Scientific Interactions and Research Evaluation: From Bibliometrics to Altmet...
Scientific Interactions and Research Evaluation: From Bibliometrics to Altmet...Scientific Interactions and Research Evaluation: From Bibliometrics to Altmet...
Scientific Interactions and Research Evaluation: From Bibliometrics to Altmet...
 
Scholarly communication and evaluation: from bibliometrics to altmetrics
Scholarly communicationand evaluation: from bibliometrics to altmetricsScholarly communicationand evaluation: from bibliometrics to altmetrics
Scholarly communication and evaluation: from bibliometrics to altmetrics
 
Mendeley as a Source of Readership by Students and Postdocs? Evaluating Ar...
Mendeley as a Source of Readership by Students and Postdocs? Evaluating Ar...Mendeley as a Source of Readership by Students and Postdocs? Evaluating Ar...
Mendeley as a Source of Readership by Students and Postdocs? Evaluating Ar...
 
Automated arXiv feeds on Twitter: On the role of bots in scholarly communication
Automated arXiv feeds on Twitter:On the role of bots in scholarly communicationAutomated arXiv feeds on Twitter:On the role of bots in scholarly communication
Automated arXiv feeds on Twitter: On the role of bots in scholarly communication
 
The heterogeneity of social media metrics and its effects on statistics
The heterogeneity of social media metrics and its effects on statisticsThe heterogeneity of social media metrics and its effects on statistics
The heterogeneity of social media metrics and its effects on statistics
 
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
 
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
 
Tweets and Mendeley readers: Two different types of article level metrics
Tweets and Mendeley readers: Two different types of article level metricsTweets and Mendeley readers: Two different types of article level metrics
Tweets and Mendeley readers: Two different types of article level metrics
 
NISO Webinar: New Perspectives on Assessment How Altmetrics Measure Scholarly...
NISO Webinar: New Perspectives on Assessment How Altmetrics Measure Scholarly...NISO Webinar: New Perspectives on Assessment How Altmetrics Measure Scholarly...
NISO Webinar: New Perspectives on Assessment How Altmetrics Measure Scholarly...
 
SIGMET Panel at ASIST: Altmetrics - Present and Future
SIGMET Panel at ASIST: Altmetrics - Present and FutureSIGMET Panel at ASIST: Altmetrics - Present and Future
SIGMET Panel at ASIST: Altmetrics - Present and Future
 
Empirical analyses of scientific papers and researchers on Twitter: Results...
 	Empirical analyses of scientific papers and researchers on Twitter: Results... 	Empirical analyses of scientific papers and researchers on Twitter: Results...
Empirical analyses of scientific papers and researchers on Twitter: Results...
 

Recently uploaded

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxVivek487417
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制vexqp
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 

Recently uploaded (20)

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 

Exploring data quality and retrieval strategies for Mendeley reader counts

  • 1. Metrics14 - ASIS&T SIGMET Workshop, Seattle, 5th November, 2014 Exploring data quality and retrieval strategies for Mendeley reader counts Zohreh Zahedi1, Stefanie Haustein2 & Timothy D. Bowman2 z.zahedi.2@cwts.leidenuniv.nl stefanie.haustein@umontreal.ca tim.bowman@gmail.com @zohrehzahedi @stefhaustein @timothydbowman 1Leiden University, The Netherlands 2Université de Montréal, Canada
  • 2. • online reference management tool • usage statistics, available via open API
  • 3. • 2.8 million users, 275,860 groups, 535 user documents (02/2014) • 68 million unique publications (08/2012; 281 million user documents) Mendeley statistics based on monthly user counts from 10/2010 to 02/2014 on the Mendeley website accessed through the Internet Archive
  • 4. Research Objectives • metadata quality and its effect on retrieval
  • 5. Research Objectives • fluctuation in Mendeley coverage and readership counts over time and through different retrieval strategies (Bar-Ilan, 2014) • altmetric studies and tools use different retrieval strategies • DOI API search • title search (e.g., Webometric Analyst)  lack of systematic study to determine effect of retrieval strategy
  • 6. Research Objectives • analyzing metadata quality of Mendeley entries systematically • testing completeness and accuracy of relevant metadata fields • identify and quantify error types • analyze difference between retrieval strategies determine best retrieval strategy for collecting Mendeley reader counts
  • 7. Research Questions • How accurate is the metadata on Mendeley for a random sample of publications? • In how far do results differ between: • manual title search in online catalog • API search via DOI • What are the most frequent error types in the bibliographic data on Mendeley? • What retrieval strategy provides the most accurate and complete results for the sampled publications?
  • 8. Data set and Method • random sample of 2012 WoS publications: 384 of 1,873,759 documents • manual title search via Mendeley online catalog n=384 • DOI search via Mendeley API simultaneously n=264 (=-31%) • comparison of all relevant metadata • Author • DOI • ISSN • Pages • Source • Title • Title • Volume • Year
  • 9. Found by manual title search
  • 10. Found by API DOI search
  • 11. Results: overview n=264 2 false positives 91.3% of searched documents n=384 47.4% of searched documents
  • 12. Results: overview documents reader counts N % N % + identical reader counts 103 36.4 975 41.1 0 identical 102 36.0 975 41.1 0 identical, both 0 1 0.4 0 0 0 API higher 111 39.2 752 31.7 718 API higher 10 3.5 204 8.6 170 API higher, manual not found 80 28.3 548 23.1 548 API 0, manual not found 21 7.4 0 0 0 manual higher 69 24.4 644 27.2 563 manual higher 21 7.4 379 16.0 298 manual higher, API not found 40 14.1 242 10.2 242 manual higher, API 0 6 2.1 23 1.0 23 manual 0, API not found 2 0.7 0 0 0 all documents 283 100.0 2,371 100.0 1,281
  • 13. Results: incorrect metadata Title search n=182 DOI search n=241 93% 92% 87% 90% 80% 73% 85% 94% 99% 7% 4% 13% 6% 14% 27% 15% 6% 1% Author DOI ISSN Issue Pages Source Title Volume Year 6% 0%* 68% 10% 10% 24% 18% 7% 1% 94% 100%* 32% 83% 83% 76% 82% 91% 99% *the API DOI search retrieved two false positives which are not included in this analysis
  • 14. Results: error types Title search DOI search
  • 15. Conclusions • errors in fields commonly used for matching: • Title: 15/18% • First author: 7/6% • Year: 1/1% • source (27/24%), ISSN (13/68%), volume (6/7%), issue (6/10%), page number (14/10%) should not be used for matching • special characters produce most errors, removing them would resolve large share of errors: • Title: 81/84% • First author: 67/73%
  • 16. Conclusions • results of retrieval strategies: • manual title: 182 (64%) documents & 1,653 readers • API DOI: 241 (85%) & 1,808 • combined: 283 & 2,371 (max) / 2,486 (sum) • DOI search found 101 (36%) additional documents, but: • could not be applied to 120 (31%) documents w/out DOI • did not retrieve 42 (15%) documents found by title search • led to 2 (1%) false positives  combination of DOI and title search w/out special characters
  • 17. Thank you for your attention!