Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Pikas using bibliometrics to make sense of research proposals
1. Using Bibliometrics To Make Sense Of Research Proposals
Christina K. Pikas, BS, MLS, PhD
Christina.Pikas@jhuapl.edu
ABSTRACT
Grantmaking organizations use a number of different methods to
make sense of, evaluate, and select proposals for further funding.
A technical team reviews each proposal to determine if it is
responsive to the call, if it proposes novel work that is likely to be
successful, if the team is likely to be able to accomplish what they
have proposed, and if the resource needs are reasonable. The
funder may want to know what approaches or schools of thought
are represented in the proposals. One way to show this is to
extract the citations in the bibliography and perform a
bibliographic analysis. This poster will describe approaches to
extracting the bibliography and suggested analyses.
BACKGROUND
Funders may issue calls or solicitations that describe a general
problem to be solved or phenomenon to be studied without
providing guidance or requirements for how performers will
address it.
Submitters describe their proposed approach and provide
evidence that they are likely to be successful in narrative text with
citations to relevant literature. Individual proposals may cite
anywhere from five to 200 sources. Proposals are often delivered
in PDF format.
We wanted to know if there were commonalities of approaches
that could be ascertained from mapping the citations.
OBJECTIVES
• Develop a method to reliably extract citations from text
• Compile a bibliography of all articles cited
• Identify approaches
• Group similar proposers
METHODS
Extracting Text
• Used Adobe Acrobat Pro* to save as Word
• Manually pasted bibliography sections to text editor
Extracting Citations
• Used Excel to reduce citations to an AuthorYear identifier for
each citation
• Used UCInet to build network
Network Analysis
• R (igraph) for community detection
• NetDraw for visualization and analysis
Inspection of the graph provided insights about different general
approaches
• Analysis revealed several highly central clusters related to key
bodies of work for the research problem
• Some proposals were outliers because they tapped into novel
areas of the literature
• No useful communities were evident using the various
community detection techniques
• The schools of thought identified by subject matter experts
were cited by most proposals, even by some in order to contrast
with chosen approach
RESULTS
CONCLUSIONS
• Although not completely successful, the approach is promising
and provided valuable insights
• Using a database API to retrieve a parsed citation may be more
effective than parsing from text.
OCR
Extract
Text
Identify
Citations
Graph Analyze
RESULTS
1
2
3
4
5
6
7
8
9
10 11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Note: Product names are provided for reference.
No endorsement implied
UNSUCCESSFUL METHODS
Extracting Text
• Programmatically – too much variation in content
Parsing Citations
• ParsCit
• FreeCite
• ParaCite / ParaTools
• AnyStyle.io
Citation x Proposer Network
Sized by betweenness centrality. Citations are blue boxes.
Proposer x Proposer Network
Sized by degree centrality.