How to Execute A Research Paper

How To Execute
The Research Paper
Anita de Waard
Disruptive Technologies Director
Elsevier Labs
Pittsburgh, April 2012

Outline
• Ten people who are changing scholarly publishing:
– New forms
– Workflow/data integration
– New models of business/attribution
• So what does this mean?
• Some projects to help us move towards these new
models:
– Claim-evidence networks
– Workflow/data integration and executable papers
– Creating a community of practice

Theme 1: New forms of publication
• Main issue: the format of the scientific paper comes
from a time when our communication was paper-
centric
• Solution: Rethink the unit and form of the scholarly
publication from the ground (i.e., the experiment) up
• Three projects doing that:

Steve Pettifer, U Manchester
• Utopia: ‘Everything you always wanted to do
with a PDF….’: interactive, sharable
• Working on integration with DOMEO to
add/share annotations
• Final goal: don’t ‘reconstruct the cow from a
hamburger’: include workflows and models

Gully Burns, USC ISI
• KEfED: model of research as an
activity
• Map out
dependent/independent
variables
within an experiment and
model them
• Start: appendix to paper; later:
precede paper, graft paper on
top of model.

Tim Clark, Harvard/MGH
swande:Claim
<http://tinyurl.com/4h2am3a>
Intramembranous Aβ behaves as chaperones of
other membrane proteins
rdf:type
dct:title
G1
<http://example.info/person/1>
pav:contributedBy
<http://example.info/citation/1>
swanrel:referencesAsSupportiveEvidence
G5
G6
• Annotation ontology allows you to trace claims
• DOMEO offers interface to do both automated
entity markup + manual mark up of
claim/evidence networks

Theme 2: data and workflow integration
• Issues:
– Format of the research paper hard to integrate within a
scientific/clinical workflow
– Hard to reproduce/deduce: what methods were used and
what data was created for a piece of research, making
reproduction or even review difficult
• Some solutions for sharing workflows and data:

Results
Logs
Results
Metadata
Paper
Slides
Feeds
into
produc
es
Includ
ed in
produc
es
Published
in
produc
es
Included
in
Include
d in
Included
in
Published
in
Workflow 16
Workflow 13
Common
pathways
Q
T
L
• Research objects: consist of all
academic output, including:
- Papers
- Workflows
- Data
- Talks, lectures
- Blogs
• Move towards executable work:
- Execute periodically to validate
- Run automatically when data updates – by self or others!
- Notify researchers of new results
Dave DeRoure, Oxford e-Research Centre

Phil Bourne, UCSD
• Big need: keep track of the data in my lab!
• Other need: know what I did/what other people
did – Yolanda Gil made workflow representation,
was hard to remember what we did…
• Need: better ways to record, share, archive what
we did.
• New role for the publisher >

Deborah McGuinness, RPI
• Future Web:
• ‘if everything is everywhere, how do we find
it/know what we want?’
• Internet, Web, Grid, Cloud, Semantic Grid
Middleware
• Xinformatics:
• Where X = geo, eco, econo…
• Linked Data to Semantics
• Semantic Foundations:
• Pushing the boundaries of
Semantic Web standards
• Ontology evolution

Theme 3: New Models for Access/Attribution
• Issues:
– User-created content, crowdsourcing means (scientific)
impact is measured very differently from the past
– Need new models for copyright/IP
– Citizen scientists participate as well
• Some efforts to address this:

Paul Groth, VU Amsterdam
Altmetrics: “the creation and study of
new metrics based on the Social Web
for analyzing and informing scholarship.”
Including:
- Downloads
- Where readers read
- Data citation
- Social network diffusion
- Slide reuse
- Peer review contributions
- Youtube views

Leslie Chan, U. Toronto Scarborough
• ElPub conference series that focus
on globally connecting information scientists
• Bioline International system “a not-for-profit
scholarly publishing cooperative committed to
providing open access to quality research journals
published in developing countries”:

John Wilbanks, Kauffman/CC
• As data becomes more accessible, need:
• raw metadata
• standards processes
• consensus processes
• document submission standards
• data archives
• Ways of governing access:
• Privacy vs. IP vs. policies
• Technology only helps so much…
• This is mostly a social/policy issue

Cameron Neylon, Cambridge
• Main arguments for Open Access:
• Citizen science is becoming more important
• Science changes when it is crowdsourced:
Tim Gowers: ‘This is to normal research as
driving is to pushing a car’
• Three principles:
• Scale and connectivity
• Reduced friction to access
• Demand-side filters

In summary, scientists are working on:
• Tools for knowledge…
– Visualisation (Steve Pettifer)
– Modeling (Gully Burns)
– Annotation (Tim Clark)
• Ways to link to
– Workflows (Dave De Roure)
– Lab data (Phil Bourne)
– Linked research data (Deborah McGuinness)
• And models for
– Attribution/credit (Paul Groth)
– Allowing new players to participate (Leslie Chan)
– Copyright/IP rights (John Wilbanks)
– Networked science (Cameron Neylon).

New roles for publishers and libraries
• Technically, there is no reason to publish in a
journal– or for that matter, to publish a paper at all:
• Perhaps a good blog post linked to workflows and
data with some validation from peers and good
download statistics might work just as well?
• Is publishing in journals mostly a habit?

“Publishers have been thinking we’re going out of
business for 20 years, what has suddenly changed?”
The internet! Not the technical web, but the social web….
‘The value of a […] network is proportional to the square of
the number of users of the system (n²)’ Metcalfe’s Law
1990’s:
Big Player
2000’s:
Medium Participant
2015:
Irrelevant!

19
What do we need?
[[1] Bleecker, J. ‘A Manifesto for Networked Objects — Cohabiting with Pigeons, Arphids and Aibos in the Internet of Things
http://nearfuturelaboratory.com/2006/02/26/a-manifesto-for-networked-objects/
2] Bechhofer, S., De Roure, D., Gamble, M., Goble, C. and Buchan, I. (2010) Research Objects: Towards Exchange and
Reuse of Digital Knowledge. In: The Future of the Web for Collaborative Science (FWCS 2010), April 2010, Raleigh, NC, USA.
http://precedings.nature.com/documents/4626/version/1
[3] Neylon, C. ‘Network Enabled Research: Maximise scale and connectivity, minimise friction’,
http://cameronneylon.net/blog/network-enabled-research/ ‘
Internet of things: (Bleecker, [1])
Interact with ‘objects that blog’ or ‘Blogjects’, that:
track where they are and where they’ve been;have
histories of their encounters and experienceshave
agency - an assertive voice on the social web [2]
Research Objects: (Bechofer et al, [2])
Create semantically rich aggregations of resources,
that can possess some scientific intent or support
some research objective
Networked Knowledge: (Neylon, [3])
If we care about taking advantage of the web and
internet for research then we must tackle the building
of scholarly communication networks.
These networks will have two critical characteristics:
scale and a lack of friction. [3]

Some examples of networked science:
• Galaxy Zoo: citizen science: classify galaxies in the
comfort of your own home – like Hanny!
• Tim Gowers, Polymath: “…the real contributors will be
the process owners and project leaders that are able to
provide horizontal leadership. To support this shift,
organizations will need to reward and recognize
horizontal contributions as much, if not more, than
hierarchical positions.”
• Mathoverflow: virtual network of mathemagicians
working collectively to answer big, small, clear and
fuzzy questions

Executable Papers
• E.g.:
http://www.vistrails.org/index.php/User:Tohline/CPM/Levels2and3

Some other publisher
6. User applications: distributed applications run on this
‘exposed data’ universe.
Wrapping a story around your data:
Concept developed with Ed Hovy, Phil Bourne,
Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata (including
provenance) and relations to other data items added to it.
metadata
metadata
metadata
metadata
metadata
5. Publishing and distribution: When a paper is published, a
collection of validated information is exposed to the world. It
remains connected to its related data item, and its heritage can
be traced.
2. Workflow: All data items created in the lab are added to a
(lab-owned) workflow system.
4. Editing and review: Once the co-authors agree, the paper is
‘exposed’ to the editors, who in turn expose it to reviewers.
Reports are stored in the authoring/editing system, the paper gets
updated, until it is validated.
Review
Edit
Revise
Rats were subjected to two grueling
tests
(click on fig 2 to see underlying data).
These results suggest that the
neurological pain pro-
3. Authoring: A paper is written in an authoring tool which can pull
data with provenance from the workflow tool in the appropriate
representation into the document.

23
Creating claim-evidence networks:
• DOMEO: connect to Science Direct
• Rich Boyce’s Drug-drug interactions: tracing
heritage of claims
• Founding that: linguistic markers for identifying
cited/own knowledge:

How a claim becomes a fact:
• Voorhoeve, 2006: “These miRNAs neutralize p53- mediated CDK
inhibition, possibly through direct inhibition of the expression of the
tumorsuppressor LATS2.”
• Kloosterman and Plasterk, 2006: “In a genetic screen, miR-372 and miR-
373 were found to allow proliferation of primary human cells that
express oncogenic RAS and active p53, possibly by inhibiting the tumor
suppressor LATS2 (Voorhoeve et al., 2006).”
• Yabuta et al., 2007: “[On the other hand,] two miRNAs, miRNA-372 and-
373, function as potential novel oncogenes in testicular germ cell
tumors by inhibition of LATS2 expression, which suggests that Lats2 is an
important tumor suppressor (Voorhoeve et al., 2006).”
• Okada et al., 2011: “Two oncogenic miRNAs, miR-372 and miR-373,
directly inhibit the expression of Lats2, thereby allowing tumorigenic
growth in the presence of p53 (Voorhoeve et al., 2006).”

Working on ontology:
1. Add to formal knowledge representations, e.g. Biological
Expression Language add {V = 3, S = N, B = 0}:
• SET Evidence = "Arterial cells are highly susceptible to oxidative stress, which can induce
both necrosis and apoptosis (programmed cell death) [1,2]"
• biologicalProcess(GO:"response to oxidative stress") increases
biologicalProcess(GO:"apoptotic process")
• biologicalProcess(GO:"response to oxidative stress") increases
biologicalProcess(GO:necrosis)
2. Improve triple search engines, e.g. compare in iHop:
• The Lats2 tumor suppressor protein has been implicated earlier in promoting p53
activation in response to mitotic apparatus stress {V = 2, S = NN, B = 0}
• Our findings reveal that miR-373 would be a potential oncogene and it participates in
the carcinogenesis of human esophageal cancer {V = 1/2?, S = A, B = D}

Application: Elsevier/Philips Use Case:
3 Content Sources, 2 Link Steps
A. Philips’ Electronic Patient Records
B. Elsevier-published
Clinical Guideline
C. Elsevier (or other publisher’s)
Research Report or Data
Step 1: Patient data +
diagnosis link to Guideline
recommendation
Step 2: Guideline recommendation
links to research report/data

27
Application: Find ‘Claimed Knowledge Updates’
Work done with Agnes Sandor,
Xerox Research Europe

FORCE11 Community of Practice
• Workshop in August of 2011: 35 invited attendees from different
parts of science, industry, funding agencies, data centers
• Goal: map main obstacles preventing new models of science
publishing and develop ways to overcome them
• Just received funding from
Sloan foundation to:
• Start online community
• Hold next workshop
• Collaboratively work on
new efforts

Summary:
• Ten people who are changing scholarly publishing:
• We (publishers, editors, libraries, etc) need to revisit if
and how we are needed
• Some projects are underway to help us move towards
these new models:
– Networked science
– Workflow/data integration
– Identifying claim-evidence trails
• ….happy to collaborate on others!
http://elsatglabs.com/labs/anita
a.dewaard@elsevier.com

How to Execute A Research Paper

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (8)

Similaire à How to Execute A Research Paper

Similaire à How to Execute A Research Paper (20)

Plus de Anita de Waard

Plus de Anita de Waard (20)

Dernier

Dernier (20)

How to Execute A Research Paper

Notes de l'éditeur