SlideShare a Scribd company logo
1 of 67
Download to read offline
Stories, that persuade with data.
   What’s inside scientific papers,
   and should it be reengineered?


  Anita de Waard, a.dewaard@elsevier.com
Disruptive Technologies Director, Elsevier Labs
Scientific papers are stories, that persuade with data.
The Story of Goldilocks and              Story        Part        Paper              The AXH Domain of Ataxin-1 Mediates
the Three Bears                                                                      Neurodegeneration through Its Interaction with Gfi-1/
                                                                                     Senseless Proteins
Once upon a time                         Time         Setting     Background         The mechanisms mediating SCA1 pathogenesis are still not fully
                                                                                     understood, but some general principles have emerged.
a little girl named Goldilocks           Characters               Objects of study   the Drosophila Atx-1 homolog (dAtx-1) which lacks a polyQ tract,

She went for a walk in the forest.       Location                 Experimental       studied and compared in vivo effects and interactions to those of the
Pretty soon, she came upon a house.                               setup              human protein

She knocked and, when no one             Goal         Theme       Research           Gain insight into how Atx-1's function contributes to SCA1
answered,                                                         goal               pathogenesis. How these interactions might contribute to the disease
                                                                                     process and how they might cause toxicity in only a subset of neurons in
she walked right in.                     Attempt                  Hypothesis         SCA1 may play a role in the regulation of gene expression
                                                                                     Atx-1 is not fully understood.

At the table in the kitchen, there       Name         Episode 1   Name               dAtX-1 and hAtx-1 Induce Similar Phenotypes When Overexpressed in
were three bowls of porridge.                                                        Files
Goldilocks was hungry.                   Subgoal                  Subgoal            test the function of the AXH domain
She tasted the porridge from the         Attempt                  Method             overexpressed dAtx-1 in flies using the GAL4/UAS system (Brand and
first bowl.                                                                           Perrimon, 1993) and compared its effects to those of hAtx-1.
This porridge is too hot! she            Outcome                  Results            Overexpression of dAtx-1 by Rhodopsin1(Rh1)-GAL4, which drives
exclaimed.                                                                           expression in the differentiated R1-R6 photoreceptor cells (Mollereau et
                                                                                     al., 2000 and O'Tousa et al., 1985), results in neurodegeneration in the
                                                                                     eye, as does overexpression of hAtx-1[82Q]. Although at 2 days after
                                                                                     eclosion, overexpression of either Atx-1 does not show obvious
So, she tasted the porridge from the     Activity                 Data               (data not shown),
                                                                                     morphological changes in the photoreceptor cells
second bowl.
This porridge is too cold, she said      Outcome                  Results            both genotypes show many large holes and loss of cell integrity at 28
                                                                                     days
So, she tasted the last bowl of          Activity                 Data               (Figures 1B-1D).
porridge.
Ahhh, this porridge is just right, she   Outcome                  Results            Overexpression of dAtx-1 using the GMR-GAL4 driver also induces eye
said happily and                                                                     abnormalities. The external structures of the eyes that overexpress
she ate it all up.                       Outcome                  Data               dAtx-1 1F), disorganized ommatidia and loss of interommatidial bristles
                                                                                     (Figure show
Scientific papers are stories, that persuade with data.
The Story of Goldilocks and              Story        Part        Paper              The AXH Domain of Ataxin-1 Mediates
the Three Bears                                                                      Neurodegeneration through Its Interaction with Gfi-1/
                                                                                     Senseless Proteins
Once upon a time                         Time         Setting     Background         The mechanisms mediating SCA1 pathogenesis are still not fully
                                                                                     understood, but some general principles have emerged.
a little girl named Goldilocks           Characters               Objects of study   the Drosophila Atx-1 homolog (dAtx-1) which lacks a polyQ tract,

She went for a walk in the forest.       Location                 Experimental       studied and compared in vivo effects and interactions to those of the
Pretty soon, she came upon a house.                               setup              human protein

She knocked and, when no one             Goal         Theme       Research           Gain insight into how Atx-1's function contributes to SCA1
answered,                                                         goal               pathogenesis. How these interactions might contribute to the disease
                                                                                     process and how they might cause toxicity in only a subset of neurons in
she walked right in.                     Attempt                  Hypothesis         SCA1 may play a role in the regulation of gene expression
                                                                                     Atx-1 is not fully understood.

At the table in the kitchen, there       Name         Episode 1   Name               dAtX-1 and hAtx-1 Induce Similar Phenotypes When Overexpressed in
were three bowls of porridge.                                                        Files
Goldilocks was hungry.                   Subgoal                  Subgoal            test the function of the AXH domain
She tasted the porridge from the         Attempt                  Method             overexpressed dAtx-1 in flies using the GAL4/UAS system (Brand and
first bowl.                                                                           Perrimon, 1993) and compared its effects to those of hAtx-1.
This porridge is too hot! she            Outcome                  Results            Overexpression of dAtx-1 by Rhodopsin1(Rh1)-GAL4, which drives
exclaimed.                                                                           expression in the differentiated R1-R6 photoreceptor cells (Mollereau et
                                                                                     al., 2000 and O'Tousa et al., 1985), results in neurodegeneration in the
                                                                                     eye, as does overexpression of hAtx-1[82Q]. Although at 2 days after
                                                                                     eclosion, overexpression of either Atx-1 does not show obvious
So, she tasted the porridge from the     Activity                 Data               (data not shown),
                                                                                     morphological changes in the photoreceptor cells
second bowl.
This porridge is too cold, she said      Outcome                  Results            both genotypes show many large holes and loss of cell integrity at 28
                                                                                     days
So, she tasted the last bowl of          Activity                 Data               (Figures 1B-1D).
porridge.
Ahhh, this porridge is just right, she   Outcome                  Results            Overexpression of dAtx-1 using the GMR-GAL4 driver also induces eye
said happily and                                                                     abnormalities. The external structures of the eyes that overexpress
she ate it all up.                       Outcome                  Data               dAtx-1 1F), disorganized ommatidia and loss of interommatidial bristles
                                                                                     (Figure show
Story analysis of scientific text:
ORB vs. Medium-grained structure
Story analysis of scientific text:
     ORB vs. Medium-grained structure




See work at http://www.w3.org/wiki/HCLSIG/SWANSIOC
Episode-level access through Linked Data standards:
Episode-level access through Linked Data standards:




                                    said @anita
                                   on April 5, 2011




                       this says
<ce:section id=#123>               mice like cheese
Episode-level access through Linked Data standards:

                                        but we all know
                                      she was deluded then


                                    said @anita
                                   on April 5, 2011




                       this says
<ce:section id=#123>               mice like cheese
Episode-level access through Linked Data standards:
the xml is fixed, but the structure is open!          allows for layers of annotation
                                                                but we all know
                                                              she was deluded then


                                                             said @anita
                                                            on April 5, 2011




                                          this says
  <ce:section id=#123>                                     mice like cheese
Satellite Format:
Linked Data repository for all Elsevier content
Satellite Format:
Linked Data repository for all Elsevier content

                     Dublin Core and SKOS
Satellite Format:
Linked Data repository for all Elsevier content

                            Dublin Core and SKOS




            SWAN’s PAV (Provenance, Authoring and Versioning) ontology
Scientific papers are stories,
                that persuade with data.

Both seminomas and the EC component of
nonseminomas share features with ES cells. To
exclude that the detection of miR-371-3 merely
reflects its expression pattern in ES cells, we tested
by RPA miR-302a-d, another ES cells-specific
miRNA cluster (Suh et al, 2004). In many of the
miR-371-3 expressing seminomas and
nonseminomas, miR-302a-d was undetectable (Figs
S7 and S8), suggesting that miR-371-3 expression
is a selective event during tumorigenesis.
Scientific papers are stories,
                 that persuade with data.

Both seminomas and the EC component of
 Both seminomas and the EC component of
nonseminomas share features with ES cells.
 nonseminomas share features with ES cells. To
exclude thatthat detection of miR-371-3 merely
 To exclude the
reflects its expression pattern in ES cells,reflects its
 the detection of miR-371-3 merely we tested
by RPA miR-302a-d, another ES cells-specific
 expression pattern in ES cells,
miRNA cluster RPA miR-302a-d, another ES cells-
 we tested by (Suh et al, 2004). In many of the
m i R - 3 7 miRNAx p r e s s(Suh et e m2004). a s a n d
 specific 1 - 3 e cluster i n g s al, i n o m
nonseminomas, miR-302a-d was undetectable (Figs
 In many of the miR-371-3 expressing seminomas
S7 and S8), suggesting that miR-371-3undetectable
 and nonseminomas, miR-302a-d was expression
is a selective event during tumorigenesis.
 (Figs S7 and S8),
 suggesting that
 miR-371-3 expression is a selective event during
 tumorigenesis.
Scientific papers are stories,
                 that persuade with data.

Both seminomas and the EC component of
 Both seminomas and the EC component of                    Fact
nonseminomas share features with ES cells.
 nonseminomas share features with ES cells. To
exclude thatthat detection of miR-371-3 merely
 To exclude the                                            Goal
reflects its expression pattern in ES cells,reflects its
 the detection of miR-371-3 merely we tested               Hypothesis
by RPA miR-302a-d, another ES cells-specific
 expression pattern in ES cells,
miRNA cluster RPA miR-302a-d, another ES cells-
 we tested by (Suh et al, 2004). In many of the
m i R - 3 7 miRNAx p r e s s(Suh et e m2004). a s a n d
 specific 1 - 3 e cluster i n g s al, i n o m              Method
nonseminomas, miR-302a-d was undetectable (Figs
 In many of the miR-371-3 expressing seminomas
S7 and S8), suggesting that miR-371-3undetectable
 and nonseminomas, miR-302a-d was expression               Result
is a selective event during tumorigenesis.
 (Figs S7 and S8),
 suggesting that                                           Reg-Implication
 miR-371-3 expression is a selective event during
                                                           Implication
 tumorigenesis.
Scientific papers are stories,
                 that persuade with data.
                                                                     Conceptual
Both seminomas and the EC component of
 Both seminomas and the EC component of                              knowledge
                                                           Fact
nonseminomas share features with ES cells.
 nonseminomas share features with ES cells. To
exclude thatthat detection of miR-371-3 merely
 To exclude the                                            Goal
reflects its expression pattern in ES cells,reflects its
 the detection of miR-371-3 merely we tested               Hypothesis
by RPA miR-302a-d, another ES cells-specific
 expression pattern in ES cells,
miRNA cluster RPA miR-302a-d, another ES cells-
 we tested by (Suh et al, 2004). In many of the
m i R - 3 7 miRNAx p r e s s(Suh et e m2004). a s a n d
 specific 1 - 3 e cluster i n g s al, i n o m              Method
nonseminomas, miR-302a-d was undetectable (Figs
 In many of the miR-371-3 expressing seminomas
S7 and S8), suggesting that miR-371-3undetectable
 and nonseminomas, miR-302a-d was expression               Result
is a selective event during tumorigenesis.
 (Figs S7 and S8),
 suggesting that                                           Reg-Implication
 miR-371-3 expression is a selective event during
                                                           Implication
 tumorigenesis.
Scientific papers are stories,
                 that persuade with data.
                                                                     Conceptual
Both seminomas and the EC component of
 Both seminomas and the EC component of                              knowledge
                                                           Fact
nonseminomas share features with ES cells.
 nonseminomas share features with ES cells. To
exclude thatthat detection of miR-371-3 merely
 To exclude the                                            Goal
reflects its expression pattern in ES cells,reflects its
 the detection of miR-371-3 merely we tested               Hypothesis
by RPA miR-302a-d, another ES cells-specific
 expression pattern in ES cells,
miRNA cluster RPA miR-302a-d, another ES cells-
 we tested by (Suh et al, 2004). In many of the
m i R - 3 7 miRNAx p r e s s(Suh et e m2004). a s a n d
 specific 1 - 3 e cluster i n g s al, i n o m              Method
                                                               Experimental
nonseminomas, miR-302a-d was undetectable (Figs
 In many of the miR-371-3 expressing seminomas
                                                                   Evidence
S7 and S8), suggesting that miR-371-3undetectable
 and nonseminomas, miR-302a-d was expression               Result
is a selective event during tumorigenesis.
 (Figs S7 and S8),
 suggesting that                                           Reg-Implication
 miR-371-3 expression is a selective event during
                                                           Implication
 tumorigenesis.
Realms of persuasive experimental discourse:
Realms of persuasive experimental discourse:


(1) Both seminomas             (2) b. the detection of                  (3) c. miR-371-3
and the EC component           miR-371-3 merely                         expression is a
of nonseminomas share          reflects its expression                  selective event during
features with ES cells.        pattern in ES cells,                     tumorigenesis.




      (2) a. To exclude that                                            (3) b. suggesting that




            (2) c. we tested by RPA                  (3) a. In many of the miR-371-3
            miR-302a-d, another ES                   expressing seminomas and
            cells-specific miRNA cluster             nonseminomas, miR-302a-d was
            (Suh et al, 2004).                       undetectable (Figs S7 and S8),
Realms of persuasive experimental discourse:

                               Concepts, models, ‘facts’

(1) Both seminomas               (2) b. the detection of                  (3) c. miR-371-3
and the EC component             miR-371-3 merely                         expression is a
of nonseminomas share            reflects its expression                  selective event during
features with ES cells.          pattern in ES cells,                     tumorigenesis.




      (2) a. To exclude that                Transitions                   (3) b. suggesting that




            (2) c. we tested by RPA                    (3) a. In many of the miR-371-3
            miR-302a-d, another ES                     expressing seminomas and
            cells-specific miRNA cluster               nonseminomas, miR-302a-d was
            (Suh et al, 2004).                         undetectable (Figs S7 and S8),


                                         Experiment
Realms of persuasive experimental discourse:

                               Concepts, models, ‘facts’ ‘State’ present tense

(1) Both seminomas               (2) b. the detection of                  (3) c. miR-371-3
and the EC component             miR-371-3 merely                         expression is a
of nonseminomas share            reflects its expression                  selective event during
features with ES cells.          pattern in ES cells,                     tumorigenesis.




      (2) a. To exclude that                Transitions                   (3) b. suggesting that




            (2) c. we tested by RPA                    (3) a. In many of the miR-371-3
            miR-302a-d, another ES                     expressing seminomas and
            cells-specific miRNA cluster               nonseminomas, miR-302a-d was
            (Suh et al, 2004).                         undetectable (Figs S7 and S8),


                                         Experiment                ‘Narrative’ past tense
Fact creation through citations:

Voorhoeve et al, Cell, 2006:
To investigate the possibility that miR-372 and miR-373 suppress the
expression of LATS2, we...

Therefore, these results point to LATS2 as a mediator of the miR-372 and
miR-373 effects on cell proliferation and tumorigenicity,
Fact creation through citations:

Voorhoeve et al, Cell, 2006:
To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis
expression of LATS2, we...

Therefore, these results point to LATS2 as a mediator of the miR-372 and
miR-373 effects on cell proliferation and tumorigenicity,
Fact creation through citations:

Voorhoeve et al, Cell, 2006:
To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis
expression of LATS2, we...

Therefore, these results point to LATS2 as a mediator of the miR-372 and
miR-373 effects on cell proliferation and tumorigenicity,               Implication
Fact creation through citations:

Voorhoeve et al, Cell, 2006:
To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis
expression of LATS2, we...

Therefore, these results point to LATS2 as a mediator of the miR-372 and
miR-373 effects on cell proliferation and tumorigenicity,               Implication


Raver-Shapira et.al, JMolCell 2007
... two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in
testicular germ cell tumors by inhibition of LATS2 expression, which suggests
that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).
Fact creation through citations:

Voorhoeve et al, Cell, 2006:
To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis
expression of LATS2, we...

Therefore, these results point to LATS2 as a mediator of the miR-372 and
miR-373 effects on cell proliferation and tumorigenicity,               Implication


Raver-Shapira et.al, JMolCell 2007                                     Cited Implication
... two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in
testicular germ cell tumors by inhibition of LATS2 expression, which suggests
that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).
Fact creation through citations:

Voorhoeve et al, Cell, 2006:
To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis
expression of LATS2, we...

Therefore, these results point to LATS2 as a mediator of the miR-372 and
miR-373 effects on cell proliferation and tumorigenicity,               Implication


Raver-Shapira et.al, JMolCell 2007                                     Cited Implication
... two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in
testicular germ cell tumors by inhibition of LATS2 expression, which suggests
that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).


Yabuta, JBioChem 2007:
miR-372 and miR-373 target the Lats2 tumor suppressor (Voorhoeve et al., 2006)
Fact creation through citations:

Voorhoeve et al, Cell, 2006:
To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis
expression of LATS2, we...

Therefore, these results point to LATS2 as a mediator of the miR-372 and
miR-373 effects on cell proliferation and tumorigenicity,               Implication


Raver-Shapira et.al, JMolCell 2007                                     Cited Implication
... two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in
testicular germ cell tumors by inhibition of LATS2 expression, which suggests
that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).


Yabuta, JBioChem 2007:                                                             Fact
miR-372 and miR-373 target the Lats2 tumor suppressor (Voorhoeve et al., 2006)
“[Y]ou can transform a fact into fiction or a fiction into fact
    just by adding or subtracting references [and data]”
          – Bruno Latour, ‘Science in Action’,1987
“[Y]ou can transform a fact into fiction or a fiction into fact
    just by adding or subtracting references [and data]”
          – Bruno Latour, ‘Science in Action’,1987
How is this rhetoric instantiated?
Rhetorical            Utterance {Proposition}                                    S=      V=
goal                                                                             H, B    C, E

Indicate lack of      {The role of untranslated exons in the CCR3 gene}          NN      0
knowledge             has not been studied.
Evaluate other        Recently, CCR3 has been shown to {be                       N, D    3
work                  upregulated on neutrophils by interferons in vitro [..]}
Offer                 it is thought that {these transcription factors            NN, R   2
hypotheses            affect transcription of the gene through interactions
                      with the RNA transcription complex.}
Interpret results     these data suggested that {5' untranslated exon            A, D    2
                      1 may have a regulatory function.}
Assess validity       Since {this was not the case with other lines,} {we        A, D    1
of                    suspect {it is integration-site specific}}
interpretations
State                 While we expected {the transcript to be about 1            A, D    2, S+
correspondence        kb in size (Figure 4A),} {two bands ~4 and 5 kb were
to expectations       apparent.}
Comparison to         It is important that {this data be viewed                  A,R/    2, F+
other work            with {what is known about other myeloid-                   NN/
                      specific promoters,}}                                       D
Eventually: trace roots of a claim:
how many independent data points is it based on?




                                        11
Eventually: trace roots of a claim:
how many independent data points is it based on?

          PHC   undergo Growth arrest




                                        11
Eventually: trace roots of a claim:
how many independent data points is it based on?

                   PHC       undergo Growth arrest



Paper A:
            implication
    method                  fact
     goal                   fact
               results


  data 1

             data 2       data 3

                                                     11
Eventually: trace roots of a claim:
how many independent data points is it based on?

                   PHC       undergo Growth arrest



Paper A:                             Paper B:
            implication                           implication
    method                  fact         method                 fact
     goal                   fact           goal                 fact
               results
                                                     results

  data 1
                                         data 4
             data 2       data 3
                                                     data 5     data 6
                                                                       11
Eventually: trace roots of a claim:
how many independent data points is it based on?

                   PHC       undergo Growth arrest



Paper A:                             Paper B:
            implication                           implication
    method                  fact         method                 fact
     goal                   fact           goal                 fact
               results
                                                     results

  data 1
                                         data 4
             data 2       data 3
                                                     data 5     data 6
                                                                       11
Eventually: trace roots of a claim:
how many independent data points is it based on?

                   PHC       undergo Growth arrest



Paper A:                             Paper B:
            implication                           implication
    method                  fact         method                 fact
     goal                   fact           goal                 fact
               results
                                                     results

  data 1
                                         data 4
             data 2       data 3
                                                     data 5     data 6
                                                                       11
Eventually: trace roots of a claim:
how many independent data points is it based on?

                   PHC       undergo Growth arrest



Paper A:                              Paper B:
            implication                           implication
                                              g
                                          nnin
    method                  fact       rpi method
                                     de                         fact
                                   un
     goal                   fact             goal               fact
               results
                                                     results

  data 1
                                           data 4
             data 2       data 3
                                                     data 5     data 6
                                                                       11
Eventually: trace roots of a claim:
how many independent data points is it based on?

                   PHC       undergo Growth arrest



Paper A:                             Paper B:
            implication                           implication
    method                  fact         method                 fact
     goal                   fact           goal                 fact
               results
                                                     results

  data 1
                                         data 4
             data 2       data 3
                                                     data 5     data 6
                                                                       11
Eventually: trace roots of a claim:
how many independent data points is it based on?

                   PHC        undergo Growth arrest



Paper A:                               Paper B:
            implication                             implication
    method               method link
                            fact           method                 fact
     goal                   fact            goal                  fact
               results
                                                       results

  data 1
                                           data 4
             data 2       data 3
                                                      data 5      data 6
                                                                         11
Scientific papers are stories,
that persuade with data.
Scientific papers are stories,
that persuade with data.
Scientific papers are stories,
that persuade with data.
Sometimes the link to data is good:
And sometimes it’s not so good:
And sometimes it’s not so good:
And sometimes it’s not so good:
And sometimes it’s not so good:
Data-driven papers?   Work done with Ed Hovy, Phil Bourne,
                      Gully Burns and Cartic Ramakrishnan
Data-driven papers?                                       Work done with Ed Hovy, Phil Bourne,
                                                                Gully Burns and Cartic Ramakrishnan

                                           1. Research: Each item in the system has metadata
                  metadata                 (including provenance) and relations to other data items
                             metadata      added to it.

metadata




       metadata

                                metadata
Data-driven papers?                                       Work done with Ed Hovy, Phil Bourne,
                                                                Gully Burns and Cartic Ramakrishnan

                                           1. Research: Each item in the system has metadata
                  metadata                 (including provenance) and relations to other data items
                             metadata      added to it.
                                           2. Workflow: All data items created in the lab are added
metadata
                                           to a (lab-owned) workflow system.




       metadata

                                metadata
Data-driven papers?                                                   Work done with Ed Hovy, Phil Bourne,
                                                                                      Gully Burns and Cartic Ramakrishnan

                                                                 1. Research: Each item in the system has metadata
                                        metadata                 (including provenance) and relations to other data items
                                                   metadata      added to it.
                                                                 2. Workflow: All data items created in the lab are added
        metadata
                                                                 to a (lab-owned) workflow system.
                                                                 3. Authoring: A paper is written in an authoring tool which
                                                                 can pull data with provenance from the workflow tool in the
                                                                 appropriate representation into the document.

                 metadata

                                                      metadata




Rats were subjected to two grueling
tests
(click on fig 2 to see underlying
data). These results suggest that the
neurological pain pro-
Data-driven papers?                                                   Work done with Ed Hovy, Phil Bourne,
                                                                                          Gully Burns and Cartic Ramakrishnan

                                                                     1. Research: Each item in the system has metadata
                                            metadata                 (including provenance) and relations to other data items
                                                       metadata      added to it.
                                                                     2. Workflow: All data items created in the lab are added
            metadata
                                                                     to a (lab-owned) workflow system.
                                                                     3. Authoring: A paper is written in an authoring tool which
                                                                     can pull data with provenance from the workflow tool in the
                                                                     appropriate representation into the document.

                     metadata                                        4. Editing and review: Once the co-authors agree, the
                                                                     paper is ‘exposed’ to the editors, who in turn expose it to
                                                          metadata   reviewers. Reports are stored in the authoring/editing
                                                                     system, the paper gets updated, until it is validated.




    Rats were subjected to two grueling
    tests
    (click on fig 2 to see underlying
    data). These results suggest that the
    neurological pain pro-



Review
                                   Revise
                   Edit
Data-driven papers?                                                   Work done with Ed Hovy, Phil Bourne,
                                                                                          Gully Burns and Cartic Ramakrishnan

                                                                     1. Research: Each item in the system has metadata
                                            metadata                 (including provenance) and relations to other data items
                                                       metadata      added to it.
                                                                     2. Workflow: All data items created in the lab are added
            metadata
                                                                     to a (lab-owned) workflow system.
                                                                     3. Authoring: A paper is written in an authoring tool which
                                                                     can pull data with provenance from the workflow tool in the
                                                                     appropriate representation into the document.

                     metadata                                        4. Editing and review: Once the co-authors agree, the
                                                                     paper is ‘exposed’ to the editors, who in turn expose it to
                                                          metadata   reviewers. Reports are stored in the authoring/editing
                                                                     system, the paper gets updated, until it is validated.
                                                                     5. Publishing and distribution: When a paper is
                                                                     published, a collection of validated information is
                                                                     exposed to the world. It remains connected to its related
    Rats were subjected to two grueling                              data item, and its heritage can be traced.
    tests
    (click on fig 2 to see underlying
    data). These results suggest that the
    neurological pain pro-



Review
                                   Revise
                   Edit
Data-driven papers?                                                   Work done with Ed Hovy, Phil Bourne,
                                                                                          Gully Burns and Cartic Ramakrishnan

                                                                     1. Research: Each item in the system has metadata
                                            metadata                 (including provenance) and relations to other data items
                                                       metadata      added to it.
                                                                     2. Workflow: All data items created in the lab are added
            metadata
                                                                     to a (lab-owned) workflow system.
                                                                     3. Authoring: A paper is written in an authoring tool which
                                                                     can pull data with provenance from the workflow tool in the
                                                                     appropriate representation into the document.

                     metadata                                        4. Editing and review: Once the co-authors agree, the
                                                                     paper is ‘exposed’ to the editors, who in turn expose it to
                                                          metadata   reviewers. Reports are stored in the authoring/editing
                                                                     system, the paper gets updated, until it is validated.
                                                                     5. Publishing and distribution: When a paper is
                                                                     published, a collection of validated information is
                                                                     exposed to the world. It remains connected to its related
    Rats were subjected to two grueling                              data item, and its heritage can be traced.
    tests
    (click on fig 2 to see underlying                                 6. User applications: distributed applications run on this
    data). These results suggest that the                            ‘exposed data’ universe.
    neurological pain pro-


                                                                                    Some other publisher
Review
                                   Revise
                   Edit
One step: encouraging submission
   of structured workflows
Another step: ScienceDirect app store
Another step: ScienceDirect app store

           - Eclipse SDK platform accessing all
             ScienceDirect/Scopus content
           - Build applications on top of content
           - Offer to users in marketplace
A third step: Executable Paper Challenge
Goal: invite computer science community to help develop formats that:
-   add executable files and reproducible data to computer science papers;
-   handle storage and validation of very large files
-   help validation of data and code, and decrease the reviewer’s workload
A third step: Executable Paper Challenge
Goal: invite computer science community to help develop formats that:
-   add executable files and reproducible data to computer science papers;
-   handle storage and validation of very large files
-   help validation of data and code, and decrease the reviewer’s workload
In Summary:
In Summary:
1. Stories:

   -   ORB, Satellite: link to any part of content - bring it on!
In Summary:
1. Stories:

   -   ORB, Satellite: link to any part of content - bring it on!
2. Persuasion:

   -   Logical structure for biological propositions; trace a claim
       through successive citations
In Summary:
1. Stories:

   -   ORB, Satellite: link to any part of content - bring it on!
2. Persuasion:

   -   Logical structure for biological propositions; trace a claim
       through successive citations
3. Data:

   -   Better data linking, better structuring of methods.
In Summary:
1. Stories:

   -   ORB, Satellite: link to any part of content - bring it on!
2. Persuasion:

   -   Logical structure for biological propositions; trace a claim
       through successive citations
3. Data:

   -   Better data linking, better structuring of methods.
In conclusion: is the research paper going away?
In Summary:
1. Stories:

    -   ORB, Satellite: link to any part of content - bring it on!
2. Persuasion:

    -   Logical structure for biological propositions; trace a claim
        through successive citations
3. Data:

    -   Better data linking, better structuring of methods.
In conclusion: is the research paper going away?
I don’t think so! But it will be:
    -   Structured better: authors will need to justify claims directly
    -   Connected better: more traceable, better links to data and
        workflow components, and to other work
Thank you!

W3C group on Discourse Structure:
http://www.w3.org/wiki/HCLSIG/SWANSIOC
SciVerse: http://developer.sciverse.com
Pangea project: http://bit.ly/98haOw
Parsing rhetoric: http://elsatglabs.com/labs/anita/
Fact creation demo: http://elsatglabs.com/labs/anita/demos/
LATSDemo102007/
Methods Navigator: http://www.methodsnavigator.com
SciVerse APIs: http://developer.sciverse.com
Executable Paper Challenge: http://www.executablepapers.com

Or mail me at:
Anita de Waard, a.dewaard@elsevier.com

More Related Content

Similar to Reengineering the scientific research paper

Scientific Sensemaking
Scientific SensemakingScientific Sensemaking
Scientific SensemakingAnita de Waard
 
Are we finally ready for transclusion?*
Are we finally ready for transclusion?*Are we finally ready for transclusion?*
Are we finally ready for transclusion?*Paul Groth
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Anita de Waard
 
Argumentation in biology papers
Argumentation in biology papersArgumentation in biology papers
Argumentation in biology papersAnita de Waard
 
Temporal-Spatial Expressions of Spy1 in Rat Sciatic Nerve After Crush
Temporal-Spatial Expressions of Spy1 in Rat Sciatic Nerve After CrushTemporal-Spatial Expressions of Spy1 in Rat Sciatic Nerve After Crush
Temporal-Spatial Expressions of Spy1 in Rat Sciatic Nerve After CrushJiao Yang
 
2011 dystroglycan Development Berti-4025-37
2011 dystroglycan Development Berti-4025-372011 dystroglycan Development Berti-4025-37
2011 dystroglycan Development Berti-4025-37Monica Ghidinelli
 
Npy induced kinase activation in c
Npy induced kinase activation in cNpy induced kinase activation in c
Npy induced kinase activation in cdiazjessica3
 
Npy induced kinase activation in c
Npy induced kinase activation in cNpy induced kinase activation in c
Npy induced kinase activation in cdiazjessica3
 
Jing, 2016
Jing, 2016Jing, 2016
Jing, 2016Jing Di
 
Directed research spring 2016 Daniel Svedberg
Directed research spring 2016 Daniel SvedbergDirected research spring 2016 Daniel Svedberg
Directed research spring 2016 Daniel SvedbergDan Svedberg
 
Inestrosa idn 2011
Inestrosa idn 2011Inestrosa idn 2011
Inestrosa idn 2011Jorge Parodi
 
Inestrosa idn 2011
Inestrosa idn 2011Inestrosa idn 2011
Inestrosa idn 2011Jorge Parodi
 

Similar to Reengineering the scientific research paper (20)

KNDI Toronto panel
KNDI Toronto panelKNDI Toronto panel
KNDI Toronto panel
 
Scientific Sensemaking
Scientific SensemakingScientific Sensemaking
Scientific Sensemaking
 
Are we finally ready for transclusion?*
Are we finally ready for transclusion?*Are we finally ready for transclusion?*
Are we finally ready for transclusion?*
 
ICPW2007.deWaard
ICPW2007.deWaardICPW2007.deWaard
ICPW2007.deWaard
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Argumentation in biology papers
Argumentation in biology papersArgumentation in biology papers
Argumentation in biology papers
 
Final Poster.
Final Poster.Final Poster.
Final Poster.
 
Temporal-Spatial Expressions of Spy1 in Rat Sciatic Nerve After Crush
Temporal-Spatial Expressions of Spy1 in Rat Sciatic Nerve After CrushTemporal-Spatial Expressions of Spy1 in Rat Sciatic Nerve After Crush
Temporal-Spatial Expressions of Spy1 in Rat Sciatic Nerve After Crush
 
Elpub
ElpubElpub
Elpub
 
2011 dystroglycan Development Berti-4025-37
2011 dystroglycan Development Berti-4025-372011 dystroglycan Development Berti-4025-37
2011 dystroglycan Development Berti-4025-37
 
Npy induced kinase activation in c
Npy induced kinase activation in cNpy induced kinase activation in c
Npy induced kinase activation in c
 
npy ppt
npy pptnpy ppt
npy ppt
 
Npy induced kinase activation in c
Npy induced kinase activation in cNpy induced kinase activation in c
Npy induced kinase activation in c
 
Npy ppt
Npy pptNpy ppt
Npy ppt
 
Jing, 2016
Jing, 2016Jing, 2016
Jing, 2016
 
Lucas...Cowell 2014
Lucas...Cowell 2014Lucas...Cowell 2014
Lucas...Cowell 2014
 
Directed research spring 2016 Daniel Svedberg
Directed research spring 2016 Daniel SvedbergDirected research spring 2016 Daniel Svedberg
Directed research spring 2016 Daniel Svedberg
 
Reiter lecture 11.11.14
Reiter lecture 11.11.14Reiter lecture 11.11.14
Reiter lecture 11.11.14
 
Inestrosa idn 2011
Inestrosa idn 2011Inestrosa idn 2011
Inestrosa idn 2011
 
Inestrosa idn 2011
Inestrosa idn 2011Inestrosa idn 2011
Inestrosa idn 2011
 

More from Anita de Waard

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseAnita de Waard
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?Anita de Waard
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataAnita de Waard
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsAnita de Waard
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesAnita de Waard
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?Anita de Waard
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data ManagementAnita de Waard
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryAnita de Waard
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data SharingAnita de Waard
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingAnita de Waard
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataAnita de Waard
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016Anita de Waard
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...Anita de Waard
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupAnita de Waard
 

More from Anita de Waard (20)

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
 

Recently uploaded

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Reengineering the scientific research paper

  • 1. Stories, that persuade with data. What’s inside scientific papers, and should it be reengineered? Anita de Waard, a.dewaard@elsevier.com Disruptive Technologies Director, Elsevier Labs
  • 2. Scientific papers are stories, that persuade with data. The Story of Goldilocks and Story Part Paper The AXH Domain of Ataxin-1 Mediates the Three Bears Neurodegeneration through Its Interaction with Gfi-1/ Senseless Proteins Once upon a time Time Setting Background The mechanisms mediating SCA1 pathogenesis are still not fully understood, but some general principles have emerged. a little girl named Goldilocks Characters Objects of study the Drosophila Atx-1 homolog (dAtx-1) which lacks a polyQ tract, She went for a walk in the forest. Location Experimental studied and compared in vivo effects and interactions to those of the Pretty soon, she came upon a house. setup human protein She knocked and, when no one Goal Theme Research Gain insight into how Atx-1's function contributes to SCA1 answered, goal pathogenesis. How these interactions might contribute to the disease process and how they might cause toxicity in only a subset of neurons in she walked right in. Attempt Hypothesis SCA1 may play a role in the regulation of gene expression Atx-1 is not fully understood. At the table in the kitchen, there Name Episode 1 Name dAtX-1 and hAtx-1 Induce Similar Phenotypes When Overexpressed in were three bowls of porridge. Files Goldilocks was hungry. Subgoal Subgoal test the function of the AXH domain She tasted the porridge from the Attempt Method overexpressed dAtx-1 in flies using the GAL4/UAS system (Brand and first bowl. Perrimon, 1993) and compared its effects to those of hAtx-1. This porridge is too hot! she Outcome Results Overexpression of dAtx-1 by Rhodopsin1(Rh1)-GAL4, which drives exclaimed. expression in the differentiated R1-R6 photoreceptor cells (Mollereau et al., 2000 and O'Tousa et al., 1985), results in neurodegeneration in the eye, as does overexpression of hAtx-1[82Q]. Although at 2 days after eclosion, overexpression of either Atx-1 does not show obvious So, she tasted the porridge from the Activity Data (data not shown), morphological changes in the photoreceptor cells second bowl. This porridge is too cold, she said Outcome Results both genotypes show many large holes and loss of cell integrity at 28 days So, she tasted the last bowl of Activity Data (Figures 1B-1D). porridge. Ahhh, this porridge is just right, she Outcome Results Overexpression of dAtx-1 using the GMR-GAL4 driver also induces eye said happily and abnormalities. The external structures of the eyes that overexpress she ate it all up. Outcome Data dAtx-1 1F), disorganized ommatidia and loss of interommatidial bristles (Figure show
  • 3. Scientific papers are stories, that persuade with data. The Story of Goldilocks and Story Part Paper The AXH Domain of Ataxin-1 Mediates the Three Bears Neurodegeneration through Its Interaction with Gfi-1/ Senseless Proteins Once upon a time Time Setting Background The mechanisms mediating SCA1 pathogenesis are still not fully understood, but some general principles have emerged. a little girl named Goldilocks Characters Objects of study the Drosophila Atx-1 homolog (dAtx-1) which lacks a polyQ tract, She went for a walk in the forest. Location Experimental studied and compared in vivo effects and interactions to those of the Pretty soon, she came upon a house. setup human protein She knocked and, when no one Goal Theme Research Gain insight into how Atx-1's function contributes to SCA1 answered, goal pathogenesis. How these interactions might contribute to the disease process and how they might cause toxicity in only a subset of neurons in she walked right in. Attempt Hypothesis SCA1 may play a role in the regulation of gene expression Atx-1 is not fully understood. At the table in the kitchen, there Name Episode 1 Name dAtX-1 and hAtx-1 Induce Similar Phenotypes When Overexpressed in were three bowls of porridge. Files Goldilocks was hungry. Subgoal Subgoal test the function of the AXH domain She tasted the porridge from the Attempt Method overexpressed dAtx-1 in flies using the GAL4/UAS system (Brand and first bowl. Perrimon, 1993) and compared its effects to those of hAtx-1. This porridge is too hot! she Outcome Results Overexpression of dAtx-1 by Rhodopsin1(Rh1)-GAL4, which drives exclaimed. expression in the differentiated R1-R6 photoreceptor cells (Mollereau et al., 2000 and O'Tousa et al., 1985), results in neurodegeneration in the eye, as does overexpression of hAtx-1[82Q]. Although at 2 days after eclosion, overexpression of either Atx-1 does not show obvious So, she tasted the porridge from the Activity Data (data not shown), morphological changes in the photoreceptor cells second bowl. This porridge is too cold, she said Outcome Results both genotypes show many large holes and loss of cell integrity at 28 days So, she tasted the last bowl of Activity Data (Figures 1B-1D). porridge. Ahhh, this porridge is just right, she Outcome Results Overexpression of dAtx-1 using the GMR-GAL4 driver also induces eye said happily and abnormalities. The external structures of the eyes that overexpress she ate it all up. Outcome Data dAtx-1 1F), disorganized ommatidia and loss of interommatidial bristles (Figure show
  • 4. Story analysis of scientific text: ORB vs. Medium-grained structure
  • 5. Story analysis of scientific text: ORB vs. Medium-grained structure See work at http://www.w3.org/wiki/HCLSIG/SWANSIOC
  • 6. Episode-level access through Linked Data standards:
  • 7. Episode-level access through Linked Data standards: said @anita on April 5, 2011 this says <ce:section id=#123> mice like cheese
  • 8. Episode-level access through Linked Data standards: but we all know she was deluded then said @anita on April 5, 2011 this says <ce:section id=#123> mice like cheese
  • 9. Episode-level access through Linked Data standards: the xml is fixed, but the structure is open! allows for layers of annotation but we all know she was deluded then said @anita on April 5, 2011 this says <ce:section id=#123> mice like cheese
  • 10. Satellite Format: Linked Data repository for all Elsevier content
  • 11. Satellite Format: Linked Data repository for all Elsevier content Dublin Core and SKOS
  • 12. Satellite Format: Linked Data repository for all Elsevier content Dublin Core and SKOS SWAN’s PAV (Provenance, Authoring and Versioning) ontology
  • 13. Scientific papers are stories, that persuade with data. Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
  • 14. Scientific papers are stories, that persuade with data. Both seminomas and the EC component of Both seminomas and the EC component of nonseminomas share features with ES cells. nonseminomas share features with ES cells. To exclude thatthat detection of miR-371-3 merely To exclude the reflects its expression pattern in ES cells,reflects its the detection of miR-371-3 merely we tested by RPA miR-302a-d, another ES cells-specific expression pattern in ES cells, miRNA cluster RPA miR-302a-d, another ES cells- we tested by (Suh et al, 2004). In many of the m i R - 3 7 miRNAx p r e s s(Suh et e m2004). a s a n d specific 1 - 3 e cluster i n g s al, i n o m nonseminomas, miR-302a-d was undetectable (Figs In many of the miR-371-3 expressing seminomas S7 and S8), suggesting that miR-371-3undetectable and nonseminomas, miR-302a-d was expression is a selective event during tumorigenesis. (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
  • 15. Scientific papers are stories, that persuade with data. Both seminomas and the EC component of Both seminomas and the EC component of Fact nonseminomas share features with ES cells. nonseminomas share features with ES cells. To exclude thatthat detection of miR-371-3 merely To exclude the Goal reflects its expression pattern in ES cells,reflects its the detection of miR-371-3 merely we tested Hypothesis by RPA miR-302a-d, another ES cells-specific expression pattern in ES cells, miRNA cluster RPA miR-302a-d, another ES cells- we tested by (Suh et al, 2004). In many of the m i R - 3 7 miRNAx p r e s s(Suh et e m2004). a s a n d specific 1 - 3 e cluster i n g s al, i n o m Method nonseminomas, miR-302a-d was undetectable (Figs In many of the miR-371-3 expressing seminomas S7 and S8), suggesting that miR-371-3undetectable and nonseminomas, miR-302a-d was expression Result is a selective event during tumorigenesis. (Figs S7 and S8), suggesting that Reg-Implication miR-371-3 expression is a selective event during Implication tumorigenesis.
  • 16. Scientific papers are stories, that persuade with data. Conceptual Both seminomas and the EC component of Both seminomas and the EC component of knowledge Fact nonseminomas share features with ES cells. nonseminomas share features with ES cells. To exclude thatthat detection of miR-371-3 merely To exclude the Goal reflects its expression pattern in ES cells,reflects its the detection of miR-371-3 merely we tested Hypothesis by RPA miR-302a-d, another ES cells-specific expression pattern in ES cells, miRNA cluster RPA miR-302a-d, another ES cells- we tested by (Suh et al, 2004). In many of the m i R - 3 7 miRNAx p r e s s(Suh et e m2004). a s a n d specific 1 - 3 e cluster i n g s al, i n o m Method nonseminomas, miR-302a-d was undetectable (Figs In many of the miR-371-3 expressing seminomas S7 and S8), suggesting that miR-371-3undetectable and nonseminomas, miR-302a-d was expression Result is a selective event during tumorigenesis. (Figs S7 and S8), suggesting that Reg-Implication miR-371-3 expression is a selective event during Implication tumorigenesis.
  • 17. Scientific papers are stories, that persuade with data. Conceptual Both seminomas and the EC component of Both seminomas and the EC component of knowledge Fact nonseminomas share features with ES cells. nonseminomas share features with ES cells. To exclude thatthat detection of miR-371-3 merely To exclude the Goal reflects its expression pattern in ES cells,reflects its the detection of miR-371-3 merely we tested Hypothesis by RPA miR-302a-d, another ES cells-specific expression pattern in ES cells, miRNA cluster RPA miR-302a-d, another ES cells- we tested by (Suh et al, 2004). In many of the m i R - 3 7 miRNAx p r e s s(Suh et e m2004). a s a n d specific 1 - 3 e cluster i n g s al, i n o m Method Experimental nonseminomas, miR-302a-d was undetectable (Figs In many of the miR-371-3 expressing seminomas Evidence S7 and S8), suggesting that miR-371-3undetectable and nonseminomas, miR-302a-d was expression Result is a selective event during tumorigenesis. (Figs S7 and S8), suggesting that Reg-Implication miR-371-3 expression is a selective event during Implication tumorigenesis.
  • 18. Realms of persuasive experimental discourse:
  • 19. Realms of persuasive experimental discourse: (1) Both seminomas (2) b. the detection of (3) c. miR-371-3 and the EC component miR-371-3 merely expression is a of nonseminomas share reflects its expression selective event during features with ES cells. pattern in ES cells, tumorigenesis. (2) a. To exclude that (3) b. suggesting that (2) c. we tested by RPA (3) a. In many of the miR-371-3 miR-302a-d, another ES expressing seminomas and cells-specific miRNA cluster nonseminomas, miR-302a-d was (Suh et al, 2004). undetectable (Figs S7 and S8),
  • 20. Realms of persuasive experimental discourse: Concepts, models, ‘facts’ (1) Both seminomas (2) b. the detection of (3) c. miR-371-3 and the EC component miR-371-3 merely expression is a of nonseminomas share reflects its expression selective event during features with ES cells. pattern in ES cells, tumorigenesis. (2) a. To exclude that Transitions (3) b. suggesting that (2) c. we tested by RPA (3) a. In many of the miR-371-3 miR-302a-d, another ES expressing seminomas and cells-specific miRNA cluster nonseminomas, miR-302a-d was (Suh et al, 2004). undetectable (Figs S7 and S8), Experiment
  • 21. Realms of persuasive experimental discourse: Concepts, models, ‘facts’ ‘State’ present tense (1) Both seminomas (2) b. the detection of (3) c. miR-371-3 and the EC component miR-371-3 merely expression is a of nonseminomas share reflects its expression selective event during features with ES cells. pattern in ES cells, tumorigenesis. (2) a. To exclude that Transitions (3) b. suggesting that (2) c. we tested by RPA (3) a. In many of the miR-371-3 miR-302a-d, another ES expressing seminomas and cells-specific miRNA cluster nonseminomas, miR-302a-d was (Suh et al, 2004). undetectable (Figs S7 and S8), Experiment ‘Narrative’ past tense
  • 22. Fact creation through citations: Voorhoeve et al, Cell, 2006: To investigate the possibility that miR-372 and miR-373 suppress the expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity,
  • 23. Fact creation through citations: Voorhoeve et al, Cell, 2006: To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity,
  • 24. Fact creation through citations: Voorhoeve et al, Cell, 2006: To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity, Implication
  • 25. Fact creation through citations: Voorhoeve et al, Cell, 2006: To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity, Implication Raver-Shapira et.al, JMolCell 2007 ... two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).
  • 26. Fact creation through citations: Voorhoeve et al, Cell, 2006: To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity, Implication Raver-Shapira et.al, JMolCell 2007 Cited Implication ... two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).
  • 27. Fact creation through citations: Voorhoeve et al, Cell, 2006: To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity, Implication Raver-Shapira et.al, JMolCell 2007 Cited Implication ... two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006). Yabuta, JBioChem 2007: miR-372 and miR-373 target the Lats2 tumor suppressor (Voorhoeve et al., 2006)
  • 28. Fact creation through citations: Voorhoeve et al, Cell, 2006: To investigate the possibility that miR-372 and miR-373 suppress the Hypothesis expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity, Implication Raver-Shapira et.al, JMolCell 2007 Cited Implication ... two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006). Yabuta, JBioChem 2007: Fact miR-372 and miR-373 target the Lats2 tumor suppressor (Voorhoeve et al., 2006)
  • 29. “[Y]ou can transform a fact into fiction or a fiction into fact just by adding or subtracting references [and data]” – Bruno Latour, ‘Science in Action’,1987
  • 30. “[Y]ou can transform a fact into fiction or a fiction into fact just by adding or subtracting references [and data]” – Bruno Latour, ‘Science in Action’,1987
  • 31. How is this rhetoric instantiated? Rhetorical Utterance {Proposition} S= V= goal H, B C, E Indicate lack of {The role of untranslated exons in the CCR3 gene} NN 0 knowledge has not been studied. Evaluate other Recently, CCR3 has been shown to {be N, D 3 work upregulated on neutrophils by interferons in vitro [..]} Offer it is thought that {these transcription factors NN, R 2 hypotheses affect transcription of the gene through interactions with the RNA transcription complex.} Interpret results these data suggested that {5' untranslated exon A, D 2 1 may have a regulatory function.} Assess validity Since {this was not the case with other lines,} {we A, D 1 of suspect {it is integration-site specific}} interpretations State While we expected {the transcript to be about 1 A, D 2, S+ correspondence kb in size (Figure 4A),} {two bands ~4 and 5 kb were to expectations apparent.} Comparison to It is important that {this data be viewed A,R/ 2, F+ other work with {what is known about other myeloid- NN/ specific promoters,}} D
  • 32. Eventually: trace roots of a claim: how many independent data points is it based on? 11
  • 33. Eventually: trace roots of a claim: how many independent data points is it based on? PHC undergo Growth arrest 11
  • 34. Eventually: trace roots of a claim: how many independent data points is it based on? PHC undergo Growth arrest Paper A: implication method fact goal fact results data 1 data 2 data 3 11
  • 35. Eventually: trace roots of a claim: how many independent data points is it based on? PHC undergo Growth arrest Paper A: Paper B: implication implication method fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 11
  • 36. Eventually: trace roots of a claim: how many independent data points is it based on? PHC undergo Growth arrest Paper A: Paper B: implication implication method fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 11
  • 37. Eventually: trace roots of a claim: how many independent data points is it based on? PHC undergo Growth arrest Paper A: Paper B: implication implication method fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 11
  • 38. Eventually: trace roots of a claim: how many independent data points is it based on? PHC undergo Growth arrest Paper A: Paper B: implication implication g nnin method fact rpi method de fact un goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 11
  • 39. Eventually: trace roots of a claim: how many independent data points is it based on? PHC undergo Growth arrest Paper A: Paper B: implication implication method fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 11
  • 40. Eventually: trace roots of a claim: how many independent data points is it based on? PHC undergo Growth arrest Paper A: Paper B: implication implication method method link fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 11
  • 41. Scientific papers are stories, that persuade with data.
  • 42. Scientific papers are stories, that persuade with data.
  • 43. Scientific papers are stories, that persuade with data.
  • 44. Sometimes the link to data is good:
  • 45. And sometimes it’s not so good:
  • 46. And sometimes it’s not so good:
  • 47. And sometimes it’s not so good:
  • 48. And sometimes it’s not so good:
  • 49. Data-driven papers? Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan
  • 50. Data-driven papers? Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan 1. Research: Each item in the system has metadata metadata (including provenance) and relations to other data items metadata added to it. metadata metadata metadata
  • 51. Data-driven papers? Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan 1. Research: Each item in the system has metadata metadata (including provenance) and relations to other data items metadata added to it. 2. Workflow: All data items created in the lab are added metadata to a (lab-owned) workflow system. metadata metadata
  • 52. Data-driven papers? Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan 1. Research: Each item in the system has metadata metadata (including provenance) and relations to other data items metadata added to it. 2. Workflow: All data items created in the lab are added metadata to a (lab-owned) workflow system. 3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document. metadata metadata Rats were subjected to two grueling tests (click on fig 2 to see underlying data). These results suggest that the neurological pain pro-
  • 53. Data-driven papers? Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan 1. Research: Each item in the system has metadata metadata (including provenance) and relations to other data items metadata added to it. 2. Workflow: All data items created in the lab are added metadata to a (lab-owned) workflow system. 3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document. metadata 4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to metadata reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated. Rats were subjected to two grueling tests (click on fig 2 to see underlying data). These results suggest that the neurological pain pro- Review Revise Edit
  • 54. Data-driven papers? Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan 1. Research: Each item in the system has metadata metadata (including provenance) and relations to other data items metadata added to it. 2. Workflow: All data items created in the lab are added metadata to a (lab-owned) workflow system. 3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document. metadata 4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to metadata reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated. 5. Publishing and distribution: When a paper is published, a collection of validated information is exposed to the world. It remains connected to its related Rats were subjected to two grueling data item, and its heritage can be traced. tests (click on fig 2 to see underlying data). These results suggest that the neurological pain pro- Review Revise Edit
  • 55. Data-driven papers? Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan 1. Research: Each item in the system has metadata metadata (including provenance) and relations to other data items metadata added to it. 2. Workflow: All data items created in the lab are added metadata to a (lab-owned) workflow system. 3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document. metadata 4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to metadata reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated. 5. Publishing and distribution: When a paper is published, a collection of validated information is exposed to the world. It remains connected to its related Rats were subjected to two grueling data item, and its heritage can be traced. tests (click on fig 2 to see underlying 6. User applications: distributed applications run on this data). These results suggest that the ‘exposed data’ universe. neurological pain pro- Some other publisher Review Revise Edit
  • 56. One step: encouraging submission of structured workflows
  • 58. Another step: ScienceDirect app store - Eclipse SDK platform accessing all ScienceDirect/Scopus content - Build applications on top of content - Offer to users in marketplace
  • 59. A third step: Executable Paper Challenge Goal: invite computer science community to help develop formats that: - add executable files and reproducible data to computer science papers; - handle storage and validation of very large files - help validation of data and code, and decrease the reviewer’s workload
  • 60. A third step: Executable Paper Challenge Goal: invite computer science community to help develop formats that: - add executable files and reproducible data to computer science papers; - handle storage and validation of very large files - help validation of data and code, and decrease the reviewer’s workload
  • 62. In Summary: 1. Stories: - ORB, Satellite: link to any part of content - bring it on!
  • 63. In Summary: 1. Stories: - ORB, Satellite: link to any part of content - bring it on! 2. Persuasion: - Logical structure for biological propositions; trace a claim through successive citations
  • 64. In Summary: 1. Stories: - ORB, Satellite: link to any part of content - bring it on! 2. Persuasion: - Logical structure for biological propositions; trace a claim through successive citations 3. Data: - Better data linking, better structuring of methods.
  • 65. In Summary: 1. Stories: - ORB, Satellite: link to any part of content - bring it on! 2. Persuasion: - Logical structure for biological propositions; trace a claim through successive citations 3. Data: - Better data linking, better structuring of methods. In conclusion: is the research paper going away?
  • 66. In Summary: 1. Stories: - ORB, Satellite: link to any part of content - bring it on! 2. Persuasion: - Logical structure for biological propositions; trace a claim through successive citations 3. Data: - Better data linking, better structuring of methods. In conclusion: is the research paper going away? I don’t think so! But it will be: - Structured better: authors will need to justify claims directly - Connected better: more traceable, better links to data and workflow components, and to other work
  • 67. Thank you! W3C group on Discourse Structure: http://www.w3.org/wiki/HCLSIG/SWANSIOC SciVerse: http://developer.sciverse.com Pangea project: http://bit.ly/98haOw Parsing rhetoric: http://elsatglabs.com/labs/anita/ Fact creation demo: http://elsatglabs.com/labs/anita/demos/ LATSDemo102007/ Methods Navigator: http://www.methodsnavigator.com SciVerse APIs: http://developer.sciverse.com Executable Paper Challenge: http://www.executablepapers.com Or mail me at: Anita de Waard, a.dewaard@elsevier.com