SlideShare une entreprise Scribd logo
1  sur  58
Télécharger pour lire hors ligne
Yannick Wurm
http://wurmlab.github.io
Steps to avoid having to
retract your analysis
Guarujá
2018
Towards improved efficiency,
reliability & reproducibility
Biology has changed.
BIG
Geoffrey Chang: Crystallographer
• Beckman FoundationYoung Investigator
Award
• Presidential Early Career Award
Journal of Molecular Biology (2003) Chang. Structure
of MsbA from Vibrio cholera: a multidrug resistance ABC
transporter homolog in a closed conformation.
PNAS (2004) Ma & Chang. Structure of the multidrug
resistance efflux transporter EmrE from Escherichia coli.
Science (2005) Reyes & Chang. Structure of the ABC
transporter MsbA in complex with ADP vanadate and
lipopolysaccharide.
Science (2005) Pornillos et al. X-ray structure of the
EmrE multidrug transporter in complex with a substrate.
Science (2001) Chang & Roth. Structure of MsbA from
E. coli: a homolog of the multidrug resistance ATP binding
cassette (ABC) transporters.
Science (2001) Chang & Roth.
earch Institute in
next year, in a cer-
Chang received a
Award
rs, the
young
ated a
apers
ctures
ded in
into a
Swiss
per in
bt on a
group
cience
gated,
scover
ispro-
mns of
density
m had
ucture.
d used
energy from adenosine triphosphate to trans-
port molecules across cell membranes. These
so-called ABC transporters perform many
determination was at the root o
cess: “He has an incredible d
ethic. He really pushed the fie
of getting things to
no one else had be
Chang’s data are go
but the faulty so
everything off.
Ironically, anoth
doc in Rees’s lab, K
exposed the mistake
tember issue of Na
now at the Swiss F
ofTechnology in Zu
the structure of anA
calledSav1866from
aureus. The structur
cally—and unexpe
ent from that of
pulling up Sav186
MsbA from S. typh
computer screen, L
realized in minutes
structurewasinvert
the “hand” of a mol
Flipping fiasco. The structures of MsbA (purple) and Sav1866 (green) overlap
little (left) until MsbA is inverted (right).
California.The next year, in a cer-
e White House, Chang received a
l Early Career Award
ts and Engineers, the
ghest honor for young
. His lab generated a
high-profile papers
e molecular structures
proteins embedded in
nes.
e dream turned into a
In September, Swiss
published a paper in
cast serious doubt on a
cture Chang’s group
ed in a 2001 Science
en he investigated,
horrified to discover
madedata-analysispro-
ipped two columns of
ng the electron-density
which his team had
final protein structure.
ly, his group had used
m to analyze data for
port molecules across cell membranes. These
so-called ABC transporters perform many
cess: “He has an
ethic. He really p
of get
no on
Chan
but t
every
Iro
doc in
expos
temb
now a
ofTec
the str
called
aureu
cally—
ent f
pullin
MsbA
comp
realiz
struct
the “h
a cha
Flipping fiasco. The structures of MsbA (purple) and Sav1866 (green) overlap
little (left) until MsbA is inverted (right).
Sav1866 Dawson & Locher (2006) Nature
Science(2001)Chang&Roth.Science (2001) Chang & Roth.
Comparison with 3D structure of ortholog
Science (2001) Chang & Roth.
http://wurmlab.github.io
LETTERS I BOOKS I POLICY FORUM I EDUCATION FORUM I PERSPECTIVES
1878 1880 1882
LETTERS
edited by Etta Kavanagh
Retraction
WE WISH TO RETRACT OUR RESEARCH ARTICLE “STRUCTURE OF
MsbA from E. coli:A homolog of the multidrug resistanceATP bind-
ing cassette (ABC) transporters” and both of our Reports “Structure of
the ABC transporter MsbA in complex with ADP•vanadate and
lipopolysaccharide”and“X-raystructureoftheEmrEmultidrugtrans-
porter in complex with a substrate” (1–3).
The recently reported structure of Sav1866 (4) indicated that our
MsbA structures (1, 2, 5) were incorrect in both the hand of the struc-
ture and the topology. Thus, our biological interpretations based on
these inverted models for MsbA are invalid.
Anin-housedatareductionprogramintroducedachangeinsignfor
anomalous differences.This program, which was not part of a conven-
tional data processing package, converted the anomalous pairs (I+ and
I-) to (F- and F+), thereby introducing a sign change. As the diffrac-
tion data collected for each set of MsbA crystals and for the EmrE
crystals were processed with the same program, the structures reported
in (1–3, 5, 6) had the wrong hand.
The error in the topology of the original MsbA structure was a con-
sequence of the low resolution of the data as well as breaks in the elec-
tron density for the connecting loop regions. Unfortunately, the use of
the multicopy refinement procedure still allowed us to obtain reason-
able refinement values for the wrong structures.
The Protein Data Bank (PDB) files 1JSQ, 1PF4, and 1Z2R for
MsbA and 1S7B and 2F2M for EmrE have been moved to the archive
of obsolete PDB entries. The MsbA and EmrE structures will be
recalculated from the original data using the proper sign for the anom-
alous differences, and the new Ca coordinates and structure factors
will be deposited.
We very sincerely regret the confusion that these papers have
caused and, in particular, subsequent research efforts that were unpro-
ductive as a result of our original findings.
GEOFFREY CHANG, CHRISTOPHER B. ROTH,
CHRISTOPHER L. REYES, OWEN PORNILLOS,
YEN-JU CHEN, ANDY P. CHEN
Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
References
1. G. Chang, C. B. Roth, Science 293, 1793 (2001).
2. C. L. Reyes, G. Chang, Science 308, 1028 (2005).
3. O. Pornillos, Y.-J. Chen, A. P. Chen, G. Chang, Science 310, 1950 (2005).
4. R. J. Dawson, K. P. Locher, Nature 443, 180 (2006).
5. G. Chang, J. Mol. Biol. 330, 419 (2003).
6. C. Ma, G. Chang, Proc. Natl. Acad. Sci. U.S.A. 101, 2852 (2004).
MsbA from E. coli:A homolog of the multidrug resistanceATP bind-
ing cassette (ABC) transporters” and both of our Reports “Structure of
the ABC transporter MsbA in complex with ADP•vanadate and
lipopolysaccharide”and“X-raystructureoftheEmrEmultidrugtrans-
porter in complex with a substrate” (1–3).
The recently reported structure of Sav1866 (4) indicated that our
MsbA structures (1, 2, 5) were incorrect in both the hand of the struc-
ture and the topology. Thus, our biological interpretations based on
these inverted models for MsbA are invalid.
Anin-housedatareductionprogramintroducedachangeinsignfor
anomalous differences.This program, which was not part of a conven-
tional data processing package, converted the anomalous pairs (I+ and
I-) to (F- and F+), thereby introducing a sign change. As the diffrac-
tion data collected for each set of MsbA crystals and for the EmrE
crystals were processed with the same program, the structures reported
in (1–3, 5, 6) had the wrong hand.
The error in the topology of the original MsbA structure was a con-
sequence of the low resolution of the data as well as breaks in the elec-
😥
Geoffrey Chang
• Beckman FoundationYoung Investigator
Award
• Presidential Early Career Award
Science (2001) Chang & Roth. Structure of MsbA from
E. coli: a homolog of the multidrug resistance ATP binding
cassette (ABC) transporters.
Journal of Molecular Biology (2003) Chang. Structure
of MsbA from Vibrio cholera: a multidrug resistance ABC
transporter homolog in a closed conformation.
PNAS (2004) Ma & Chang. Structure of the multidrug
resistance efflux transporter EmrE from Escherichia coli.
Science (2005) Reyes & Chang. Structure of the ABC
transporter MsbA in complex with ADP vanadate and
lipopolysaccharide.
Science (2005) Pornillos et al. X-ray structure of the
EmrE multidrug transporter in complex with a substrate.
1860
Untilrecently,GeoffreyChang’scareerwason
a trajectory most young scientists only dream
about. In 1999, at the age of 28, the protein
crystallographer landed a faculty position at
the prestigious Scripps Research Institute in
San Diego, California.The next year, in a cer-
emony at the White House, Chang received a
Presidential Early Career Award
for Scientists and Engineers, the
country’s highest honor for young
researchers. His lab generated a
stream of high-profile papers
detailing the molecular structures
of important proteins embedded in
cell membranes.
Then the dream turned into a
nightmare. In September, Swiss
researchers published a paper in
Nature that cast serious doubt on a
protein structure Chang’s group
had described in a 2001 Science
paper. When he investigated,
Chang was horrified to discover
thatahomemadedata-analysispro-
2001 Science paper, which described the struc-
tureofaproteincalledMsbA,isolatedfromthe
bacterium Escherichia coli. MsbA belongs to a
huge and ancient family of molecules that use
energy from adenosine triphosphate to trans-
port molecules across cell membranes. These
so-called ABC transporters perform many
Sciences and
EmrE, a differ
Crystalliz
five membra
was an incred
postdoc advis
nia Institute o
proteins are a
because they
ously diffic
needed for x-
determination
cess: “He has
ethic. He real
of
no
Ch
bu
ev
do
ex
tem
no
of
the
cal
au
ca
en
pu
A Scientist’s Nightmare: Software
Problem Leads to Five Retractions
SCIENTIFIC PUBLISHING
OF
d-
of
nd
ns-
ur
c-
on
or
n-
nd
c-
rE
ed
n-
c-
able refinement values for the wrong structures.
The Protein Data Bank (PDB) files 1JSQ, 1PF4, and 1Z2R for
MsbA and 1S7B and 2F2M for EmrE have been moved to the archive
of obsolete PDB entries. The MsbA and EmrE structures will be
recalculated from the original data using the proper sign for the anom-
alous differences, and the new Ca coordinates and structure factors
will be deposited.
We very sincerely regret the confusion that these papers have
caused and, in particular, subsequent research efforts that were unpro-
ductive as a result of our original findings.
GEOFFREY CHANG, CHRISTOPHER B. ROTH,
CHRISTOPHER L. REYES, OWEN PORNILLOS,
YEN-JU CHEN, ANDY P. CHEN
Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
References
1. G. Chang, C. B. Roth, Science 293, 1793 (2001).
2. C. L. Reyes, G. Chang, Science 308, 1028 (2005).
3. O. Pornillos, Y.-J. Chen, A. P. Chen, G. Chang, Science 310, 1950 (2005).
4. R. J. Dawson, K. P. Locher, Nature 443, 180 (2006).
5. G. Chang, J. Mol. Biol. 330, 419 (2003).
6. C. Ma, G. Chang, Proc. Natl. Acad. Sci. U.S.A. 101, 2852 (2004).
http://wurmlab.github.io
This is costly
For:
•the individual
•collaborators
•the institution
•1000s of researchers performing
follow-up work
•science
•society
This changes
everything.
Cost to sequence 1,000,000 nucleotides
Any lab can
sequence
anything!
We generate 50,000x more data per $
than 10 years ago
Llorente et al 2015 Science
-> actually no
general
pattern
20 types of cancer, 6,000 samples total
Methylation profiles analysed
—> cancer cells are 36 years older than normal cells
>500 citations
http://wurmlab.github.io
• Understanding/visualising/analysing/massaging big data is hard.
• Biology/life is complex.
• Biologists lack computational training.
• Field is young.
• Analysis tools (generally) suck:
• badly written
• badly tested
• hard to install
• output quality… often questionable.
• Data sizes keep growing!
• Data formats keep changing :(
Genome bioinformatics is hardBiology is harder than (many) other
data sciences
We need great
approaches.
Some sources of inspiration
• Avoid costly mistakes
• Be faster:“stand on the shoulders of giants”
• Increase impact / visibility
http://wurmlab.github.io
Community Page
Best Practices for Scientific Computing
Greg Wilson1
*, D. A. Aruliah2
, C. Titus Brown3
, Neil P. Chue Hong4
, Matt Davis5
, Richard T. Guy6¤
,
Steven H. D. Haddock7
, Kathryn D. Huff8
, Ian M. Mitchell9
, Mark D. Plumbley10
, Ben Waugh11
,
Ethan P. White12
, Paul Wilson13
1 Mozilla Foundation, Toronto, Ontario, Canada, 2 University of Ontario Institute of Technology, Oshawa, Ontario, Canada, 3 Michigan State University, East Lansing,
Michigan, United States of America, 4 Software Sustainability Institute, Edinburgh, United Kingdom, 5 Space Telescope Science Institute, Baltimore, Maryland, United
States of America, 6 University of Toronto, Toronto, Ontario, Canada, 7 Monterey Bay Aquarium Research Institute, Moss Landing, California, United States of America,
8 University of California Berkeley, Berkeley, California, United States of America, 9 University of British Columbia, Vancouver, British Columbia, Canada, 10 Queen Mary
University of London, London, United Kingdom, 11 University College London, London, United Kingdom, 12 Utah State University, Logan, Utah, United States of America,
13 University of Wisconsin, Madison, Wisconsin, United States of America
Introduction
Scientists spend an increasing amount of time building and
using software. However, most scientists are never taught how to
do this efficiently. As a result, many are unaware of tools and
practices that would allow them to write more reliable and
maintainable code with less effort. We describe a set of best
practices for scientific software development that have solid
foundations in research and experience, and that improve
scientists’ productivity and the reliability of their software.
Software is as important to modern scientific research as
telescopes and test tubes. From groups that work exclusively on
computational problems, to traditional laboratory and field
scientists, more and more of the daily operation of science revolves
around developing new algorithms, managing and analyzing the
large amounts of data that are generated in single research
projects, combining disparate datasets to assess synthetic problems,
and other computational tasks.
Scientists typically develop their own software for these purposes
because doing so requires substantial domain-specific knowledge.
As a result, recent studies have found that scientists typically spend
30% or more of their time developing software [1,2]. However,
90% or more of them are primarily self-taught [1,2], and therefore
lack exposure to basic software development practices such as
writing maintainable code, using version control and issue
error from another group’s code was not discovered until after
publication [6]. As with bench experiments, not everything must be
done to the most exacting standards; however, scientists need to be
aware of best practices both to improve their own approaches and
for reviewing computational work by others.
This paper describes a set of practices that are easy to adopt and
have proven effective in many research settings. Our recommenda-
tions are based on several decades of collective experience both
building scientific software and teaching computing to scientists
[17,18], reports from many other groups [19–25], guidelines for
commercial and open source software development [26,27], and on
empirical studies of scientific computing [28–31] and software
development in general (summarized in [32]). None of these practices
will guarantee efficient, error-free software development, but used in
concert they will reduce the number of errors in scientific software,
make it easier to reuse, and save the authors of the software time and
effort that can used for focusing on the underlying scientific questions.
Our practices are summarized in Box 1; labels in the main text
such as ‘‘(1a)’’ refer to items in that summary. For reasons of space,
we do not discuss the equally important (but independent) issues of
reproducible research, publication and citation of code and data,
and open science. We do believe, however, that all of these will be
much easier to implement if scientists have the skills we describe.
Education
A Quick Guide to Organizing Computational Biology
Projects
William Stafford Noble1,2
*
1 Department of Genome Sciences, School of Medicine, University of Washington, Seattle, Washington, United States of America, 2 Department of Computer Science and
Engineering, University of Washington, Seattle, Washington, United States of America
Introduction
Most bioinformatics coursework focus-
es on algorithms, with perhaps some
components devoted to learning pro-
gramming skills and learning how to
use existing bioinformatics software. Un-
fortunately, for students who are prepar-
ing for a research career, this type of
curriculum fails to address many of the
day-to-day organizational challenges as-
sociated with performing computational
experiments. In practice, the principles
behind organizing and documenting
computational experiments are often
learned on the fly, and this learning is
strongly influenced by personal predilec-
tions as well as by chance interactions
with collaborators or colleagues.
The purpose of this article is to describe
one good strategy for carrying out com-
putational experiments. I will not describe
profound issues such as how to formulate
hypotheses, design experiments, or draw
conclusions. Rather, I will focus on
relatively mundane issues such as organiz-
ing files and directories and documenting
understanding your work or who may be
evaluating your research skills. Most com-
monly, however, that ‘‘someone’’ is you. A
few months from now, you may not
remember what you were up to when you
created a particular set of files, or you may
not remember what conclusions you drew.
You will either have to then spend time
reconstructing your previous experiments
or lose whatever insights you gained from
those experiments.
This leads to the second principle,
which is actually more like a version of
Murphy’s Law: Everything you do, you
will probably have to do over again.
Inevitably, you will discover some flaw in
your initial preparation of the data being
analyzed, or you will get access to new
data, or you will decide that your param-
eterization of a particular model was not
broad enough. This means that the
experiment you did last week, or even
the set of experiments you’ve been work-
ing on over the past month, will probably
need to be redone. If you have organized
and documented your work clearly, then
repeating the experiment with the new
under a common root directory. The
exception to this rule is source code or
scripts that are used in multiple projects
Each such program might have a projec
directory of its own.
Within a given project, I use a top-leve
organization that is logical, with chrono
logical organization at the next level, and
logical organization below that. A sample
project, called msms, is shown in Figure 1
At the root of most of my projects, I have a
data directory for storing fixed data sets, a
results directory for tracking computa
tional experiments peformed on that data
a doc directory with one subdirectory per
manuscript, and directories such as src
for source code and bin for compiled
binaries or scripts.
Within the data and results directo
ries, it is often tempting to apply a similar
logical organization. For example, you
may have two or three data sets agains
which you plan to benchmark your
algorithms, so you could create one
directory for each of them under data
In my experience, this approach is risky
because the logical structure of your finahttp://software.ac.uk
http://wurmlab.github.io
Specific Approaches/Tools
1. Write code for humans
http://wurmlab.github.io
Write code for humans (not computers!)
• For
• yourself today, in 6 months & in 3 years
• colleagues / collaborators
• reviewers
• other random people who may reuse/improve your code
• Respect conventions (e.g., a style guide)
te Damian ConwayUse whitespace/indentation!
e Damian Conway
Same information
Line length
Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a
reasonably sized font. If you find yourself running out of room, this is a good indication that you
should encapsulate some of the work in a separate function.

ant_measurements <- read.table(file = '~/Downloads/Web/ant_measurements.txt', header=TRUE, se
ant_measurements <- read.table(file = '~/Downloads/Web/ant_measurements.txt',
header = TRUE,
sep = 't',
col.names = c('colony', 'individual', 'headwidth', 'mass')
)
ant_measurements <- read.table(file = '~/Downloads/Web/ant_measurements.txt', header=TRUE, 

sep='t', col.names = c('colony', 'individual', 'headwidth', 'mass'))
Subset of R style guide
http://r-pkgs.had.co.nz/style.html
http://r-pkgs.had.co.nz/style.html
Subset of R style guide
http://wurmlab.github.io
http://wurmlab.github.io
Write code for humans (not computers!)
• For
• yourself today, in 6 months & in 3 years
• colleagues / collaborators
• reviewers
• other random people who may want to reuse your code
• Respect conventions (e.g., a style guide)
• If it runs "fast enough", no need to optimise (generally…)
http://wurmlab.github.io
Code reviews: ask a peer to
(critically) read your analysis code.
And/or do peer-programming
sessions
http://wurmlab.github.io
Specific Approaches/Tools
1. Write code for humans
2. Organise mindfully
Organise mindfully
http://wurmlab.github.io
Organise mindfully http://bit.ly/projectstruct
Choose a standard/
template and stick to it!
Organise mindfully http://bit.ly/projectstruct
http://wurmlab.github.io
Specific Approaches/Tools
1. Write code for humans
2. Organise mindfully
3. Plan for mistakes
http://wurmlab.github.io
Create code tests that are easy to run
• Unit tests == checking edge cases to see if the function works
# do your stuff
# e.g. define speed() function
library(testthat)
expect_that(speed(km = 0, minutes = 60), equals(0))
expect_that(speed(km = 60, minutes = 60), equals(1))
expect_that(my_model, is_a("lm"))
• Integration tests
• == "full analysis" but on small/fake data with known results
• e.g. on fakeVCF genotype file of 2 loci (one true positive,
one true negative)
• Add "sanity checks". Nonsensical commands should fail!
speed(km= "twenty", minutes=20) # should fail
speed(km = -4, minutes = 60) # should fail
expect_that(speed(km = -4, minutes = 60), throws_error())
expect_that(nrow(significant_SNPs), 42)
Automatically check consistency with style guide
install.packages("lintr") # once
library(lintr) # everytime
lint("file_to_check.R")
http://wurmlab.github.io
"Continuous integration":
Tests should run automagically.
So you don't have to remember (or find time) to do it.
💾http://github.org
Tests run
automatically
http://travis-ci.org
If unexpected
result:
📬
Write less code, less information
Amount you write
Amountofbugs
Let the data (filenames) provide information?
Use tried & tested code where possible
DRY: Don’t RepeatYourself
& don't reinvent the wheel.
http://wurmlab.github.io
Specific Approaches/Tools
1. Write code for humans
2. Organise mindfully
3. Plan for mistakes
4. Use tools that reduce risks
http://wurmlab.github.io
Use tools that reduce risks
• Ensure computers are set up for productivity. E.g.,:
• use GNU parallel on an 40-core machine is in some cases
more more appropriate than submitting to queue
• If you need to make a "pipeline", use software designed for this.
E.g.:
• Nextflow
• Snakemake
• (etc)
• too many examples to discuss here
knitr/rmarkdown/
jupyter
Analysis & report in one.
analysis.Rmd
A minimal R Markdown example
I know the value of pi is 3.1416, and 2 times pi is 6.2832. To c
library(knitr); knit( minimal.Rmd )
A paragraph here. A code chunk below:
1+1
## [1] 2
.4-.7+.3 # what? it is not zero!
## [1] 5.551e-17
Graphics work too
library(ggplot2)
qplot(speed, dist, data = cars) + geom_smooth()
●
●
●
●
●
●
●
●
●
●
●
●
●●● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
● ●
0
40
80
120
5 10 15 20
speed
dist
Figure 1: A scatterplot of cars
Github: Facebook for code
Github: Facebook for code
• Easily keep track of old versions
• Modify without risks
• Easily collaborate
• Random people use your stuff
• And find problems and fix and improve it!
• Greater impact / better world
• Identify new tools
• Build online reputation Demo
Learn at: https://try.github.io/
Getting help.
• In real life: Make friends with people.Talk to them.
• Online:
• Specific discussion mailing lists (e.g.: R, bioruby, MAKER...)
• Programming: http://stackoverflow.com
• Bioinformatics: http://www.biostars.org
• Sequencing-related: http://seqanswers.com
• Stats: http://stats.stackexchange.com , R-help mailing list.
in many many manners.
(5. Visualize visualise visualize!)
SequenceServer
“Can you BLAST this for me?”
BLAST
But:
•convoluted interface
•challenging on custom data
Antgenomes.org SequenceServer
BLAST made easy
is the most commonly used tool: >100,000 citations
http://www.sequenceserver.com/
If no config file:Asks interactive setup questions.
If needed: Downloads BLAST binaries
If needed: Formats FASTA into BLAST database.
1. Installing
gem install sequenceserver
### Launched SequenceServer at: http://0.0.0.0:4567
2. Launch
sequenceserver
Anurag Priyam - @yeban
http://www.sequenceserver.com/ Anurag Priyam @yeban
http://www.sequenceserver.com/ Anurag Priyam @yeban
All queries vs all hits circos plot overview
1 query: length distribution of all hits
1 query v 1 hit "kablammo" overview
GeneValidator
Gene prediction/identification
Dozens of software algorithms: dozens of predictions
5-10% failure rate:
•missing pieces
•extra pieces
•incorrect merging
•incorrect splitting
Yandell&Ence2013NRG
GTTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATTATGTTGAATaTTAGGGTTTTTATAAAGAATGTGTATATTGUTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTTA
Evidence
Consensus:
Bad gene predictions
"poison" analyses.
Many false positives.
https://github.com/wurmlab/genevalidator
http://wurmlab.github.io
Summary: Specific Approaches/Tools
1. Write code for humans
2. Organise mindfully
3. Plan for mistakes
4. Use tools that reduce risks
Bruno
Vieira
Anurag
Priyam
Ismail
Moghul
Roddy
Pracana
Joe
Colgan
+Emeline Favreau
+Eckart Stolle
+Leandro Santiago
+Carlos Martinez-Ruiz
(5. Visualize visualize visualize!)

Contenu connexe

Tendances

BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS
 
Biological databases
Biological databasesBiological databases
Biological databasesAshfaq Ahmad
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesJackie Wirz, PhD
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesLeighton Pritchard
 
20081216 06陳倩琪 紅麴菌基因體之定序與分析
20081216 06陳倩琪 紅麴菌基因體之定序與分析20081216 06陳倩琪 紅麴菌基因體之定序與分析
20081216 06陳倩琪 紅麴菌基因體之定序與分析Monascus2008
 
100505 koenig biological_databases
100505 koenig biological_databases100505 koenig biological_databases
100505 koenig biological_databasesMeetika Gupta
 

Tendances (9)

BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners Slides
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In Sequences
 
20081216 06陳倩琪 紅麴菌基因體之定序與分析
20081216 06陳倩琪 紅麴菌基因體之定序與分析20081216 06陳倩琪 紅麴菌基因體之定序與分析
20081216 06陳倩琪 紅麴菌基因體之定序與分析
 
NCBI
NCBINCBI
NCBI
 
Biological Databases
Biological DatabasesBiological Databases
Biological Databases
 
100505 koenig biological_databases
100505 koenig biological_databases100505 koenig biological_databases
100505 koenig biological_databases
 

Similaire à 2018 08-reduce risks of genomics research

2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...Yannick Wurm
 
2015BPSposterQL
2015BPSposterQL2015BPSposterQL
2015BPSposterQLQing Li
 
"Cell", article entitled ""Heroes of CRISPR"
"Cell", article entitled ""Heroes of CRISPR""Cell", article entitled ""Heroes of CRISPR"
"Cell", article entitled ""Heroes of CRISPR"Lynsey Wiggins
 
Ptacin_et_al-2013-Cellular_Microbiology (review)
Ptacin_et_al-2013-Cellular_Microbiology (review)Ptacin_et_al-2013-Cellular_Microbiology (review)
Ptacin_et_al-2013-Cellular_Microbiology (review)Jerod Ptacin
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Robin Gutell
 
Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Robin Gutell
 
UC Davis EVE161 Lecture 17 by @phylogenomics
 UC Davis EVE161 Lecture 17 by @phylogenomics UC Davis EVE161 Lecture 17 by @phylogenomics
UC Davis EVE161 Lecture 17 by @phylogenomicsJonathan Eisen
 
Lena Bengtsson CIRM Poster 2016
Lena Bengtsson CIRM Poster 2016Lena Bengtsson CIRM Poster 2016
Lena Bengtsson CIRM Poster 2016Lena Bengtsson
 
Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535Robin Gutell
 
Research Symposium Poster Draft
Research Symposium Poster DraftResearch Symposium Poster Draft
Research Symposium Poster DraftSara Nass
 
Nikola_Ivica_Thesis
Nikola_Ivica_ThesisNikola_Ivica_Thesis
Nikola_Ivica_ThesisNikola Ivica
 
genomics final paper 3 after peer
genomics final paper 3 after peergenomics final paper 3 after peer
genomics final paper 3 after peerRoshan Kumar
 
Gutell 076.curr.genetics.2001.40.0082
Gutell 076.curr.genetics.2001.40.0082Gutell 076.curr.genetics.2001.40.0082
Gutell 076.curr.genetics.2001.40.0082Robin Gutell
 
Primary visual cortex shows laminar specific and balanced circuit organization...
Primary visual cortex shows laminar specific and balanced circuit organization...Primary visual cortex shows laminar specific and balanced circuit organization...
Primary visual cortex shows laminar specific and balanced circuit organization...Taruna Ikrar
 
Gutell 006.tibs.1983.08.0359
Gutell 006.tibs.1983.08.0359Gutell 006.tibs.1983.08.0359
Gutell 006.tibs.1983.08.0359Robin Gutell
 

Similaire à 2018 08-reduce risks of genomics research (20)

2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
 
2015BPSposterQL
2015BPSposterQL2015BPSposterQL
2015BPSposterQL
 
"Cell", article entitled ""Heroes of CRISPR"
"Cell", article entitled ""Heroes of CRISPR""Cell", article entitled ""Heroes of CRISPR"
"Cell", article entitled ""Heroes of CRISPR"
 
Ptacin_et_al-2013-Cellular_Microbiology (review)
Ptacin_et_al-2013-Cellular_Microbiology (review)Ptacin_et_al-2013-Cellular_Microbiology (review)
Ptacin_et_al-2013-Cellular_Microbiology (review)
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383
 
Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473
 
UC Davis EVE161 Lecture 17 by @phylogenomics
 UC Davis EVE161 Lecture 17 by @phylogenomics UC Davis EVE161 Lecture 17 by @phylogenomics
UC Davis EVE161 Lecture 17 by @phylogenomics
 
bai2
bai2bai2
bai2
 
Klug dna
Klug dnaKlug dna
Klug dna
 
Lena Bengtsson CIRM Poster 2016
Lena Bengtsson CIRM Poster 2016Lena Bengtsson CIRM Poster 2016
Lena Bengtsson CIRM Poster 2016
 
Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535
 
Research Symposium Poster Draft
Research Symposium Poster DraftResearch Symposium Poster Draft
Research Symposium Poster Draft
 
Nikola_Ivica_Thesis
Nikola_Ivica_ThesisNikola_Ivica_Thesis
Nikola_Ivica_Thesis
 
genomics final paper 3 after peer
genomics final paper 3 after peergenomics final paper 3 after peer
genomics final paper 3 after peer
 
A tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus KaiserA tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus Kaiser
 
Gutell 076.curr.genetics.2001.40.0082
Gutell 076.curr.genetics.2001.40.0082Gutell 076.curr.genetics.2001.40.0082
Gutell 076.curr.genetics.2001.40.0082
 
Primary visual cortex shows laminar specific and balanced circuit organization...
Primary visual cortex shows laminar specific and balanced circuit organization...Primary visual cortex shows laminar specific and balanced circuit organization...
Primary visual cortex shows laminar specific and balanced circuit organization...
 
Ribosoma virtual
Ribosoma virtualRibosoma virtual
Ribosoma virtual
 
Synthetic biology
Synthetic biologySynthetic biology
Synthetic biology
 
Gutell 006.tibs.1983.08.0359
Gutell 006.tibs.1983.08.0359Gutell 006.tibs.1983.08.0359
Gutell 006.tibs.1983.08.0359
 

Plus de Yannick Wurm

2018 09-03-ses open-fair_practices_in_evolutionary_genomics
2018 09-03-ses open-fair_practices_in_evolutionary_genomics2018 09-03-ses open-fair_practices_in_evolutionary_genomics
2018 09-03-ses open-fair_practices_in_evolutionary_genomicsYannick Wurm
 
2016 09-16-fairdom
2016 09-16-fairdom2016 09-16-fairdom
2016 09-16-fairdomYannick Wurm
 
2016 05-31-wurm-social-chromosome
2016 05-31-wurm-social-chromosome2016 05-31-wurm-social-chromosome
2016 05-31-wurm-social-chromosomeYannick Wurm
 
2016 05-30-monday-assembly
2016 05-30-monday-assembly2016 05-30-monday-assembly
2016 05-30-monday-assemblyYannick Wurm
 
2016 05-29-intro-sib-springschool-leuker bad
2016 05-29-intro-sib-springschool-leuker bad2016 05-29-intro-sib-springschool-leuker bad
2016 05-29-intro-sib-springschool-leuker badYannick Wurm
 
2015 11-17-programming inr.key
2015 11-17-programming inr.key2015 11-17-programming inr.key
2015 11-17-programming inr.keyYannick Wurm
 
2015 11-10-bio-in-docker-oswitch
2015 11-10-bio-in-docker-oswitch2015 11-10-bio-in-docker-oswitch
2015 11-10-bio-in-docker-oswitchYannick Wurm
 
Week 5 genetic basis of evolution
Week 5   genetic basis of evolutionWeek 5   genetic basis of evolution
Week 5 genetic basis of evolutionYannick Wurm
 
Biol113 week4 evolution
Biol113 week4 evolutionBiol113 week4 evolution
Biol113 week4 evolutionYannick Wurm
 
2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible research2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible researchYannick Wurm
 
2015 10-7-9am regex-functions-loops.key
2015 10-7-9am regex-functions-loops.key2015 10-7-9am regex-functions-loops.key
2015 10-7-9am regex-functions-loops.keyYannick Wurm
 
2015 9-30-sbc361-research methcomm
2015 9-30-sbc361-research methcomm2015 9-30-sbc361-research methcomm
2015 9-30-sbc361-research methcommYannick Wurm
 
2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.keyYannick Wurm
 
2015 09-28 bio721 intro
2015 09-28 bio721 intro2015 09-28 bio721 intro
2015 09-28 bio721 introYannick Wurm
 
Sustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshopSustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshopYannick Wurm
 
2014 10-15-Nextbug edinburgh
2014 10-15-Nextbug edinburgh2014 10-15-Nextbug edinburgh
2014 10-15-Nextbug edinburghYannick Wurm
 

Plus de Yannick Wurm (20)

2018 09-03-ses open-fair_practices_in_evolutionary_genomics
2018 09-03-ses open-fair_practices_in_evolutionary_genomics2018 09-03-ses open-fair_practices_in_evolutionary_genomics
2018 09-03-ses open-fair_practices_in_evolutionary_genomics
 
2016 09-16-fairdom
2016 09-16-fairdom2016 09-16-fairdom
2016 09-16-fairdom
 
2016 05-31-wurm-social-chromosome
2016 05-31-wurm-social-chromosome2016 05-31-wurm-social-chromosome
2016 05-31-wurm-social-chromosome
 
2016 05-30-monday-assembly
2016 05-30-monday-assembly2016 05-30-monday-assembly
2016 05-30-monday-assembly
 
2016 05-29-intro-sib-springschool-leuker bad
2016 05-29-intro-sib-springschool-leuker bad2016 05-29-intro-sib-springschool-leuker bad
2016 05-29-intro-sib-springschool-leuker bad
 
2015 11-17-programming inr.key
2015 11-17-programming inr.key2015 11-17-programming inr.key
2015 11-17-programming inr.key
 
2015 11-10-bio-in-docker-oswitch
2015 11-10-bio-in-docker-oswitch2015 11-10-bio-in-docker-oswitch
2015 11-10-bio-in-docker-oswitch
 
Week 5 genetic basis of evolution
Week 5   genetic basis of evolutionWeek 5   genetic basis of evolution
Week 5 genetic basis of evolution
 
Biol113 week4 evolution
Biol113 week4 evolutionBiol113 week4 evolution
Biol113 week4 evolution
 
Evolution week3
Evolution week3Evolution week3
Evolution week3
 
2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible research2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible research
 
2015 10-7-9am regex-functions-loops.key
2015 10-7-9am regex-functions-loops.key2015 10-7-9am regex-functions-loops.key
2015 10-7-9am regex-functions-loops.key
 
Evolution week2
Evolution week2Evolution week2
Evolution week2
 
2015 9-30-sbc361-research methcomm
2015 9-30-sbc361-research methcomm2015 9-30-sbc361-research methcomm
2015 9-30-sbc361-research methcomm
 
2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key
 
Sbc322 intro.key
Sbc322 intro.keySbc322 intro.key
Sbc322 intro.key
 
2015 09-28 bio721 intro
2015 09-28 bio721 intro2015 09-28 bio721 intro
2015 09-28 bio721 intro
 
Sustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshopSustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshop
 
2014 10-15-Nextbug edinburgh
2014 10-15-Nextbug edinburgh2014 10-15-Nextbug edinburgh
2014 10-15-Nextbug edinburgh
 
2014 12-09-oulu
2014 12-09-oulu2014 12-09-oulu
2014 12-09-oulu
 

Dernier

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 

Dernier (20)

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 

2018 08-reduce risks of genomics research

  • 1. Yannick Wurm http://wurmlab.github.io Steps to avoid having to retract your analysis Guarujá 2018
  • 4. BIG
  • 5.
  • 6. Geoffrey Chang: Crystallographer • Beckman FoundationYoung Investigator Award • Presidential Early Career Award Journal of Molecular Biology (2003) Chang. Structure of MsbA from Vibrio cholera: a multidrug resistance ABC transporter homolog in a closed conformation. PNAS (2004) Ma & Chang. Structure of the multidrug resistance efflux transporter EmrE from Escherichia coli. Science (2005) Reyes & Chang. Structure of the ABC transporter MsbA in complex with ADP vanadate and lipopolysaccharide. Science (2005) Pornillos et al. X-ray structure of the EmrE multidrug transporter in complex with a substrate. Science (2001) Chang & Roth. Structure of MsbA from E. coli: a homolog of the multidrug resistance ATP binding cassette (ABC) transporters. Science (2001) Chang & Roth.
  • 7. earch Institute in next year, in a cer- Chang received a Award rs, the young ated a apers ctures ded in into a Swiss per in bt on a group cience gated, scover ispro- mns of density m had ucture. d used energy from adenosine triphosphate to trans- port molecules across cell membranes. These so-called ABC transporters perform many determination was at the root o cess: “He has an incredible d ethic. He really pushed the fie of getting things to no one else had be Chang’s data are go but the faulty so everything off. Ironically, anoth doc in Rees’s lab, K exposed the mistake tember issue of Na now at the Swiss F ofTechnology in Zu the structure of anA calledSav1866from aureus. The structur cally—and unexpe ent from that of pulling up Sav186 MsbA from S. typh computer screen, L realized in minutes structurewasinvert the “hand” of a mol Flipping fiasco. The structures of MsbA (purple) and Sav1866 (green) overlap little (left) until MsbA is inverted (right). California.The next year, in a cer- e White House, Chang received a l Early Career Award ts and Engineers, the ghest honor for young . His lab generated a high-profile papers e molecular structures proteins embedded in nes. e dream turned into a In September, Swiss published a paper in cast serious doubt on a cture Chang’s group ed in a 2001 Science en he investigated, horrified to discover madedata-analysispro- ipped two columns of ng the electron-density which his team had final protein structure. ly, his group had used m to analyze data for port molecules across cell membranes. These so-called ABC transporters perform many cess: “He has an ethic. He really p of get no on Chan but t every Iro doc in expos temb now a ofTec the str called aureu cally— ent f pullin MsbA comp realiz struct the “h a cha Flipping fiasco. The structures of MsbA (purple) and Sav1866 (green) overlap little (left) until MsbA is inverted (right). Sav1866 Dawson & Locher (2006) Nature Science(2001)Chang&Roth.Science (2001) Chang & Roth. Comparison with 3D structure of ortholog Science (2001) Chang & Roth.
  • 8. http://wurmlab.github.io LETTERS I BOOKS I POLICY FORUM I EDUCATION FORUM I PERSPECTIVES 1878 1880 1882 LETTERS edited by Etta Kavanagh Retraction WE WISH TO RETRACT OUR RESEARCH ARTICLE “STRUCTURE OF MsbA from E. coli:A homolog of the multidrug resistanceATP bind- ing cassette (ABC) transporters” and both of our Reports “Structure of the ABC transporter MsbA in complex with ADP•vanadate and lipopolysaccharide”and“X-raystructureoftheEmrEmultidrugtrans- porter in complex with a substrate” (1–3). The recently reported structure of Sav1866 (4) indicated that our MsbA structures (1, 2, 5) were incorrect in both the hand of the struc- ture and the topology. Thus, our biological interpretations based on these inverted models for MsbA are invalid. Anin-housedatareductionprogramintroducedachangeinsignfor anomalous differences.This program, which was not part of a conven- tional data processing package, converted the anomalous pairs (I+ and I-) to (F- and F+), thereby introducing a sign change. As the diffrac- tion data collected for each set of MsbA crystals and for the EmrE crystals were processed with the same program, the structures reported in (1–3, 5, 6) had the wrong hand. The error in the topology of the original MsbA structure was a con- sequence of the low resolution of the data as well as breaks in the elec- tron density for the connecting loop regions. Unfortunately, the use of the multicopy refinement procedure still allowed us to obtain reason- able refinement values for the wrong structures. The Protein Data Bank (PDB) files 1JSQ, 1PF4, and 1Z2R for MsbA and 1S7B and 2F2M for EmrE have been moved to the archive of obsolete PDB entries. The MsbA and EmrE structures will be recalculated from the original data using the proper sign for the anom- alous differences, and the new Ca coordinates and structure factors will be deposited. We very sincerely regret the confusion that these papers have caused and, in particular, subsequent research efforts that were unpro- ductive as a result of our original findings. GEOFFREY CHANG, CHRISTOPHER B. ROTH, CHRISTOPHER L. REYES, OWEN PORNILLOS, YEN-JU CHEN, ANDY P. CHEN Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, USA. References 1. G. Chang, C. B. Roth, Science 293, 1793 (2001). 2. C. L. Reyes, G. Chang, Science 308, 1028 (2005). 3. O. Pornillos, Y.-J. Chen, A. P. Chen, G. Chang, Science 310, 1950 (2005). 4. R. J. Dawson, K. P. Locher, Nature 443, 180 (2006). 5. G. Chang, J. Mol. Biol. 330, 419 (2003). 6. C. Ma, G. Chang, Proc. Natl. Acad. Sci. U.S.A. 101, 2852 (2004). MsbA from E. coli:A homolog of the multidrug resistanceATP bind- ing cassette (ABC) transporters” and both of our Reports “Structure of the ABC transporter MsbA in complex with ADP•vanadate and lipopolysaccharide”and“X-raystructureoftheEmrEmultidrugtrans- porter in complex with a substrate” (1–3). The recently reported structure of Sav1866 (4) indicated that our MsbA structures (1, 2, 5) were incorrect in both the hand of the struc- ture and the topology. Thus, our biological interpretations based on these inverted models for MsbA are invalid. Anin-housedatareductionprogramintroducedachangeinsignfor anomalous differences.This program, which was not part of a conven- tional data processing package, converted the anomalous pairs (I+ and I-) to (F- and F+), thereby introducing a sign change. As the diffrac- tion data collected for each set of MsbA crystals and for the EmrE crystals were processed with the same program, the structures reported in (1–3, 5, 6) had the wrong hand. The error in the topology of the original MsbA structure was a con- sequence of the low resolution of the data as well as breaks in the elec-
  • 9. 😥 Geoffrey Chang • Beckman FoundationYoung Investigator Award • Presidential Early Career Award Science (2001) Chang & Roth. Structure of MsbA from E. coli: a homolog of the multidrug resistance ATP binding cassette (ABC) transporters. Journal of Molecular Biology (2003) Chang. Structure of MsbA from Vibrio cholera: a multidrug resistance ABC transporter homolog in a closed conformation. PNAS (2004) Ma & Chang. Structure of the multidrug resistance efflux transporter EmrE from Escherichia coli. Science (2005) Reyes & Chang. Structure of the ABC transporter MsbA in complex with ADP vanadate and lipopolysaccharide. Science (2005) Pornillos et al. X-ray structure of the EmrE multidrug transporter in complex with a substrate. 1860 Untilrecently,GeoffreyChang’scareerwason a trajectory most young scientists only dream about. In 1999, at the age of 28, the protein crystallographer landed a faculty position at the prestigious Scripps Research Institute in San Diego, California.The next year, in a cer- emony at the White House, Chang received a Presidential Early Career Award for Scientists and Engineers, the country’s highest honor for young researchers. His lab generated a stream of high-profile papers detailing the molecular structures of important proteins embedded in cell membranes. Then the dream turned into a nightmare. In September, Swiss researchers published a paper in Nature that cast serious doubt on a protein structure Chang’s group had described in a 2001 Science paper. When he investigated, Chang was horrified to discover thatahomemadedata-analysispro- 2001 Science paper, which described the struc- tureofaproteincalledMsbA,isolatedfromthe bacterium Escherichia coli. MsbA belongs to a huge and ancient family of molecules that use energy from adenosine triphosphate to trans- port molecules across cell membranes. These so-called ABC transporters perform many Sciences and EmrE, a differ Crystalliz five membra was an incred postdoc advis nia Institute o proteins are a because they ously diffic needed for x- determination cess: “He has ethic. He real of no Ch bu ev do ex tem no of the cal au ca en pu A Scientist’s Nightmare: Software Problem Leads to Five Retractions SCIENTIFIC PUBLISHING
  • 10. OF d- of nd ns- ur c- on or n- nd c- rE ed n- c- able refinement values for the wrong structures. The Protein Data Bank (PDB) files 1JSQ, 1PF4, and 1Z2R for MsbA and 1S7B and 2F2M for EmrE have been moved to the archive of obsolete PDB entries. The MsbA and EmrE structures will be recalculated from the original data using the proper sign for the anom- alous differences, and the new Ca coordinates and structure factors will be deposited. We very sincerely regret the confusion that these papers have caused and, in particular, subsequent research efforts that were unpro- ductive as a result of our original findings. GEOFFREY CHANG, CHRISTOPHER B. ROTH, CHRISTOPHER L. REYES, OWEN PORNILLOS, YEN-JU CHEN, ANDY P. CHEN Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, USA. References 1. G. Chang, C. B. Roth, Science 293, 1793 (2001). 2. C. L. Reyes, G. Chang, Science 308, 1028 (2005). 3. O. Pornillos, Y.-J. Chen, A. P. Chen, G. Chang, Science 310, 1950 (2005). 4. R. J. Dawson, K. P. Locher, Nature 443, 180 (2006). 5. G. Chang, J. Mol. Biol. 330, 419 (2003). 6. C. Ma, G. Chang, Proc. Natl. Acad. Sci. U.S.A. 101, 2852 (2004).
  • 11. http://wurmlab.github.io This is costly For: •the individual •collaborators •the institution •1000s of researchers performing follow-up work •science •society
  • 12. This changes everything. Cost to sequence 1,000,000 nucleotides Any lab can sequence anything! We generate 50,000x more data per $ than 10 years ago
  • 13.
  • 14. Llorente et al 2015 Science
  • 15. -> actually no general pattern 20 types of cancer, 6,000 samples total Methylation profiles analysed —> cancer cells are 36 years older than normal cells >500 citations
  • 16. http://wurmlab.github.io • Understanding/visualising/analysing/massaging big data is hard. • Biology/life is complex. • Biologists lack computational training. • Field is young. • Analysis tools (generally) suck: • badly written • badly tested • hard to install • output quality… often questionable. • Data sizes keep growing! • Data formats keep changing :( Genome bioinformatics is hardBiology is harder than (many) other data sciences
  • 17.
  • 19. Some sources of inspiration • Avoid costly mistakes • Be faster:“stand on the shoulders of giants” • Increase impact / visibility
  • 20.
  • 21. http://wurmlab.github.io Community Page Best Practices for Scientific Computing Greg Wilson1 *, D. A. Aruliah2 , C. Titus Brown3 , Neil P. Chue Hong4 , Matt Davis5 , Richard T. Guy6¤ , Steven H. D. Haddock7 , Kathryn D. Huff8 , Ian M. Mitchell9 , Mark D. Plumbley10 , Ben Waugh11 , Ethan P. White12 , Paul Wilson13 1 Mozilla Foundation, Toronto, Ontario, Canada, 2 University of Ontario Institute of Technology, Oshawa, Ontario, Canada, 3 Michigan State University, East Lansing, Michigan, United States of America, 4 Software Sustainability Institute, Edinburgh, United Kingdom, 5 Space Telescope Science Institute, Baltimore, Maryland, United States of America, 6 University of Toronto, Toronto, Ontario, Canada, 7 Monterey Bay Aquarium Research Institute, Moss Landing, California, United States of America, 8 University of California Berkeley, Berkeley, California, United States of America, 9 University of British Columbia, Vancouver, British Columbia, Canada, 10 Queen Mary University of London, London, United Kingdom, 11 University College London, London, United Kingdom, 12 Utah State University, Logan, Utah, United States of America, 13 University of Wisconsin, Madison, Wisconsin, United States of America Introduction Scientists spend an increasing amount of time building and using software. However, most scientists are never taught how to do this efficiently. As a result, many are unaware of tools and practices that would allow them to write more reliable and maintainable code with less effort. We describe a set of best practices for scientific software development that have solid foundations in research and experience, and that improve scientists’ productivity and the reliability of their software. Software is as important to modern scientific research as telescopes and test tubes. From groups that work exclusively on computational problems, to traditional laboratory and field scientists, more and more of the daily operation of science revolves around developing new algorithms, managing and analyzing the large amounts of data that are generated in single research projects, combining disparate datasets to assess synthetic problems, and other computational tasks. Scientists typically develop their own software for these purposes because doing so requires substantial domain-specific knowledge. As a result, recent studies have found that scientists typically spend 30% or more of their time developing software [1,2]. However, 90% or more of them are primarily self-taught [1,2], and therefore lack exposure to basic software development practices such as writing maintainable code, using version control and issue error from another group’s code was not discovered until after publication [6]. As with bench experiments, not everything must be done to the most exacting standards; however, scientists need to be aware of best practices both to improve their own approaches and for reviewing computational work by others. This paper describes a set of practices that are easy to adopt and have proven effective in many research settings. Our recommenda- tions are based on several decades of collective experience both building scientific software and teaching computing to scientists [17,18], reports from many other groups [19–25], guidelines for commercial and open source software development [26,27], and on empirical studies of scientific computing [28–31] and software development in general (summarized in [32]). None of these practices will guarantee efficient, error-free software development, but used in concert they will reduce the number of errors in scientific software, make it easier to reuse, and save the authors of the software time and effort that can used for focusing on the underlying scientific questions. Our practices are summarized in Box 1; labels in the main text such as ‘‘(1a)’’ refer to items in that summary. For reasons of space, we do not discuss the equally important (but independent) issues of reproducible research, publication and citation of code and data, and open science. We do believe, however, that all of these will be much easier to implement if scientists have the skills we describe. Education A Quick Guide to Organizing Computational Biology Projects William Stafford Noble1,2 * 1 Department of Genome Sciences, School of Medicine, University of Washington, Seattle, Washington, United States of America, 2 Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America Introduction Most bioinformatics coursework focus- es on algorithms, with perhaps some components devoted to learning pro- gramming skills and learning how to use existing bioinformatics software. Un- fortunately, for students who are prepar- ing for a research career, this type of curriculum fails to address many of the day-to-day organizational challenges as- sociated with performing computational experiments. In practice, the principles behind organizing and documenting computational experiments are often learned on the fly, and this learning is strongly influenced by personal predilec- tions as well as by chance interactions with collaborators or colleagues. The purpose of this article is to describe one good strategy for carrying out com- putational experiments. I will not describe profound issues such as how to formulate hypotheses, design experiments, or draw conclusions. Rather, I will focus on relatively mundane issues such as organiz- ing files and directories and documenting understanding your work or who may be evaluating your research skills. Most com- monly, however, that ‘‘someone’’ is you. A few months from now, you may not remember what you were up to when you created a particular set of files, or you may not remember what conclusions you drew. You will either have to then spend time reconstructing your previous experiments or lose whatever insights you gained from those experiments. This leads to the second principle, which is actually more like a version of Murphy’s Law: Everything you do, you will probably have to do over again. Inevitably, you will discover some flaw in your initial preparation of the data being analyzed, or you will get access to new data, or you will decide that your param- eterization of a particular model was not broad enough. This means that the experiment you did last week, or even the set of experiments you’ve been work- ing on over the past month, will probably need to be redone. If you have organized and documented your work clearly, then repeating the experiment with the new under a common root directory. The exception to this rule is source code or scripts that are used in multiple projects Each such program might have a projec directory of its own. Within a given project, I use a top-leve organization that is logical, with chrono logical organization at the next level, and logical organization below that. A sample project, called msms, is shown in Figure 1 At the root of most of my projects, I have a data directory for storing fixed data sets, a results directory for tracking computa tional experiments peformed on that data a doc directory with one subdirectory per manuscript, and directories such as src for source code and bin for compiled binaries or scripts. Within the data and results directo ries, it is often tempting to apply a similar logical organization. For example, you may have two or three data sets agains which you plan to benchmark your algorithms, so you could create one directory for each of them under data In my experience, this approach is risky because the logical structure of your finahttp://software.ac.uk
  • 23. http://wurmlab.github.io Write code for humans (not computers!) • For • yourself today, in 6 months & in 3 years • colleagues / collaborators • reviewers • other random people who may reuse/improve your code • Respect conventions (e.g., a style guide)
  • 24. te Damian ConwayUse whitespace/indentation! e Damian Conway Same information
  • 25. Line length Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a reasonably sized font. If you find yourself running out of room, this is a good indication that you should encapsulate some of the work in a separate function. ant_measurements <- read.table(file = '~/Downloads/Web/ant_measurements.txt', header=TRUE, se ant_measurements <- read.table(file = '~/Downloads/Web/ant_measurements.txt', header = TRUE, sep = 't', col.names = c('colony', 'individual', 'headwidth', 'mass') ) ant_measurements <- read.table(file = '~/Downloads/Web/ant_measurements.txt', header=TRUE, 
 sep='t', col.names = c('colony', 'individual', 'headwidth', 'mass')) Subset of R style guide http://r-pkgs.had.co.nz/style.html
  • 28. http://wurmlab.github.io Write code for humans (not computers!) • For • yourself today, in 6 months & in 3 years • colleagues / collaborators • reviewers • other random people who may want to reuse your code • Respect conventions (e.g., a style guide) • If it runs "fast enough", no need to optimise (generally…)
  • 29. http://wurmlab.github.io Code reviews: ask a peer to (critically) read your analysis code. And/or do peer-programming sessions
  • 30. http://wurmlab.github.io Specific Approaches/Tools 1. Write code for humans 2. Organise mindfully
  • 34. http://wurmlab.github.io Specific Approaches/Tools 1. Write code for humans 2. Organise mindfully 3. Plan for mistakes
  • 35. http://wurmlab.github.io Create code tests that are easy to run • Unit tests == checking edge cases to see if the function works # do your stuff # e.g. define speed() function library(testthat) expect_that(speed(km = 0, minutes = 60), equals(0)) expect_that(speed(km = 60, minutes = 60), equals(1)) expect_that(my_model, is_a("lm")) • Integration tests • == "full analysis" but on small/fake data with known results • e.g. on fakeVCF genotype file of 2 loci (one true positive, one true negative) • Add "sanity checks". Nonsensical commands should fail! speed(km= "twenty", minutes=20) # should fail speed(km = -4, minutes = 60) # should fail expect_that(speed(km = -4, minutes = 60), throws_error()) expect_that(nrow(significant_SNPs), 42)
  • 36. Automatically check consistency with style guide install.packages("lintr") # once library(lintr) # everytime lint("file_to_check.R")
  • 37. http://wurmlab.github.io "Continuous integration": Tests should run automagically. So you don't have to remember (or find time) to do it. 💾http://github.org Tests run automatically http://travis-ci.org If unexpected result: 📬
  • 38. Write less code, less information Amount you write Amountofbugs Let the data (filenames) provide information? Use tried & tested code where possible DRY: Don’t RepeatYourself & don't reinvent the wheel.
  • 39. http://wurmlab.github.io Specific Approaches/Tools 1. Write code for humans 2. Organise mindfully 3. Plan for mistakes 4. Use tools that reduce risks
  • 40. http://wurmlab.github.io Use tools that reduce risks • Ensure computers are set up for productivity. E.g.,: • use GNU parallel on an 40-core machine is in some cases more more appropriate than submitting to queue • If you need to make a "pipeline", use software designed for this. E.g.: • Nextflow • Snakemake • (etc) • too many examples to discuss here
  • 41. knitr/rmarkdown/ jupyter Analysis & report in one. analysis.Rmd A minimal R Markdown example I know the value of pi is 3.1416, and 2 times pi is 6.2832. To c library(knitr); knit( minimal.Rmd ) A paragraph here. A code chunk below: 1+1 ## [1] 2 .4-.7+.3 # what? it is not zero! ## [1] 5.551e-17 Graphics work too library(ggplot2) qplot(speed, dist, data = cars) + geom_smooth() ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● 0 40 80 120 5 10 15 20 speed dist Figure 1: A scatterplot of cars
  • 43. Github: Facebook for code • Easily keep track of old versions • Modify without risks • Easily collaborate • Random people use your stuff • And find problems and fix and improve it! • Greater impact / better world • Identify new tools • Build online reputation Demo Learn at: https://try.github.io/
  • 44. Getting help. • In real life: Make friends with people.Talk to them. • Online: • Specific discussion mailing lists (e.g.: R, bioruby, MAKER...) • Programming: http://stackoverflow.com • Bioinformatics: http://www.biostars.org • Sequencing-related: http://seqanswers.com • Stats: http://stats.stackexchange.com , R-help mailing list.
  • 45.
  • 46.
  • 47. in many many manners. (5. Visualize visualise visualize!)
  • 49. “Can you BLAST this for me?” BLAST But: •convoluted interface •challenging on custom data Antgenomes.org SequenceServer BLAST made easy is the most commonly used tool: >100,000 citations
  • 50. http://www.sequenceserver.com/ If no config file:Asks interactive setup questions. If needed: Downloads BLAST binaries If needed: Formats FASTA into BLAST database. 1. Installing gem install sequenceserver ### Launched SequenceServer at: http://0.0.0.0:4567 2. Launch sequenceserver Anurag Priyam - @yeban
  • 53. All queries vs all hits circos plot overview
  • 54. 1 query: length distribution of all hits 1 query v 1 hit "kablammo" overview
  • 56. Gene prediction/identification Dozens of software algorithms: dozens of predictions 5-10% failure rate: •missing pieces •extra pieces •incorrect merging •incorrect splitting Yandell&Ence2013NRG GTTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATTATGTTGAATaTTAGGGTTTTTATAAAGAATGTGTATATTGUTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTTA Evidence Consensus: Bad gene predictions "poison" analyses. Many false positives.
  • 58. http://wurmlab.github.io Summary: Specific Approaches/Tools 1. Write code for humans 2. Organise mindfully 3. Plan for mistakes 4. Use tools that reduce risks Bruno Vieira Anurag Priyam Ismail Moghul Roddy Pracana Joe Colgan +Emeline Favreau +Eckart Stolle +Leandro Santiago +Carlos Martinez-Ruiz (5. Visualize visualize visualize!)