SlideShare une entreprise Scribd logo
1  sur  47
Open Data 
Open Notebook Science 
Peter Murray-Rust, 
Open Science, Rio, BR, 2014-08-22
Retrieved 2014-08-08 
Lancet 2011 
31 USD 
For 1 day 
PMR: Closed Access Means People Die
Overview 
• Most scientific data is lost; costs many billions… 
• … AND LIVES. 
• Human problem; lack of vision + active 
opposition. 
• Born-open data and Open Notebook Science 
• Jean-Claude Bradley 
• Panton Principles and Fellows (OKFN) 
• Digital Enlightenment or Digital Darkness?
Reasons for Open Data/Science 
• Moral: Closed can be unjust 
• Ethical: Community norms expect it 
• Utilitarian: Greater communal good f 
• Personal: Greater personal benefit
RCUK 
Wellcome 
ERC 
NSF 
FWF… 
require 
fully OPEN 
[at Research Data Alliance, we are entering a new “era of open science”, which will be “good 
for citizens, good for scientists and good for society”. 
She explicitly highlighted the transformative potential of open access, open data, open 
software and open educational resources – mentioning the EU’s policy requiring open access 
to all publications and data resulting from EU funded research. 
http://blog.okfn.org/2013/03/21/we-are-entering-an-era-of-open-science-says-eu-vp-neelie-kroes/# 
sthash.3SWDXDE6.dpuf
Scientific and Medical publication (STM)[+] 
• World Citizens pay $400,000,000,000… 
• … for research in 1,500,000 articles … 
• … cost $300,000 each to create … 
• … $7000 each to “publish” [*]… 
• … $10,000,000,000 from academic libraries … 
• … to “publishers” who forbid access to 99.9% of 
citizens of the world … 
[+] Figures probably +- 50 % 
[*] arXiV preprint server costs $7 USD per paper
US Taxpayers spend 139 Billion USD / yr 
on Scientific Research 
4 Billion USD on human genome 
yielded 800 Billion USD and 4 M job-years
Bad publication wastes science 
…three problems—flawed design, non-publication, 
and poor reporting—together 
meant >85% of research funds were wasted, a 
global total loss >100 billion USD per year. [Lancet 
2009http://www.thelancet.com/journals/lancet /article/PIIS0140-6736%2809%2960329- 
9/fu lltext.] 
[Even more] waste clearly occurs after 
publication: from poor access, poor 
dissemination, and poor uptake of the findings 
of research. 
[PLOS Medicine 2014-05-27 DOI: 10.1371/journal.pmed.1001651]
Authors don’t deposit data (Ross Mounce)
C) What’s the problem with this spectrum? 
Original thanks to ChemBark 
Org. Lett., 2011, 13 (15), pp 4084–4087
After AMI2 processing….. 
… AMI2 has detected a square
PM-R writes about 
how Open gave him 
5 jobs 
August 2014 
Marcus Hanwell 
http://opensource.com/tags/open-science 
Ross Mounce
Traditional Research and Publication 
“Lab” work paper/th 
esis 
Write 
rewrite 
Re-experiment 
process “belongs” 
to publisher 
publish 
??? 
Validation?? 
DATA 
output “belongs” 
to publisher 
Walls of 
academia
Free/Open Software Development 
CODE 
REPOSITORY 
World 
community 
CODE 
validate 
rewrite 
CODE 
fork 
CODE 
Re-use 
CODE 
Re-use 
Github, BitBucket 
StackOverflow, 
Apache 
inspires 
OSI 
NO WALLS 
BORN-OPEN-SOURCE 
Example: ContentMine at 
http://github.com/ContentMine/quickscrape
BornOS commits in 4 hours
Continuous integration in PMR group 
does the code still work?
Open data
Restrictions on Re-use of Crystallographic data 
NOTE: The CCDC is based on data contributed by 
scientists as part of publication and validation
Elsevier wants to control Open Data 
ViceChancellor Cambridge 
[asked by Michelle Brook]
Licences destroy Content Mining 
WE WALKED OUT 
• Brit Library 
• JISC 
• RLUK 
• OKFN 
• … 
• Ross Mounce 
• PM-R 
STM Publishers Licence 
2012_03_15_Sample_Licence_Text_Data_Mining.pdf 
(Summary: PMR has NO rights) 
• [cannot publish to: ] “libraries, repositories, or archives” 
• [cannot] “Make the results of any TDM Output available on an externally facing server or 
website” 
• “Subscriber shall pay a […] fee” 
Heather Piwowar: “negotiating with publishers [made me physically ill]”
Human Genome Project 
https://en.wikipedia.org/wiki/Bermuda_Principles 
• Automatic release of sequence assemblies larger than 1 
kb (preferably within 24 hours). 
• Immediate publication of finished annotated 
sequences. 
• Aim to make the entire sequence freely available in the 
public domain for both research and development in 
order to maximise benefits to society.
Panton Principles for Open Data in 
science(2010) 
• PUBLISH YOUR DATA OPENLY 
• …make an explicit and robust statement of your wishes. 
• Use a recognized waiver or license that is appropriate for 
data. 
• open as defined by the Open Knowledge/Data Definition 
(… NOT non-commercial) 
• Explicit dedication of data … into the public domain via 
PDDL or CCZero 
Peter Murray-Rust, Cameron Neylon, Rufus Pollock, John 
Wilbanks
Panton Authors and Fellows
Open Notebook Science
Open notebook science is the practice of 
making the entire primary record of a research 
project publicly available online as it is 
recorded. (WP) 
Jean-Claude Bradley was a chemist who 
actively promoted Open Science in 
chemistry,… He coined the term Open 
Notebook Science. … A memorial 
symposium was held July 14, 2014 at 
Cambridge University, UK.[9]
Open Source software inspires Open Science 
Jean-Claude Bradley 2006
Open Notebook Science, ONS 
Jean-Claude Bradley 2006
Jean-Claude Bradley 2006
Jean-Claude Bradley 2006
Jean-Claude Bradley 2006
Volunteer community in chemistry: Open Data/Source/Standards
Award of Blue Obelisk 
Jean-Claude Bradley Egon Willighagen
Realising OpenNotebookScience 
When a distinguished but elderly scientist states that something is 
possible, he is almost certainly right. When he states that something is 
impossible, he is very probably wrong. 
http://en.wikipedia.org/wiki/Clarke's_three_laws 
Open Inspirations (some are zero budget) 
• Open Street Map 
• Journal Of Machine Learning Research 
• Blue Obelisk 
• arXiV 
• Protein Data Bank 
• Galaxy Zoo
Self-benefit drives Open 
• I put my data/papers in a repository because I 
HAVE TO 
• I commit my code to GitHub because I WANT 
TO: 
– It’s safe 
– It’s validated 
– I know it works 
– There are tools to search it 
– Other coders improve and add to it
http://en.wikipedia.org/wiki/Reinventing_Discovery 
http://michaelnielsen.org/blog/reinventing-discovery/
The Polymath project 
Tim Gowers and the world 
http://polymathprojects.org/2013/11/04/polymath9-pnp/#comments 
http://gowers.wordpress.com/2013/11/03/dbd1-initial-post/
Open Notebook Science 
TOOLS 
Open 
engineered 
repository 
INSTRUMENT 
World 
community 
validate 
merge 
MODEL 
CODE 
DATA 
DATA 
knowledge 
calibrate 
Machines 
and humans 
Working 
together 
Problems are solved communally; 
Nothing is needlessly duplicated; “publication“ is 
continuous ; data are SEMANTIC
Sophie Kershaw, Panton Fellow
Open Notebook Science 
TOOLS 
Open 
engineered 
repository 
INSTRUMENT 
World 
community 
validate 
merge 
MODEL 
CODE 
DATA 
DATA 
knowledge 
calibrate 
Machines 
and humans 
Working 
together 
Problems are solved communally; 
Nothing is needlessly duplicated; “publication“ is 
continuous ; data are SEMANTIC
Benefits of OpenNotebookScience 
• Fraud is virtually impossible 
• Priority and credit are algorithmically established 
• It is difficult to be scooped… 
• Data and ideas cannot be lost 
• The world discovers you and you the world 
• Time to announcement is much advanced 
(?years) 
• The “publication process” is vastly less onerous 
• … but others may use your work in other ways
http://www.budapestopenaccessinitiative.org/read 
… an unprecedented public good. … 
… completely free and unrestricted access to [peer-reviewed 
literature] by all scientists, scholars, teachers, 
students, and other curious minds. … 
…Removing access barriers to this literature will 
accelerate research, enrich education, share the 
learning of the rich with the poor and the poor with 
the rich, make this literature as useful as it can be, and 
lay the foundation for uniting humanity in a common 
intellectual conversation and quest for knowledge. 
(Budapest Open Access Initiative, 2003)
Open Notebook Science 
TOOLS 
ONS 
repository 
World 
community 
INSTRUMENT 
validate 
merge 
MODEL 
CODE 
DATA 
DATA 
knowledge 
calibrate 
Machines and 
humans 
working together 
CC-BY 
Problems are solved communally; 
Nothing is needlessly duplicated; “publication“ is 
continuous and immediate
Traditional Research and Publication 
“Lab” work paper/th 
esis 
Write 
rewrite 
Re-experiment 
publish 
??? 
Validation?? 
DATA 
output “belongs” 
to publisher 
Is there anything we can do with this?
Open Notebook Science 
TOOLS 
ONS 
repository 
World 
community 
INSTRUMENT 
validate 
merge 
MODEL 
CODE 
DATA 
DATA 
knowledge 
calibrate 
Machines and 
humans 
working together 
CC-BY/0 
Problems are solved communally; 
Nothing is needlessly duplicated; “publication“ is 
continuous and immediate

Contenu connexe

Tendances

Tendances (20)

The Content Mine (presented at UKSG)
The Content Mine (presented at UKSG)The Content Mine (presented at UKSG)
The Content Mine (presented at UKSG)
 
Disruptive Communities and Technology
Disruptive Communities and TechnologyDisruptive Communities and Technology
Disruptive Communities and Technology
 
Making Theses USEFUL
Making Theses USEFULMaking Theses USEFUL
Making Theses USEFUL
 
Content Mining for Machines and Humans
Content Mining for Machines and HumansContent Mining for Machines and Humans
Content Mining for Machines and Humans
 
ContentMine: Liberating scholarship from Open publications and theses
ContentMine: Liberating scholarship from Open publications and thesesContentMine: Liberating scholarship from Open publications and theses
ContentMine: Liberating scholarship from Open publications and theses
 
Embrace the Open Revolution
Embrace the Open RevolutionEmbrace the Open Revolution
Embrace the Open Revolution
 
Csvconf
CsvconfCsvconf
Csvconf
 
Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)
 
Open Data and Open Science
Open Data and Open ScienceOpen Data and Open Science
Open Data and Open Science
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchData
 
Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus
Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome CampusBibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus
Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia
 
Open scholarship [a FOSTER open science talk]
Open scholarship [a FOSTER open science talk]Open scholarship [a FOSTER open science talk]
Open scholarship [a FOSTER open science talk]
 
Disrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic ComplexDisrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic Complex
 
Climate Change and Human Migration
Climate Change and Human MigrationClimate Change and Human Migration
Climate Change and Human Migration
 
Open Access for Early Career Researchers
Open Access for Early Career ResearchersOpen Access for Early Career Researchers
Open Access for Early Career Researchers
 
ContentMining and Clinical Trials
ContentMining and Clinical TrialsContentMining and Clinical Trials
ContentMining and Clinical Trials
 
ContentMining and Clinical Trials
ContentMining and Clinical TrialsContentMining and Clinical Trials
ContentMining and Clinical Trials
 
Authenticating Scientists with OpenID
Authenticating Scientists with OpenIDAuthenticating Scientists with OpenID
Authenticating Scientists with OpenID
 

En vedette

En vedette (20)

Introduction to open science
Introduction to open scienceIntroduction to open science
Introduction to open science
 
Open Science
Open ScienceOpen Science
Open Science
 
Science in the Open - Science Commons Pacific Northwest
Science in the Open - Science Commons Pacific NorthwestScience in the Open - Science Commons Pacific Northwest
Science in the Open - Science Commons Pacific Northwest
 
Columbia Talk on Open Notebook Science
Columbia Talk on Open Notebook ScienceColumbia Talk on Open Notebook Science
Columbia Talk on Open Notebook Science
 
Building Capacity for Open Science
Building Capacity for Open ScienceBuilding Capacity for Open Science
Building Capacity for Open Science
 
Open science, open data - FOSTER training, Potsdam
Open science, open data - FOSTER training, PotsdamOpen science, open data - FOSTER training, Potsdam
Open science, open data - FOSTER training, Potsdam
 
Open Science and European Access Policies in H2020
Open Science and European Access Policies in H2020 Open Science and European Access Policies in H2020
Open Science and European Access Policies in H2020
 
The Future of Open Science
The Future of Open ScienceThe Future of Open Science
The Future of Open Science
 
Relationships between Open Science, Science 2.0, and Social Media
Relationships between Open Science, Science 2.0, and Social MediaRelationships between Open Science, Science 2.0, and Social Media
Relationships between Open Science, Science 2.0, and Social Media
 
Open science
Open scienceOpen science
Open science
 
What is Open Science and what role does it play in Development?
What is Open Science and what role does it play in Development?What is Open Science and what role does it play in Development?
What is Open Science and what role does it play in Development?
 
Presentation on Open Science and its 'Impacts';
Presentation on Open Science and its 'Impacts'; Presentation on Open Science and its 'Impacts';
Presentation on Open Science and its 'Impacts';
 
Directions in Open Science
Directions in Open ScienceDirections in Open Science
Directions in Open Science
 
Open Science: What, why, how?
Open Science: What, why, how? Open Science: What, why, how?
Open Science: What, why, how?
 
Winning research proposals with open science
Winning research proposals with open scienceWinning research proposals with open science
Winning research proposals with open science
 
Scholarly publishing in the context of open science
Scholarly publishing in the context of open scienceScholarly publishing in the context of open science
Scholarly publishing in the context of open science
 
Open Science at the European Commission
Open Science at the European CommissionOpen Science at the European Commission
Open Science at the European Commission
 
Unit 1, Lesson 1.8 - The Scientific Method (Part Two)
Unit 1, Lesson 1.8 - The Scientific Method (Part Two)Unit 1, Lesson 1.8 - The Scientific Method (Part Two)
Unit 1, Lesson 1.8 - The Scientific Method (Part Two)
 
Open Science in a European Perspective
Open Science in a European PerspectiveOpen Science in a European Perspective
Open Science in a European Perspective
 
Connecting the dots - e-Infra services for open science
Connecting the dots - e-Infra services for open scienceConnecting the dots - e-Infra services for open science
Connecting the dots - e-Infra services for open science
 

Similaire à Open data and Open Science

Similaire à Open data and Open Science (20)

Open Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics InstituteOpen Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics Institute
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-Rust
 
OpenNotebookScience NOW!
OpenNotebookScience NOW!OpenNotebookScience NOW!
OpenNotebookScience NOW!
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open research
 
Open Opportunities
Open OpportunitiesOpen Opportunities
Open Opportunities
 
Open Data HK: open science meets open data. A primer from Scott Edmunds
Open Data HK: open science meets open data. A primer from Scott EdmundsOpen Data HK: open science meets open data. A primer from Scott Edmunds
Open Data HK: open science meets open data. A primer from Scott Edmunds
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
 
Open Access and Open Data: what do I need to know (and do)?
Open Access and Open Data: what do I need to know (and do)?Open Access and Open Data: what do I need to know (and do)?
Open Access and Open Data: what do I need to know (and do)?
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
Benefits and practice of open science
Benefits and practice of open scienceBenefits and practice of open science
Benefits and practice of open science
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and  Medicine from the scholarly literatureAutomatic Extraction of Science and  Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literature
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literatureAutomatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literature
 
Science 2.0
Science 2.0Science 2.0
Science 2.0
 
Rapid biomedical search
Rapid biomedical search Rapid biomedical search
Rapid biomedical search
 
The Era of Open
The Era of OpenThe Era of Open
The Era of Open
 
The OpenCon Intro to Open Data
The OpenCon Intro to Open DataThe OpenCon Intro to Open Data
The OpenCon Intro to Open Data
 
Open science and its advocacy
Open science and its advocacyOpen science and its advocacy
Open science and its advocacy
 

Plus de petermurrayrust

Plus de petermurrayrust (20)

Omdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital AgeOmdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital Age
 
Open Science Principles and Practice
Open Science Principles and PracticeOpen Science Principles and Practice
Open Science Principles and Practice
 
Open Virus Indian Presentation
Open Virus Indian PresentationOpen Virus Indian Presentation
Open Virus Indian Presentation
 
Can machines understand the scientific literature?
Can machines understand the scientific literature?Can machines understand the scientific literature?
Can machines understand the scientific literature?
 
OpenVirus at OpenPublishingFest
OpenVirus at OpenPublishingFestOpenVirus at OpenPublishingFest
OpenVirus at OpenPublishingFest
 
Open Virus Indian Presentation
Open Virus Indian PresentationOpen Virus Indian Presentation
Open Virus Indian Presentation
 
Automatic mining of data from materials science literature
Automatic mining of data from materials science literatureAutomatic mining of data from materials science literature
Automatic mining of data from materials science literature
 
openVirus - tools for discovering literature on viruses
openVirus - tools for discovering literature on virusesopenVirus - tools for discovering literature on viruses
openVirus - tools for discovering literature on viruses
 
XML for science; its huge potential; but are pubiishers preventing it?
XML for science; its huge potential; but are pubiishers preventing it?XML for science; its huge potential; but are pubiishers preventing it?
XML for science; its huge potential; but are pubiishers preventing it?
 
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be BraveEarly Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
 
Early Career Reseachers and Open Healthcare
Early Career Reseachers and Open HealthcareEarly Career Reseachers and Open Healthcare
Early Career Reseachers and Open Healthcare
 
Scientific search for everyone
Scientific search for everyoneScientific search for everyone
Scientific search for everyone
 
Openplant2018 Poster; Semantic searching
Openplant2018 Poster; Semantic searchingOpenplant2018 Poster; Semantic searching
Openplant2018 Poster; Semantic searching
 
Extracting science from the archive
Extracting science from the archiveExtracting science from the archive
Extracting science from the archive
 
WikiFactMine: Ontology for Everybody and Everything
WikiFactMine: Ontology for Everybody and EverythingWikiFactMine: Ontology for Everybody and Everything
WikiFactMine: Ontology for Everybody and Everything
 
Paradise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to MineParadise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to Mine
 
Young people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge NeocolonialismYoung people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge Neocolonialism
 
WikiFactMine: Science for Everyone
WikiFactMine: Science for EveryoneWikiFactMine: Science for Everyone
WikiFactMine: Science for Everyone
 
ContentMining and Copyright at CopyCamp2017
ContentMining and Copyright at CopyCamp2017ContentMining and Copyright at CopyCamp2017
ContentMining and Copyright at CopyCamp2017
 
Big Data and ContentMining for Libraries
Big Data and ContentMining for LibrariesBig Data and ContentMining for Libraries
Big Data and ContentMining for Libraries
 

Dernier

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Dernier (20)

FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 

Open data and Open Science

  • 1. Open Data Open Notebook Science Peter Murray-Rust, Open Science, Rio, BR, 2014-08-22
  • 2. Retrieved 2014-08-08 Lancet 2011 31 USD For 1 day PMR: Closed Access Means People Die
  • 3. Overview • Most scientific data is lost; costs many billions… • … AND LIVES. • Human problem; lack of vision + active opposition. • Born-open data and Open Notebook Science • Jean-Claude Bradley • Panton Principles and Fellows (OKFN) • Digital Enlightenment or Digital Darkness?
  • 4. Reasons for Open Data/Science • Moral: Closed can be unjust • Ethical: Community norms expect it • Utilitarian: Greater communal good f • Personal: Greater personal benefit
  • 5. RCUK Wellcome ERC NSF FWF… require fully OPEN [at Research Data Alliance, we are entering a new “era of open science”, which will be “good for citizens, good for scientists and good for society”. She explicitly highlighted the transformative potential of open access, open data, open software and open educational resources – mentioning the EU’s policy requiring open access to all publications and data resulting from EU funded research. http://blog.okfn.org/2013/03/21/we-are-entering-an-era-of-open-science-says-eu-vp-neelie-kroes/# sthash.3SWDXDE6.dpuf
  • 6. Scientific and Medical publication (STM)[+] • World Citizens pay $400,000,000,000… • … for research in 1,500,000 articles … • … cost $300,000 each to create … • … $7000 each to “publish” [*]… • … $10,000,000,000 from academic libraries … • … to “publishers” who forbid access to 99.9% of citizens of the world … [+] Figures probably +- 50 % [*] arXiV preprint server costs $7 USD per paper
  • 7. US Taxpayers spend 139 Billion USD / yr on Scientific Research 4 Billion USD on human genome yielded 800 Billion USD and 4 M job-years
  • 8. Bad publication wastes science …three problems—flawed design, non-publication, and poor reporting—together meant >85% of research funds were wasted, a global total loss >100 billion USD per year. [Lancet 2009http://www.thelancet.com/journals/lancet /article/PIIS0140-6736%2809%2960329- 9/fu lltext.] [Even more] waste clearly occurs after publication: from poor access, poor dissemination, and poor uptake of the findings of research. [PLOS Medicine 2014-05-27 DOI: 10.1371/journal.pmed.1001651]
  • 9. Authors don’t deposit data (Ross Mounce)
  • 10. C) What’s the problem with this spectrum? Original thanks to ChemBark Org. Lett., 2011, 13 (15), pp 4084–4087
  • 11. After AMI2 processing….. … AMI2 has detected a square
  • 12.
  • 13. PM-R writes about how Open gave him 5 jobs August 2014 Marcus Hanwell http://opensource.com/tags/open-science Ross Mounce
  • 14. Traditional Research and Publication “Lab” work paper/th esis Write rewrite Re-experiment process “belongs” to publisher publish ??? Validation?? DATA output “belongs” to publisher Walls of academia
  • 15. Free/Open Software Development CODE REPOSITORY World community CODE validate rewrite CODE fork CODE Re-use CODE Re-use Github, BitBucket StackOverflow, Apache inspires OSI NO WALLS BORN-OPEN-SOURCE Example: ContentMine at http://github.com/ContentMine/quickscrape
  • 16. BornOS commits in 4 hours
  • 17. Continuous integration in PMR group does the code still work?
  • 19. Restrictions on Re-use of Crystallographic data NOTE: The CCDC is based on data contributed by scientists as part of publication and validation
  • 20. Elsevier wants to control Open Data ViceChancellor Cambridge [asked by Michelle Brook]
  • 21. Licences destroy Content Mining WE WALKED OUT • Brit Library • JISC • RLUK • OKFN • … • Ross Mounce • PM-R STM Publishers Licence 2012_03_15_Sample_Licence_Text_Data_Mining.pdf (Summary: PMR has NO rights) • [cannot publish to: ] “libraries, repositories, or archives” • [cannot] “Make the results of any TDM Output available on an externally facing server or website” • “Subscriber shall pay a […] fee” Heather Piwowar: “negotiating with publishers [made me physically ill]”
  • 22. Human Genome Project https://en.wikipedia.org/wiki/Bermuda_Principles • Automatic release of sequence assemblies larger than 1 kb (preferably within 24 hours). • Immediate publication of finished annotated sequences. • Aim to make the entire sequence freely available in the public domain for both research and development in order to maximise benefits to society.
  • 23. Panton Principles for Open Data in science(2010) • PUBLISH YOUR DATA OPENLY • …make an explicit and robust statement of your wishes. • Use a recognized waiver or license that is appropriate for data. • open as defined by the Open Knowledge/Data Definition (… NOT non-commercial) • Explicit dedication of data … into the public domain via PDDL or CCZero Peter Murray-Rust, Cameron Neylon, Rufus Pollock, John Wilbanks
  • 25.
  • 27. Open notebook science is the practice of making the entire primary record of a research project publicly available online as it is recorded. (WP) Jean-Claude Bradley was a chemist who actively promoted Open Science in chemistry,… He coined the term Open Notebook Science. … A memorial symposium was held July 14, 2014 at Cambridge University, UK.[9]
  • 28.
  • 29. Open Source software inspires Open Science Jean-Claude Bradley 2006
  • 30. Open Notebook Science, ONS Jean-Claude Bradley 2006
  • 34. Volunteer community in chemistry: Open Data/Source/Standards
  • 35. Award of Blue Obelisk Jean-Claude Bradley Egon Willighagen
  • 36. Realising OpenNotebookScience When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong. http://en.wikipedia.org/wiki/Clarke's_three_laws Open Inspirations (some are zero budget) • Open Street Map • Journal Of Machine Learning Research • Blue Obelisk • arXiV • Protein Data Bank • Galaxy Zoo
  • 37. Self-benefit drives Open • I put my data/papers in a repository because I HAVE TO • I commit my code to GitHub because I WANT TO: – It’s safe – It’s validated – I know it works – There are tools to search it – Other coders improve and add to it
  • 39. The Polymath project Tim Gowers and the world http://polymathprojects.org/2013/11/04/polymath9-pnp/#comments http://gowers.wordpress.com/2013/11/03/dbd1-initial-post/
  • 40. Open Notebook Science TOOLS Open engineered repository INSTRUMENT World community validate merge MODEL CODE DATA DATA knowledge calibrate Machines and humans Working together Problems are solved communally; Nothing is needlessly duplicated; “publication“ is continuous ; data are SEMANTIC
  • 42. Open Notebook Science TOOLS Open engineered repository INSTRUMENT World community validate merge MODEL CODE DATA DATA knowledge calibrate Machines and humans Working together Problems are solved communally; Nothing is needlessly duplicated; “publication“ is continuous ; data are SEMANTIC
  • 43. Benefits of OpenNotebookScience • Fraud is virtually impossible • Priority and credit are algorithmically established • It is difficult to be scooped… • Data and ideas cannot be lost • The world discovers you and you the world • Time to announcement is much advanced (?years) • The “publication process” is vastly less onerous • … but others may use your work in other ways
  • 44. http://www.budapestopenaccessinitiative.org/read … an unprecedented public good. … … completely free and unrestricted access to [peer-reviewed literature] by all scientists, scholars, teachers, students, and other curious minds. … …Removing access barriers to this literature will accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge. (Budapest Open Access Initiative, 2003)
  • 45. Open Notebook Science TOOLS ONS repository World community INSTRUMENT validate merge MODEL CODE DATA DATA knowledge calibrate Machines and humans working together CC-BY Problems are solved communally; Nothing is needlessly duplicated; “publication“ is continuous and immediate
  • 46. Traditional Research and Publication “Lab” work paper/th esis Write rewrite Re-experiment publish ??? Validation?? DATA output “belongs” to publisher Is there anything we can do with this?
  • 47. Open Notebook Science TOOLS ONS repository World community INSTRUMENT validate merge MODEL CODE DATA DATA knowledge calibrate Machines and humans working together CC-BY/0 Problems are solved communally; Nothing is needlessly duplicated; “publication“ is continuous and immediate

Notes de l'éditeur

  1. ChemBark