SlideShare a Scribd company logo
1 of 31
From Open Data to Open Science
Geoffrey Boulton
University of Edinburgh & CODATA
“Learn” Workshop
University College, London
January 2016
Knowledge and understanding - the engines of material progress
depend on technologies that enable their accumulation and communication
1454 2002
Openness – the bedrock of science in the
modern era
Henry Oldenburg
Scientific self correction
/var/folders/ls/nv6g47p94ks4d11f1p72h2ch00
00gn/T/com.apple.Preview/com.apple.Preview
.PasteboardItems/rutford_avo_afi_ed_july201
0 (dragged).pdf
The Challenge: the “Data Storm” is undermining
“self correction”
THEN AND NOW
A crisis of reproducibility and credibility?
Why such low levels of reproducibility?
• Misconduct/fraud
• Invalid reasoning
• Absent or inadequate data and/or metadata
19
Exabytes280Exabytes
Based on:
http://www.martinhilbert.net/WorldOnfoCapacity.html 1 Exabyte=1018 bytes
The digital revolution
Global information storage capacity
In optimally compressed bytes
Digital
Storage
Analogue Storage
Explosion of the
Digital revolution
1986
1993
2000
2007
2014-4000Exabytes
http://www.wired.co.
uk/news/archive/201
4-01/15/1000-dollar-
genome/viewgallery/3
31679
Data acquistion: Cost down – Flux up
Information: how much is crystallised into knowledge?
Reinventing reproducibility
for the digital age
How do we retain an essential principle?
The data providing the evidence for a published
concept MUST be concurrently published, together
with necessary metadata and computer code.
To do otherwise is scientific MALPRACTICE
Ozone Levels
Four key drivers of change for science
• Big data
• Semantically-linked data
• Open data
• Cost reduction
Micro-satellite
Looking at clouds
Pillars of the Digital Revolution
Big Data
Volume
Velocity
Variety
Linked
Open
Data
Many
databases
Semantic
Relations
Deeper
meaning
Foundations : Openness
Machine analysis & learning Text and data mining
The opportunity: data from “simple” to complex systems
from uncoupled to highly coupled behaviour
Uncoupled
systems
Simulating behaviour of
highly coupled systems
Simulating system dynamics Mapping a complex state
Image of brain cells in a rat
Emergent behaviour of a specific
6-component coupled system
• patterns not hitherto seen
• unsuspected relationship
• complex systems
e.g. complexity: dynamic evolution and system state
Scientific opportunities
Satellite observation Surface monitoring
The opportunity: data-modelling: iterative integration
Initial conditions
Model forecast
Model-data iteration - forecast correction
Linear regression
Cluster analysis
Dynamic/complex behaviour
Complex systems
No mathematical pipeline
Simple relationships
Classical statistics
System characterisations: from simple to complex
Glucose in type II diabetes
Topological analysis
A barrier to openness? - Analytic overload.
E.g. - Global Earth Observation System of Systems
• What is the human role?
• Can we analyse & scrutinise what is in the
black box? - &who owns the box?
• What does it mean to be a researcher in a
data intensive age?
A disconnect between machine
analysis & human cognition?
Mathematics related discussions
Tim Gowers
- crowd-sourced mathematics
An unsolved problem posed on
his blog.
32 days – 27 people – 800
substantive contributions
Emerging contributions rapidly
developed or discarded
Problem solved!
“Its like driving a car whilst
normal research is like pushing
it”
What inhibits such processes?
- The criteria for credit and
promotion
– ALTMETRICS THE ANSWER?
New modes of technology-
enabled creativity:
e.g Crowd-sourcing
The Open Data Iceberg
The Technical Challenge
The Consent Challenge
The Ecosystem Challenge
The Funding Challenge
The Support Challenge
The Skills Challenge
The Incentives Challenge
The Mindset Challenge
Processes &
Organisation
People
motivation and ethos.
Developed from: Deetjen, U., E. T. Meyer and R. Schroeder (2015).
A National Infrastructure
Technology
The “Science International” Accord:
principles of open data
(www.icsu.org/science-international)
Responsibilities
1-2. Scientists
3. Research institutions & universities
4. Publishers
5. Funding agencies
6. Scholarly societies and academies
7. Libraries & repositories
8. Boundaries of openness
Enabling practices
9. Citation and provenance
10. Interoperability
11. Non-restrictive re-use
12. Linkability
Responsibilities
Scientists
i. Publicly funded scientists have a responsibility to contribute to the
public good through the creation and communication of new
knowledge, of which associated data are intrinsic parts. They
should make such data openly available to others as soon as
possible after their production in ways that permit them to be re-
used and re-purposed.
ii. The data that provide evidence for published scientific claims
should be made concurrently and publicly available in an
intelligently open form. This should permit the logic of the link
between data and claim to be rigorously scrutinised and the
validity of the data to be tested by replication of experiments or
observations. To the extent possible, data should be deposited in
well-managed and trusted repositories with low access barriers.
CODATACODATA
II
SS
UU
African Open Data/Open Science Platform
Platform Forum
Coordination
Government
Priority setting
Funders
Funding
Incentives
Capacity Building
Training and Skills
Infrastructure
Roadmaps
Flagship
Co-Designed Data
Intensive Projects
International
Standards
Programmes
Shared infrastructure investment; shared good practice; capacity building;
system development
EMBL-EBI services
Labs around the
world send us
their data and
we…
Archive it
Classify it
Share it with
other data
providers
Analyse, add
value and
integrate it
…provide
tools to help
researchers
use it
A collaborative
enterprise
Disciplinary communities can lead the way
e.g. Elixir programme in life sciences/bio-informatics
Regional Platforms for Open Science
African
Platform?
Asian
Platform?
Australian
Platform
Shared investment in infrastructure; harvesting and circulating good ideas;
spreading and supporting good practice; capacity building; promoting
applications; linking to international programmes and standards.
S.
American
Platform?
Inputs Outputs
Open access
Administrative
data (held by
public
authorities e.g.
prescription
data)
Public Sector
Research data
(e.g. Met
Office weather
data)
Research
Data (e.g.
CERN,
generated in
universities)
Research
publications
(i.e. papers in
journals)
Open data
Open science
“science as a public enterprise”
Collecting the
data
Doing
research
Doing science
openly
Researchers - Govt & Public sector - Businesses - Citizens - Citizen scientists
(communication/dialogue – joint production of knowledge)
Stakeholders
• Communication/dialogue must be audience-sensitive
• Is it – with all stakeholder groups?
Open Science
Data / Publications
Researchers
Mono/MultiInterTransdisciplinary
Stakeholders
RigourInnovationPolicySolutions
Open Knowledge
Ins tu onal
management and support
Na onal policies
& e-infrastructure
Open
Research
Data
Big Data
Analy cs
Knowledge
Output
EXPLOITING THE DATA REVOLUTION
Scien fic inference
Ins tu onal
management & support
Na onal policies
& e-infrastructure
A national data-intensive system
CODATACODATA
II
SS
UU
International Research Data Collaboration
CODATACODATA
II
SS
UU
CODATA
 Policies & practice
 Frontiers of data
science
 Capacity Building
WDS
• Data stewardship
• Data standards
RDA
• Interoperability
1. Maintaining “self-correction”
2. Open knowledge is creative & productive
“If you have an apple and I have an apple and we
exchange these apples, then you and I will still
each have one apple. But if you have an idea and I
have an idea and we exchange these ideas, then
each of us will have two ideas.”
3. Open data enables semantic linking
George Bernard Shaw
Why openness & sharing?
• Openly collected science is already helping policy
makers.
• AshTag app allows users to submit photos and
locations of sightings to a team who will refer them on
to the Forestry Commission, which is leading efforts to
stop the disease's spread with the Department for
Environment, Food and Rural Affairs (Defra).
Chalara spread: 1992-2012
Citizen Science

More Related Content

What's hot

A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
LIBER Europe
 
Workshop at Oxford on publishing for early career researchers - April 2011
Workshop at Oxford on publishing for early career researchers - April 2011Workshop at Oxford on publishing for early career researchers - April 2011
Workshop at Oxford on publishing for early career researchers - April 2011
Jisc
 

What's hot (20)

The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...
The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...
The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...
 
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3mResearch Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
 
What does open science mean? A stakeholder perspective
What does open science mean? A stakeholder perspectiveWhat does open science mean? A stakeholder perspective
What does open science mean? A stakeholder perspective
 
Open science, open data - FOSTER training, Potsdam
Open science, open data - FOSTER training, PotsdamOpen science, open data - FOSTER training, Potsdam
Open science, open data - FOSTER training, Potsdam
 
Data, Science, Society - Claudio Gutierrez, University of Chile
Data, Science, Society - Claudio Gutierrez, University of ChileData, Science, Society - Claudio Gutierrez, University of Chile
Data, Science, Society - Claudio Gutierrez, University of Chile
 
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
 
Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014
 
Open Science
Open ScienceOpen Science
Open Science
 
Introduction to open science
Introduction to open scienceIntroduction to open science
Introduction to open science
 
November 10, 2015 NISO/ICSTI Joint Webinar: A Pathway from Open Access and Da...
November 10, 2015 NISO/ICSTI Joint Webinar: A Pathway from Open Access and Da...November 10, 2015 NISO/ICSTI Joint Webinar: A Pathway from Open Access and Da...
November 10, 2015 NISO/ICSTI Joint Webinar: A Pathway from Open Access and Da...
 
Fostering Open Science to Research Using a Taxonomy and an eLearning Portal
Fostering Open Science to Research Using a Taxonomy and an eLearning PortalFostering Open Science to Research Using a Taxonomy and an eLearning Portal
Fostering Open Science to Research Using a Taxonomy and an eLearning Portal
 
Opening Research Data in EU Universities: Policies, Motivators and Challenges
Opening Research Data in EU Universities: Policies, Motivators and ChallengesOpening Research Data in EU Universities: Policies, Motivators and Challenges
Opening Research Data in EU Universities: Policies, Motivators and Challenges
 
The Future of Open Science
The Future of Open ScienceThe Future of Open Science
The Future of Open Science
 
Enabling Data-Intensive Science Through Data Infrastructures
Enabling Data-Intensive Science Through Data InfrastructuresEnabling Data-Intensive Science Through Data Infrastructures
Enabling Data-Intensive Science Through Data Infrastructures
 
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
 
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
 
Scott Edmunds at OASP Asia: Open (and Big) Data – the next challenge
Scott Edmunds at OASP Asia: Open (and Big) Data – the next challengeScott Edmunds at OASP Asia: Open (and Big) Data – the next challenge
Scott Edmunds at OASP Asia: Open (and Big) Data – the next challenge
 
LEARN Final Conference: Tutorial Group | How To Engage Early Career Researchers
LEARN Final Conference: Tutorial Group | How To Engage Early Career ResearchersLEARN Final Conference: Tutorial Group | How To Engage Early Career Researchers
LEARN Final Conference: Tutorial Group | How To Engage Early Career Researchers
 
The Needs of Stakeholders in the RDM Process - the role of LEARN
The Needs of Stakeholders in the RDM Process - the role of LEARNThe Needs of Stakeholders in the RDM Process - the role of LEARN
The Needs of Stakeholders in the RDM Process - the role of LEARN
 
Workshop at Oxford on publishing for early career researchers - April 2011
Workshop at Oxford on publishing for early career researchers - April 2011Workshop at Oxford on publishing for early career researchers - April 2011
Workshop at Oxford on publishing for early career researchers - April 2011
 

Similar to From Open Data to Open Science, by Geoffrey Boulton

e-infrastructures supporting open knowledge circulation - OpenAIRE France
e-infrastructures supporting open knowledge circulation - OpenAIRE Francee-infrastructures supporting open knowledge circulation - OpenAIRE France
e-infrastructures supporting open knowledge circulation - OpenAIRE France
Jean-François Lutz
 
Science20brussels osimo april2013
Science20brussels osimo april2013Science20brussels osimo april2013
Science20brussels osimo april2013
osimod
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Carole Goble
 

Similar to From Open Data to Open Science, by Geoffrey Boulton (20)

A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Open Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesOpen Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practices
 
Science as an Open Enterprise – Geoffrey Boulton
Science as an Open Enterprise – Geoffrey BoultonScience as an Open Enterprise – Geoffrey Boulton
Science as an Open Enterprise – Geoffrey Boulton
 
The FOSTER project - general overview
The FOSTER project - general overviewThe FOSTER project - general overview
The FOSTER project - general overview
 
Winning Horizon 2020 with Open Science
Winning Horizon 2020 with Open ScienceWinning Horizon 2020 with Open Science
Winning Horizon 2020 with Open Science
 
I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17
 
e-infrastructures supporting open knowledge circulation - OpenAIRE France
e-infrastructures supporting open knowledge circulation - OpenAIRE Francee-infrastructures supporting open knowledge circulation - OpenAIRE France
e-infrastructures supporting open knowledge circulation - OpenAIRE France
 
Ethiopia: Open Data/Open Science Agenda/Teklemichael Tefera
Ethiopia: Open Data/Open Science Agenda/Teklemichael TeferaEthiopia: Open Data/Open Science Agenda/Teklemichael Tefera
Ethiopia: Open Data/Open Science Agenda/Teklemichael Tefera
 
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
 
Digital Resources for Open Science
Digital Resources for Open ScienceDigital Resources for Open Science
Digital Resources for Open Science
 
Vitae tomorrows-researchers
Vitae tomorrows-researchersVitae tomorrows-researchers
Vitae tomorrows-researchers
 
Open Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon HodsonOpen Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon Hodson
 
European Commission's Open Science Initiative: co-creating added value with data
European Commission's Open Science Initiative: co-creating added value with dataEuropean Commission's Open Science Initiative: co-creating added value with data
European Commission's Open Science Initiative: co-creating added value with data
 
Open Data and Open Science
Open Data and Open ScienceOpen Data and Open Science
Open Data and Open Science
 
Science20brussels osimo april2013
Science20brussels osimo april2013Science20brussels osimo april2013
Science20brussels osimo april2013
 
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
The Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data PilotThe Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data Pilot
 
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE WebinarThe Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
 

More from LEARN Project

More from LEARN Project (20)

Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster
 
LEARN Final Conference: Tutorial Group | Using the LEARN Model RDM Policy
LEARN Final Conference: Tutorial Group | Using the LEARN Model RDM PolicyLEARN Final Conference: Tutorial Group | Using the LEARN Model RDM Policy
LEARN Final Conference: Tutorial Group | Using the LEARN Model RDM Policy
 
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM ToolkitLEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
 
LEARN Final Conference: Tutorial Group | Costing RDM
LEARN Final Conference: Tutorial Group | Costing RDMLEARN Final Conference: Tutorial Group | Costing RDM
LEARN Final Conference: Tutorial Group | Costing RDM
 
Paolo Budroni at COAR Annual Meeting
Paolo Budroni at COAR Annual MeetingPaolo Budroni at COAR Annual Meeting
Paolo Budroni at COAR Annual Meeting
 
LEARN Webinar
LEARN WebinarLEARN Webinar
LEARN Webinar
 
Developing a Framework for Research Data Management Protocols
Developing a Framework for Research Data Management ProtocolsDeveloping a Framework for Research Data Management Protocols
Developing a Framework for Research Data Management Protocols
 
About Data From A Machine Learning Perspective
About Data From A Machine Learning PerspectiveAbout Data From A Machine Learning Perspective
About Data From A Machine Learning Perspective
 
LEARN Carribean Workshop Opening Remarks
LEARN Carribean Workshop Opening RemarksLEARN Carribean Workshop Opening Remarks
LEARN Carribean Workshop Opening Remarks
 
Managing Research Data in the Caribbean: Good practices and challenges
Managing Research Data in the Caribbean: Good practices and challengesManaging Research Data in the Caribbean: Good practices and challenges
Managing Research Data in the Caribbean: Good practices and challenges
 
LEARN Project: The Story So Far
LEARN Project: The Story So FarLEARN Project: The Story So Far
LEARN Project: The Story So Far
 
The Data Deluge: the Role of Research Organisations
The Data Deluge: the Role of Research OrganisationsThe Data Deluge: the Role of Research Organisations
The Data Deluge: the Role of Research Organisations
 
Data for Development in the Caribbean
Data for Development in the CaribbeanData for Development in the Caribbean
Data for Development in the Caribbean
 
Open Data in a Big World by Fernando Ariel López
Open Data in a Big World by Fernando Ariel López Open Data in a Big World by Fernando Ariel López
Open Data in a Big World by Fernando Ariel López
 
CENTRO DE DATOS
CENTRO DE DATOSCENTRO DE DATOS
CENTRO DE DATOS
 
Research Data Management in São Paulo by Fabio Kon FAPESP
Research Data Management in São Paulo by Fabio Kon FAPESPResearch Data Management in São Paulo by Fabio Kon FAPESP
Research Data Management in São Paulo by Fabio Kon FAPESP
 
Gestion de datos para la investigacion: el caso peruano by Edward Mezones, Su...
Gestion de datos para la investigacion: el caso peruano by Edward Mezones, Su...Gestion de datos para la investigacion: el caso peruano by Edward Mezones, Su...
Gestion de datos para la investigacion: el caso peruano by Edward Mezones, Su...
 
TALLER LEARN SOBRE DATOS DE INVESTIGACIÓN IMPLEMENTACIÓN DE POLÍTICAS Y ESTRA...
TALLER LEARN SOBRE DATOS DE INVESTIGACIÓN IMPLEMENTACIÓN DE POLÍTICAS Y ESTRA...TALLER LEARN SOBRE DATOS DE INVESTIGACIÓN IMPLEMENTACIÓN DE POLÍTICAS Y ESTRA...
TALLER LEARN SOBRE DATOS DE INVESTIGACIÓN IMPLEMENTACIÓN DE POLÍTICAS Y ESTRA...
 
Avances en torno a la Ley 26.899 e iniciativa regional de datos primarios de...
Avances en torno a la Ley 26.899 e iniciativa regional de datos primarios de...Avances en torno a la Ley 26.899 e iniciativa regional de datos primarios de...
Avances en torno a la Ley 26.899 e iniciativa regional de datos primarios de...
 
“Data for Development – the value of data for research and society” by Dr. Ma...
“Data for Development – the value of data for research and society” by Dr. Ma...“Data for Development – the value of data for research and society” by Dr. Ma...
“Data for Development – the value of data for research and society” by Dr. Ma...
 

Recently uploaded

如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 

Recently uploaded (20)

Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 

From Open Data to Open Science, by Geoffrey Boulton

  • 1. From Open Data to Open Science Geoffrey Boulton University of Edinburgh & CODATA “Learn” Workshop University College, London January 2016
  • 2. Knowledge and understanding - the engines of material progress depend on technologies that enable their accumulation and communication 1454 2002
  • 3. Openness – the bedrock of science in the modern era Henry Oldenburg
  • 6. A crisis of reproducibility and credibility? Why such low levels of reproducibility? • Misconduct/fraud • Invalid reasoning • Absent or inadequate data and/or metadata
  • 7. 19 Exabytes280Exabytes Based on: http://www.martinhilbert.net/WorldOnfoCapacity.html 1 Exabyte=1018 bytes The digital revolution Global information storage capacity In optimally compressed bytes Digital Storage Analogue Storage Explosion of the Digital revolution 1986 1993 2000 2007 2014-4000Exabytes
  • 9. Information: how much is crystallised into knowledge?
  • 10. Reinventing reproducibility for the digital age How do we retain an essential principle? The data providing the evidence for a published concept MUST be concurrently published, together with necessary metadata and computer code. To do otherwise is scientific MALPRACTICE
  • 11.
  • 12. Ozone Levels Four key drivers of change for science • Big data • Semantically-linked data • Open data • Cost reduction Micro-satellite Looking at clouds
  • 13. Pillars of the Digital Revolution Big Data Volume Velocity Variety Linked Open Data Many databases Semantic Relations Deeper meaning Foundations : Openness Machine analysis & learning Text and data mining
  • 14. The opportunity: data from “simple” to complex systems from uncoupled to highly coupled behaviour Uncoupled systems Simulating behaviour of highly coupled systems
  • 15. Simulating system dynamics Mapping a complex state Image of brain cells in a rat Emergent behaviour of a specific 6-component coupled system • patterns not hitherto seen • unsuspected relationship • complex systems e.g. complexity: dynamic evolution and system state Scientific opportunities
  • 16. Satellite observation Surface monitoring The opportunity: data-modelling: iterative integration Initial conditions Model forecast Model-data iteration - forecast correction
  • 17. Linear regression Cluster analysis Dynamic/complex behaviour Complex systems No mathematical pipeline Simple relationships Classical statistics System characterisations: from simple to complex Glucose in type II diabetes Topological analysis
  • 18. A barrier to openness? - Analytic overload. E.g. - Global Earth Observation System of Systems • What is the human role? • Can we analyse & scrutinise what is in the black box? - &who owns the box? • What does it mean to be a researcher in a data intensive age? A disconnect between machine analysis & human cognition?
  • 19. Mathematics related discussions Tim Gowers - crowd-sourced mathematics An unsolved problem posed on his blog. 32 days – 27 people – 800 substantive contributions Emerging contributions rapidly developed or discarded Problem solved! “Its like driving a car whilst normal research is like pushing it” What inhibits such processes? - The criteria for credit and promotion – ALTMETRICS THE ANSWER? New modes of technology- enabled creativity: e.g Crowd-sourcing
  • 20. The Open Data Iceberg The Technical Challenge The Consent Challenge The Ecosystem Challenge The Funding Challenge The Support Challenge The Skills Challenge The Incentives Challenge The Mindset Challenge Processes & Organisation People motivation and ethos. Developed from: Deetjen, U., E. T. Meyer and R. Schroeder (2015). A National Infrastructure Technology
  • 21. The “Science International” Accord: principles of open data (www.icsu.org/science-international) Responsibilities 1-2. Scientists 3. Research institutions & universities 4. Publishers 5. Funding agencies 6. Scholarly societies and academies 7. Libraries & repositories 8. Boundaries of openness Enabling practices 9. Citation and provenance 10. Interoperability 11. Non-restrictive re-use 12. Linkability
  • 22. Responsibilities Scientists i. Publicly funded scientists have a responsibility to contribute to the public good through the creation and communication of new knowledge, of which associated data are intrinsic parts. They should make such data openly available to others as soon as possible after their production in ways that permit them to be re- used and re-purposed. ii. The data that provide evidence for published scientific claims should be made concurrently and publicly available in an intelligently open form. This should permit the logic of the link between data and claim to be rigorously scrutinised and the validity of the data to be tested by replication of experiments or observations. To the extent possible, data should be deposited in well-managed and trusted repositories with low access barriers.
  • 23. CODATACODATA II SS UU African Open Data/Open Science Platform Platform Forum Coordination Government Priority setting Funders Funding Incentives Capacity Building Training and Skills Infrastructure Roadmaps Flagship Co-Designed Data Intensive Projects International Standards Programmes Shared infrastructure investment; shared good practice; capacity building; system development
  • 24. EMBL-EBI services Labs around the world send us their data and we… Archive it Classify it Share it with other data providers Analyse, add value and integrate it …provide tools to help researchers use it A collaborative enterprise Disciplinary communities can lead the way e.g. Elixir programme in life sciences/bio-informatics
  • 25. Regional Platforms for Open Science African Platform? Asian Platform? Australian Platform Shared investment in infrastructure; harvesting and circulating good ideas; spreading and supporting good practice; capacity building; promoting applications; linking to international programmes and standards. S. American Platform?
  • 26. Inputs Outputs Open access Administrative data (held by public authorities e.g. prescription data) Public Sector Research data (e.g. Met Office weather data) Research Data (e.g. CERN, generated in universities) Research publications (i.e. papers in journals) Open data Open science “science as a public enterprise” Collecting the data Doing research Doing science openly Researchers - Govt & Public sector - Businesses - Citizens - Citizen scientists (communication/dialogue – joint production of knowledge) Stakeholders • Communication/dialogue must be audience-sensitive • Is it – with all stakeholder groups?
  • 27. Open Science Data / Publications Researchers Mono/MultiInterTransdisciplinary Stakeholders RigourInnovationPolicySolutions Open Knowledge
  • 28. Ins tu onal management and support Na onal policies & e-infrastructure Open Research Data Big Data Analy cs Knowledge Output EXPLOITING THE DATA REVOLUTION Scien fic inference Ins tu onal management & support Na onal policies & e-infrastructure A national data-intensive system
  • 29. CODATACODATA II SS UU International Research Data Collaboration CODATACODATA II SS UU CODATA  Policies & practice  Frontiers of data science  Capacity Building WDS • Data stewardship • Data standards RDA • Interoperability
  • 30. 1. Maintaining “self-correction” 2. Open knowledge is creative & productive “If you have an apple and I have an apple and we exchange these apples, then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas.” 3. Open data enables semantic linking George Bernard Shaw Why openness & sharing?
  • 31. • Openly collected science is already helping policy makers. • AshTag app allows users to submit photos and locations of sightings to a team who will refer them on to the Forestry Commission, which is leading efforts to stop the disease's spread with the Department for Environment, Food and Rural Affairs (Defra). Chalara spread: 1992-2012 Citizen Science

Editor's Notes

  1. The material advance of human society has been based on the acquisition and use of knowledge and science, as it has been practised in the last 300 years has proved to be the most effective way of gaining reliable knowledge. I want to talk about the processes whereby science is done and how they need to adapt to a novel environment in which we are able to acquire, store, manipulate and communicate data of unprecedented volume and complexity. What challenges does this environment offer to the essential processes of science, how can we exploit the opportunities that it offers and what barriers inhibit necessary changes. This is not about openness for itself – but open processes in the doing of science) Open science is not new. It was the bedrock on which the extraordinary scientific revolutions of the 18th and 19th centuries were built. But we do need to reinvent it for a data-rich era. So let us start with a little history.
  2. This is Henry Oldenberg, the first secretary of the newly formed Royal Society in the early 1660s. Henry was an inveterate correspondent, with those we would now call scientists both in Europe and beyond. Rather than keep this correspondence private, he thought it would be a good idea to publish it, and persuaded the new Society to do so by creating the Philosophical Transactions, which remains a top-flight journal to the present day. But he demanded two things of his correspondents: that they should submit in the vernacular and not Latin; and that evidence (data) that supported a concept must be published together with the concept. It permitted others to scrutinize the logic of the concept, the extent to which it was supported by the data and permitted replication and re-use. Open publication of concept and evidence is the basis of “scientific self-correction”, which historians of science argue were the crucial building blocks on which the scientific revolution of the 18th and 19th centuries was built and remain fundamental to the progress of science. Openness to scrutiny by scientific peers is the most powerful form of peer review.
  3. The fundamental challenge is to scientific self-correction. Journals can no longer contain the data, and neither scientists nor journals have taken the obvious step of having data relevant to a publication concurrently available in an electronic database. (example of last year’s Nature paper revealing that only 11% of results in 50 benchmark papers in pre-clinical oncology were replicable. If lack of Oldenburg’s rigour in presenting evidence is widespread, a failure of replicability risks undermines science as a reliable way of acquiring knowledge and can therefore undermines its credibility.
  4. Lots of interchangeable and fluid terms but many shared principles. The word “science” is used to mean the systematic organisation of knowledge that can be rationally explained and reliably applied. It is not exclusively restricted to “natural science”.
  5. Human and technical requirements for a sustainable data infrastructure. Network of world data centres. Data policies and data science: bringing data experts together with research scientists.
  6. Ash dieback, caused by the fungus Chalara fraxinea, was found in the UK in October outside of plantations and nurseries in East Anglia, raising fears of a repeat of Dutch elm disease which killed 25 million mature elms in the 1970s and 80s. In an attempt to map and help prevent the spread of the disease across the country, a team of developers and academics worked through the weekend to create an app that smartphone owners can use to report suspected cases of infection. Infected ash trees are recognisable by lesions on their bark, dieback of leaves at the tree's crown, and leaves turning brown – though experts say the arrival of autumn makes the latter harder to accurately spot. zThe AshTag app for IOS and Android devices allows users to submit photos and locations of sightings to a team who will refer them on to the Forestry Commission, which is leading efforts to stop the disease's spread with the Department for Environment, Food and Rural Affairs (Defra).