SlideShare une entreprise Scribd logo
1  sur  38
The Fourth Paradigm:
Data-Intensive Scientific Discovery
and Open Science
Tony Hey
Chief Data Scientist
UK Science and Technology Facilities Council
tony.hey@stfc.ac.uk
The birthplace of the industrial revolution
The Fourth Paradigm:
Data-Intensive Science
Much of Science is now Data-Intensive
Number of Researchers
Data Volume
• Extremely large data sets
• Expensive to move
• Domain standards
• High computational needs
• Supercomputers, HPC, Grids
e.g. High Energy Physics, Astronomy
• Large data sets
• Some Standards within Domains
• Shared Datacenters & Clusters
• Research Collaborations
e.g. Genomics, Financial
• Medium & Small data sets
• Flat Files, Excel
• Widely diverse data; Few standards
• Local Servers & PCs
e.g. Social Sciences, Humanities
Four “V’s” of Data
•Volume
•Variety
•Velocity
•Veracity
‘The Long Tail of
Science’
The ‘Cosmic Genome Project’:
The Sloan Digital Sky Survey
• Survey of more than ¼ of the night sky
• Survey produces 200 GB of data per night
• Two surveys in one – images and spectra
• Nearly 2M astronomical objects, including
800,000 galaxies, 100,000 quasars
• 100’s of TB of data, and data is public
• Started in 1992, ‘finished’ in 2008
The SkyServer Web Service was
built at JHU by team led by Alex
Szalay and Jim Gray
The University of Chicago
Princeton University
The Johns Hopkins University
The University of Washington
New Mexico State University
Fermi National Accelerator Laboratory
US Naval Observatory
The Japanese Participation Group
The Institute for Advanced Study
Max Planck Inst, Heidelberg
Sloan Foundation, NSF, DOE, NASA
Open Data: Public Use of the Sloan Data
• SkyServer web service has
had over 400 million web
• About 1M distinct users
vs 10,000 astronomers
• >1600 refereed papers!
• Delivered 50,000 hours
of lectures to high schools
 New publishing paradigm:
data is published before
analysis by astronomers
 Platform for ‘citizen science’
with GalaxyZoo project
Posterchild in 21st century data publishing
Thousand years ago – Experimental Science
• Description of natural phenomena
Last few hundred years – Theoretical Science
• Newton’s Laws, Maxwell’s Equations…
Last few decades – Computational Science
• Simulation of complex phenomena
Today – Data-Intensive Science
• Scientists overwhelmed with data sets
from many different sources
• Data captured by instruments
• Data generated by simulations
• Data generated by sensor networks
eScience and the Fourth Paradigm
2
2
2.
3
4
a
cG
a
a










eScience is the set of tools and technologies
to support data federation and collaboration
• For analysis and data mining
• For data visualization and exploration
• For scholarly communication and dissemination
(With thanks to Jim Gray)
Genomics and Personalized medicine
Use genetic markers (e.g. SNPs)
to…
 Understand causes of disease
 Diagnose a disease
 Infer propensity to get a disease
 Predict reaction to a drug
Genomics, Machine Learning and the Cloud
• Wellcome Trust data for seven
common diseases
• Look at all SNP pairs (about 60
billion)
• Analysis with state-of-the-art
Machine Learning algorithm
requires 1,000 compute years and
produces 20 TB output
• Using 27,000 compute cores in
Microsoft’s Cloud, the analysis
was completed in 13 days
First result: SNP pair implicated in
coronary artery disease
The Problem
NSF’s Ocean Observatory Initiative
Slide courtesy of John Delaney
Oceans and Life
Slide courtesy of John Delaney
CEDA: Centre for Environmental Data Analysis
To support environmental science, further environmental
data archival practices, and develop and deploy new
technologies to enhance access to data
Centre for Environmental Data Analysis:
JASMIN infrastructure
Part data store, part supercomputer, part private cloud…
Data-Intensive Scientific Discovery
The Fourth Paradigm
Science@Microsoft http://research.microsoft.com
Amazon.com
Open Access, Open Data,
Open Science
The US National Library of Medicine
• The NIH Public Access Policy
ensures that the public has access
to the published results of NIH
funded research.
• Requires scientists to submit final
peer-reviewed journal manuscripts
that arise from NIH funds to the
digital archive PubMed Central upon
acceptance for publication.
• Policy requires that these papers
are accessible to the public on
PubMed Central no later than 12
months after publication.
Nucleotide
sequences
Protein
sequences
Taxon
Phylogeny MMDB
3 -D
Structure
PubMed
abstracts
Complete
Genomes
PubMed Entrez
Genomes
Publishers Genome
Centers
Entrez cross-database search
Serious problems of research reproducibility
in bioinformatics
During a decade as head of global
cancer research at Amgen, C. Glenn
Begley identified 53 "landmark"
publications -- papers in top journals,
from reputable labs -- for his team to
reproduce.
Result: 47 of the 53 could not be
replicated!
US White House Memo on increased public
access to the results of federally-funded research
• Directive required the major Federal Funding agencies “to
develop a plan to support increased public access to the results
of research funded by the Federal Government.”
• The memo defines research results to encompass not only the
research paper but also “… the digital recorded factual
material commonly accepted in the scientific community as
necessary to validate research findings including data sets
used to support scholarly publications ….”
22 February 2013
EPSRC Expectations for Data Preservation
•Research organisations will ensure that EPSRC-funded
research data is securely preserved for a minimum of
10 years from the date that any researcher ‘privileged
access’ period expires
•Research organisations will ensure that effective data
curation is provided throughout the full data lifecycle
Digital Curation Centre
• Maintenance of digital data and research results over
entire data life-cycle
• Data curation and digital preservation
• Research in tools and technologies to support data
integration, annotation, provenance, metadata, security…
Established by Jisc in 2004 with the mission to work
with librarians, researchers and computer scientists to
examine and support all aspects of research data
curation and preservation
Now DCC services more important than ever
44 % of data
links from
2001 broken in
2011
Pepe et al. 2012
Sustainability of Data Links?
Datacite and ORCID
DataCite
• International consortium to establish easier access to scientific research
data
• Increase acceptance of research data as legitimate, citable contributions
to the scientific record
• Support data archiving that will permit results to be verified and re-
purposed for future study.
ORCID - Open Research & Contributor ID
• Aims to solve the author/contributor name ambiguity problem in scholarly
communications
• Central registry of unique identifiers for individual researchers
• Open and transparent linking mechanism between ORCID and other
current author ID schemes.
• Identifiers can be linked to the researcher’s output to enhance the
scientific discovery process
Progress in Data Curation in last 10 years?
• Biggest change is funding agency mandate:
• NSF’s insistence on a Data Management Plan for all proposals
has made scientists (pretend?) to take data curation seriously.
• There are better curated databases and metadata now …
• … but not sure that the quality fraction is increasing!
• Frew’s laws of metadata:
• First law: scientists don’t write metadata
• Second law: any scientist can be forced to write bad metadata
 Should automate creation of metadata as far as possible
 Scientists need to work with metadata specialists with
domain knowledge a.k.a. science librarians
With thanks to Jim Frew, UCSB
Role of Research Libraries?
Institutional Research Repositories
End-to end Network Support for
Data-intensive Research?
The goal of ‘campus bridging’ is to enable the
seamlessly integrated use among:
• a researcher’s personal cyberinfrastructure
• cyberinfrastructure at other campuses
• cyberinfrastructure at the regional, national
and international levels
so that they all function as if they were
proximate to the scientist
NSF Task Force on ‘Campus Bridging’ (2011)
STFC Harwell Site Experimental Facilities
ISIS
CLF
Pacific Research Platform
• NSF funding $5M award to UC San Diego and UC
Berkeley to establish a science-driven high-
capacity data-centric “freeway system” on a large
regional scale.
• This network infrastructure will give the research
institutions the ability to move data 1,000 times
faster compared to speeds on today’s Internet.
August 2015
“PRP will enable researchers to use standard
tools to move data to and from their labs and
their collaborators’ sites, supercomputer centers
and data repositories distant from their campus
IT infrastructure, at speeds comparable to
accessing local disks,” said co-PI Tom DeFanti
The Future
What is a Data Scientist?
Data Engineer People who are expert at
• Operating at low levels close to the data, write code that manipulates
• They may have some machine learning background.
• Large companies may have teams of them in-house or they may look to third party
specialists to do the work.
Data Analyst People who explore data through statistical and analytical methods
• They may know programming; May be an spreadsheet wizard.
• Either way, they can build models based on low-level data.
• They eat and drink numbers; They know which questions to ask of the data. Every
company will have lots of these.
Data Steward People who think to managing, curating, and preserving data.
• They are information specialists, archivists, librarians and compliance officers.
• This is an important role: if data has value, you want someone to manage it, make it
discoverable, look after it and make sure it remains usable.
What is a data scientist? Microsoft UK Enterprise Insights Blog, Kenji Takeda
http://blogs.msdn.com/b/microsoftenterpriseinsight/archive/2013/01/31/what-is-a-data-scientist.aspx
Slide thanks to Bryan Lawrence
Scientist career paths?
Jim Gray’s Vision: All Scientific Data Online
• Many disciplines overlap and use data
from other sciences.
• Internet can unify all literature and
data
• Go from literature to computation to
data back to literature.
• Information at your fingertips –
For everyone, everywhere
• Increase Scientific Information
Velocity
• Huge increase in Science Productivity
(From Jim Gray’s last talk)
Literature
Derived and
recombined data
Raw Data
Outline
•The Fourth Paradigm – Data-intensive Science
•Open Access, Open Data, Open Science
• End-to-end Network Support for Data-intensive Research
•The Future
The Tolkien Trail in Hall Green and Moseley

Contenu connexe

Tendances

Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AISeth Grimes
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentationSurya Sg
 
Ethics in the use of Data & AI
Ethics in the use of Data & AI Ethics in the use of Data & AI
Ethics in the use of Data & AI Kalilur Rahman
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trendsAlan Morrison
 
Machine Learning by Analogy
Machine Learning by AnalogyMachine Learning by Analogy
Machine Learning by AnalogyColleen Farrelly
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big DataDATAVERSITY
 
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHMHEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHMamiteshg
 
AI Governance and Ethics - Industry Standards
AI Governance and Ethics - Industry StandardsAI Governance and Ethics - Industry Standards
AI Governance and Ethics - Industry StandardsAnsgar Koene
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeakin University
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Krishnaram Kenthapadi
 
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...Dataconomy Media
 
Responsible AI
Responsible AIResponsible AI
Responsible AINeo4j
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Dataconomy Media
 

Tendances (20)

Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AI
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
 
Introduction to AI Governance
Introduction to AI GovernanceIntroduction to AI Governance
Introduction to AI Governance
 
Ethics in the use of Data & AI
Ethics in the use of Data & AI Ethics in the use of Data & AI
Ethics in the use of Data & AI
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trends
 
Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
 
Machine Learning by Analogy
Machine Learning by AnalogyMachine Learning by Analogy
Machine Learning by Analogy
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big Data
 
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHMHEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
 
Ml in genomics
Ml in genomicsMl in genomics
Ml in genomics
 
AI Governance and Ethics - Industry Standards
AI Governance and Ethics - Industry StandardsAI Governance and Ethics - Industry Standards
AI Governance and Ethics - Industry Standards
 
Data science ppt
Data science pptData science ppt
Data science ppt
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining I
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
 
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
 
1.Introduction to deep learning
1.Introduction to deep learning1.Introduction to deep learning
1.Introduction to deep learning
 
Responsible AI
Responsible AIResponsible AI
Responsible AI
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 

En vedette

Responsible metrics for research - Jisc Digifest 2016
Responsible metrics for research - Jisc Digifest 2016Responsible metrics for research - Jisc Digifest 2016
Responsible metrics for research - Jisc Digifest 2016Jisc
 
The user -driven evolution of Janet - Jisc Digifest 2016
The user -driven evolution of Janet - Jisc Digifest 2016The user -driven evolution of Janet - Jisc Digifest 2016
The user -driven evolution of Janet - Jisc Digifest 2016Jisc
 
Understanding Prevent - the role of education in tackling radicalisation - Ji...
Understanding Prevent - the role of education in tackling radicalisation - Ji...Understanding Prevent - the role of education in tackling radicalisation - Ji...
Understanding Prevent - the role of education in tackling radicalisation - Ji...Jisc
 
Working in partnership to develop student employability - Jisc Digifest 2016
Working in partnership to develop student employability - Jisc Digifest 2016Working in partnership to develop student employability - Jisc Digifest 2016
Working in partnership to develop student employability - Jisc Digifest 2016Jisc
 
Leveraging the digital - capability, capacity and change in higher and furthe...
Leveraging the digital - capability, capacity and change in higher and furthe...Leveraging the digital - capability, capacity and change in higher and furthe...
Leveraging the digital - capability, capacity and change in higher and furthe...Jisc
 
Benefits and efficiencies with Vscene - Jisc Digifest 2016
Benefits and efficiencies with Vscene - Jisc Digifest 2016Benefits and efficiencies with Vscene - Jisc Digifest 2016
Benefits and efficiencies with Vscene - Jisc Digifest 2016Jisc
 
Introducing the open citation experiment - Jisc Digifest 2016
Introducing the open citation experiment - Jisc Digifest 2016Introducing the open citation experiment - Jisc Digifest 2016
Introducing the open citation experiment - Jisc Digifest 2016Jisc
 
Universities as e-textbook publishers - Jisc Digifest 2016
Universities as e-textbook publishers - Jisc Digifest 2016Universities as e-textbook publishers - Jisc Digifest 2016
Universities as e-textbook publishers - Jisc Digifest 2016Jisc
 
Transforming assessment and feedback with technology - Jisc Digifest 2016
Transforming assessment and feedback with technology - Jisc Digifest 2016Transforming assessment and feedback with technology - Jisc Digifest 2016
Transforming assessment and feedback with technology - Jisc Digifest 2016Jisc
 
The future of cloud computing - Jisc Digifest 2016
The future of cloud computing - Jisc Digifest 2016The future of cloud computing - Jisc Digifest 2016
The future of cloud computing - Jisc Digifest 2016Jisc
 
Figshare for institutions - Jisc Digifest 2016
Figshare for institutions - Jisc Digifest 2016Figshare for institutions - Jisc Digifest 2016
Figshare for institutions - Jisc Digifest 2016Jisc
 
The evolution of FELTAG - Jisc Digifest 2016
The evolution of FELTAG - Jisc Digifest 2016The evolution of FELTAG - Jisc Digifest 2016
The evolution of FELTAG - Jisc Digifest 2016Jisc
 
Business intelligence: making more informed decisions - Jisc Digifest 2016
Business intelligence: making more informed decisions - Jisc Digifest 2016Business intelligence: making more informed decisions - Jisc Digifest 2016
Business intelligence: making more informed decisions - Jisc Digifest 2016Jisc
 
The value of Jisc Collections - Jisc Digifest 2016
The value of Jisc Collections - Jisc Digifest 2016The value of Jisc Collections - Jisc Digifest 2016
The value of Jisc Collections - Jisc Digifest 2016Jisc
 
The power of digital for teaching and learning - Jisc Digifest 2016
The power of digital for teaching and learning - Jisc Digifest 2016The power of digital for teaching and learning - Jisc Digifest 2016
The power of digital for teaching and learning - Jisc Digifest 2016Jisc
 
New emerging assistive technologies - Jisc Digifest 2016
New emerging assistive technologies - Jisc Digifest 2016New emerging assistive technologies - Jisc Digifest 2016
New emerging assistive technologies - Jisc Digifest 2016Jisc
 
Delivering online learning - are you ready? - Jisc Digifest 2016
Delivering online learning - are you ready? - Jisc Digifest 2016Delivering online learning - are you ready? - Jisc Digifest 2016
Delivering online learning - are you ready? - Jisc Digifest 2016Jisc
 
Unlocking the potential of cloud in research and education - Jisc Digifest 2016
Unlocking the potential of cloud in research and education - Jisc Digifest 2016Unlocking the potential of cloud in research and education - Jisc Digifest 2016
Unlocking the potential of cloud in research and education - Jisc Digifest 2016Jisc
 
Build your own university app in under an hour - Jisc Digifest 2016
Build your own university app in under an hour - Jisc Digifest 2016Build your own university app in under an hour - Jisc Digifest 2016
Build your own university app in under an hour - Jisc Digifest 2016Jisc
 
Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016Jisc
 

En vedette (20)

Responsible metrics for research - Jisc Digifest 2016
Responsible metrics for research - Jisc Digifest 2016Responsible metrics for research - Jisc Digifest 2016
Responsible metrics for research - Jisc Digifest 2016
 
The user -driven evolution of Janet - Jisc Digifest 2016
The user -driven evolution of Janet - Jisc Digifest 2016The user -driven evolution of Janet - Jisc Digifest 2016
The user -driven evolution of Janet - Jisc Digifest 2016
 
Understanding Prevent - the role of education in tackling radicalisation - Ji...
Understanding Prevent - the role of education in tackling radicalisation - Ji...Understanding Prevent - the role of education in tackling radicalisation - Ji...
Understanding Prevent - the role of education in tackling radicalisation - Ji...
 
Working in partnership to develop student employability - Jisc Digifest 2016
Working in partnership to develop student employability - Jisc Digifest 2016Working in partnership to develop student employability - Jisc Digifest 2016
Working in partnership to develop student employability - Jisc Digifest 2016
 
Leveraging the digital - capability, capacity and change in higher and furthe...
Leveraging the digital - capability, capacity and change in higher and furthe...Leveraging the digital - capability, capacity and change in higher and furthe...
Leveraging the digital - capability, capacity and change in higher and furthe...
 
Benefits and efficiencies with Vscene - Jisc Digifest 2016
Benefits and efficiencies with Vscene - Jisc Digifest 2016Benefits and efficiencies with Vscene - Jisc Digifest 2016
Benefits and efficiencies with Vscene - Jisc Digifest 2016
 
Introducing the open citation experiment - Jisc Digifest 2016
Introducing the open citation experiment - Jisc Digifest 2016Introducing the open citation experiment - Jisc Digifest 2016
Introducing the open citation experiment - Jisc Digifest 2016
 
Universities as e-textbook publishers - Jisc Digifest 2016
Universities as e-textbook publishers - Jisc Digifest 2016Universities as e-textbook publishers - Jisc Digifest 2016
Universities as e-textbook publishers - Jisc Digifest 2016
 
Transforming assessment and feedback with technology - Jisc Digifest 2016
Transforming assessment and feedback with technology - Jisc Digifest 2016Transforming assessment and feedback with technology - Jisc Digifest 2016
Transforming assessment and feedback with technology - Jisc Digifest 2016
 
The future of cloud computing - Jisc Digifest 2016
The future of cloud computing - Jisc Digifest 2016The future of cloud computing - Jisc Digifest 2016
The future of cloud computing - Jisc Digifest 2016
 
Figshare for institutions - Jisc Digifest 2016
Figshare for institutions - Jisc Digifest 2016Figshare for institutions - Jisc Digifest 2016
Figshare for institutions - Jisc Digifest 2016
 
The evolution of FELTAG - Jisc Digifest 2016
The evolution of FELTAG - Jisc Digifest 2016The evolution of FELTAG - Jisc Digifest 2016
The evolution of FELTAG - Jisc Digifest 2016
 
Business intelligence: making more informed decisions - Jisc Digifest 2016
Business intelligence: making more informed decisions - Jisc Digifest 2016Business intelligence: making more informed decisions - Jisc Digifest 2016
Business intelligence: making more informed decisions - Jisc Digifest 2016
 
The value of Jisc Collections - Jisc Digifest 2016
The value of Jisc Collections - Jisc Digifest 2016The value of Jisc Collections - Jisc Digifest 2016
The value of Jisc Collections - Jisc Digifest 2016
 
The power of digital for teaching and learning - Jisc Digifest 2016
The power of digital for teaching and learning - Jisc Digifest 2016The power of digital for teaching and learning - Jisc Digifest 2016
The power of digital for teaching and learning - Jisc Digifest 2016
 
New emerging assistive technologies - Jisc Digifest 2016
New emerging assistive technologies - Jisc Digifest 2016New emerging assistive technologies - Jisc Digifest 2016
New emerging assistive technologies - Jisc Digifest 2016
 
Delivering online learning - are you ready? - Jisc Digifest 2016
Delivering online learning - are you ready? - Jisc Digifest 2016Delivering online learning - are you ready? - Jisc Digifest 2016
Delivering online learning - are you ready? - Jisc Digifest 2016
 
Unlocking the potential of cloud in research and education - Jisc Digifest 2016
Unlocking the potential of cloud in research and education - Jisc Digifest 2016Unlocking the potential of cloud in research and education - Jisc Digifest 2016
Unlocking the potential of cloud in research and education - Jisc Digifest 2016
 
Build your own university app in under an hour - Jisc Digifest 2016
Build your own university app in under an hour - Jisc Digifest 2016Build your own university app in under an hour - Jisc Digifest 2016
Build your own university app in under an hour - Jisc Digifest 2016
 
Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016
 

Similaire à The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016

Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsManuel Corpas
 
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET
 
Dataverse in the Universe of Data by Christine L. Borgman
Dataverse in the Universe of Data by Christine L. BorgmanDataverse in the Universe of Data by Christine L. Borgman
Dataverse in the Universe of Data by Christine L. Borgmandatascienceiqss
 
Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Jisc
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureXiaogang (Marshall) Ma
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
 
Data management: The new frontier for libraries
Data management: The new frontier for librariesData management: The new frontier for libraries
Data management: The new frontier for librariesLEARN Project
 
Presentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical SocietyPresentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical Societyosimod
 
06 e science-bio diversity@ pacc 18.07.2014
06 e science-bio diversity@ pacc 18.07.201406 e science-bio diversity@ pacc 18.07.2014
06 e science-bio diversity@ pacc 18.07.2014VinothkumaR Ramu
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Datacunera
 
Open science in RIKEN-KI doctorial course on March 20, 2019
Open science in RIKEN-KI doctorial course on March 20, 2019Open science in RIKEN-KI doctorial course on March 20, 2019
Open science in RIKEN-KI doctorial course on March 20, 2019Takeya Kasukawa
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
 
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...CILIP MDG
 
How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research PaperAnita de Waard
 
Graham Pryor
Graham PryorGraham Pryor
Graham PryorEduserv
 

Similaire à The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016 (20)

Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics Datasets
 
Big Data
Big Data Big Data
Big Data
 
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
 
Dataverse in the Universe of Data by Christine L. Borgman
Dataverse in the Universe of Data by Christine L. BorgmanDataverse in the Universe of Data by Christine L. Borgman
Dataverse in the Universe of Data by Christine L. Borgman
 
Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystem
 
Data management: The new frontier for libraries
Data management: The new frontier for librariesData management: The new frontier for libraries
Data management: The new frontier for libraries
 
Presentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical SocietyPresentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical Society
 
06 e science-bio diversity@ pacc 18.07.2014
06 e science-bio diversity@ pacc 18.07.201406 e science-bio diversity@ pacc 18.07.2014
06 e science-bio diversity@ pacc 18.07.2014
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
Open Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality DataOpen Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality Data
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Data
 
Open science in RIKEN-KI doctorial course on March 20, 2019
Open science in RIKEN-KI doctorial course on March 20, 2019Open science in RIKEN-KI doctorial course on March 20, 2019
Open science in RIKEN-KI doctorial course on March 20, 2019
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
 
How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research Paper
 
Graham Pryor
Graham PryorGraham Pryor
Graham Pryor
 

Plus de Jisc

Jisc's value to HE: the University of Sheffield
Jisc's value to HE: the University of SheffieldJisc's value to HE: the University of Sheffield
Jisc's value to HE: the University of SheffieldJisc
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...Jisc
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxJisc
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxJisc
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Jisc
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...Jisc
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptxJisc
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxJisc
 
The Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxThe Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxJisc
 
Are we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxJisc
 
JiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJisc
 
UWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxUWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxJisc
 

Plus de Jisc (20)

Jisc's value to HE: the University of Sheffield
Jisc's value to HE: the University of SheffieldJisc's value to HE: the University of Sheffield
Jisc's value to HE: the University of Sheffield
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptx
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptx
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptx
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptx
 
The Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxThe Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptx
 
Are we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptx
 
JiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptx
 
UWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxUWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptx
 

Dernier

會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽中 央社
 
PSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxPSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxMarlene Maheu
 
Đề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinhĐề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinhleson0603
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptNishitharanjan Rout
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital ManagementMBA Assignment Experts
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文中 央社
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024Borja Sotomayor
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...Nguyen Thanh Tu Collection
 
SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code ExamplesPeter Brusilovsky
 
ANTI PARKISON DRUGS.pptx
ANTI         PARKISON          DRUGS.pptxANTI         PARKISON          DRUGS.pptx
ANTI PARKISON DRUGS.pptxPoojaSen20
 
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportBasic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportDenish Jangid
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi RajagopalEADTU
 
MOOD STABLIZERS DRUGS.pptx
MOOD     STABLIZERS           DRUGS.pptxMOOD     STABLIZERS           DRUGS.pptx
MOOD STABLIZERS DRUGS.pptxPoojaSen20
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSAnaAcapella
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...Gary Wood
 

Dernier (20)

會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
PSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxPSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptx
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
 
Đề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinhĐề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinh
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
Including Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdfIncluding Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdf
 
Supporting Newcomer Multilingual Learners
Supporting Newcomer  Multilingual LearnersSupporting Newcomer  Multilingual Learners
Supporting Newcomer Multilingual Learners
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
 
VAMOS CUIDAR DO NOSSO PLANETA! .
VAMOS CUIDAR DO NOSSO PLANETA!                    .VAMOS CUIDAR DO NOSSO PLANETA!                    .
VAMOS CUIDAR DO NOSSO PLANETA! .
 
SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code Examples
 
ANTI PARKISON DRUGS.pptx
ANTI         PARKISON          DRUGS.pptxANTI         PARKISON          DRUGS.pptx
ANTI PARKISON DRUGS.pptx
 
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportBasic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopal
 
MOOD STABLIZERS DRUGS.pptx
MOOD     STABLIZERS           DRUGS.pptxMOOD     STABLIZERS           DRUGS.pptx
MOOD STABLIZERS DRUGS.pptx
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
 

The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016

  • 1. The Fourth Paradigm: Data-Intensive Scientific Discovery and Open Science Tony Hey Chief Data Scientist UK Science and Technology Facilities Council tony.hey@stfc.ac.uk
  • 2. The birthplace of the industrial revolution
  • 4. Much of Science is now Data-Intensive Number of Researchers Data Volume • Extremely large data sets • Expensive to move • Domain standards • High computational needs • Supercomputers, HPC, Grids e.g. High Energy Physics, Astronomy • Large data sets • Some Standards within Domains • Shared Datacenters & Clusters • Research Collaborations e.g. Genomics, Financial • Medium & Small data sets • Flat Files, Excel • Widely diverse data; Few standards • Local Servers & PCs e.g. Social Sciences, Humanities Four “V’s” of Data •Volume •Variety •Velocity •Veracity ‘The Long Tail of Science’
  • 5. The ‘Cosmic Genome Project’: The Sloan Digital Sky Survey • Survey of more than ¼ of the night sky • Survey produces 200 GB of data per night • Two surveys in one – images and spectra • Nearly 2M astronomical objects, including 800,000 galaxies, 100,000 quasars • 100’s of TB of data, and data is public • Started in 1992, ‘finished’ in 2008 The SkyServer Web Service was built at JHU by team led by Alex Szalay and Jim Gray The University of Chicago Princeton University The Johns Hopkins University The University of Washington New Mexico State University Fermi National Accelerator Laboratory US Naval Observatory The Japanese Participation Group The Institute for Advanced Study Max Planck Inst, Heidelberg Sloan Foundation, NSF, DOE, NASA
  • 6. Open Data: Public Use of the Sloan Data • SkyServer web service has had over 400 million web • About 1M distinct users vs 10,000 astronomers • >1600 refereed papers! • Delivered 50,000 hours of lectures to high schools  New publishing paradigm: data is published before analysis by astronomers  Platform for ‘citizen science’ with GalaxyZoo project Posterchild in 21st century data publishing
  • 7. Thousand years ago – Experimental Science • Description of natural phenomena Last few hundred years – Theoretical Science • Newton’s Laws, Maxwell’s Equations… Last few decades – Computational Science • Simulation of complex phenomena Today – Data-Intensive Science • Scientists overwhelmed with data sets from many different sources • Data captured by instruments • Data generated by simulations • Data generated by sensor networks eScience and the Fourth Paradigm 2 2 2. 3 4 a cG a a           eScience is the set of tools and technologies to support data federation and collaboration • For analysis and data mining • For data visualization and exploration • For scholarly communication and dissemination (With thanks to Jim Gray)
  • 8. Genomics and Personalized medicine Use genetic markers (e.g. SNPs) to…  Understand causes of disease  Diagnose a disease  Infer propensity to get a disease  Predict reaction to a drug
  • 9. Genomics, Machine Learning and the Cloud • Wellcome Trust data for seven common diseases • Look at all SNP pairs (about 60 billion) • Analysis with state-of-the-art Machine Learning algorithm requires 1,000 compute years and produces 20 TB output • Using 27,000 compute cores in Microsoft’s Cloud, the analysis was completed in 13 days First result: SNP pair implicated in coronary artery disease The Problem
  • 10. NSF’s Ocean Observatory Initiative Slide courtesy of John Delaney
  • 11. Oceans and Life Slide courtesy of John Delaney
  • 12. CEDA: Centre for Environmental Data Analysis To support environmental science, further environmental data archival practices, and develop and deploy new technologies to enhance access to data
  • 13. Centre for Environmental Data Analysis: JASMIN infrastructure Part data store, part supercomputer, part private cloud…
  • 14. Data-Intensive Scientific Discovery The Fourth Paradigm Science@Microsoft http://research.microsoft.com Amazon.com
  • 15. Open Access, Open Data, Open Science
  • 16. The US National Library of Medicine • The NIH Public Access Policy ensures that the public has access to the published results of NIH funded research. • Requires scientists to submit final peer-reviewed journal manuscripts that arise from NIH funds to the digital archive PubMed Central upon acceptance for publication. • Policy requires that these papers are accessible to the public on PubMed Central no later than 12 months after publication. Nucleotide sequences Protein sequences Taxon Phylogeny MMDB 3 -D Structure PubMed abstracts Complete Genomes PubMed Entrez Genomes Publishers Genome Centers Entrez cross-database search
  • 17. Serious problems of research reproducibility in bioinformatics During a decade as head of global cancer research at Amgen, C. Glenn Begley identified 53 "landmark" publications -- papers in top journals, from reputable labs -- for his team to reproduce. Result: 47 of the 53 could not be replicated!
  • 18. US White House Memo on increased public access to the results of federally-funded research • Directive required the major Federal Funding agencies “to develop a plan to support increased public access to the results of research funded by the Federal Government.” • The memo defines research results to encompass not only the research paper but also “… the digital recorded factual material commonly accepted in the scientific community as necessary to validate research findings including data sets used to support scholarly publications ….” 22 February 2013
  • 19. EPSRC Expectations for Data Preservation •Research organisations will ensure that EPSRC-funded research data is securely preserved for a minimum of 10 years from the date that any researcher ‘privileged access’ period expires •Research organisations will ensure that effective data curation is provided throughout the full data lifecycle
  • 20. Digital Curation Centre • Maintenance of digital data and research results over entire data life-cycle • Data curation and digital preservation • Research in tools and technologies to support data integration, annotation, provenance, metadata, security… Established by Jisc in 2004 with the mission to work with librarians, researchers and computer scientists to examine and support all aspects of research data curation and preservation Now DCC services more important than ever
  • 21. 44 % of data links from 2001 broken in 2011 Pepe et al. 2012 Sustainability of Data Links?
  • 22. Datacite and ORCID DataCite • International consortium to establish easier access to scientific research data • Increase acceptance of research data as legitimate, citable contributions to the scientific record • Support data archiving that will permit results to be verified and re- purposed for future study. ORCID - Open Research & Contributor ID • Aims to solve the author/contributor name ambiguity problem in scholarly communications • Central registry of unique identifiers for individual researchers • Open and transparent linking mechanism between ORCID and other current author ID schemes. • Identifiers can be linked to the researcher’s output to enhance the scientific discovery process
  • 23. Progress in Data Curation in last 10 years? • Biggest change is funding agency mandate: • NSF’s insistence on a Data Management Plan for all proposals has made scientists (pretend?) to take data curation seriously. • There are better curated databases and metadata now … • … but not sure that the quality fraction is increasing! • Frew’s laws of metadata: • First law: scientists don’t write metadata • Second law: any scientist can be forced to write bad metadata  Should automate creation of metadata as far as possible  Scientists need to work with metadata specialists with domain knowledge a.k.a. science librarians With thanks to Jim Frew, UCSB
  • 24.
  • 25. Role of Research Libraries?
  • 27. End-to end Network Support for Data-intensive Research?
  • 28. The goal of ‘campus bridging’ is to enable the seamlessly integrated use among: • a researcher’s personal cyberinfrastructure • cyberinfrastructure at other campuses • cyberinfrastructure at the regional, national and international levels so that they all function as if they were proximate to the scientist NSF Task Force on ‘Campus Bridging’ (2011)
  • 29.
  • 30.
  • 31. STFC Harwell Site Experimental Facilities ISIS CLF
  • 32. Pacific Research Platform • NSF funding $5M award to UC San Diego and UC Berkeley to establish a science-driven high- capacity data-centric “freeway system” on a large regional scale. • This network infrastructure will give the research institutions the ability to move data 1,000 times faster compared to speeds on today’s Internet. August 2015 “PRP will enable researchers to use standard tools to move data to and from their labs and their collaborators’ sites, supercomputer centers and data repositories distant from their campus IT infrastructure, at speeds comparable to accessing local disks,” said co-PI Tom DeFanti
  • 34. What is a Data Scientist? Data Engineer People who are expert at • Operating at low levels close to the data, write code that manipulates • They may have some machine learning background. • Large companies may have teams of them in-house or they may look to third party specialists to do the work. Data Analyst People who explore data through statistical and analytical methods • They may know programming; May be an spreadsheet wizard. • Either way, they can build models based on low-level data. • They eat and drink numbers; They know which questions to ask of the data. Every company will have lots of these. Data Steward People who think to managing, curating, and preserving data. • They are information specialists, archivists, librarians and compliance officers. • This is an important role: if data has value, you want someone to manage it, make it discoverable, look after it and make sure it remains usable. What is a data scientist? Microsoft UK Enterprise Insights Blog, Kenji Takeda http://blogs.msdn.com/b/microsoftenterpriseinsight/archive/2013/01/31/what-is-a-data-scientist.aspx
  • 35. Slide thanks to Bryan Lawrence Scientist career paths?
  • 36. Jim Gray’s Vision: All Scientific Data Online • Many disciplines overlap and use data from other sciences. • Internet can unify all literature and data • Go from literature to computation to data back to literature. • Information at your fingertips – For everyone, everywhere • Increase Scientific Information Velocity • Huge increase in Science Productivity (From Jim Gray’s last talk) Literature Derived and recombined data Raw Data
  • 37. Outline •The Fourth Paradigm – Data-intensive Science •Open Access, Open Data, Open Science • End-to-end Network Support for Data-intensive Research •The Future
  • 38. The Tolkien Trail in Hall Green and Moseley

Notes de l'éditeur

  1. Citizen Science If people do not understand what a cell is how can they understand the ethics and implications of stem-cell research? If the general public does not understand molecules and DNA how can they understand the principals of heredity and risks in healthcare and disease management? Or, put another way, scientific illiteracy undermines citizens' ability to take part in the democratic process (30). Although the NSF is not focused on broad-scale education it can catalyze community engagement in exciting scientific discovery and, through this, both advance scientific discovery and help educate US citizens in key scientific principles. There are now many examples of meaningful citizen science engagement however Galaxy Zoo (15) activities give a useful indication of the latent appetite for scientific engagement in society. This is a collection of online astronomy projects which invite members of the public to assist in classifying galaxies. In the first year, the initial project boasted over 50 million classifications made by 150,000 individuals in the general public – it quickly became the world's largest database of galaxy shapes. So successful was the original project that it spawned Galaxy Zoo 2 in February 2009 to classify another 250,000 SDSS galaxies. The project included unique scientific discoveries such as Hanny’s Voorwerp (31) and ‘Green Pea’ galaxies.  
  2. Microsoft’s cloud platform Tens of thousands of nodes available (~30k)
  3. Takes me back to our codesign research at risk – here is what the sector were asking us to work with them on. There was a strong desire for shared services, but getting to identify them wasn’t straight forward.