SlideShare a Scribd company logo
1 of 73
The Evolution
of e-Research
Machines, Methods and Music
David De Roure
MathsPhysics
Medical
electronics PhD in distributed declarative
programming language design
Hypermedia
Large scale
Distributed
Systems
Semantic Sensor Networks
Web
Science
Devices
Amorphous
Computing
Digital
Social
Research
Equator
e-Science
MusicElectronics Programming
Transputers
Temporal
Media
Computational Musicology
Advanced
Knowledge
Technologies
Semantic
Web
Process
Networks
myExperiment
Web 2
Statistics
Grid
Linked
Data
1981
2010
Environmental
sensing
Networks
VREs
MITAJGH PH WH
PEOPLEOPLE Agents
Semantic
Grid
e-Laboratories
Workflows
QBH
Overview
Generation 1: Early adopters
Generation 2: Embedding
Generation 3: Radical sharing
SALAMI
A case study in 3rd generation e-Research
e-Science
• e-Science was defined by John Taylor (Director
General of the UK Research Councils) as
global collaboration in key areas of science
and the next generation of infrastructure that
will enable it
• e-Science was the name of the destination
• It became the name of the journey
• When we arrive, the destination is just called
science
“e-research extends
e-Science and
cyberinfrastructure
to other disciplines,
including the
humanities and
social sciences.”
e-Research
http://mitpress.mit.edu/catalog/item/default.asp?tid=12185&ttype=2
2000 – 2005
Generation 1
...the imminent flood of
scientific data expected
from the next generation of
experiments, simulations,
sensors and satellites
Tony Hey and Anne Trefethen
Source: CERN, CERN-EX-0712023, http://cdsweb.cern.ch/record/1203203
26/2/2007 | myExperiment |
Slide 8
Jeremy Frey
• Workflows are the new rock
and roll
• Machinery for coordinating
the execution of (scientific)
services and linking together
(scientific) resources
• The era of Service Oriented
Applications
• Repetitive and mundane
boring stuff made easier
Carole Goble
E. Science laboris
Kepler
Triana
BPEL
Taverna
Trident
Meandre
Galaxy
co-shaping
co-design
co-creation
co-constitution
co-evolution
co-construction
co-
co-realisation
http://webscience.org
Box of Chemists
My Chemistry Experiment
CombeChem
CombeChem
empower
to equip or supply with an ability;
enable
service
the performance of duties or the
duties performed as or by a
waiter or servant
Early adoptors of tools.
Characterised by researchers using tools within their
particular problem area, with some re-use of tools, data
and methods within the discipline.
Traditional publishing is supplemented by publication of
some digital artefacts like workflows and links to data.
Science is accelerated and practice beginning to shift to
emphasise in silico work.
1st Generation Summary
Thanks to Iain Buchan
and the chipmunks
2005 – 2010
Generation 2
• Paul writes workflows for identifying biological
pathways implicated in resistance to
Trypanosomiasis in cattle
• Paul meets Jo. Jo is investigating Whipworm in
mouse.
• Jo reuses one of Paul’s workflow without change.
• Jo identifies the biological pathways involved in
sex dependence in the mouse model, believed to
be involved in the ability of mice to expel the
parasite.
• Previously a manual two year study by Jo had
failed to do this.
Reuse, Recycling, Repurposing
Carole Goble
Carole Goble “e-Science
is me-Science: What do
Scientists want?”, EGEE
2006
“There are these great
collaboration tools that
12-year-olds are using.
It’s all back to front.”
Robert Stevens
“A biologist would rather share their
toothbrush than their gene name”
Mike Ashburner and others
Professor in Dept of Genetics,
University of Cambridge, UK
Data mining: my data’s mine and your
data’s mine
workflows
photos
movies
slides
mySpace for scientists!FacebookNot
too open!
too passé!
Open
Repositories
Researchers
Social
Networkers
Developers
Social
Scientists
 “Facebook for Scientists”
...but different to Facebook!
 A repository of research
methods
 A community social network
of people and things
 A Social Virtual Research
Environment
 A probe into researcher
behaviour
 Open source (BSD) Ruby on
Rails app
 REST and SPARQL interfaces,
supports Linked Data
 Inspiration for: BioCatalogue,
MethodBox and SysMO-SEEK
myExperiment currently has 4400 members, 236 groups, 1336
workflows, 351 files and 141 packs
http://www.myexperiment.org/
Visits to www.myexperiment.org (Oct 2010)
Global collaboration
in key areas of
science and the next
generation of
infrastructure that
will enable it
http://wiki.myexperiment.org
data
method
Methods should be first class citizens
Celebrate the flux! Let the data flow
through the pipelines. Nail down the
methods not the data!
Towards “Linked Open Methods”
Though this be madness, yet there is method in it
* Polonius in Hamlet ** Sean Bechhofer in Manchester *** Not the e-Science Envoy
*
***
**
Data bonanza => Methods bonanza!
It’s not just the data
And what other people do with it
...that you never thought of
It’s what you do with it that counts
Results
Logs
Results
Metadata
PaperSlides
Feeds into
produces
Included
in
produces Published in
produces
Included in
Included in Included in
Published in
Workflow 16
Workflow 13
Common pathways
QTL
Paul’s PackPaul’s
Research
Object
Research Objects enable data-intensive research to be:
1. Replayable – go back and see what happened
2. Repeatable – run the experiment again
3. Reproducible – independent expt to reproduce
4. Reusable – use as part of new experiments
5. Repurposeable – reuse the pieces in new expt
6. Reliable – robust under automation
7. Referenceable – citable and traceable
The Six Rs of Research Object Behaviours
http://blog.openwetware.org/deroure/?p=56
Semantically enhanced publication versus
Shared digital Research Objects
Challenging the mindset of paper-sized chunks
Documents
under glass
Projects delivering now.
Some institutional embedding.
Key characteristic is re-use – of the increasing pool of
tools, data and methods across areas/disciplines.
Contain some freestanding, recombinant, reproducible
research objects.
New scientific practices are established and opportunities
arise for completely new scientific investigations.
Some expert curation.
2nd Generation Summary
2010 – 2015
Generation 3
4th Paradigm
The Fourth Paradigm:
Data-Intensive
Scientific Discovery
Presenting the first
broad look at the rapidly
emerging field of data-
intensive science
http://research.microsoft.com/en-us/collaboration/fourthparadigm/
http://blogs.nature.com/fourthparadigm/
BioEssays, 26(1):99–105, January 2004
Doug Kell
Francois Belleau
“…to discover proteins that interact with transmembrane
proteins, particularly those that can be related to neuro-
degenerative diseases in which amyloids play a significant role”
1) Taverna provenance exposed as RDF
2) myExperiment RDF document for a protein discovery workflow
3) Mocked-up BioCatalogue document using myExperiment RDF
data as example
4) Provisional RDF documents obtained from the ConceptWiki
(conceptwiki.org) development server
5) An RDF document for an example protein, obtained from the RDF
interface of the UniProt web site
A Bioinformatics Experiment Scott Marshall
Marco Roos
LifeGuide http://www.lifeguideonline.org/
Lucy Yardley
MethodBox http://www.methodbox.org/
Enable cross
disciplinary research
into Major Public
Health problems
Ease handling data
and sharing results
and insights
http://www.galaxyzoo.org/
Arfon Smith
http://www.zooniverse.org/
The solutions we'll be delivering in 5 years
Characterised by global reuse of tools, data and methods
across any discipline, and surfacing the right levels of
complexity for the researcher.
Routine use.
Key characteristic is radical sharing.
Research is significantly data driven – plundering the
backlog of data, results and methods.
Publishing by the social network
Increasing automation and decision-support for the
researcher – the VRE becomes assistive.
Curation is autonomic and social.
3rd Generation Summary
Easy and low risk to start
Progress to advanced skills
For researchers
No obligation
Go as far as you want
Find a service & relax
Intellectual ramps
Malcolm Atkinson
NRAO/AUI/NSF
telescopes for the naked mindDatascopes
Malcolm Atkinson
From Signal to Understanding
Jeannette M. Wing COMMUNICATIONS OF THE ACM March 2006/Vol. 49, No. 3 Pages 33-35
2010 – 2011
and beyond
Music and Linked Data
http://www.openarchives.org/ore/terms/aggregates
http://eprints.ecs.soton.ac.uk/id/eprint/20817
It’s about enabling the join
Ben Fields, 6th October 2010
SALAMI: Structural Analysis
of Large Amounts of Music
Information
David De Roure
J. Stephen Downie
Ichiro Fujinaga
www.diggingintodata.org
The SALAMI collaboration
• DDeR (e-Research South), J. Stephen Downie (Illinois) and
Ichiro Fujinaga (McGill)
• NCSA donating 250,000 supercomputer hours
• 350,000 pieces of music (23,000 hours)
– Internet Archive, DRAM, IMIRSEL, McGill
• Feature analysis and structural analysis
• Music Ontology by Yves Raimond (BBC)
• Musicologists from McGill and Southampton
• Sharing of analyses
http://salami.music.mcgill.ca
Digital Music
Collections
Crowdsourced
ground truth
Community
Software
Linked Data
Repositories
Supercomputer
23,000 hours of
recorded music
250,000 hours NCSA
Supercomputer time
Music Information
Retrieval Community
Ashley Burgoyne
http://www.sonicvisualiser.org/
MIREX Overview
• Began in 2005
• Tasks defined by community debate
• Data sets collected and/or donated
• Participants submit code to IMIRSEL
• Code rarely works first try 
• Huge labour consumption getting
programs to work
• Meet at ISMIR to discuss results Stephen Downie
http://www.music-ir.org/mirex
MIREX TASKS
Audio Artist Identification Audio Onset Detection
Audio Beat Tracking Audio Tag Classification
Audio Chord Detection Audio Tempo Extraction
Audio Classical Composer ID Multiple F0 Estimation
Audio Cover Song Identification Multiple F0 Note Detection
Audio Drum Detection Query-by-Singing/Humming
Audio Genre Classification Query-by-Tapping
Audio Key Finding Score Following
Audio Melody Extraction Symbolic Genre Classification
Audio Mood Classification Symbolic Key Finding
Audio Music Similarity Symbolic Melodic Similarity
seasr.org/meandreMeandre
“Signal”
Digital Audio
“Ground Truth”
Community
It’s web-like!
Q. If and when should community-generated content be assimilated into managed repositories?
Structural
Analysis
How country is
my country?
Kevin Page and Ben Fieldshttp://www.nema.ecs.soton.ac.uk/countrycountry/
Stephen Downie
Music and computational thinking
“Again, it [the Analytical
Engine] might act upon
other things besides
number, were objects
found whose mutual
fundamental relations
could be expressed by
those of the abstract
science of operations,
and which should be
also susceptible of
adaptations to the action
of the operating notation
and mechanism of the
engine...”
“Supposing, for instance,
that the fundamental
relations of pitched
sounds in the science of
harmony and of musical
composition were
susceptible of such
expression and
adaptations, the engine
might compose elaborate
and scientific pieces of
music of any degree of
complexity or extent.”
Ada, The Enchantress of
Numbers: Poetical Science
by Betty Alexandra Toole
http://www.well.com/user/adatoole/
Betty Alexandra Toole
I can write a workflow that creates
workflows based on those of others, and
automatically modify it – think genetic
mutation and crossovers. Who owns it?
I can register a query over an increasing
number and diversity of “linked data”
sources to ask new research questions.
http://eresearch-ethics.org/
The computer can learn from the activities of 1,000,000
scientists – and be indistinguishable from them?
What about the ethics of Citizen Social Science? Of citizens
designing experiments?
Co-*
Methods
Access ramps
Research Objects
Computational thinking
Ethics of e-Research at scale
david.deroure@oerc.ox.ac.uk
Thanks to: Jeremy Frey & CombeChem; Carole Goble, myGrid and
myExperiment; Iain Buchan & Obesity e-Lab; Sean Bechhofer; Doug Kell;
Marco Roos; Lucy Yardley; Arfon Smith; Malcolm Atkinson; Stephen
Downie, Kevin Page, Ben Fields, Ashley Burgoyne and NEMA/SALAMI;
Betty Toole.
http://www.myexperiment.org/packs/153
The Evolution of e-Research: Machines, Methods and Music

More Related Content

What's hot

Data Sharing & Data Citation
Data Sharing & Data CitationData Sharing & Data Citation
Data Sharing & Data CitationMicah Altman
 
A Lifecycle Approach to Information Privacy
A Lifecycle Approach to Information PrivacyA Lifecycle Approach to Information Privacy
A Lifecycle Approach to Information PrivacyMicah Altman
 
Share: Science Information Life Cycle
Share: Science Information Life CycleShare: Science Information Life Cycle
Share: Science Information Life Cyclekauberry
 
Expertise for the future: harnessing the power of digital technologies
Expertise for the future: harnessing the power of digital technologiesExpertise for the future: harnessing the power of digital technologies
Expertise for the future: harnessing the power of digital technologiesEFSA EU
 
Increasing transparency in Medical Education through Open Data
Increasing transparency in Medical Education through Open Data Increasing transparency in Medical Education through Open Data
Increasing transparency in Medical Education through Open Data Rebecca Grant
 
Big data divided (24 march2014)
Big data divided (24 march2014)Big data divided (24 march2014)
Big data divided (24 march2014)Han Woo PARK
 
DATA CENTRIC EDUCATION & LEARNING
 DATA CENTRIC EDUCATION & LEARNING DATA CENTRIC EDUCATION & LEARNING
DATA CENTRIC EDUCATION & LEARNINGdatasciencekorea
 
Bioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataBioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataPhilip Bourne
 
The Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data PilotThe Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data PilotMartin Donnelly
 
The Scientific and Technical Foundation for Altmetrics in the United States
The Scientific and Technical Foundation for Altmetrics in the United StatesThe Scientific and Technical Foundation for Altmetrics in the United States
The Scientific and Technical Foundation for Altmetrics in the United StatesWilliam Gunn
 
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...TimDraws
 
EDRD*6000 - SlideShare Presentation - Paul Simon
EDRD*6000 - SlideShare Presentation - Paul SimonEDRD*6000 - SlideShare Presentation - Paul Simon
EDRD*6000 - SlideShare Presentation - Paul SimonPaul Simon
 
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona Elsevier
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGPhilip Bourne
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesMicah Altman
 
"Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective""Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective"Micah Altman
 

What's hot (20)

Data Sharing & Data Citation
Data Sharing & Data CitationData Sharing & Data Citation
Data Sharing & Data Citation
 
A Lifecycle Approach to Information Privacy
A Lifecycle Approach to Information PrivacyA Lifecycle Approach to Information Privacy
A Lifecycle Approach to Information Privacy
 
Knoesis Student Achievement
Knoesis Student AchievementKnoesis Student Achievement
Knoesis Student Achievement
 
Share: Science Information Life Cycle
Share: Science Information Life CycleShare: Science Information Life Cycle
Share: Science Information Life Cycle
 
Expertise for the future: harnessing the power of digital technologies
Expertise for the future: harnessing the power of digital technologiesExpertise for the future: harnessing the power of digital technologies
Expertise for the future: harnessing the power of digital technologies
 
Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013
 
Increasing transparency in Medical Education through Open Data
Increasing transparency in Medical Education through Open Data Increasing transparency in Medical Education through Open Data
Increasing transparency in Medical Education through Open Data
 
Big data divided (24 march2014)
Big data divided (24 march2014)Big data divided (24 march2014)
Big data divided (24 march2014)
 
DATA CENTRIC EDUCATION & LEARNING
 DATA CENTRIC EDUCATION & LEARNING DATA CENTRIC EDUCATION & LEARNING
DATA CENTRIC EDUCATION & LEARNING
 
Bioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataBioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big Data
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
The Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data PilotThe Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data Pilot
 
The Scientific and Technical Foundation for Altmetrics in the United States
The Scientific and Technical Foundation for Altmetrics in the United StatesThe Scientific and Technical Foundation for Altmetrics in the United States
The Scientific and Technical Foundation for Altmetrics in the United States
 
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
 
EDRD*6000 - SlideShare Presentation - Paul Simon
EDRD*6000 - SlideShare Presentation - Paul SimonEDRD*6000 - SlideShare Presentation - Paul Simon
EDRD*6000 - SlideShare Presentation - Paul Simon
 
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAG
 
From byte to mind
From byte to mindFrom byte to mind
From byte to mind
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual Archives
 
"Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective""Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective"
 

Similar to The Evolution of e-Research: Machines, Methods and Music

Scholarly Social Machines
Scholarly Social MachinesScholarly Social Machines
Scholarly Social MachinesDavid De Roure
 
New e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionNew e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionDavid De Roure
 
OII Summer Doctoral Programme 2010: Global brain by Meyer & Schroeder
OII Summer Doctoral Programme 2010: Global brain by Meyer & SchroederOII Summer Doctoral Programme 2010: Global brain by Meyer & Schroeder
OII Summer Doctoral Programme 2010: Global brain by Meyer & SchroederEric Meyer
 
Social Machines of Science and Scholarship
Social Machines of Science and ScholarshipSocial Machines of Science and Scholarship
Social Machines of Science and ScholarshipDavid De Roure
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData TheContentMine
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustLEARN Project
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchDatapetermurrayrust
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)Duncan Hull
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New ScienceAnita de Waard
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open researchheila1
 
The End(s) of e-Research
The End(s) of e-ResearchThe End(s) of e-Research
The End(s) of e-ResearchEric Meyer
 
What Academia Can Learn from Open Source
What Academia Can Learn from Open SourceWhat Academia Can Learn from Open Source
What Academia Can Learn from Open SourceAll Things Open
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...Carole Goble
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 
e-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Articlee-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly ArticleDavid De Roure
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019heila1
 
Social Machines of Scholarly Collaboration
Social Machines of Scholarly CollaborationSocial Machines of Scholarly Collaboration
Social Machines of Scholarly CollaborationDavid De Roure
 

Similar to The Evolution of e-Research: Machines, Methods and Music (20)

Scholarly Social Machines
Scholarly Social MachinesScholarly Social Machines
Scholarly Social Machines
 
New e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionNew e-Science Edinburgh Late Edition
New e-Science Edinburgh Late Edition
 
OII Summer Doctoral Programme 2010: Global brain by Meyer & Schroeder
OII Summer Doctoral Programme 2010: Global brain by Meyer & SchroederOII Summer Doctoral Programme 2010: Global brain by Meyer & Schroeder
OII Summer Doctoral Programme 2010: Global brain by Meyer & Schroeder
 
Social Machines of Science and Scholarship
Social Machines of Science and ScholarshipSocial Machines of Science and Scholarship
Social Machines of Science and Scholarship
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-Rust
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchData
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
 
The Internet, Science, and Transformations of Knowledge (Ralph Schroeder)
The Internet, Science, and Transformations of Knowledge (Ralph Schroeder)The Internet, Science, and Transformations of Knowledge (Ralph Schroeder)
The Internet, Science, and Transformations of Knowledge (Ralph Schroeder)
 
E research overview gahegan bioinformatics workshop 2010
E research overview gahegan bioinformatics workshop 2010E research overview gahegan bioinformatics workshop 2010
E research overview gahegan bioinformatics workshop 2010
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New Science
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open research
 
The End(s) of e-Research
The End(s) of e-ResearchThe End(s) of e-Research
The End(s) of e-Research
 
What Academia Can Learn from Open Source
What Academia Can Learn from Open SourceWhat Academia Can Learn from Open Source
What Academia Can Learn from Open Source
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
e-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Articlee-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Article
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019
 
Social Machines of Scholarly Collaboration
Social Machines of Scholarly CollaborationSocial Machines of Scholarly Collaboration
Social Machines of Scholarly Collaboration
 

More from David De Roure

Emerging Scholarly Practice and Scholarly Primitives: a Case Study in Music a...
Emerging Scholarly Practice and Scholarly Primitives: a Case Study in Music a...Emerging Scholarly Practice and Scholarly Primitives: a Case Study in Music a...
Emerging Scholarly Practice and Scholarly Primitives: a Case Study in Music a...David De Roure
 
Digital Humanities RSE Landscape
Digital Humanities RSE LandscapeDigital Humanities RSE Landscape
Digital Humanities RSE LandscapeDavid De Roure
 
Digital Research Infrastructure
Digital Research InfrastructureDigital Research Infrastructure
Digital Research InfrastructureDavid De Roure
 
Alter: an ensemble work composed with and about AI
Alter: an ensemble work composed with and about AIAlter: an ensemble work composed with and about AI
Alter: an ensemble work composed with and about AIDavid De Roure
 
Digital Scholarship: Intersection, Automation, and Scholarly Social Machines
Digital Scholarship: Intersection, Automation, and Scholarly Social MachinesDigital Scholarship: Intersection, Automation, and Scholarly Social Machines
Digital Scholarship: Intersection, Automation, and Scholarly Social MachinesDavid De Roure
 
Lovelace’s Legacy : Creative Algorithmic Interventions for Live Performance
Lovelace’s Legacy: Creative Algorithmic Interventions for Live PerformanceLovelace’s Legacy: Creative Algorithmic Interventions for Live Performance
Lovelace’s Legacy : Creative Algorithmic Interventions for Live PerformanceDavid De Roure
 
Experimental Humanities: An Adventure with Lovelace and Babbage
Experimental Humanities: An Adventure with Lovelace and BabbageExperimental Humanities: An Adventure with Lovelace and Babbage
Experimental Humanities: An Adventure with Lovelace and BabbageDavid De Roure
 
Creativity in Digital Scholarship
Creativity in Digital ScholarshipCreativity in Digital Scholarship
Creativity in Digital ScholarshipDavid De Roure
 
The Imagination of Ada Lovelace
The Imagination of Ada LovelaceThe Imagination of Ada Lovelace
The Imagination of Ada LovelaceDavid De Roure
 
Scholarly Social Machines Essay
Scholarly Social Machines EssayScholarly Social Machines Essay
Scholarly Social Machines EssayDavid De Roure
 
Social Machines and how to study them
Social Machines and how to study themSocial Machines and how to study them
Social Machines and how to study themDavid De Roure
 
New and Emerging Forms of Data
New and Emerging Forms of DataNew and Emerging Forms of Data
New and Emerging Forms of DataDavid De Roure
 
Plans and Performances
Plans and PerformancesPlans and Performances
Plans and PerformancesDavid De Roure
 
Description of Process
Description of ProcessDescription of Process
Description of ProcessDavid De Roure
 
The Short and the Long of Web Science
The Short and the Long of Web ScienceThe Short and the Long of Web Science
The Short and the Long of Web ScienceDavid De Roure
 
Short and Long of Data Driven Innovation
Short and Long of Data Driven InnovationShort and Long of Data Driven Innovation
Short and Long of Data Driven InnovationDavid De Roure
 
New Data `New Computation
New Data `New ComputationNew Data `New Computation
New Data `New ComputationDavid De Roure
 
Emerging Forms of Data and Analytics
Emerging Forms of Data and AnalyticsEmerging Forms of Data and Analytics
Emerging Forms of Data and AnalyticsDavid De Roure
 

More from David De Roure (20)

Emerging Scholarly Practice and Scholarly Primitives: a Case Study in Music a...
Emerging Scholarly Practice and Scholarly Primitives: a Case Study in Music a...Emerging Scholarly Practice and Scholarly Primitives: a Case Study in Music a...
Emerging Scholarly Practice and Scholarly Primitives: a Case Study in Music a...
 
Digital Humanities RSE Landscape
Digital Humanities RSE LandscapeDigital Humanities RSE Landscape
Digital Humanities RSE Landscape
 
Music in the Archives
Music in the ArchivesMusic in the Archives
Music in the Archives
 
Digital Research Infrastructure
Digital Research InfrastructureDigital Research Infrastructure
Digital Research Infrastructure
 
Alter: an ensemble work composed with and about AI
Alter: an ensemble work composed with and about AIAlter: an ensemble work composed with and about AI
Alter: an ensemble work composed with and about AI
 
Digital Scholarship: Intersection, Automation, and Scholarly Social Machines
Digital Scholarship: Intersection, Automation, and Scholarly Social MachinesDigital Scholarship: Intersection, Automation, and Scholarly Social Machines
Digital Scholarship: Intersection, Automation, and Scholarly Social Machines
 
Lovelace’s Legacy : Creative Algorithmic Interventions for Live Performance
Lovelace’s Legacy: Creative Algorithmic Interventions for Live PerformanceLovelace’s Legacy: Creative Algorithmic Interventions for Live Performance
Lovelace’s Legacy : Creative Algorithmic Interventions for Live Performance
 
Experimental Humanities: An Adventure with Lovelace and Babbage
Experimental Humanities: An Adventure with Lovelace and BabbageExperimental Humanities: An Adventure with Lovelace and Babbage
Experimental Humanities: An Adventure with Lovelace and Babbage
 
Creativity in Digital Scholarship
Creativity in Digital ScholarshipCreativity in Digital Scholarship
Creativity in Digital Scholarship
 
The Imagination of Ada Lovelace
The Imagination of Ada LovelaceThe Imagination of Ada Lovelace
The Imagination of Ada Lovelace
 
Scholarly Social Machines Essay
Scholarly Social Machines EssayScholarly Social Machines Essay
Scholarly Social Machines Essay
 
Social Machines and how to study them
Social Machines and how to study themSocial Machines and how to study them
Social Machines and how to study them
 
New and Emerging Forms of Data
New and Emerging Forms of DataNew and Emerging Forms of Data
New and Emerging Forms of Data
 
Plans and Performances
Plans and PerformancesPlans and Performances
Plans and Performances
 
Description of Process
Description of ProcessDescription of Process
Description of Process
 
The Short and the Long of Web Science
The Short and the Long of Web ScienceThe Short and the Long of Web Science
The Short and the Long of Web Science
 
Short and Long of Data Driven Innovation
Short and Long of Data Driven InnovationShort and Long of Data Driven Innovation
Short and Long of Data Driven Innovation
 
New Data `New Computation
New Data `New ComputationNew Data `New Computation
New Data `New Computation
 
Ethics of Automation
Ethics of AutomationEthics of Automation
Ethics of Automation
 
Emerging Forms of Data and Analytics
Emerging Forms of Data and AnalyticsEmerging Forms of Data and Analytics
Emerging Forms of Data and Analytics
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

The Evolution of e-Research: Machines, Methods and Music

  • 1. The Evolution of e-Research Machines, Methods and Music David De Roure
  • 2. MathsPhysics Medical electronics PhD in distributed declarative programming language design Hypermedia Large scale Distributed Systems Semantic Sensor Networks Web Science Devices Amorphous Computing Digital Social Research Equator e-Science MusicElectronics Programming Transputers Temporal Media Computational Musicology Advanced Knowledge Technologies Semantic Web Process Networks myExperiment Web 2 Statistics Grid Linked Data 1981 2010 Environmental sensing Networks VREs MITAJGH PH WH PEOPLEOPLE Agents Semantic Grid e-Laboratories Workflows QBH
  • 3. Overview Generation 1: Early adopters Generation 2: Embedding Generation 3: Radical sharing SALAMI A case study in 3rd generation e-Research
  • 4. e-Science • e-Science was defined by John Taylor (Director General of the UK Research Councils) as global collaboration in key areas of science and the next generation of infrastructure that will enable it • e-Science was the name of the destination • It became the name of the journey • When we arrive, the destination is just called science
  • 5. “e-research extends e-Science and cyberinfrastructure to other disciplines, including the humanities and social sciences.” e-Research http://mitpress.mit.edu/catalog/item/default.asp?tid=12185&ttype=2
  • 7. ...the imminent flood of scientific data expected from the next generation of experiments, simulations, sensors and satellites Tony Hey and Anne Trefethen Source: CERN, CERN-EX-0712023, http://cdsweb.cern.ch/record/1203203
  • 8. 26/2/2007 | myExperiment | Slide 8 Jeremy Frey
  • 9. • Workflows are the new rock and roll • Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources • The era of Service Oriented Applications • Repetitive and mundane boring stuff made easier Carole Goble E. Science laboris
  • 13. Box of Chemists My Chemistry Experiment CombeChem
  • 15. empower to equip or supply with an ability; enable service the performance of duties or the duties performed as or by a waiter or servant
  • 16. Early adoptors of tools. Characterised by researchers using tools within their particular problem area, with some re-use of tools, data and methods within the discipline. Traditional publishing is supplemented by publication of some digital artefacts like workflows and links to data. Science is accelerated and practice beginning to shift to emphasise in silico work. 1st Generation Summary Thanks to Iain Buchan and the chipmunks
  • 18. • Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis in cattle • Paul meets Jo. Jo is investigating Whipworm in mouse. • Jo reuses one of Paul’s workflow without change. • Jo identifies the biological pathways involved in sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite. • Previously a manual two year study by Jo had failed to do this. Reuse, Recycling, Repurposing Carole Goble
  • 19. Carole Goble “e-Science is me-Science: What do Scientists want?”, EGEE 2006 “There are these great collaboration tools that 12-year-olds are using. It’s all back to front.” Robert Stevens
  • 20. “A biologist would rather share their toothbrush than their gene name” Mike Ashburner and others Professor in Dept of Genetics, University of Cambridge, UK
  • 21. Data mining: my data’s mine and your data’s mine
  • 25.  “Facebook for Scientists” ...but different to Facebook!  A repository of research methods  A community social network of people and things  A Social Virtual Research Environment  A probe into researcher behaviour  Open source (BSD) Ruby on Rails app  REST and SPARQL interfaces, supports Linked Data  Inspiration for: BioCatalogue, MethodBox and SysMO-SEEK myExperiment currently has 4400 members, 236 groups, 1336 workflows, 351 files and 141 packs
  • 27. Visits to www.myexperiment.org (Oct 2010) Global collaboration in key areas of science and the next generation of infrastructure that will enable it http://wiki.myexperiment.org
  • 29. Methods should be first class citizens Celebrate the flux! Let the data flow through the pipelines. Nail down the methods not the data! Towards “Linked Open Methods” Though this be madness, yet there is method in it * Polonius in Hamlet ** Sean Bechhofer in Manchester *** Not the e-Science Envoy * *** ** Data bonanza => Methods bonanza!
  • 30. It’s not just the data And what other people do with it ...that you never thought of It’s what you do with it that counts
  • 31. Results Logs Results Metadata PaperSlides Feeds into produces Included in produces Published in produces Included in Included in Included in Published in Workflow 16 Workflow 13 Common pathways QTL Paul’s PackPaul’s Research Object
  • 32. Research Objects enable data-intensive research to be: 1. Replayable – go back and see what happened 2. Repeatable – run the experiment again 3. Reproducible – independent expt to reproduce 4. Reusable – use as part of new experiments 5. Repurposeable – reuse the pieces in new expt 6. Reliable – robust under automation 7. Referenceable – citable and traceable The Six Rs of Research Object Behaviours http://blog.openwetware.org/deroure/?p=56
  • 33. Semantically enhanced publication versus Shared digital Research Objects Challenging the mindset of paper-sized chunks
  • 35.
  • 36. Projects delivering now. Some institutional embedding. Key characteristic is re-use – of the increasing pool of tools, data and methods across areas/disciplines. Contain some freestanding, recombinant, reproducible research objects. New scientific practices are established and opportunities arise for completely new scientific investigations. Some expert curation. 2nd Generation Summary
  • 38. 4th Paradigm The Fourth Paradigm: Data-Intensive Scientific Discovery Presenting the first broad look at the rapidly emerging field of data- intensive science http://research.microsoft.com/en-us/collaboration/fourthparadigm/
  • 40.
  • 43. “…to discover proteins that interact with transmembrane proteins, particularly those that can be related to neuro- degenerative diseases in which amyloids play a significant role” 1) Taverna provenance exposed as RDF 2) myExperiment RDF document for a protein discovery workflow 3) Mocked-up BioCatalogue document using myExperiment RDF data as example 4) Provisional RDF documents obtained from the ConceptWiki (conceptwiki.org) development server 5) An RDF document for an example protein, obtained from the RDF interface of the UniProt web site A Bioinformatics Experiment Scott Marshall Marco Roos
  • 45. MethodBox http://www.methodbox.org/ Enable cross disciplinary research into Major Public Health problems Ease handling data and sharing results and insights
  • 48. The solutions we'll be delivering in 5 years Characterised by global reuse of tools, data and methods across any discipline, and surfacing the right levels of complexity for the researcher. Routine use. Key characteristic is radical sharing. Research is significantly data driven – plundering the backlog of data, results and methods. Publishing by the social network Increasing automation and decision-support for the researcher – the VRE becomes assistive. Curation is autonomic and social. 3rd Generation Summary
  • 49.
  • 50. Easy and low risk to start Progress to advanced skills For researchers No obligation Go as far as you want Find a service & relax Intellectual ramps Malcolm Atkinson
  • 51. NRAO/AUI/NSF telescopes for the naked mindDatascopes Malcolm Atkinson From Signal to Understanding
  • 52. Jeannette M. Wing COMMUNICATIONS OF THE ACM March 2006/Vol. 49, No. 3 Pages 33-35
  • 53. 2010 – 2011 and beyond Music and Linked Data
  • 54.
  • 56. It’s about enabling the join Ben Fields, 6th October 2010
  • 57. SALAMI: Structural Analysis of Large Amounts of Music Information David De Roure J. Stephen Downie Ichiro Fujinaga
  • 59. The SALAMI collaboration • DDeR (e-Research South), J. Stephen Downie (Illinois) and Ichiro Fujinaga (McGill) • NCSA donating 250,000 supercomputer hours • 350,000 pieces of music (23,000 hours) – Internet Archive, DRAM, IMIRSEL, McGill • Feature analysis and structural analysis • Music Ontology by Yves Raimond (BBC) • Musicologists from McGill and Southampton • Sharing of analyses http://salami.music.mcgill.ca
  • 60. Digital Music Collections Crowdsourced ground truth Community Software Linked Data Repositories Supercomputer 23,000 hours of recorded music 250,000 hours NCSA Supercomputer time Music Information Retrieval Community
  • 62. MIREX Overview • Began in 2005 • Tasks defined by community debate • Data sets collected and/or donated • Participants submit code to IMIRSEL • Code rarely works first try  • Huge labour consumption getting programs to work • Meet at ISMIR to discuss results Stephen Downie http://www.music-ir.org/mirex
  • 63. MIREX TASKS Audio Artist Identification Audio Onset Detection Audio Beat Tracking Audio Tag Classification Audio Chord Detection Audio Tempo Extraction Audio Classical Composer ID Multiple F0 Estimation Audio Cover Song Identification Multiple F0 Note Detection Audio Drum Detection Query-by-Singing/Humming Audio Genre Classification Query-by-Tapping Audio Key Finding Score Following Audio Melody Extraction Symbolic Genre Classification Audio Mood Classification Symbolic Key Finding Audio Music Similarity Symbolic Melodic Similarity
  • 65. “Signal” Digital Audio “Ground Truth” Community It’s web-like! Q. If and when should community-generated content be assimilated into managed repositories? Structural Analysis
  • 66. How country is my country? Kevin Page and Ben Fieldshttp://www.nema.ecs.soton.ac.uk/countrycountry/
  • 67. Stephen Downie Music and computational thinking
  • 68. “Again, it [the Analytical Engine] might act upon other things besides number, were objects found whose mutual fundamental relations could be expressed by those of the abstract science of operations, and which should be also susceptible of adaptations to the action of the operating notation and mechanism of the engine...”
  • 69. “Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent.” Ada, The Enchantress of Numbers: Poetical Science by Betty Alexandra Toole http://www.well.com/user/adatoole/ Betty Alexandra Toole
  • 70. I can write a workflow that creates workflows based on those of others, and automatically modify it – think genetic mutation and crossovers. Who owns it? I can register a query over an increasing number and diversity of “linked data” sources to ask new research questions. http://eresearch-ethics.org/ The computer can learn from the activities of 1,000,000 scientists – and be indistinguishable from them? What about the ethics of Citizen Social Science? Of citizens designing experiments?
  • 71. Co-* Methods Access ramps Research Objects Computational thinking Ethics of e-Research at scale
  • 72. david.deroure@oerc.ox.ac.uk Thanks to: Jeremy Frey & CombeChem; Carole Goble, myGrid and myExperiment; Iain Buchan & Obesity e-Lab; Sean Bechhofer; Doug Kell; Marco Roos; Lucy Yardley; Arfon Smith; Malcolm Atkinson; Stephen Downie, Kevin Page, Ben Fields, Ashley Burgoyne and NEMA/SALAMI; Betty Toole. http://www.myexperiment.org/packs/153

Editor's Notes

  1. Today I’m going to talk about the trajectory of e-Science – from its conception through examples of 3 generations, and I’ll reflect on how we are moving from generation 2 to generation 3. Different disciplines and especially communities may be in different stages of evolution.
  2. First something about words. This definition of e-Science is important – it reminds us that it isn’t just about technology but about people working together and being empowered by technology – and the emphasis on “science” reminds us that ultimately success is measured by new scientific outcome.At the turn of the decade this was a vision of the future. A programme was created called e-Science. The projects doing the innovation were labelled as “e-Science”. By the time we arrive, it’s just “science”. So “e-Science” has become the name of the journey rather than the destination. Note that the innovation that takes us to the destination isn’t solely in the custody of e-Science projects – there’s a lot of relevant work going on that doesn’t carry that label.Note also that when we say “e-Science” we actually mean “e-Research”! We sometimes forget to say that.
  3. First something about words. This definition of e-Science is important – it reminds us that it isn’t just about technology but about people working together and being empowered by technology – and the emphasis on “science” reminds us that ultimately success is measured by new scientific outcome.At the turn of the decade this was a vision of the future. A programme was created called e-Science. The projects doing the innovation were labelled as “e-Science”. By the time we arrive, it’s just “science”. So “e-Science” has become the name of the journey rather than the destination. Note that the innovation that takes us to the destination isn’t solely in the custody of e-Science projects – there’s a lot of relevant work going on that doesn’t carry that label.Note also that when we say “e-Science” we actually mean “e-Research”! We sometimes forget to say that.
  4. Today I’m going to talk about the trajectory of e-Science – from its conception through examples of 3 generations, and I’ll reflect on how we are moving from generation 2 to generation 3. Different disciplines and especially communities may be in different stages of evolution.
  5. CERN teams up with Leaders in Information Technology to build giant Data GridData accumulation rate: 10 Petabytes per year (equivalent to about 20 million CD-ROMs).http://public.web.cern.ch/press/pressreleases/Releases2001/PR11.01ECERNopenlab.html
  6. Scientific workflow systems are a key automation technique for systematically handling the data deluge and giving us the “workflow” as a new sharable artefact of digital science – to record, repeat, reproduce and repurpose an experiment.This is an iconic slide by Carole Goble which is much repeated, reproduced and repurposed!
  7. Today I’m going to talk about the trajectory of e-Science – from its conception through examples of 3 generations, and I’ll reflect on how we are moving from generation 2 to generation 3. Different disciplines and especially communities may be in different stages of evolution.
  8. What we didn’t see much in phase 1 was sharing and reuse, but this is essential to harnessing of the new technology.The story on this slide involves sharing in a corridor and we will go on to see how we do it digitally! But it’s an important motivation. It led to new science.
  9. myExperiment in one slide! It’s a “boutique” Web site with the largest public collection of scientific workflows. For lots more information see the myExperiment wiki http://wiki.myexperiment.org/BioCatalogue is a registry of Web Service in the life sciences and is directly based on the myExperiment experience. Sysmo and Methodbox grew from the myExperiment codebase – methodbox is an e-Social Science e-Laboratory for sharing and analysing data, and sysmo is customised to the systems biology domain. Seehttp://www.biocatalogue.org/http://www.methodbox.org/http://www.sysmo-db.org/
  10. This is reflected in a third distinctive – the pack. This is Paul Fishers pack from the Tryps example.Some packs contain example input and output data so workflows can be checked for “decay” (they don’t actually rot, but the world changes round them).While others are looking at semantically enhanced publication, we are asking “what is the shared artefact of future research?” We come at the same problem from the other side. We have it surrounded! Our approach relieves us of the paper mindest – so, for example, a Research Object could contain information for many audiences and purposes, with a commonly interpreted core (social scientists will recognise the idea of a “boundary object”).
  11. Today I’m going to talk about the trajectory of e-Science – from its conception through examples of 3 generations, and I’ll reflect on how we are moving from generation 2 to generation 3. Different disciplines and especially communities may be in different stages of evolution.
  12. First something about words. This definition of e-Science is important – it reminds us that it isn’t just about technology but about people working together and being empowered by technology – and the emphasis on “science” reminds us that ultimately success is measured by new scientific outcome.At the turn of the decade this was a vision of the future. A programme was created called e-Science. The projects doing the innovation were labelled as “e-Science”. By the time we arrive, it’s just “science”. So “e-Science” has become the name of the journey rather than the destination. Note that the innovation that takes us to the destination isn’t solely in the custody of e-Science projects – there’s a lot of relevant work going on that doesn’t carry that label.Note also that when we say “e-Science” we actually mean “e-Research”! We sometimes forget to say that.
  13. Now we look at myExperiment as a probe into the future behaviour of researchers. For example, these workflows by Francois Belleau show what could be described as another level of working – building on the new tooling.
  14. Here we see bioinformaticians assembling the resources they need to answer a research question – and also demonstrating what the methods section of the future paper needs to look like.They are using Linked Data. We see the power – ease of assembly. This could be where the new computer science challenges lie in e-Research.
  15. From The Galileo Project web site: http://galileo.rice.edu/sci/instruments/telescope.html- The earliest known illustration of a telescope. Giovanpattista della Porta included this sketch in a letter written in August 1609 - porta-sketchJohannes Hevelius (Poland, 1611-1687) observing with one of his telescopes (Source: Selenographia, 1647)Hubble_earth_horz and hubble - from http://hubble.nasa.gov/. Very Large Array from http://images.nrao.edu/Telescopes. Copyright requirement - include "NRAO/AUI/NSF" on slide.
  16. Today I’m going to talk about the trajectory of e-Science – from its conception through examples of 3 generations, and I’ll reflect on how we are moving from generation 2 to generation 3. Different disciplines and especially communities may be in different stages of evolution.
  17. That example comes from a Digging into Data project with the best project acronym ever. The projects is conducting a massive structural analysis of music in the internet archibe, to support musicologists. It illustrates many of the things we are now seeing in e-Research – crowdsourcing, annotation, community software development, high performance computation, data publication. This project involves UIUC, McGill and Oxford – and the supercomputer time is donated by NCSA.
  18. That example comes from a Digging into Data project with the best project acronym ever. The projects is conducting a massive structural analysis of music in the internet archibe, to support musicologists. It illustrates many of the things we are now seeing in e-Research – crowdsourcing, annotation, community software development, high performance computation, data publication. This project involves UIUC, McGill and Oxford – and the supercomputer time is donated by NCSA.