SlideShare une entreprise Scribd logo
1  sur  48
Open Data – Where Do We
Stand from A Researcher's
       Perspective?


         Philip E. Bourne
University of California San Diego
      pbourne@ucsd.edu
My Perspective …
• Mine is a biomedical sciences perspective
• My lab. distributes for free data equivalent to ¼ the
  Library of Congress every month
• I am a supporter of open access (provided there is a
  business/sustainability model) and founding editor in
  chief of PLOS Computational Biology
• I am Co-founder of SciVee Inc. and believe
  innovation comes from open access to knowledge
• Recently became UCSD’s AVC of Innovation which is
  giving me a more institutional perspective

   I Readily Acknowledge Each Discipline is Different
My General Opinion:
Where Does the Open Access Debate
          Stand Today?

• Its not a question of “if” but a question of
  “when” and “how” for most disciplines
• We are at the tip of the iceberg in our
  ability to use OA content
• OA will gain momentum in an increasingly
  knowledge-based economy
The State of Play:
 UC Open Access Policy Debate:
       Opt Out vs Opt in
• For                       • Against
  – Publically funded         – Cost to some
    research should be          disciplines
    public                    – Impact on societies
  – Institutional             – Journal quality re
    Perspective: The open       promotion
    provision of data and     – Extra work
    knowledge derived
                              – Administration
    from these data
    appears to be an          – UC as “Big Brother”
    unidentified asset at
    this time
We will come back to this, but
 first let us explore why open
knowledge is so important (to
            me at least)
Open Data May                                              *

 Save Lives?
Structure Summary page activity for
H1N1 Influenza related structures

   Jan. 2008      Jul. 2008      Jan. 2009     Jul. 2009       Jan. 2010   Jul. 2010

     3B7E: Neuraminidase of A/Brevig Mission/1/1918
     H1N1 strain in complex with zanamivir




     1RUZ: 1918 H1 Hemagglutinin




               * http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm
Open Science Can Accelerate
   the Scientific Process…

For some people the change may
  be too slow to save their life
Josh Sommer – A Remarkable Young Man
Co-founder & Executive Director the Chordoma Foundation




                                 http://sagecongress.org/Presentations/Sommer.pdf
Chordoma
                                                            • A rare form of brain
                                                              cancer
                                                            • No known drugs
                                                            • Treatment – surgical
                                                              resection followed by
                                                              intense radiation
                                                              therapy


http://upload.wikimedia.org/wikipedia/commons/2/2b/Chordoma.JPG
http://sagecongress.org/Presentations/Sommer.pdf
http://sagecongress.org/Presentations/Sommer.pdf
http://sagecongress.org/Presentations/Sommer.pdf
If I have seen further it is only by
standing on the shoulders of giants
                 Isaac
                     Isaac Newton




From Josh’s point of view the climb
up just takes too long


> 15 years and > $850M to be
more precise




                                Adapted: http://sagecongress.org/Presentations/Sommer.pdf
http://sagecongress.org/Presentations/Sommer.pdf
http://sagecongress.org/Presentations/Sommer.pdf
http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_Foundation
The Story of Meredith
What Does Meredith Tell Us?
• The Wikipedia / Kahn Academy /YouTube
  generation knows no bounds
• Bounds are too often imposed by tradition
  rather than what makes the most sense

• Another example of an underexploited
  asset at this time?
Another Way of Thinking About
 the Implications of What Josh
and Meredith Represent Is the
    Need for New Forms of
 Knowledge Management and
            Access

Lets Explore this Notion with
   An Emphasis on Data
The Silos of Data & Knowledge Are
       Starting to Coalesce




          Is a Biological Database Really Different than a Biological Journal?
          PLoS Comp. Biol. 2005 1(3) e34
The Silos of Data & Knowledge Are
       Starting to Coalesce



• Supplemental information   • Databases are now
  has exploded                 knowledgebases
• Data journals are          • Science can be done on
  emerging                     the fly
• The use of rich media is   • Biocuration is a respectful
  increasing                   career
• Software and other
  processes are becoming
  available                  PLoS Comp. Biol. 2008. 4(7): e1000136
Where Does That Take Us?
• A paper is an artifact of a previous era
• It is not the logical end product of eScience,
  hence:
  –   Work is omitted
  –   Article vs supplement is a mess
  –   Visualization may be limited
  –   Interaction and enquiry are non-existent
  –   Rich media can help, but barriers remain
Where Does That Take Us?
         Data Sharing Policies

• From the NSF:

• Investigators are expected to share with other researchers, at
  no more than incremental cost and within a reasonable time,
  the primary data, samples, physical collections and other
  supporting materials created or gathered in the course of
  work under NSF grants. Grantees are expected to encourage
  and facilitate such sharing. See Award & Administration Guide
  (AAG) Chapter VI.D.4.
Big Data is Off…
        • March 2012 OSTP
          commits $200M to
          Big Data
        • NSF, DOD, NIH all
          announce programs
        • GBMF think tank
          leads to soon-to-be-
          announced
          institutional awards
Where Does That Take Us?
         Add into the Mix:

•   Reproducibility   •   It really is a myth!
•   Maintainability   •   DNA doubles in 5 months
•   Usability         •   Go ahead and try!
•   Reward            •   Tenure for data – no way


    Notwithstanding dreams do emerge …
                Here is mine
Here is What
 The Knowledge and Data Cycle
0. Full text of PLoS papers stored          4. The composite view has
                                                                                  I Want
           in a database                     links to pertinent blocks
                                     of literature text and back to the PDB
                                                                               1. User clicks on thumbnail
                                       4.                                      2. Metadata and a
                                                                                  webservices call provide
                                                                                  a renderable image that
    1.                                                                            can be annotated
                                       3. A composite view of
 1. A link brings up figures
        from the paper                  journal and database                   3. Selecting a features
                                           content results
                                                                     3.           provides a
                                                                                  database/literature
                                                                                  mashup
                                                                               4. That leads to new
                                                                                  papers
                 2.
                                      2. Clicking the paper figure retrieves
                                           data from the PDB which is
                                                     analyzed                     PLoS Comp. Biol. 2005 1(3) e34
The Knowledge Economy Begins


Cardiac Disease
Literature




                     Immunology Literature
Simultaneously Discovery
  Informatics Emerges
            • Google with not
              suffice as a scientific
              knowledge discovery
              tool
            • Google is broad but
              shallow
            • Science is cross-
              disciplinary narrower
              and deeper
NSF Discovery Informatics
                       Workshop
                                                             • Discoveries surpass
                                                               an individuals ability -
                                                               need intelligent tools
                                                             • Need to increase
                                                               connections between
                                                               knowledge and data
                                                             • Need to combine
                                                               diverse human
                                                               abilities

Discovery informatics - computer scientists, domain scientists,
social scientists -
http://www.isi.edu/~gil/diw2012/NSFDiscoveryInformatics2012-FinalReport.pdf
This is Just the Beginning of
       Discovery Informatics
• Each evening the labs “Evernote”
  notebooks are scanned for commonalities
  from the days activities. These are seeds
  in a deep search of the web for knowledge
  and data that has become available since
  last searched. Results are ranked and
  presented for consideration over coffee
  the next morning

http://www.discoveryinformaticsinitiative.org/diw2012
Unimaginable Connections Made Automatically
                Through RDF Descriptions




http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html
Before We Get Too Heady Lets
    Look at the Realities of the
  Situation from My Perspective

• Data repositories are broken

• There is a “high noon” effect

• NCBI has been a wonderful model to
  date…
Data/Institutional Repositories
• Build it and they will come fails most of the
  time
• Institutional repository is an oxymoron
• NCBI works because:
  – It is an act of the US congress
  – It has strong leadership
  – It has a monopoly on the literature
  – It has IT thought out over many years
                         Innkeeper at the Roach Motel D. Salo 2008
                         http://muse.jhu.edu/journals/library_trends/v057/57.2.salo.html
Data/Institutional Repositories
• “High Noon” Effect

  – Publishers make knowledge in very difficult,
    but at least knowledge out, albeit limited is
    consistent, intuitive and easy to use

  – Data repositories make data in and data out
    very difficult – they strive to be different when
    in fact users want them to be the same
Data and Journals
• That journals are thinking about data is
  good
• Dryad etc. are welcome but a stop gap
  measure
• Fully functional data journals will not occur
  without a change to the reward system
• Data papers can help shift the reward
  system
• Are PLoS Topic Pages a sign?
Interim Solution:
Use the Traditional Reward System
The Wikipedia Experiment – Topic Pages

                             Identify areas of Wikipedia that
                              relate to the journal that are
                              missing of stubs
                             Develop a Wikipedia page in the
                              sandbox
                             Have a Topic Page Editor Review
                              the page
                             Publish the copy of record with
                              associated rewards
                             Release the living version into
                              Wikipedia
Think Globally Act Locally:

What Can Our Institutions Do
Now To Move Us in The Right
        Direction?
Institutional Response
• Have repositories that are useful
  – Use common standards
  – Are vetted by the community
  – Are fully open and searchable


• Reward all forms of scholarship

• Leverage the asset …
Most Laboratories
         • We are the long tail
         • Goodbye to the
           student is goodbye to
           the data
         • Very few of us have
           complied (or will
           comply with the data
           management plans
           we write into grants)
UCSD Dropbox
• Simple!!!!
• Can drop large files easily
• Asks for limited metadata and permissions to
  “discover”
• Has guaranteed quality of service and
  security not available in the cloud
• Is the data management plan and charged
  against grants
• Is a rich campus corpus open to discovery
  informatics
The UCSD Dropbox
             Discovery Environment
• Scenarios:
  – Fosters known collaborations through
    simplified data exchange
  – Discovers new collaborators through the
    same or related data elements
  – A corpus whose intrinsic value is as yet
    unknown
What Do I Want by 2020 or
    Earlier as a Researcher?
• Answer biological questions not just
  retrieve data

• Understand all there is to know about the
  availability and quality of a unit of
  biological data

• Operate on data in a way that is simpler,
  more productive, and reproducible
What Do We Need to Do to Get
  There? A Data Registry?
• Individual repositories register their
  metadata which includes access
  statistics, commentary etc. – DataCite
  is a beginning
• Identify identical data objects and their
  respective metadata for comparative
  analysis
• Funders support registration
• Publishers support registration
What Do We Need to Do to
    Get There? An App+ Store?
• The App model
  – Think of it operating on a content base
    rather than a mobile device
  – Simple and consistent user interface
  – Needs to pass some quality control
  – Has a reward
• The App+ Model
  – Apps interoperate through a generic
    workflow interface
In Summary
• We have at hand the means to accelerate
  the rate of discovery
• To do so we need to place more value on
  the data, the individuals that produce it
  and the institutions that maintain it
• We are all stakeholders in this endeavor
• Here is one way to get involved….
Get Involved: FORCE11
                     • Tools and Resource
                       catalog
                     • Article database in
                       Mendeley
                     • Discussion Forum via
                       Google
                     • Blogs courtesy of blog
                       sites and RSS feeds
                     • Web site via Drupal
                     • Announcements via
                       Twitter


http://force11.org
General References
• Force11 Manifesto

• Fourth Paradigm: Data Intensive Scientific
  Discovery
  http://research.microsoft.com/enus/collabora
  tion/fourthparadigm/
pbourne@ucsd.edu

Contenu connexe

Tendances

Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...Martin Donnelly
 
Needs for Data Management & Citation Throughout the Information Lifecycle
Needs for Data Management & Citation Throughout  the Information LifecycleNeeds for Data Management & Citation Throughout  the Information Lifecycle
Needs for Data Management & Citation Throughout the Information LifecycleMicah Altman
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...Neuroscience Information Framework
 

Tendances (7)

Dapra
DapraDapra
Dapra
 
Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...
 
Needs for Data Management & Citation Throughout the Information Lifecycle
Needs for Data Management & Citation Throughout  the Information LifecycleNeeds for Data Management & Citation Throughout  the Information Lifecycle
Needs for Data Management & Citation Throughout the Information Lifecycle
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
 
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data ServicesNISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
 
Data-Intensive Research
Data-Intensive ResearchData-Intensive Research
Data-Intensive Research
 
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
 

En vedette

Research Data Alliance March 19, 2013
Research Data Alliance March 19, 2013Research Data Alliance March 19, 2013
Research Data Alliance March 19, 2013Philip Bourne
 
Hiring and Supervising
Hiring and Supervising Hiring and Supervising
Hiring and Supervising Philip Bourne
 
Next Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical PharmacologistsNext Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical PharmacologistsPhilip Bourne
 
Professional Development Presented to ACS Student Group Oct 16, 2013
Professional Development Presented to ACS Student Group Oct 16, 2013Professional Development Presented to ACS Student Group Oct 16, 2013
Professional Development Presented to ACS Student Group Oct 16, 2013Philip Bourne
 
Overview of Digital Publishing
Overview of Digital PublishingOverview of Digital Publishing
Overview of Digital PublishingPhilip Bourne
 
Communicating Systems Biology - Why and How We Should Do Better in a Digital ...
Communicating Systems Biology - Why and How We Should Do Better in a Digital ...Communicating Systems Biology - Why and How We Should Do Better in a Digital ...
Communicating Systems Biology - Why and How We Should Do Better in a Digital ...Philip Bourne
 
Towards Biomedical Research as a Digital Enterprise
Towards Biomedical Research as a Digital EnterpriseTowards Biomedical Research as a Digital Enterprise
Towards Biomedical Research as a Digital EnterprisePhilip Bourne
 
UCSD Progress in Innovation
UCSD Progress in InnovationUCSD Progress in Innovation
UCSD Progress in InnovationPhilip Bourne
 
Ten Simple Rules for Building and Maintaining a Scientific Reputation
Ten Simple Rules for Building and Maintaining a Scientific ReputationTen Simple Rules for Building and Maintaining a Scientific Reputation
Ten Simple Rules for Building and Maintaining a Scientific ReputationPhilip Bourne
 

En vedette (9)

Research Data Alliance March 19, 2013
Research Data Alliance March 19, 2013Research Data Alliance March 19, 2013
Research Data Alliance March 19, 2013
 
Hiring and Supervising
Hiring and Supervising Hiring and Supervising
Hiring and Supervising
 
Next Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical PharmacologistsNext Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical Pharmacologists
 
Professional Development Presented to ACS Student Group Oct 16, 2013
Professional Development Presented to ACS Student Group Oct 16, 2013Professional Development Presented to ACS Student Group Oct 16, 2013
Professional Development Presented to ACS Student Group Oct 16, 2013
 
Overview of Digital Publishing
Overview of Digital PublishingOverview of Digital Publishing
Overview of Digital Publishing
 
Communicating Systems Biology - Why and How We Should Do Better in a Digital ...
Communicating Systems Biology - Why and How We Should Do Better in a Digital ...Communicating Systems Biology - Why and How We Should Do Better in a Digital ...
Communicating Systems Biology - Why and How We Should Do Better in a Digital ...
 
Towards Biomedical Research as a Digital Enterprise
Towards Biomedical Research as a Digital EnterpriseTowards Biomedical Research as a Digital Enterprise
Towards Biomedical Research as a Digital Enterprise
 
UCSD Progress in Innovation
UCSD Progress in InnovationUCSD Progress in Innovation
UCSD Progress in Innovation
 
Ten Simple Rules for Building and Maintaining a Scientific Reputation
Ten Simple Rules for Building and Maintaining a Scientific ReputationTen Simple Rules for Building and Maintaining a Scientific Reputation
Ten Simple Rules for Building and Maintaining a Scientific Reputation
 

Similaire à Open Data - Where Do We Stand from a Researcher's Perspective?

Biomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital EnterpriseBiomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital EnterprisePhilip Bourne
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhilip Bourne
 
Next Generation Preprint Service
Next Generation Preprint ServiceNext Generation Preprint Service
Next Generation Preprint ServicePhilip Bourne
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New ScienceAnita de Waard
 
How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research PaperAnita de Waard
 
Data at the NIH: Some Early Thoughts
Data at the NIH: Some Early ThoughtsData at the NIH: Some Early Thoughts
Data at the NIH: Some Early ThoughtsPhilip Bourne
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global EcosystemPhilip Bourne
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down HerePhilip Bourne
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...GigaScience, BGI Hong Kong
 
NGP Retreat Open Science 2015
NGP Retreat Open Science 2015NGP Retreat Open Science 2015
NGP Retreat Open Science 2015Jackie Wirz, PhD
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectivePhilip Bourne
 
One Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific PublishersOne Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific PublishersPhilip Bourne
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
 
Introduction to Research Data Management for postgraduate students
Introduction to Research Data Management for postgraduate studentsIntroduction to Research Data Management for postgraduate students
Introduction to Research Data Management for postgraduate studentsMarieke Guy
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnTodd Vision
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystemMaryann Martone
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)Duncan Hull
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHPhilip Bourne
 

Similaire à Open Data - Where Do We Stand from a Researcher's Perspective? (20)

Biomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital EnterpriseBiomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital Enterprise
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
Data at the NIH
Data at the NIHData at the NIH
Data at the NIH
 
Next Generation Preprint Service
Next Generation Preprint ServiceNext Generation Preprint Service
Next Generation Preprint Service
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New Science
 
How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research Paper
 
Cartegena051811
Cartegena051811Cartegena051811
Cartegena051811
 
Data at the NIH: Some Early Thoughts
Data at the NIH: Some Early ThoughtsData at the NIH: Some Early Thoughts
Data at the NIH: Some Early Thoughts
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down Here
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
 
NGP Retreat Open Science 2015
NGP Retreat Open Science 2015NGP Retreat Open Science 2015
NGP Retreat Open Science 2015
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH Perspective
 
One Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific PublishersOne Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific Publishers
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
Introduction to Research Data Management for postgraduate students
Introduction to Research Data Management for postgraduate studentsIntroduction to Research Data Management for postgraduate students
Introduction to Research Data Management for postgraduate students
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, Bonn
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystem
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
 

Plus de Philip Bourne

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationPhilip Bourne
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingPhilip Bourne
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityPhilip Bourne
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?Philip Bourne
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangePhilip Bourne
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug DiscoveryPhilip Bourne
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AlonePhilip Bourne
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchPhilip Bourne
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data SciencePhilip Bourne
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptxPhilip Bourne
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Philip Bourne
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision EducationPhilip Bourne
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data SciencePhilip Bourne
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Philip Bourne
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Philip Bourne
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance SustainabilityPhilip Bourne
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in ResearchPhilip Bourne
 

Plus de Philip Bourne (20)

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 

Dernier

Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
CHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxCHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxAneriPatwari
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 

Dernier (20)

Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
CHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxCHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptx
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 

Open Data - Where Do We Stand from a Researcher's Perspective?

  • 1. Open Data – Where Do We Stand from A Researcher's Perspective? Philip E. Bourne University of California San Diego pbourne@ucsd.edu
  • 2. My Perspective … • Mine is a biomedical sciences perspective • My lab. distributes for free data equivalent to ¼ the Library of Congress every month • I am a supporter of open access (provided there is a business/sustainability model) and founding editor in chief of PLOS Computational Biology • I am Co-founder of SciVee Inc. and believe innovation comes from open access to knowledge • Recently became UCSD’s AVC of Innovation which is giving me a more institutional perspective I Readily Acknowledge Each Discipline is Different
  • 3. My General Opinion: Where Does the Open Access Debate Stand Today? • Its not a question of “if” but a question of “when” and “how” for most disciplines • We are at the tip of the iceberg in our ability to use OA content • OA will gain momentum in an increasingly knowledge-based economy
  • 4. The State of Play: UC Open Access Policy Debate: Opt Out vs Opt in • For • Against – Publically funded – Cost to some research should be disciplines public – Impact on societies – Institutional – Journal quality re Perspective: The open promotion provision of data and – Extra work knowledge derived – Administration from these data appears to be an – UC as “Big Brother” unidentified asset at this time
  • 5. We will come back to this, but first let us explore why open knowledge is so important (to me at least)
  • 6. Open Data May * Save Lives? Structure Summary page activity for H1N1 Influenza related structures Jan. 2008 Jul. 2008 Jan. 2009 Jul. 2009 Jan. 2010 Jul. 2010 3B7E: Neuraminidase of A/Brevig Mission/1/1918 H1N1 strain in complex with zanamivir 1RUZ: 1918 H1 Hemagglutinin * http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm
  • 7. Open Science Can Accelerate the Scientific Process… For some people the change may be too slow to save their life
  • 8. Josh Sommer – A Remarkable Young Man Co-founder & Executive Director the Chordoma Foundation http://sagecongress.org/Presentations/Sommer.pdf
  • 9. Chordoma • A rare form of brain cancer • No known drugs • Treatment – surgical resection followed by intense radiation therapy http://upload.wikimedia.org/wikipedia/commons/2/2b/Chordoma.JPG
  • 13. If I have seen further it is only by standing on the shoulders of giants Isaac Isaac Newton From Josh’s point of view the climb up just takes too long > 15 years and > $850M to be more precise Adapted: http://sagecongress.org/Presentations/Sommer.pdf
  • 17. The Story of Meredith
  • 18. What Does Meredith Tell Us? • The Wikipedia / Kahn Academy /YouTube generation knows no bounds • Bounds are too often imposed by tradition rather than what makes the most sense • Another example of an underexploited asset at this time?
  • 19. Another Way of Thinking About the Implications of What Josh and Meredith Represent Is the Need for New Forms of Knowledge Management and Access Lets Explore this Notion with An Emphasis on Data
  • 20. The Silos of Data & Knowledge Are Starting to Coalesce Is a Biological Database Really Different than a Biological Journal? PLoS Comp. Biol. 2005 1(3) e34
  • 21. The Silos of Data & Knowledge Are Starting to Coalesce • Supplemental information • Databases are now has exploded knowledgebases • Data journals are • Science can be done on emerging the fly • The use of rich media is • Biocuration is a respectful increasing career • Software and other processes are becoming available PLoS Comp. Biol. 2008. 4(7): e1000136
  • 22. Where Does That Take Us? • A paper is an artifact of a previous era • It is not the logical end product of eScience, hence: – Work is omitted – Article vs supplement is a mess – Visualization may be limited – Interaction and enquiry are non-existent – Rich media can help, but barriers remain
  • 23. Where Does That Take Us? Data Sharing Policies • From the NSF: • Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing. See Award & Administration Guide (AAG) Chapter VI.D.4.
  • 24. Big Data is Off… • March 2012 OSTP commits $200M to Big Data • NSF, DOD, NIH all announce programs • GBMF think tank leads to soon-to-be- announced institutional awards
  • 25. Where Does That Take Us? Add into the Mix: • Reproducibility • It really is a myth! • Maintainability • DNA doubles in 5 months • Usability • Go ahead and try! • Reward • Tenure for data – no way Notwithstanding dreams do emerge … Here is mine
  • 26. Here is What The Knowledge and Data Cycle 0. Full text of PLoS papers stored 4. The composite view has I Want in a database links to pertinent blocks of literature text and back to the PDB 1. User clicks on thumbnail 4. 2. Metadata and a webservices call provide a renderable image that 1. can be annotated 3. A composite view of 1. A link brings up figures from the paper journal and database 3. Selecting a features content results 3. provides a database/literature mashup 4. That leads to new papers 2. 2. Clicking the paper figure retrieves data from the PDB which is analyzed PLoS Comp. Biol. 2005 1(3) e34
  • 27. The Knowledge Economy Begins Cardiac Disease Literature Immunology Literature
  • 28. Simultaneously Discovery Informatics Emerges • Google with not suffice as a scientific knowledge discovery tool • Google is broad but shallow • Science is cross- disciplinary narrower and deeper
  • 29. NSF Discovery Informatics Workshop • Discoveries surpass an individuals ability - need intelligent tools • Need to increase connections between knowledge and data • Need to combine diverse human abilities Discovery informatics - computer scientists, domain scientists, social scientists - http://www.isi.edu/~gil/diw2012/NSFDiscoveryInformatics2012-FinalReport.pdf
  • 30. This is Just the Beginning of Discovery Informatics • Each evening the labs “Evernote” notebooks are scanned for commonalities from the days activities. These are seeds in a deep search of the web for knowledge and data that has become available since last searched. Results are ranked and presented for consideration over coffee the next morning http://www.discoveryinformaticsinitiative.org/diw2012
  • 31. Unimaginable Connections Made Automatically Through RDF Descriptions http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html
  • 32. Before We Get Too Heady Lets Look at the Realities of the Situation from My Perspective • Data repositories are broken • There is a “high noon” effect • NCBI has been a wonderful model to date…
  • 33. Data/Institutional Repositories • Build it and they will come fails most of the time • Institutional repository is an oxymoron • NCBI works because: – It is an act of the US congress – It has strong leadership – It has a monopoly on the literature – It has IT thought out over many years Innkeeper at the Roach Motel D. Salo 2008 http://muse.jhu.edu/journals/library_trends/v057/57.2.salo.html
  • 34. Data/Institutional Repositories • “High Noon” Effect – Publishers make knowledge in very difficult, but at least knowledge out, albeit limited is consistent, intuitive and easy to use – Data repositories make data in and data out very difficult – they strive to be different when in fact users want them to be the same
  • 35. Data and Journals • That journals are thinking about data is good • Dryad etc. are welcome but a stop gap measure • Fully functional data journals will not occur without a change to the reward system • Data papers can help shift the reward system • Are PLoS Topic Pages a sign?
  • 36. Interim Solution: Use the Traditional Reward System The Wikipedia Experiment – Topic Pages  Identify areas of Wikipedia that relate to the journal that are missing of stubs  Develop a Wikipedia page in the sandbox  Have a Topic Page Editor Review the page  Publish the copy of record with associated rewards  Release the living version into Wikipedia
  • 37. Think Globally Act Locally: What Can Our Institutions Do Now To Move Us in The Right Direction?
  • 38. Institutional Response • Have repositories that are useful – Use common standards – Are vetted by the community – Are fully open and searchable • Reward all forms of scholarship • Leverage the asset …
  • 39. Most Laboratories • We are the long tail • Goodbye to the student is goodbye to the data • Very few of us have complied (or will comply with the data management plans we write into grants)
  • 40. UCSD Dropbox • Simple!!!! • Can drop large files easily • Asks for limited metadata and permissions to “discover” • Has guaranteed quality of service and security not available in the cloud • Is the data management plan and charged against grants • Is a rich campus corpus open to discovery informatics
  • 41. The UCSD Dropbox Discovery Environment • Scenarios: – Fosters known collaborations through simplified data exchange – Discovers new collaborators through the same or related data elements – A corpus whose intrinsic value is as yet unknown
  • 42. What Do I Want by 2020 or Earlier as a Researcher? • Answer biological questions not just retrieve data • Understand all there is to know about the availability and quality of a unit of biological data • Operate on data in a way that is simpler, more productive, and reproducible
  • 43. What Do We Need to Do to Get There? A Data Registry? • Individual repositories register their metadata which includes access statistics, commentary etc. – DataCite is a beginning • Identify identical data objects and their respective metadata for comparative analysis • Funders support registration • Publishers support registration
  • 44. What Do We Need to Do to Get There? An App+ Store? • The App model – Think of it operating on a content base rather than a mobile device – Simple and consistent user interface – Needs to pass some quality control – Has a reward • The App+ Model – Apps interoperate through a generic workflow interface
  • 45. In Summary • We have at hand the means to accelerate the rate of discovery • To do so we need to place more value on the data, the individuals that produce it and the institutions that maintain it • We are all stakeholders in this endeavor • Here is one way to get involved….
  • 46. Get Involved: FORCE11 • Tools and Resource catalog • Article database in Mendeley • Discussion Forum via Google • Blogs courtesy of blog sites and RSS feeds • Web site via Drupal • Announcements via Twitter http://force11.org
  • 47. General References • Force11 Manifesto • Fourth Paradigm: Data Intensive Scientific Discovery http://research.microsoft.com/enus/collabora tion/fourthparadigm/