SlideShare une entreprise Scribd logo
1  sur  22
www.software.ac.uk




    Where does it go from here?
The Place of Software in Digital Repositories
                   12 July 2012
               OR2012, Edinburgh
             Neil Chue Hong (@npch)
           N.ChueHong@software.ac.uk

              Software Sustainability Institute
Software is pervasive
     in research                            www.software.ac.uk




        Software Sustainability Institute
The Software Sustainability
            Institute                                           www.software.ac.uk



A national facility for building better software
• Better software enables better research
• Software reaches boundaries in its
  development cycle that prevent
  improvement, growth and adoption
• Providing the expertise and services
  needed to negotiate to the next stage
   •   Software reviews and refactoring, collaborations
       to develop your project, guidance and best practice
       on software development, project management,
       community building, publicity and more…
                                                             Supported by EPSRC
                         Software Sustainability Institute   Grant EP/H043160/1
Software Sustainability:
       preservation vs sustainability                                                                  www.software.ac.uk




                                                                                  Sustainability?
                                                                   Image courtesy of London Permaculture under CC-by-nc-sa license




Image courtesy of Mortati under CC-by-nc-nd



     Preservation?

                                              Software Sustainability Institute
Why are you considering
      software sustainability?                                       www.software.ac.uk




                       Achieve legal compliance

                       Create heritage value
Purpose
                       Enable continued access to data

                       Encourage software reuse


 JISC-funded, with Curtis+Cartwright
 http://www.software.ac.uk/resources/preserving-software-resources
                          Software Sustainability Institute
How are you going to choose
   the right approach?                               www.software.ac.uk



    Preservation (techno-centric)

         Emulation (data-centric)

 Migration (functionality-centric)
                                                    Approach
      Transition (process-centric)

 Hibernation (knowledge-centric)

                        Deprecation
                Software Sustainability Institute
Software Carpentry
                                                                  www.software.ac.uk



• Helping scientists be more productive by
  teaching them basic computing skills
• How to use
  repositories
  properly
  is a key skill

•   http://software-carpentry.org




                              Software Sustainability Institute
Just the Nature of the problem?
                                                                              www.software.ac.uk


Statistics courtesy of Greg Wilson, Software Carpentry, from Nature article




                                                         Maintenance is not fun
 Published online 13 October 2010 | Nature 467, 775-777 (2010)
 doi:10.1038/467775a
                                                                 Hacking is fun
                               Software Sustainability Institute
www.software.ac.uk




“Re-”
is the new black


       Software Sustainability Institute
Slide from Carole Goble, JCDL 2012
                   Reuse     Review
 New                                                                       Refresh
 State
                                                                           Rerun
 Same
 State                                           Good enough               Repeat
                                                 To Verify
                                                                           Reproduce
                                                                           with new Data
Data
                                                   Replay
Provenance
                           Repurpose               Recover
                           Reconstruct             Repair
 Data
                           Reproduce with new Method

          Public
          ation
                           Method                  Method                  Method
           only
                           Documentation           Provenance              Execution
                                                   (link data and code)

Drummond C Replicability is not Reproducibility: Nor is it Good Science, online
Peng RD, Reproducible Research in Computational Science Science 2 Dec 2011: 1226-1227.
The most important: Reward
                                                         www.software.ac.uk



• How do we reward people for important software
  contributions?

• Traditionally: publish a research paper that happens to
  mention software
    Can we provide more direct, acceptable software citations?
• A Research Software Impact Manifesto
    http://www.software.ac.uk/blog/2011-05-02-publish-or-be-
     damned-alternative-impact-manifesto-research-software
    NB Authorship is hard


                     Software Sustainability Institute
www.software.ac.uk




Isn’t software
just data?
http://beyond-impact.org/?p=175


                    Software Sustainability Institute
Boundary www.software.ac.uk




What do we choose to keep:
- Workflow?
- Software that runs workflow?
- Software referenced by workflow?
- Software dependencies?
What’s the minimum citable part?
                        Software Sustainability Institute
Function
                             Granularity          www.software.ac.uk




                                                             Library / Suite / Package
                              Algorithm
Program




                                              …


          Software Sustainability Institute
Why do we version?
                                                   Versioning       www.software.ac.uk


- To indicate a change
- To allow sharing
- To confer special status

               Public                                   Public     Public
                v1                                       v2         v3




                                   Personal            Personal
                                      v3                 v3a
   Personal   Personal                                            Personal
      v1         v2                                                 v2a
                                   Personal
                                     v2a
                             Software Sustainability Institute
www.software.ac.uk




Backup,
Sharing,
Archiving

       Software Sustainability Institute
Differing roles,
     different repositories                        www.software.ac.uk




backup  sharing  archiving
Timescales                                           Ingest
Policy                                           Metadata
Licensing                                        Assurance

             Software Sustainability Institute
Software Metapapers
                                                                                     www.software.ac.uk



   • Create a complete scholarly record including “standard”
     publication, method, dataset and models, and software
          e.g. modelling and simulation, statistical analysis
          Enable replay, reproduction and reuse
   • Pragmatic approach is to create a metadata record for
     the software, and link it to a copy of the software in
     some storage infrastructure
          This is a software metapaper
          Peer-review the metadata, not the software
   • Journal of Open Research Software:
          http://openresearchsoftware.metajnl.com/
See: http://openresearchsoftware.metajnl.com/faq/
                                          Software Sustainability Institute
and the work by B. Matthews et al: The Significant Properties of Software: A Study
An acceptable repository
                                                         www.software.ac.uk



• Metapaper references an instance of software,
  stored in a “suitable” repository
     Clear access / deposit / preservation policy
     Adherence to standards
     Ability to easily “transfer”
     Sustainability of hosting organisation
     Ability to monitor, check integrity (obsolescence?)
• We may be storing
   Binaries, source code (as text or archived), virtual
    machines(!)

                     Software Sustainability Institute
Potential for confusion
                                                                       www.software.ac.uk



• ‘The right license for all parts of the scholarly record’
    Victoria Stodden, Enabling Reproducible Research: Open
     Licensing for Scientific Innovation
• Commonly used OSI approved licenses include:
      Apache License, 2.0 (Apache-2.0)
      BSD 3-Clause “New” or “Revised” license (BSD-3-Clause)
      BSD 3-Clause “Simplified” or “FreeBSD” license (BSD-2-Clause)
      GNU General Public License (GPL)
      GNU Library or “Lesser” General Public License (LGPL)
      MIT license (MIT)
      Mozilla Public License 2.0 (MPL-2.0)
      Common Development and Distribution License (CDDL-1.0)
      Eclipse Public License (EPL-1.0)
• Does enabling the deposit of software just confuse
  those already depositing publications/data?

                               Software Sustainability Institute
5 Stars of Software?
                                                                 www.software.ac.uk



• Do we need a 5 stars for software?
   Existence – there is accurate
    metadata that defines the software
   Availability – you can access and run
    the software
   Openness – the software has an
    open permissible license
   Assured – the software provides
    ways of assuring its correctness
   Linked – the related data,                          c.f.
                                                        5 Stars of Linked Data
    dependencies and papers are                         (Berners-Lee)
    indicated                                           5 Stars of Online Journals
                                                        (Shotton)

                    Software Sustainability Institute
Take home points                                www.software.ac.uk


1) Researchers are developing more software
than ever, and trying to do it better
2) They want to be rewarded for creating a
complete scholarly record – this includes
software
3) We still don’t know the best way to shift
from one repository role to another when it
comes to software!
        BackupSoftware Sustainability Institutearchiving
                  -> sharing ->

Contenu connexe

En vedette

Mining Software Repositories: Using Humans to Better Software
Mining Software Repositories: Using Humans to Better SoftwareMining Software Repositories: Using Humans to Better Software
Mining Software Repositories: Using Humans to Better SoftwareMarat Akhin
 
2013 07 05 (uc3m) lasi emadrid grobles jgbarahona urjc lecciones aprendidas a...
2013 07 05 (uc3m) lasi emadrid grobles jgbarahona urjc lecciones aprendidas a...2013 07 05 (uc3m) lasi emadrid grobles jgbarahona urjc lecciones aprendidas a...
2013 07 05 (uc3m) lasi emadrid grobles jgbarahona urjc lecciones aprendidas a...eMadrid network
 
MSR mining challenge 2015 - Quick Trigger
MSR mining challenge 2015 - Quick TriggerMSR mining challenge 2015 - Quick Trigger
MSR mining challenge 2015 - Quick TriggerXin Yang
 
MSR2012 - Explaining Software Defects Using Topic Models
MSR2012 - Explaining Software Defects Using Topic ModelsMSR2012 - Explaining Software Defects Using Topic Models
MSR2012 - Explaining Software Defects Using Topic ModelsConcordia University
 
Ase2010 shang
Ase2010 shangAse2010 shang
Ase2010 shangSAIL_QU
 
Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)Kim Herzig
 
Mobile Audio Transcription and Submission (MATS)
Mobile Audio Transcription and Submission (MATS)Mobile Audio Transcription and Submission (MATS)
Mobile Audio Transcription and Submission (MATS)DevCSI
 
Data Engine
Data EngineData Engine
Data EngineDevCSI
 
MIning Software Repositories (MSR) 2010 presentation
MIning Software Repositories (MSR) 2010 presentationMIning Software Repositories (MSR) 2010 presentation
MIning Software Repositories (MSR) 2010 presentationAhmed Lamkanfi
 
E-journal Preservation & the Archival Value of the Authors’ Final Copy
E-journal Preservation & the Archival Value of the Authors’ Final CopyE-journal Preservation & the Archival Value of the Authors’ Final Copy
E-journal Preservation & the Archival Value of the Authors’ Final CopyEDINA, University of Edinburgh
 
Code coverage for MSR Researches [Work in Progress]
Code coverage for MSR Researches [Work in Progress]Code coverage for MSR Researches [Work in Progress]
Code coverage for MSR Researches [Work in Progress]Maurício Aniche
 

En vedette (12)

Mining Software Repositories: Using Humans to Better Software
Mining Software Repositories: Using Humans to Better SoftwareMining Software Repositories: Using Humans to Better Software
Mining Software Repositories: Using Humans to Better Software
 
2013 07 05 (uc3m) lasi emadrid grobles jgbarahona urjc lecciones aprendidas a...
2013 07 05 (uc3m) lasi emadrid grobles jgbarahona urjc lecciones aprendidas a...2013 07 05 (uc3m) lasi emadrid grobles jgbarahona urjc lecciones aprendidas a...
2013 07 05 (uc3m) lasi emadrid grobles jgbarahona urjc lecciones aprendidas a...
 
MSR mining challenge 2015 - Quick Trigger
MSR mining challenge 2015 - Quick TriggerMSR mining challenge 2015 - Quick Trigger
MSR mining challenge 2015 - Quick Trigger
 
MSR2012 - Explaining Software Defects Using Topic Models
MSR2012 - Explaining Software Defects Using Topic ModelsMSR2012 - Explaining Software Defects Using Topic Models
MSR2012 - Explaining Software Defects Using Topic Models
 
Ase2010 shang
Ase2010 shangAse2010 shang
Ase2010 shang
 
Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)
 
Mobile Audio Transcription and Submission (MATS)
Mobile Audio Transcription and Submission (MATS)Mobile Audio Transcription and Submission (MATS)
Mobile Audio Transcription and Submission (MATS)
 
Data Engine
Data EngineData Engine
Data Engine
 
MIning Software Repositories (MSR) 2010 presentation
MIning Software Repositories (MSR) 2010 presentationMIning Software Repositories (MSR) 2010 presentation
MIning Software Repositories (MSR) 2010 presentation
 
The past on tap
The past on tapThe past on tap
The past on tap
 
E-journal Preservation & the Archival Value of the Authors’ Final Copy
E-journal Preservation & the Archival Value of the Authors’ Final CopyE-journal Preservation & the Archival Value of the Authors’ Final Copy
E-journal Preservation & the Archival Value of the Authors’ Final Copy
 
Code coverage for MSR Researches [Work in Progress]
Code coverage for MSR Researches [Work in Progress]Code coverage for MSR Researches [Work in Progress]
Code coverage for MSR Researches [Work in Progress]
 

Similaire à Where does it go from here? The role of software in digital repositories

Communicating trust, enabling criticism
Communicating trust, enabling criticismCommunicating trust, enabling criticism
Communicating trust, enabling criticismNeil Chue Hong
 
Doing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers SeminarDoing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers SeminarNeil Chue Hong
 
The Foundations of Digital Research
The Foundations of Digital ResearchThe Foundations of Digital Research
The Foundations of Digital ResearchNeil Chue Hong
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyNeil Chue Hong
 
Software, Training and Users Panel: the Software Sustainability Institute's View
Software, Training and Users Panel: the Software Sustainability Institute's ViewSoftware, Training and Users Panel: the Software Sustainability Institute's View
Software, Training and Users Panel: the Software Sustainability Institute's ViewNeil Chue Hong
 
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...Neil Chue Hong
 
Learning Open Source through GSOC
Learning Open Source through GSOC Learning Open Source through GSOC
Learning Open Source through GSOC smarru
 
What to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based ArtWhat to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based Artneilgrindley
 
Software Sustainability in e-Research: Dying for a Change
Software Sustainability in e-Research: Dying for a ChangeSoftware Sustainability in e-Research: Dying for a Change
Software Sustainability in e-Research: Dying for a ChangeNeil Chue Hong
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Blue BRIDGE
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3guru122
 
Collaboration and Sharing
Collaboration and SharingCollaboration and Sharing
Collaboration and SharingJisc
 
The provision of support and training for e-Infrastructure users and potentia...
The provision of support and training for e-Infrastructure users and potentia...The provision of support and training for e-Infrastructure users and potentia...
The provision of support and training for e-Infrastructure users and potentia...Software Sustainability Institute
 
Six Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower ScientistsSix Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower ScientistsDavid De Roure
 
Software Sustainability: a UK Perspective
Software Sustainability: a UK PerspectiveSoftware Sustainability: a UK Perspective
Software Sustainability: a UK PerspectiveNeil Chue Hong
 
EPA 2013 Air Sensors Meeting Big Data Talk
EPA 2013 Air Sensors Meeting Big Data TalkEPA 2013 Air Sensors Meeting Big Data Talk
EPA 2013 Air Sensors Meeting Big Data TalkAdina Chuang Howe
 
Sainath_Resume_updated
Sainath_Resume_updatedSainath_Resume_updated
Sainath_Resume_updatedsainath devara
 
Sakai Cost Savings Webinar Feb 12 2009
Sakai Cost Savings Webinar Feb 12 2009Sakai Cost Savings Webinar Feb 12 2009
Sakai Cost Savings Webinar Feb 12 2009rSmart
 

Similaire à Where does it go from here? The role of software in digital repositories (20)

Communicating trust, enabling criticism
Communicating trust, enabling criticismCommunicating trust, enabling criticism
Communicating trust, enabling criticism
 
Doing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers SeminarDoing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers Seminar
 
The Foundations of Digital Research
The Foundations of Digital ResearchThe Foundations of Digital Research
The Foundations of Digital Research
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
 
Software, Training and Users Panel: the Software Sustainability Institute's View
Software, Training and Users Panel: the Software Sustainability Institute's ViewSoftware, Training and Users Panel: the Software Sustainability Institute's View
Software, Training and Users Panel: the Software Sustainability Institute's View
 
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
 
Learning Open Source through GSOC
Learning Open Source through GSOC Learning Open Source through GSOC
Learning Open Source through GSOC
 
What to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based ArtWhat to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based Art
 
Software Sustainability in e-Research: Dying for a Change
Software Sustainability in e-Research: Dying for a ChangeSoftware Sustainability in e-Research: Dying for a Change
Software Sustainability in e-Research: Dying for a Change
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 
Session 36 - Engage Results
Session 36 - Engage ResultsSession 36 - Engage Results
Session 36 - Engage Results
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Collaboration and Sharing
Collaboration and SharingCollaboration and Sharing
Collaboration and Sharing
 
The provision of support and training for e-Infrastructure users and potentia...
The provision of support and training for e-Infrastructure users and potentia...The provision of support and training for e-Infrastructure users and potentia...
The provision of support and training for e-Infrastructure users and potentia...
 
Six Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower ScientistsSix Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower Scientists
 
Software Sustainability: a UK Perspective
Software Sustainability: a UK PerspectiveSoftware Sustainability: a UK Perspective
Software Sustainability: a UK Perspective
 
EPA 2013 Air Sensors Meeting Big Data Talk
EPA 2013 Air Sensors Meeting Big Data TalkEPA 2013 Air Sensors Meeting Big Data Talk
EPA 2013 Air Sensors Meeting Big Data Talk
 
Sainath_Resume_updated
Sainath_Resume_updatedSainath_Resume_updated
Sainath_Resume_updated
 
Sakai Cost Savings Webinar Feb 12 2009
Sakai Cost Savings Webinar Feb 12 2009Sakai Cost Savings Webinar Feb 12 2009
Sakai Cost Savings Webinar Feb 12 2009
 

Plus de Neil Chue Hong

Why developing research software is like a startup (and why this matters)
Why developing research software is like a startup (and why this matters)Why developing research software is like a startup (and why this matters)
Why developing research software is like a startup (and why this matters)Neil Chue Hong
 
Tracking software contributions
Tracking software contributionsTracking software contributions
Tracking software contributionsNeil Chue Hong
 
UK Funder Policy - the results of the Academic Spring?
UK Funder Policy - the results of the Academic Spring?UK Funder Policy - the results of the Academic Spring?
UK Funder Policy - the results of the Academic Spring?Neil Chue Hong
 
Software Sustainability Institute
Software Sustainability InstituteSoftware Sustainability Institute
Software Sustainability InstituteNeil Chue Hong
 
Software Sustainability: preserving the future of research software
Software Sustainability: preserving the future of research softwareSoftware Sustainability: preserving the future of research software
Software Sustainability: preserving the future of research softwareNeil Chue Hong
 
Cultivating Sustainable Software For Research
Cultivating Sustainable Software For ResearchCultivating Sustainable Software For Research
Cultivating Sustainable Software For ResearchNeil Chue Hong
 
Cat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project ManagementCat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project ManagementNeil Chue Hong
 
Why Good Software Sometimes Dies... and how to save it
Why Good Software Sometimes Dies... and how to save itWhy Good Software Sometimes Dies... and how to save it
Why Good Software Sometimes Dies... and how to save itNeil Chue Hong
 
UK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing ParticipationUK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing ParticipationNeil Chue Hong
 

Plus de Neil Chue Hong (10)

Why developing research software is like a startup (and why this matters)
Why developing research software is like a startup (and why this matters)Why developing research software is like a startup (and why this matters)
Why developing research software is like a startup (and why this matters)
 
Tracking software contributions
Tracking software contributionsTracking software contributions
Tracking software contributions
 
UK Funder Policy - the results of the Academic Spring?
UK Funder Policy - the results of the Academic Spring?UK Funder Policy - the results of the Academic Spring?
UK Funder Policy - the results of the Academic Spring?
 
Software Sustainability Institute
Software Sustainability InstituteSoftware Sustainability Institute
Software Sustainability Institute
 
Software Sustainability: preserving the future of research software
Software Sustainability: preserving the future of research softwareSoftware Sustainability: preserving the future of research software
Software Sustainability: preserving the future of research software
 
Cultivating Sustainable Software For Research
Cultivating Sustainable Software For ResearchCultivating Sustainable Software For Research
Cultivating Sustainable Software For Research
 
Data 2.0|
Data 2.0|Data 2.0|
Data 2.0|
 
Cat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project ManagementCat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project Management
 
Why Good Software Sometimes Dies... and how to save it
Why Good Software Sometimes Dies... and how to save itWhy Good Software Sometimes Dies... and how to save it
Why Good Software Sometimes Dies... and how to save it
 
UK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing ParticipationUK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing Participation
 

Dernier

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Dernier (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Where does it go from here? The role of software in digital repositories

  • 1. www.software.ac.uk Where does it go from here? The Place of Software in Digital Repositories 12 July 2012 OR2012, Edinburgh Neil Chue Hong (@npch) N.ChueHong@software.ac.uk Software Sustainability Institute
  • 2. Software is pervasive in research www.software.ac.uk Software Sustainability Institute
  • 3. The Software Sustainability Institute www.software.ac.uk A national facility for building better software • Better software enables better research • Software reaches boundaries in its development cycle that prevent improvement, growth and adoption • Providing the expertise and services needed to negotiate to the next stage • Software reviews and refactoring, collaborations to develop your project, guidance and best practice on software development, project management, community building, publicity and more… Supported by EPSRC Software Sustainability Institute Grant EP/H043160/1
  • 4. Software Sustainability: preservation vs sustainability www.software.ac.uk Sustainability? Image courtesy of London Permaculture under CC-by-nc-sa license Image courtesy of Mortati under CC-by-nc-nd Preservation? Software Sustainability Institute
  • 5. Why are you considering software sustainability? www.software.ac.uk Achieve legal compliance Create heritage value Purpose Enable continued access to data Encourage software reuse JISC-funded, with Curtis+Cartwright http://www.software.ac.uk/resources/preserving-software-resources Software Sustainability Institute
  • 6. How are you going to choose the right approach? www.software.ac.uk Preservation (techno-centric) Emulation (data-centric) Migration (functionality-centric) Approach Transition (process-centric) Hibernation (knowledge-centric) Deprecation Software Sustainability Institute
  • 7. Software Carpentry www.software.ac.uk • Helping scientists be more productive by teaching them basic computing skills • How to use repositories properly is a key skill • http://software-carpentry.org Software Sustainability Institute
  • 8. Just the Nature of the problem? www.software.ac.uk Statistics courtesy of Greg Wilson, Software Carpentry, from Nature article Maintenance is not fun Published online 13 October 2010 | Nature 467, 775-777 (2010) doi:10.1038/467775a Hacking is fun Software Sustainability Institute
  • 9. www.software.ac.uk “Re-” is the new black Software Sustainability Institute
  • 10. Slide from Carole Goble, JCDL 2012 Reuse Review New Refresh State Rerun Same State Good enough Repeat To Verify Reproduce with new Data Data Replay Provenance Repurpose Recover Reconstruct Repair Data Reproduce with new Method Public ation Method Method Method only Documentation Provenance Execution (link data and code) Drummond C Replicability is not Reproducibility: Nor is it Good Science, online Peng RD, Reproducible Research in Computational Science Science 2 Dec 2011: 1226-1227.
  • 11. The most important: Reward www.software.ac.uk • How do we reward people for important software contributions? • Traditionally: publish a research paper that happens to mention software  Can we provide more direct, acceptable software citations? • A Research Software Impact Manifesto  http://www.software.ac.uk/blog/2011-05-02-publish-or-be- damned-alternative-impact-manifesto-research-software  NB Authorship is hard Software Sustainability Institute
  • 13. Boundary www.software.ac.uk What do we choose to keep: - Workflow? - Software that runs workflow? - Software referenced by workflow? - Software dependencies? What’s the minimum citable part? Software Sustainability Institute
  • 14. Function Granularity www.software.ac.uk Library / Suite / Package Algorithm Program … Software Sustainability Institute
  • 15. Why do we version? Versioning www.software.ac.uk - To indicate a change - To allow sharing - To confer special status Public Public Public v1 v2 v3 Personal Personal v3 v3a Personal Personal Personal v1 v2 v2a Personal v2a Software Sustainability Institute
  • 16. www.software.ac.uk Backup, Sharing, Archiving Software Sustainability Institute
  • 17. Differing roles, different repositories www.software.ac.uk backup  sharing  archiving Timescales Ingest Policy Metadata Licensing Assurance Software Sustainability Institute
  • 18. Software Metapapers www.software.ac.uk • Create a complete scholarly record including “standard” publication, method, dataset and models, and software  e.g. modelling and simulation, statistical analysis  Enable replay, reproduction and reuse • Pragmatic approach is to create a metadata record for the software, and link it to a copy of the software in some storage infrastructure  This is a software metapaper  Peer-review the metadata, not the software • Journal of Open Research Software:  http://openresearchsoftware.metajnl.com/ See: http://openresearchsoftware.metajnl.com/faq/ Software Sustainability Institute and the work by B. Matthews et al: The Significant Properties of Software: A Study
  • 19. An acceptable repository www.software.ac.uk • Metapaper references an instance of software, stored in a “suitable” repository  Clear access / deposit / preservation policy  Adherence to standards  Ability to easily “transfer”  Sustainability of hosting organisation  Ability to monitor, check integrity (obsolescence?) • We may be storing  Binaries, source code (as text or archived), virtual machines(!) Software Sustainability Institute
  • 20. Potential for confusion www.software.ac.uk • ‘The right license for all parts of the scholarly record’  Victoria Stodden, Enabling Reproducible Research: Open Licensing for Scientific Innovation • Commonly used OSI approved licenses include:  Apache License, 2.0 (Apache-2.0)  BSD 3-Clause “New” or “Revised” license (BSD-3-Clause)  BSD 3-Clause “Simplified” or “FreeBSD” license (BSD-2-Clause)  GNU General Public License (GPL)  GNU Library or “Lesser” General Public License (LGPL)  MIT license (MIT)  Mozilla Public License 2.0 (MPL-2.0)  Common Development and Distribution License (CDDL-1.0)  Eclipse Public License (EPL-1.0) • Does enabling the deposit of software just confuse those already depositing publications/data? Software Sustainability Institute
  • 21. 5 Stars of Software? www.software.ac.uk • Do we need a 5 stars for software?  Existence – there is accurate metadata that defines the software  Availability – you can access and run the software  Openness – the software has an open permissible license  Assured – the software provides ways of assuring its correctness  Linked – the related data, c.f. 5 Stars of Linked Data dependencies and papers are (Berners-Lee) indicated 5 Stars of Online Journals (Shotton) Software Sustainability Institute
  • 22. Take home points www.software.ac.uk 1) Researchers are developing more software than ever, and trying to do it better 2) They want to be rewarded for creating a complete scholarly record – this includes software 3) We still don’t know the best way to shift from one repository role to another when it comes to software! BackupSoftware Sustainability Institutearchiving -> sharing ->

Notes de l'éditeur

  1. Steven Gray here at CASA has produced a proof of concept showing the last hours snow fall in the UK as Tweets and the last 24 in postcode districts (the important part here is the data underneath, not the Tweets as such)Based on Ben Marsh’s work.
  2. I ended up doing this because we needed to fix the basics:Reproducible researchSoftware credit / career pathsSoftware skillsDrawing on pool of specialists to drive the continued improvement and impact of research software developed by and for researchersProviding services for research software users and developersDeveloping research community interactions and capacityPromoting research software best practice and capability
  3. Clarifying the Purposes and Benefits of Software Preservation: http://softwarepreservation.jiscinvolve.org/wp/about/
  4. There is a spectrum of approaches
  5. Statistics from Greg WilsonAre academics software developers?Can research consortia manage production?Are timing constraints different?What is the role of the PI in software development management?Are the skills for software and research the same?
  6. c.f work of James Howison
  7. Based on study done for Cameron Neylon’s Beyond Impact workshop
  8. Is it more important to sustain the software that this workflow references, or the workflow itself?
  9. At what level do you reference, at what level do you deposit?
  10. Made more difficult than data because of the fluidly changing collaborative nature of software development – not just adding to the contributor pool
  11. Based on OR2012 workshop outputs
  12. Want to move towards OSI licenses which are similar in spirit to CC-BY e.g. BSD, Apache
  13. C.f.5 Stars of Linked Data (Berners-Lee):Available w/ open license, machine-readable, non-proprietary format, open standards, linked to provide context 5 Stars of Online Journals (Shotton):Peer Review, Open Access, Enriched Content, Available Datasets, Machine-readable metadataWhat about community?