SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
1Yolanda GilUSC Information Sciences Institute gil@isi.edu
OntoSoft: A Distributed Semantic Registry for
Scientific Software
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar
Information Sciences Institute
and Department of Computer Science
University of Southern California
@yolandagil, @dgarijov
{gil,dgarijo,saurabhm,varunr}@isi.edu
http://www.ontosoft.org
Building Block
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
2Yolanda GilUSC Information Sciences Institute gil@isi.edu
We have all been here…
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
3Yolanda GilUSC Information Sciences Institute gil@isi.edu
The Value of Software: Reproducibility
Financial
Human lives
Reliability
Scientific
integrity
Financial
Trust
5/ 29/ 15, 1:49 AMRetracted Scientific Studies: A Growing List - NYTimes.com
Sections Home Search Skip to content
Advertisement
Email
Share
Tweet
More
Search
Subscribe
Log In 0 Settings
Close search
SUBSCRIBE NOW
5/ 29/ 15, 1:49 AM
a study of changing attitudes about gay marriage is
wal of research results from scientific literature.
e last. A 2011 study in Nature found a 10-fold
during the preceding decade.
ster outside of the scientific field. But in some
ere clawed back made major waves in societal
dealt with. This list recounts some prominent
d since 1980.
Photo
h medical journal,
rew Wakefield
children was
ine for measles,
The Lancet
a review of Dr.
ds and financial
dy, Dr.
rong effect on
d
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
4Yolanda GilUSC Information Sciences Institute gil@isi.edu
Quantifying the Value of Software through
“Reproducibility Maps” [Bourne & Gil et al 12]
 2 months of effort in reproducing published method (in PLoS’10)
 Authors expertise was required
Comparison of
ligand binding
sites
Comparison of dissimilar
protein structures
Graph network
generation
Molecular Docking
Work with P. Bourne of UCSD
5Yolanda GilUSC Information Sciences Institute gil@isi.edu
Software Today
 There are repositories of domain specific software (e.g.,
geosciences)
 There are general software repositories with no standard
metadata
 Most scientists are not aware of the value of their software
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
6Yolanda GilUSC Information Sciences Institute gil@isi.edu
“Dark Software”
 Models that are not
published
• Eg from a PhD thesis
 Data preparation
software
• Data pre-processing and
QC can take up to 80% of a
project’s effort
 Visualization software
“Dark Software” is the counterpart of “Dark Data” [Heidorn 2008]
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
7Yolanda GilUSC Information Sciences Institute gil@isi.edu
Why Is Software Not Shared?
 “Noone would use my code if I shared it”
 “My code is really bad”
 “My code is not ready to be shared”
 “Sharing my software will take a lot of time”
 “I won’t get anything out of sharing my software”
 “I’ve shared software before, bad things happened”
 “I work for the government”
 “I want to commercialize my software”
 “I don’t want anyone to sell my software”
 “I don’t know where to start!”
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
8Yolanda GilUSC Information Sciences Institute gil@isi.edu
Contributions: OntoSoft
Registry for software
• Complements code repositories
• Scientist-centered software metadata
• Community curated software metadata
• Training scientists on best practices
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
9Yolanda GilUSC Information Sciences Institute gil@isi.edu
OntoSoft Architecture
OntoSo
So ware
Metadata
Repository
Ontologies
Geo
sciences
OntoSo
so ware
metadataimportpublish
query
OntoSo User Interface
Publish
Browse/
Search
query
External
Repository
Push
GitHub
Apache
SVN
CSDMS
…
Adapters
(eg, BMI)
CSDMS CF ESMF …
Domain-Specific UI
Standard
Names
OntoSo Training
Lessons
VM
Environment
Generator
Docker
Vagrant
…
…
Solr
Search
Index
Videos
Domain
Ontologies
…
External
Repository
Pull
5/31/2016
Recommend
NOAA
…
OntoSo components
External components
Legend
Other OntoSo
Installa ons
PROV
Web
Access
Control
Metadata
Access
Control
10Yolanda GilUSC Information Sciences Institute gil@isi.edu
The OntoSoft Ontology for Describing
Scientific Software Metadata [Gil et al 2015]
 An ontology for scientific software metadata
• Intended to describe scientific software
• Designed with scientists in mind to guide them to deposit and
describe their software in a software registry
 Major categories of metadata: what does a scientist need?
1. identify software
2. understand what it does and its utility for research,
3. execute the software,
4. get support if questions arise,
5. do research with it, and
6. contribute to its development
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
11Yolanda GilUSC Information Sciences Institute gil@isi.edu
OntoSoft Metadata Categories
http://www.ontosoft.org/software
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
12Yolanda GilUSC Information Sciences Institute gil@isi.edu
Describing Scientific Software in OntoSoft
http://www.ontosoft.org/portal
Metadata can be exported in
several formats (HTML, RDF,
JSON)
Metadata for 3DDY Software
Metadata properties
collected through
simple questions
Set permissions for 3DDY
Metadata properties
organized into categories that
make sense to scientists
Automatic import of metadata
from other repositories
Indicators of metadata
completeness
13Yolanda GilUSC Information Sciences Institute gil@isi.edu
Access control
http://www.ontosoft.org/portal
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
Users and permissions for
the 3DDY software
component
Setting permissions for
editing 3DDY metadata
W3CWeb access control Ontology
14Yolanda GilUSC Information Sciences Institute gil@isi.edu
Software entries
from distributed
repositories are
readily accessible
Semantic
search
Comparison matrix
of software entries
PIHM PIHMgis DrEICH TauDEM WBMsed
nto$
o%$
Metadata
completion
highlighted
Software is
contrasted
by property
15Yolanda GilUSC Information Sciences Institute gil@isi.edu
Community
Learning
UK Software
Institute
Software
Carpentry
CIG
ESMF
Critical Zone
Observatory
Early Career
Advisory Board
FES/
ESIP
CSDMS
EarthCube
Building Blocks
Recommender system
Ÿ Interoperability
Publication
Community
Learning
Structured metadata
Ÿ Interactive advice
Ÿ Best practices
Ÿ Multimedia lessons
Collaborating with SEN C4P EC3
EarthCube
RCNs
Publication
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
Omics
Code meta initiative
16Yolanda GilUSC Information Sciences Institute gil@isi.edu
Conclusions
Software is a valuable
research product
• Must embed best practices of
software sharing into
research activities
Improve productivity,
quality, reproducibility
OntoSoft contributions
• Ontology of scientific
software metadata
• Portal for software registry Do you want to use Ontosoft?
Let us know!
http://www.ontosoft.org
http://www.ontosoft.org/software
http://www.ontosoft.org/portal
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
17Yolanda GilUSC Information Sciences Institute gil@isi.edu
More Information
http://www.ontosoft.org
http://www.ontosoft.org/software
http://www.ontosoft.org/portal
http://www.ontosoft.org/gpf
 OntoSoft: Capturing Scientific Software Metadata. Yolanda Gil, Varun Ratnakar, and Daniel Garijo. Proceedings of the
Eighth ACM International Conference on Knowledge Capture (K-CAP), 2015.
 OntoSoft: A Distributed Semantic Registry for Scientific Software. Yolanda Gil, Daniel Garijo, Saurabh Mishra, and
Varun Ratnakar. Under review, 2016.
 DRAT: An Unobtrusive, Scalable Approach to Large Scale Software License Analysis. Chris A. Mattmann, Ji-Hyun Oh,
Tyler Palsulich, Lewis John McGibbney, Yolanda Gil, and Varun Ratnakar. Proceedings of the Fourth International Workshop
on Software Mining, held in conjunction with the 30th IEEE/ACM International Conference on Automated Software Engineering
(ASE), 2015.
 Cyber-Innovated Watershed Research at the Shale Hills Critical Zone Observatory. Xuan Yu, Chris Duffy, Yolanda Gil,
Lorne Leonard, Gopal Bhatt, and Evan Thomas. IEEE Systems Journal, to appear.
 Collaborative Software Development Needs in Geosciences. Yolanda Gil, Eunyoung Moon and James Howison.
Proceedings of the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2), held in conjunction
with the IEEE ACM International Conference on High Performance Computing (SC), New Orleans, LA, November 2014.
 Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users. Daniel Garijo, Oscar Corcho, Yolanda Gil,
Meredith N. Braskie, Derrek Hibar, Xue Hua, Neda Jahanshad and, Paul Thompson and Arthur W. Toga. Proceedings of
the IEEE Conference on e-Science, 2014.
 FragFlow: Automated Fragment Detection in Scientific Workflows. Daniel Garijo, Oscar Corcho, Yolanda Gil, Boris A.
Gutman, Ivo D. Dinov, Paul Thompson and Arthur W. Toga. Proceedings of the IEEE Conference on e-Science, Guarujua,
Brazil, October 2014.
 An Overview of Mobile Applications for Field Science. Anna Zeng, Kevin Zeng, Yolanda Gil, and Matty Mookerjee.
GeoSoft Project Report, September 2014.
 The CSDMS Standard Names: Cross-Domain Naming Conventions for Describing Process Models, Data Sets and
Their Associated Variables. Scott D. Peckham. Proceedings of the Seventh International Congress on Environmental Modeling
and Software, San Diego, CA, June 2014.
 Web Applications that Share Level-12 HUC Data and Models of the CONUS. Lorne Leonard and Chris Duffy.
Proceedings of the Seventh International Congress on Environmental Modeling and Software, San Diego, CA, June 2014.
 Intelligent Workflow Systems and Provenance-Aware Software. Yolanda Gil. Proceedings of the Seventh International
Congress on Environmental Modeling and Software, San Diego, CA, June 2014.
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
18Yolanda GilUSC Information Sciences Institute gil@isi.edu
Acknowledgements
 The OntoSoft project team includes Chris Duffy (PSU), Chris Mattmann (JPL),
Scott Pechkam (CU), Ji-Hyun Oh (USC), Varun Ratnakar (USC), and Erin
Robinson (ESIP)
 Thank you to James Howison (UT), Lisa Kempler (Matworks), and Greg Wilson
(Software Carpentry) for their feedback on best practices for software sharing
 Thank you to the scientists and other colleagues that have contributed ideas
and asked hard questions about software stewardship
 Thank you to the National Science Foundation and the EarthCube program for
supporting this work
EarthCube!
ICER-1440323
ICER-1343800
http://www.ontosoft.org
http://www.ontosoft.org/software
http://www.ontosoft.org/portal
http://www.ontosoft.org/gpf
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016

Contenu connexe

En vedette

Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflowsdgarijo
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesdgarijo
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representationsdgarijo
 
Common Motifs in Scientific Workflows: An Empirical Analysis
Common Motifs in Scientific Workflows: An Empirical AnalysisCommon Motifs in Scientific Workflows: An Empirical Analysis
Common Motifs in Scientific Workflows: An Empirical Analysisdgarijo
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...Nandana Mihindukulasooriya
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Oscar Corcho
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narrativesdgarijo
 

En vedette (7)

Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representations
 
Common Motifs in Scientific Workflows: An Empirical Analysis
Common Motifs in Scientific Workflows: An Empirical AnalysisCommon Motifs in Scientific Workflows: An Empirical Analysis
Common Motifs in Scientific Workflows: An Empirical Analysis
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
 

Similaire à OntoSoft: A Distributed Semantic Registry for Scientific Software

Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better ResearchCarole Goble
 
PhD Defense Øyvind Hauge
PhD Defense Øyvind HaugePhD Defense Øyvind Hauge
PhD Defense Øyvind HaugeØyvind Hauge
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402vrij
 
8 better ways of doing your engineering project
8 better ways of doing your engineering project8 better ways of doing your engineering project
8 better ways of doing your engineering projecttalkingkarthik
 
RDA BoF on Sustainability - my experience with ISA tools
RDA BoF on Sustainability - my experience with ISA toolsRDA BoF on Sustainability - my experience with ISA tools
RDA BoF on Sustainability - my experience with ISA toolsSusanna-Assunta Sansone
 
Code for science (rev 2)
Code for science (rev 2)Code for science (rev 2)
Code for science (rev 2)Andy Lenards
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeLizLyon
 
Usability, Reusability and Reproducibility of Bioinformatic Applications
 Usability, Reusability and Reproducibility of Bioinformatic Applications  Usability, Reusability and Reproducibility of Bioinformatic Applications
Usability, Reusability and Reproducibility of Bioinformatic Applications Sandra Gesing
 
Learning Open Source through GSOC
Learning Open Source through GSOC Learning Open Source through GSOC
Learning Open Source through GSOC smarru
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfArmyTrilidiaDevegaSK
 
The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)David De Roure
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleAndy Petrella
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731jeffreylancaster
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Carole Goble
 
myExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentmyExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentDavid De Roure
 
Doing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers SeminarDoing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers SeminarNeil Chue Hong
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Keiichiro Ono
 

Similaire à OntoSoft: A Distributed Semantic Registry for Scientific Software (20)

Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
PhD Defense Øyvind Hauge
PhD Defense Øyvind HaugePhD Defense Øyvind Hauge
PhD Defense Øyvind Hauge
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402
 
8 better ways of doing your engineering project
8 better ways of doing your engineering project8 better ways of doing your engineering project
8 better ways of doing your engineering project
 
RDA BoF on Sustainability - my experience with ISA tools
RDA BoF on Sustainability - my experience with ISA toolsRDA BoF on Sustainability - my experience with ISA tools
RDA BoF on Sustainability - my experience with ISA tools
 
Code for science (rev 2)
Code for science (rev 2)Code for science (rev 2)
Code for science (rev 2)
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
Usability, Reusability and Reproducibility of Bioinformatic Applications
 Usability, Reusability and Reproducibility of Bioinformatic Applications  Usability, Reusability and Reproducibility of Bioinformatic Applications
Usability, Reusability and Reproducibility of Bioinformatic Applications
 
Learning Open Source through GSOC
Learning Open Source through GSOC Learning Open Source through GSOC
Learning Open Source through GSOC
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
 
The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...
 
myExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentmyExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research Environment
 
Doing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers SeminarDoing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers Seminar
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
 

Plus de dgarijo

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Softwaredgarijo
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationdgarijo
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Datadgarijo
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...dgarijo
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019dgarijo
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Sciencedgarijo
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...dgarijo
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologiesdgarijo
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineeringdgarijo
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overviewdgarijo
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsdgarijo
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigacióndgarijo
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overviewdgarijo
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)dgarijo
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsdgarijo
 

Plus de dgarijo (20)

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentation
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasets
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigación
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 

Dernier

%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxalwaysnagaraju26
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ
 

Dernier (20)

%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 

OntoSoft: A Distributed Semantic Registry for Scientific Software

  • 1. 1Yolanda GilUSC Information Sciences Institute gil@isi.edu OntoSoft: A Distributed Semantic Registry for Scientific Software Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar Information Sciences Institute and Department of Computer Science University of Southern California @yolandagil, @dgarijov {gil,dgarijo,saurabhm,varunr}@isi.edu http://www.ontosoft.org Building Block Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 2. 2Yolanda GilUSC Information Sciences Institute gil@isi.edu We have all been here… Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 3. 3Yolanda GilUSC Information Sciences Institute gil@isi.edu The Value of Software: Reproducibility Financial Human lives Reliability Scientific integrity Financial Trust 5/ 29/ 15, 1:49 AMRetracted Scientific Studies: A Growing List - NYTimes.com Sections Home Search Skip to content Advertisement Email Share Tweet More Search Subscribe Log In 0 Settings Close search SUBSCRIBE NOW 5/ 29/ 15, 1:49 AM a study of changing attitudes about gay marriage is wal of research results from scientific literature. e last. A 2011 study in Nature found a 10-fold during the preceding decade. ster outside of the scientific field. But in some ere clawed back made major waves in societal dealt with. This list recounts some prominent d since 1980. Photo h medical journal, rew Wakefield children was ine for measles, The Lancet a review of Dr. ds and financial dy, Dr. rong effect on d Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 4. 4Yolanda GilUSC Information Sciences Institute gil@isi.edu Quantifying the Value of Software through “Reproducibility Maps” [Bourne & Gil et al 12]  2 months of effort in reproducing published method (in PLoS’10)  Authors expertise was required Comparison of ligand binding sites Comparison of dissimilar protein structures Graph network generation Molecular Docking Work with P. Bourne of UCSD
  • 5. 5Yolanda GilUSC Information Sciences Institute gil@isi.edu Software Today  There are repositories of domain specific software (e.g., geosciences)  There are general software repositories with no standard metadata  Most scientists are not aware of the value of their software Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 6. 6Yolanda GilUSC Information Sciences Institute gil@isi.edu “Dark Software”  Models that are not published • Eg from a PhD thesis  Data preparation software • Data pre-processing and QC can take up to 80% of a project’s effort  Visualization software “Dark Software” is the counterpart of “Dark Data” [Heidorn 2008] Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 7. 7Yolanda GilUSC Information Sciences Institute gil@isi.edu Why Is Software Not Shared?  “Noone would use my code if I shared it”  “My code is really bad”  “My code is not ready to be shared”  “Sharing my software will take a lot of time”  “I won’t get anything out of sharing my software”  “I’ve shared software before, bad things happened”  “I work for the government”  “I want to commercialize my software”  “I don’t want anyone to sell my software”  “I don’t know where to start!” Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 8. 8Yolanda GilUSC Information Sciences Institute gil@isi.edu Contributions: OntoSoft Registry for software • Complements code repositories • Scientist-centered software metadata • Community curated software metadata • Training scientists on best practices Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 9. 9Yolanda GilUSC Information Sciences Institute gil@isi.edu OntoSoft Architecture OntoSo So ware Metadata Repository Ontologies Geo sciences OntoSo so ware metadataimportpublish query OntoSo User Interface Publish Browse/ Search query External Repository Push GitHub Apache SVN CSDMS … Adapters (eg, BMI) CSDMS CF ESMF … Domain-Specific UI Standard Names OntoSo Training Lessons VM Environment Generator Docker Vagrant … … Solr Search Index Videos Domain Ontologies … External Repository Pull 5/31/2016 Recommend NOAA … OntoSo components External components Legend Other OntoSo Installa ons PROV Web Access Control Metadata Access Control
  • 10. 10Yolanda GilUSC Information Sciences Institute gil@isi.edu The OntoSoft Ontology for Describing Scientific Software Metadata [Gil et al 2015]  An ontology for scientific software metadata • Intended to describe scientific software • Designed with scientists in mind to guide them to deposit and describe their software in a software registry  Major categories of metadata: what does a scientist need? 1. identify software 2. understand what it does and its utility for research, 3. execute the software, 4. get support if questions arise, 5. do research with it, and 6. contribute to its development Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 11. 11Yolanda GilUSC Information Sciences Institute gil@isi.edu OntoSoft Metadata Categories http://www.ontosoft.org/software Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 12. 12Yolanda GilUSC Information Sciences Institute gil@isi.edu Describing Scientific Software in OntoSoft http://www.ontosoft.org/portal Metadata can be exported in several formats (HTML, RDF, JSON) Metadata for 3DDY Software Metadata properties collected through simple questions Set permissions for 3DDY Metadata properties organized into categories that make sense to scientists Automatic import of metadata from other repositories Indicators of metadata completeness
  • 13. 13Yolanda GilUSC Information Sciences Institute gil@isi.edu Access control http://www.ontosoft.org/portal Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016 Users and permissions for the 3DDY software component Setting permissions for editing 3DDY metadata W3CWeb access control Ontology
  • 14. 14Yolanda GilUSC Information Sciences Institute gil@isi.edu Software entries from distributed repositories are readily accessible Semantic search Comparison matrix of software entries PIHM PIHMgis DrEICH TauDEM WBMsed nto$ o%$ Metadata completion highlighted Software is contrasted by property
  • 15. 15Yolanda GilUSC Information Sciences Institute gil@isi.edu Community Learning UK Software Institute Software Carpentry CIG ESMF Critical Zone Observatory Early Career Advisory Board FES/ ESIP CSDMS EarthCube Building Blocks Recommender system Ÿ Interoperability Publication Community Learning Structured metadata Ÿ Interactive advice Ÿ Best practices Ÿ Multimedia lessons Collaborating with SEN C4P EC3 EarthCube RCNs Publication Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016 Omics Code meta initiative
  • 16. 16Yolanda GilUSC Information Sciences Institute gil@isi.edu Conclusions Software is a valuable research product • Must embed best practices of software sharing into research activities Improve productivity, quality, reproducibility OntoSoft contributions • Ontology of scientific software metadata • Portal for software registry Do you want to use Ontosoft? Let us know! http://www.ontosoft.org http://www.ontosoft.org/software http://www.ontosoft.org/portal Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 17. 17Yolanda GilUSC Information Sciences Institute gil@isi.edu More Information http://www.ontosoft.org http://www.ontosoft.org/software http://www.ontosoft.org/portal http://www.ontosoft.org/gpf  OntoSoft: Capturing Scientific Software Metadata. Yolanda Gil, Varun Ratnakar, and Daniel Garijo. Proceedings of the Eighth ACM International Conference on Knowledge Capture (K-CAP), 2015.  OntoSoft: A Distributed Semantic Registry for Scientific Software. Yolanda Gil, Daniel Garijo, Saurabh Mishra, and Varun Ratnakar. Under review, 2016.  DRAT: An Unobtrusive, Scalable Approach to Large Scale Software License Analysis. Chris A. Mattmann, Ji-Hyun Oh, Tyler Palsulich, Lewis John McGibbney, Yolanda Gil, and Varun Ratnakar. Proceedings of the Fourth International Workshop on Software Mining, held in conjunction with the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2015.  Cyber-Innovated Watershed Research at the Shale Hills Critical Zone Observatory. Xuan Yu, Chris Duffy, Yolanda Gil, Lorne Leonard, Gopal Bhatt, and Evan Thomas. IEEE Systems Journal, to appear.  Collaborative Software Development Needs in Geosciences. Yolanda Gil, Eunyoung Moon and James Howison. Proceedings of the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2), held in conjunction with the IEEE ACM International Conference on High Performance Computing (SC), New Orleans, LA, November 2014.  Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users. Daniel Garijo, Oscar Corcho, Yolanda Gil, Meredith N. Braskie, Derrek Hibar, Xue Hua, Neda Jahanshad and, Paul Thompson and Arthur W. Toga. Proceedings of the IEEE Conference on e-Science, 2014.  FragFlow: Automated Fragment Detection in Scientific Workflows. Daniel Garijo, Oscar Corcho, Yolanda Gil, Boris A. Gutman, Ivo D. Dinov, Paul Thompson and Arthur W. Toga. Proceedings of the IEEE Conference on e-Science, Guarujua, Brazil, October 2014.  An Overview of Mobile Applications for Field Science. Anna Zeng, Kevin Zeng, Yolanda Gil, and Matty Mookerjee. GeoSoft Project Report, September 2014.  The CSDMS Standard Names: Cross-Domain Naming Conventions for Describing Process Models, Data Sets and Their Associated Variables. Scott D. Peckham. Proceedings of the Seventh International Congress on Environmental Modeling and Software, San Diego, CA, June 2014.  Web Applications that Share Level-12 HUC Data and Models of the CONUS. Lorne Leonard and Chris Duffy. Proceedings of the Seventh International Congress on Environmental Modeling and Software, San Diego, CA, June 2014.  Intelligent Workflow Systems and Provenance-Aware Software. Yolanda Gil. Proceedings of the Seventh International Congress on Environmental Modeling and Software, San Diego, CA, June 2014. Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 18. 18Yolanda GilUSC Information Sciences Institute gil@isi.edu Acknowledgements  The OntoSoft project team includes Chris Duffy (PSU), Chris Mattmann (JPL), Scott Pechkam (CU), Ji-Hyun Oh (USC), Varun Ratnakar (USC), and Erin Robinson (ESIP)  Thank you to James Howison (UT), Lisa Kempler (Matworks), and Greg Wilson (Software Carpentry) for their feedback on best practices for software sharing  Thank you to the scientists and other colleagues that have contributed ideas and asked hard questions about software stewardship  Thank you to the National Science Foundation and the EarthCube program for supporting this work EarthCube! ICER-1440323 ICER-1343800 http://www.ontosoft.org http://www.ontosoft.org/software http://www.ontosoft.org/portal http://www.ontosoft.org/gpf Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016