Data Science Meets Open Scholarship – What Comes Next?
1. Data Science Meets Open Scholarship – What
Comes Next?
Philip E. Bourne
Founding Dean & Prof of Data Science and Biomedical
Engineering
University of Virginia, USA
Beilstein Open Science Symposium, October 6, 202
2. My Perspective/Bias
• US centric – sorry
• Long standing biomedical researcher
• Co-Founder and Founding Editor in Chief PLOS Computational Biology
• First President of FORCE11
• Involved in FAIR
• Associate Director for Data Science NIH – preprints, data sharing etc.
• Institutional builder – Dean School of Data Science
3. Data Science – in 40+ Years in Academia I
Have Never Seen Anything Like It
• It is part of the digital transformation of society
• It is touching every discipline
• We can keep the students out of our classes
• Cause – large amounts of digital data
• Effect – ever greater openness
4. In 40+ Years in Academia I Have Never Seen
Anything Like it… But There is a Precedent
http://www.ornl.gov/hgmis
• High throughput DNA digital
data changed how we think
about biomedicine
• Spawned a new field –
bioinformatics / computational
biology/ systems biology /
biomedical data science
6. Genomics - A Culture of Sharing
1999 2004
2003 2007 2014
2008
Research
Tools Policy
NIH Data Sharing
Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge (BD2K)
Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical Trials
White House
Initiative
(2013 “Holdren
Memo”)
2023
Revised NIH Data
Sharing Policy
NIH Timeline
7. What is Data Science?
https://techstory.in/mastering-data-science-answers-to-the-most-common-questions-and-answers/
Touching all domains
9. One Example – UN Goal 10 Reduce Inequalities
• Historian meets data scientist to exploit
parish records to understand social
networks of native Americans and how
and why they were displaced
• Takeaways –
• To study history is not to repeat it (maybe)
• Its not all about STEM
• History like every other field is awash in
digital data – in this case text that can be
mined – it changes the academy in profound
ways
10. Without open data, methods, protocols,
workflows etc. it is questionable whether data
science would exist as a field
11. Borgman and Bourne 2021
https://arxiv.org/pdf/2109.01694.pdf
https://theapopkavoice.com/it-takes-a-village-to-raise-a-child-lets-talk-about-it/
12. Institutions are Part of That Village
• Open scholarship is the remit of the VP for Research –
• Bad news - many responsibilities
• Good news – compliance
• Open scholarship not well understood by leadership
• Fractionated funding models imped institutional action
• Libraries are under some threat
• Higher education more generally is in a state of transition
13. Questions Institutions Should Ask
Themselves?
• What impact will the field of data science have?
• Will eating one’s own dog food do what it did for tech, retail, finance
etc?
• Will movement away from institutional compute centers towards
cloud computing impact open scholarship?
• Will funding mandates move the needle in institutions?
• Will the researchers themselves make a difference?
• Will libraries play a key role?
16. Guiding Principles
• Be constantly strategic and nimble - think supply chain
• Be sustainable - do not over reach
• Be interdisciplinary
• Be a organization without walls
• Be diverse, accessible and open
• Be team not individually driven
• Strive for quality not quantity in education, research & service
• Be innovative and translational
16
17. Recall - Without open data, methods and protocols
it is questionable whether data science would exist
as a field
It is only fair that data scientists give back
But how???
18. Timeline
• August 2020 – Dean popularizes the idea of an open access policy for the
School
• Sept 2020 – Faculty subcommittee debate the pros and cons; ex officio
member of the library to advise on copyright etc.
• Jan 2021 - A school cannot set a policy only guidelines
• Feb 2021 – Unanimously approved by school faculty and broadened to all
research products
• Feb 2021 – SPARC features the guidelines
• May 2021 – University-wide faculty senate adopts guidelines
• ?? – Will become university policy
• Oct 2021 – NASEM convenes meeting of university leaders to form a cohort
in support of open scholarship
19. Future Timeline
• Working with our library, there instance of Dataverse and other tools
• Committed to ORCID as our bibliographic reference
• Seeking philanthropy
• Overall – adoption still requires resources and incentives