5-14-13 An Introduction to VIVO Presentation Slides

May 14, 2013 Hot Topics: DuraSpace Community Webinar Series
Hot Topics: The DuraSpace
Community Webinar Series
Series Five:
“VIVO: Research Discovery &
Networking ”
Curated by Dean Krafft

Webinar 1: Overview of VIVO
Presented by:
Brian Lowe, Semantic Applications Programmer, Cornell
Jon Corson-Rikert, VIVO Development Lead, Cornell
Dean Krafft, Chief Technology Strategist at Cornell
University Library and Chair of the VIVO-DuraSpace
Management Committee

What is VIVO?
• A semantic-web-based researcher and
research discovery tool
– People plus much more
• Institution-wide, publicly-visible
information
– For external as well as internal audiences
• An open, shared platform for connecting
scholars, communities, campuses, and
countries using Linked Open Data

How did we get here?
31 authors
6 institutions

A brief VIVO history
2003-2005 First realization for the life sciences at
Cornell, as a relational database
2006-2008 Expansion to all disciplines at Cornell,
and conversion to Semantic Web
2009-2012 National Institutes of Health-sponsored
VIVO: Enabling the National Networking
of Scientists project transforms VIVO to
a multi-institutional open source
platform
2013-2014 VIVO Incubator Project with DuraSpace
for open community development

Major opportunity, 2009
NIH … “invites applications
designed to develop, enhance, or
extend infrastructure for
connecting people and
resources to facilitate
national discovery of
individuals and of scientific
resources by scientists and
students to encourage
interdisciplinary
collaboration and scientific
exchange.”

VIVO Collaboration
Cornell University
Dean Krafft (Cornell PI)
Manolo Bevia
Jim Blake
Nick Cappadona
Brian Caruso
Jon Corson-Rikert
Elly Cramer
Medha Devare
Elizabeth Hines
Huda Khan
Depak Konidena
Brian Lowe
Joseph McEnerney
Holly Mistlebauer
Stella Mitchell
Anup Sawant
Christopher Westling
Tim Worrall
Rebecca Younes
University of Florida
Mike Conlon (VIVO and UF PI)
Beth Auten
Michael Barbieri
Chris Barnes
Kaitlin Blackburn
Cecilia Botero
Kerry Britt
Erin Brooks
Amy Buhler
Ellie Bushhousen
Linda Butson
Chris Case
Christine Cogar
Valrie Davis
Mary Edwards
Nita Ferree
Rolando Garcia-Milan
George Hack
Chris Haines
Sara Henning
Rae Jesano
Margeaux Johnson
Meghan Latorre
Yang Li
Jennifer Lyon
Paula Markes
Hannah Norton
James Pence
Narayan Raum
Nicholas Rejack
Alexander Rockwell
Sara Russell Gonzalez
Nancy Schaefer
Dale Scheppler
Nicholas Skaggs
Matthew Tedder
Michele R. Tennant
Alicia Turner
Stephen Williams
Indiana University
Katy Borner (IU PI)
Kavitha Chandrasekar
Bin Chen
Shanshan Chen
Ryan Cobine
Jeni Coffey
Suresh Deivasigamani
Ying Ding
Russell Duhon
Jon Dunn
Poornima Gopinath
Julie Hardesty
Brian Keese
Namrata Lele
Micah Linnemeier
Nianli Ma
Robert H. McDonald
Asik Pradhan Gongaju
Mark Price
Michael Stamper
Yuyin Sun
Chintan Tank
Alan Walsh
Brian Wheeler
Feng Wu
Angela Zoss
Ponce School of Medicine
Richard J. Noel, Jr. (Ponce PI)
Ricardo Espada Colon
Damaris Torres Cruz
Michael Vega Negrón
This project is funded by the National Institutes of Health, U24 RR029822
"VIVO: Enabling National Networking of Scientists”
The Scripps Research
Institute
Gerald Joyce (Scripps PI)
Catherine Dunn
Sam Katkov
Brant Kelley
Paula King
Angela Murrell
Barbara Noble
Cary Thomas
Michaeleen Trimarchi
Washington University School of
Medicine in St. Louis
Rakesh Nagarajan (WUSTL PI)
Kristi L. Holmes
Caerie Houchins
George Joseph
Sunita B. Koul
Leslie D. McIntosh
Weill Cornell Medical College
Curtis Cole (Weill PI)
Paul Albert
Victor Brodsky
Mark Bronnimann
Adam Cheriff
Oscar Cruz
Dan Dickinson
Richard Hu
Chris Huang
Itay Klaz
Kenneth Lee
Peter Michelini
Grace Migliorisi
John Ruffing
Jason Specland
Tru Tran
Vinay Varughese
Virgil Wong

What does VIVO do?
• Integrates multiple sources of data
– Systems of record
– Faculty activity reporting
– External sources (e.g., Scopus, PubMed,
NIH RePORTER)
• Provides a review and editing interface
– Single sign-on for self-editing or by
proxy
• Provides integrated, filterable feeds to
other websites

Structured data for
visualizations

Enabling an (inter)national network
• Open software
• Open data
• Local control
• Decentralized infrastructure

What does VIVO model?
• People and more
– Organizations, grants, programs, projects,
publications, events, facilities, and research
resources
• Relationships among the above
– Meaningful
– Bidirectional
– Navigable context
• Links to URIs elsewhere
– Concepts, identifiers
– People, places, organizations, events

Typical data sources
• HR – people, appointments
• Research administration – grants & contracts
• Registrar – courses
• Faculty reporting system(s)
– publications, service, research areas, awards
• Events calendar
• Internal and external news
• External repositories – e.g., Pubmed, Scopus

Value for institutions
• Common data substrate
– Public, granular and direct
– Discovery via external and internal search
engines
– Available for reuse at many levels
• Distributed curation
– E.g., affiliations beyond what HR system tracks
– Data coordination across functional silos
– Feeding changes back to systems of record
– Direct linking across campuses
• Data that is visible gets fixed

The Semantic Web
• Turn data into a web of simple links
• Use ontology to explain how things are
linked
• Use reasoning to add new links
automatically
• Be flexible and extensible

The VIVO ontology
• Describe people and organizations in
the process of doing research
• Stay discipline neutral
• Use existing scientific domain
terminology to describe content of
research

What is Linked Open Data (LOD)?
• Data
– Structured information, not just documents
with text
– A common, simple format
• Open
– Available, visible, mine-able
– Anyone can post, consume, and reuse
• Linked
– Directly by reference
– Indirectly through common references and
inference

Linked data indexed for search
Ponce
VIVO
Ponce
VIVO
WashU
VIVO
WashU
VIVO
IU
VIVO
IU
VIVO
Cornel
l
Ithaca
VIVO
Cornel
l
Ithaca
VIVO
Weill
Cornel
l
VIVO
Weill
Cornel
l
VIVO
eagle-i
research
resources
eagle-i
research
resources Harvard
Profiles
RDF
Harvard
Profiles
RDF
Other
VIVOs
Other
VIVOs
Digital
Vita
RDF
Digital
Vita
RDF
Iowa
Loki
RDF
Iowa
Loki
RDF
Linked Open DataLinked Open Data
vivo
search
.org
UF
VIVO
UF
VIVO
Scripps
VIVO
Scripps
VIVO
Solr
search
index
Solr
search
index
another
Solr
index
another
Solr
index

Implementation challenges
• A simple idea – take the basic public
information about researchers at Cornell
and make it easy to find for academic
purposes
• Why is this hard?

Policy issues
• Dirty data
• Lack even of common definitions of
organizations or who’s faculty
• Data ownership
• Many dimensions of privacy
• Short-term “go it alone” vs. common
good

Enter data once, use it many times

Weill Cornell research reporting
• How has the number of publications co-
authored with other institutions
changed year to year?

Multi-institutional scenarios for VIVO
• Multiple campuses of one university
• University and federal lab connections
– E.g., Colorado ties with regional federal
labs
• Consortia – 60 CTSAs
• International
– 13 Netherlands universities and the
National Library
– AgriVIVO

Benefits across institutions
• Sharing experience provides clarity and new
ideas
• Incentives from sharing development, tools,
customizations
• Potential data-level connectivity
– Research is happening increasingly in
teams that span institutions
– Meeting the needs of short and long-term
virtual organizations

From outputs to outcomes
• Outputs like papers and patents can be tracked
– Collaborative ontology effort to adequately
represent the humanities
• Outcomes such as economic impact or societal
benefit are much harder to identify
• Questions about return on research investment
beg for consistent, comparable data
– over time
– across institutions
– across domains

Partnerships – ORCID
• Open Researcher and Contributor ID
– Attribution for works of any type
• ORCID and VIVO
– ORCID is an attribute in a VIVO profile
– Tools being tested for submission of
researcher registrations from VIVO
http://orcid.org

VIVO/DuraSpace Partnership
• DuraSpace is a not-for-profit organization
supporting the DSpace and Fedora repositories
• Serves as the open source community home for
future VIVO development
• Provides a legal and financial framework,
extensive tools, and proven track record of
managing community developed open source
projects
• Joint two-year initial governance based on
founding sponsors, management team, and
dedicated development and leadership effort

Meeting about VIVO
• 2nd Australian VIVO Days in February
• CU Boulder hosted 50 attendees for the
3rd
VIVO Implementation Fest in April
• May 20th
VIVO event for New York City
area institutions
• August 2013 will be the 4th
Annual VIVO
Conference – approximately 200-250
attendees, with workshops, papers,
keynotes, invited talks, and posters

Research Informatics Infrastructure
• USDA adopting for intramural research,
and also using VIVO to knit together
data from their 7 major agencies to
fulfill reporting mandates to Office of
Science & Technology Policy and
Congress
• National Center for Atmospheric
Research (NCAR) is piloting VIVO to
coordinate large, multi-year, multi-
institutional, multi-instrument research
projects

Research Informatics Infrastructure –
cont.
• Accurate, structured VIVO data can feed
external profiling and discovery systems
(ORCID, Google Scholar, Academic
Analytics, etc.)
• VIVO extensibility allows it to represent
research resources and tie them to
research datasets, publications, and
researchers, promoting data discovery
and reuse

VIVO for atmospheric and space physics

CTSAconnect and the ISF
• VIVO and eagle-i team members won NIH
funding in 2012 for a project to unify their
ontologies and extend both in the clinical
domain
• The unified ontology is known as the
Integrated Semantic Framework, or ISF
• VIVO 1.6 and eagle-i’s next release will use the
ISF
• This combined ontology is modular to allow
selective data population based on local needs

Tying biomedical research to clinical delivery

Challenges
• Communicating VIVO’s goals to faculty,
administrators, funders, and other
institutions
• Adapting to constant changes in data
sources
• Fully exploiting the opportunities provided
by VIVO linked open data
• Co-existing in a world where not everyone
uses VIVO
• Positioning VIVO on a sustainable path

Next Webinar: Case Studies
• Tuesday, June 4
• Colorado
• Duke
• Brown
• Weill Cornell Medical College

3rd
Webinar – Technical Deep Dive
• Tuesday, June 11
• Ontology & Linked Data
• Open source technologies used
• What’s coming in v1.6
• VIVO technical community touch points
• Many ways to participate, benefit, and
contribute

Questions?

5-14-13 An Introduction to VIVO Presentation Slides

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à 5-14-13 An Introduction to VIVO Presentation Slides

Similaire à 5-14-13 An Introduction to VIVO Presentation Slides (20)

Plus de DuraSpace

Plus de DuraSpace (20)

Dernier

Dernier (20)

5-14-13 An Introduction to VIVO Presentation Slides

Notes de l'éditeur