An introduction to VIVO, an open source, semantic web application that enables discovery of research and scholarship across institutions and one library's role in its implementation and development.
VIVO: enabling the discovery of research and scholarship
1.
2.
3.
4.
5. Showcase credentials, expertise, skills, and professional achievements Connect within focus areas and geographic expertise Simplify reporting tasks and link data to external applications – e.g., to generate biosketches or CVs Publish the URL or populate profiles on other websites Find potential colleagues by research area, authorship, and collaborations Display visualizations of complex research networks and relationships What One Can (and Potentially Can) Do With VIVO
10. VIVO Collaborators Federal agencies – NIH, USDA, EPA, NSF, OSTP, FDP, STAR Metrics, NSF, NLM, VA, CENDI … Publishers and Aggregators – ORCID, Thomson Reuters, CiteSeer, arXiv, PLoS, BioMed Central, D-Space, Elsevier/Collexis … Press – Nature, Science, Cell, Genome Tech, SEED, The Chronicle, AP, Information Today, IEEE, Wired, … Professional Societies – APA, AAAS, AIRI, AAMC, ABRF, APA, … International – Australia, China, Netherlands, UK, Brazil, India, Costa Rica, Mexico, … Semantic Web community – eagle-i, DERI, Tim Berners-Lee, MyExperiment, ConceptWeb, Open Pharma Space (EU), Linked Data , … Social Network Analysis Community – Northwestern, Davis, UCF, … Schools and Consortia – CTSAs, CIC, SURA, FLR, Iowa, Harvard, UCSF, Pittsburgh, Stanford, MIT, Brown, Michigan, Colorado, OHSU, Duke, Minnesota, many more Sub-awards to Duke, Pittsburgh, Leicester, Stony Brook, Weill, Indiana Other Collaborators – Symplectic Elements, Wellspring, … Application and service providers – over 100 SourceForge File downloads – 10,000+ on SF since May 2010 Contact list – over 1,500
18. Local Dataflow into VIVO > > > > RDF harvest SPARQL endpoint interactive input VIVO (RDF) data ingest ontologies (RDF) shared as RDF local systems of record external sources
43. Cornell University Dean Krafft (Cornell PI) Manolo Bevia Jim Blake Nick Cappadona Brian Caruso Jon Corson-Rikert Elly Cramer Medha Devare Elizabeth Hines Huda Khan Deepak Konidena Brian Lowe Joseph McEnerney Holly Mistlebauer Stella Mitchell Anup Sawant Christopher Westling Tim Worrall Rebecca Younes University of Florida Mike Conlon (VIVO and UF PI) Beth Auten Michael Barbieri Chris Barnes Kaitlin Blackburn Cecilia Botero Kerry Britt Erin Brooks Amy Buhler Ellie Bushhousen Linda Butson Chris Case Christine Cogar Valrie Davis Mary Edwards Nita Ferree Rolando Garcia-Milan George Hack Chris Haines Sara Henning Rae Jesano Margeaux Johnson Meghan Latorre Yang Li Jennifer Lyon Paula Markes Hannah Norton James Pence Narayan Raum Nicholas Rejack Alexander Rockwell Sara Russell Gonzalez Nancy Schaefer Dale Scheppler Nicholas Skaggs Matthew Tedder Michele R. Tennant Alicia Turner Stephen Williams Indiana University Katy Borner (IU PI) Kavitha Chandrasekar Bin Chen Shanshan Chen Ryan Cobine Jeni Coffey Suresh Deivasigamani Ying Ding Russell Duhon Jon Dunn Poornima Gopinath Julie Hardesty Brian Keese Namrata Lele Micah Linnemeier Nianli Ma Robert H. McDonald Asik Pradhan Gongaju Mark Price Michael Stamper Yuyin Sun Chintan Tank Alan Walsh Brian Wheeler Feng Wu Angela Zoss Ponce School of Medicine Richard J. Noel, Jr. (Ponce PI) Ricardo Espada Colon Damaris Torres Cruz Michael Vega Negrón This project is funded by the National Institutes of Health, U24 RR029822 "VIVO: Enabling National Networking of Scientists ” The Scripps Research Institute Gerald Joyce (Scripps PI) Catherine Dunn Sam Katkov Brant Kelley Paula King Angela Murrell Barbara Noble Cary Thomas Michaeleen Trimarchi Washington University School of Medicine in St. Louis Rakesh Nagarajan (WUSTL PI) Kristi L. Holmes Caerie Houchins George Joseph Sunita B. Koul Leslie D. McIntosh Weill Cornell Medical College Curtis Cole (Weill PI) Paul Albert Victor Brodsky Mark Bronnimann Adam Cheriff Oscar Cruz Dan Dickinson Richard Hu Chris Huang Itay Klaz Kenneth Lee Peter Michelini Grace Migliorisi John Ruffing Jason Specland Tru Tran Vinay Varughese Virgil Wong VIVO Collaboration
44.
Notes de l'éditeur
Institutional VIVOs and other compatible proling applications are producing data to form a rich network of information that can be searched to foster collaboration across institutions and enable open sharing of research discovery.
A prototype of a national research resource discovery network–one that will help biomedical scientists search for and find previously invisible, but highly valuable, resources.
Also: Easily move institutional data from one system to another Recruit graduate students Assemble specialized review panels Plan budgets, services, resources
The technology can easily accommodate changes based on the demands that users make as they utilize VIVO. Other data sources we ’re working toward include grants, patent data, course responsibilities, and more.
Data is reused and repurposed in a wide array of tools and settings. Cornell University has done a stellar job of this – using VIVO data to provide current information about faculty and their interests for department and college websites; University of Florida reuses data from their VIVO for their CTSI member database – a move that other institutions will make, as well. VIVO data is available for reuse by web pages, applications, and other consumers both within and outside the institution. Individuals may access the browse and search functionalities of VIVO anytime via the web. Researchers, scholars, students, administrators, funding agencies, donors, and members of the general public may all benefit from using VIVO.
Here is a map of all of VIVO’s “adoptions, partners, and implementation work.” 50+ number of U.S.-based at some stage of implementations 19+ countries with implementation sites or are moving forward with work around VIVO including Brazil, Costa Rica, Puerto Rico, Mexico, U.S., Canada, England, Netherlands, India, Bangladesh, Saudi Arabia, Jordan, Australia, China Many of these sites are starting to think about VIVO as a back end for data.
We have a lot of exciting collaborations that are ongoing with a number of groups…. For those of you who will only give up enterprise software from your cold dead hands, you can pay for third-party support (Wellspring).
Hell no. 1. VIVO is open source. You may download the application, manipulate the source code, do whatever you want to it. 2. The data in VIVO is open and in a standard accessible format. It can be freely consumed by any third party as machine-readable metadata. 3. VIVO is authoritative data. In this age of information glut, the essential question is no longer, “Does the data exist?” but “What is its provenance? Where did it come from? Is it reliable?” VIVO is designed to easily and recurrently draw from authoritative resources like Faculty Affairs or a grants database or, say, PubMed.
Many of us have experience updating web pages, so we know that when you add a bit of content to a plain HTML page, you don ’t specify what it is. But making a semantic assertion is different. With semantic data, you are adding content, specifying what type of content it is (for example- Is it a grant? A presentation? A department? An NGO?), and you’re specifying any relationships that data has with other data.
Each element – subject, predicate, object is governed by ontologies with semantics. VIVO 1.2 includes an ontology module representing research resources such as biological specimens, human studies, instruments, organisms, protocols, reagents, and research opportunities. This module is aligned with the top-level ontology classes and properties ! from the NIH-funded eagle-i Project ( https://www.eagle-i.org/home/ ). We ’re also developing the extensions to the ontology that would allow more diverse types of efforts to be included in the profiles (blogs are currently avalable, wiki edits are not) – very important for microattribution/nanopublications efforts
Some of the developers at Indiana have developed a SPARQL query builder. ~~~
Anyone can create a third-party VIVO to search across multiple VIVOs.
For example, here is an site that leverages the power of semantic data to search across VIVO sites. This exemplar contains RDF from the seven implementation sites and Harvard Profiles. In addition, University of Iowa, which uses a system called Loki has announced that they can produce VIVO RDF, and Elsevier has said they will in the future. While searches for people are an obvious requirement for researcher networking, we don't want to limit ourselves to searching for people. VIVO's ontology-based data model is not limited to profiles of people, but includes organizations, events, publications, grants, and many other types of data. This enables VIVO to represent the relationships among people and other types of data as an interconnected network that can be accessed in many ways. We’ll demo this exemplar later.
Another visualization (not shown) is available for co-investigator relationships.
ScienceMap. Examine collections of publications for individuals, work groups, institutions
The combination of high quality and semantic data means you can do some exciting things with that data. Here ’s a screenshot from an application Alex Rockwell created that allows users to map out relations in an organizational hierarchy.
Let ’s take a look at how this is done.
New applications that take advantage of VIVO ’s linked open data are being developed from within the project as well as from a broad open source community
Miles Worthington Image from Dr. Barend Mons, Scientific Director of the Netherlands Bioinformatics Institute Allows experts to be found, but also ties the object to specific concepts
This figure displays “collaborative publications” found at 2 or more Researcher Networking websites. The idea that institutions don't work together and that biomedical research is conducted in silos is not true. Researchers, even when separated by great distances, are in fact willing to work together, and this visualization demonstrates that they often do. Contact: Nick Benik ( nbenik@gmail.com)
We talked about Harvester. Here ’s a great tool for dealing with dirty data, one we developed at Weill Cornell. Google Refine + VIVO is an extension of Google Refine. It allows users to align a dataset to the VIVO core ontologies, assert relationships to help transform data from a grid to a semantic web graph, and export the data into a standardized triple format (such as N3) so that it can be easily ingested by a VIVO installation. We have used to ingest data from our course management system.
Most people here are a fraction of an FTE.
VIVO chooses to meet these participation challenges with the help of libraries – neutral campus entities that understand their user communities and their research environment. Libraries are increasingly involved in IT decisions Libraries have subject experts. Many librarians who work to support VIVO also provide reference and instructional support to the agricultural sciences, basic sciences, social sciences, and humanities. Librarians have good knowledge of their user communities – they understand what drives their institution ’s research environment. Of course, librarians have a strong service ethic and have a long history of providing academic support Facilitate the resolution of data integration issues endemic to legacy data/faculty reporting systems.
Harvester 3 rd party app for getting data into vio only took us far because of author disambiguation issues in conjunction with pubmed– harvester with scopus wos author disambiguation built in and if not researchers can wos researcher id scopus author id slated to be completed in the next 6 months
Here is a model of where we get data. Call us lazy, but we are disinclined to manually key in anything. As we all know, that type of practice is just not sustainable particularly when some of our prolific authors have 500 papers to their name…. In some cases (MSK and Hunter), we we have yet to get data. Another case where we still need to get data is publications. Pilot project with Thomson to get publications in from Web of Science. Getting publications into VIVO remains the most important thing we can do.
Also….
Screenshot from a prototype.
Screenshot from a prototype. Ver 1.4
As you can see, The VIVO project itself is a rather large, geographically dispersed team. 7 institutions Project areas: development, implementation, ontology, and outreach Special thanks to Kristi Holmes from Washington University and Jon Corson-Rikert from Cornell University for help with the slides for this presentation