Research Cyberinfrastructure at UCSD
David Minor
UC San Diego Libraries San Diego Supercomputer Center
Presentation at Research Data Access & Preservation Summit
22 March 2012
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
1. David Minor
Director, Preservation Initiatives
UC San Diego Libraries
San Diego Supercomputer Center
March, 2012
2. A brief history …
• 2008: Campus-wide needs assessment
– What do campus users need today?
– What do they think they need tomorrow?
– What is hindering their research?
3. A brief history …
• 70% indicated need for short-term storage
– 1-3 years
• 64% indicated need
for long-term
preservation of
data sets
• Also needed data management help, metadata
creation, tools for sharing, etc.
4. A brief history …
• April, 2009: Blueprint for the Digital University
– Publically available
– Indicates directions and
goals for campus
5. A brief history …
• April, 2010: Cyberinfrastructure Planning and
Operations Committee Report issued
– Operationalized the Blueprint
– Actual plans, budgets and projections
6. A brief history …
• January, 2011: RCI Oversight Committee
formed
– business plan accepted, oversight committee
charged
– Let’s go DO this
8. High-performance computing
• Triton Resource: a cost–effective and accessible high-
performance computing system primarily for UC San
Diego and UC researchers
• Triton Affiliates and Partners Program (TAPP):
high performance cluster computing time at a
reasonable cost.
• New developments include “condo” computing
http://www.sdsc.edu/us/tapp
9. Data center colocation
• Standard rack provided with ISO-Base seismic
protection, aisle containment, and 2x30A power
distribution
• 10+ Gb networking fabric connectivity both throughout
SDSC aggregation fabric and into CENIC
• 24/7 operations staff providing facility oversight and
emergency "remote hands" hardware assistance
http://rci.ucsd.edu/services/colocation.html
10. Networking and other services
• Web Hosting
• Database Hosting
• 10GigE research network throughout campus
http://rci.ucsd.edu/services/other-services.html
11. Storage
Storage Type Cost per Terabyte-Year Availability Application Performance
Parallel File System • Designed for HPC users 99.5% Up to 100 GB/s
Project Storage • Standard Availability, • 99.5% • Up to 1 GB/s
Single-Site Durability
• High Availability, Multiple-
Site Durability • 99.95% • Up to 1 GB/s
Cloud Storage • Single-Site Durability • 99.5% • Up to 100 MB/s
• Triple Copy
• 99.5% • Up to 100 MB/s
12. Data curation
• Starting with a two year pilot phase
• Using existing tools whenever possible
– Storage at SDSC
– Digital Asset Management System at UCSD Libraries
– Campus high-speed networking
– Chronopolis digital preservation network
http://rci.ucsd.edu/services/data-curation.html
http://rci.ucsd.edu/pilots
15. The Brain Observatory
Preserve and curate the digital version of the
brain of patient HM, the most studied
neuropsychological patient in modern medicine.
16. The Brain Observatory
• Aspects of image preservation
• Interaction with a commercial site
• Work with combinations of physical slides,
images, pyramidal structures
18. NSF OpenTopography Facility
• Preservation of raw data
• Provide DOIs for complex datasets
• Information passing between portals
19. Levantine Archaeology Laboratory
Focuses on archaeological investigations
concerning the evolution of societies in the
southern Levant from the Neolithic to Islamic
periods.
20. Levantine Archaeology Laboratory
• Cyber-archaeology
• Tools for uniting field work, objects in cold storage,
and digital imagery
• Develop the infrastructure needed to curate cultural
heritage data that is spurred by new visualization and
analysis tools.
21. Scripps Institution of Oceanography
Geological Collections
The Sediment Core collection contains samples
collected from as early as 1916. The Cored Sediment
Collection is a growing archive of sea-floor samples and
associated data supporting a diverse variety of scientific
research.
22. Scripps Institution of Oceanography
Geological Collections
• Work with local data and a national community
• Assist with the creation of a standards-based access,
discovery and preservation system for one of the
largest collections of marine geology samples in the
United States.
23. The Laboratory for Computational
Astrophysics
Dedicated to advancing the state-of-the-art of
astrophysical simulation through the development and
dissemination of community codes, and through large-
scale simulations of astrophysical and cosmological
systems.
24. The Laboratory for Computational
Astrophysics
• Provide data management and curation to improve
collaborations with other researchers
• Support publishing simulations of astrophysical
phenomenon in cosmology, star formation and
turbulence
• Provide metadata support
25. Data management plans
• Resources and contacts available to UCSD researchers
• Examples from submitted proposals
• Guidance, tips and recommendations for DMP
preparation
• UCSD-centered version of DMP Tool
http://rci.ucsd.edu/dmp/index.html
Talk about condo computing here ; take out the charges here
These other services are offered by SDSC and are not part of RCI funding. The 5th part of RCI that is funded is the campus network. Do you want to say that in this slide? I think you should delete Other services. --AK