In this presentation from the Dell booth at SC13, Scott Nolin from the University of Wisconsin describes how Big Data gets handled for HPC at the University.
"As science drives a rapidly growing need for storage, existing environments face increasing pressure to expand capabilities while controlling costs. Many researchers, scientists and engineers find that they are outgrowing their current system, but fear their organizations may be too small to cover the cost and support needed for more storage. Join these experts for a lively discussion on how you can take control and solve the HPC data deluge."
Watch the video presentation: http://insidehpc.com/2013/12/03/panel-discussion-solving-hpc-data-deluge/
2. About SSEC
The University of Wisconsin Space Science and Engineering center (SSEC) is a research and development
center focusing on geophysical research and technology to enhance our understanding of the atmosphere of
Earth, the other planets in our Solar System, and the cosmos.
Major SSEC Initiatives
§
atmospheric studies of Earth and other planets
§
interactive computing, data access, and image processing
§
spaceflight hardware development and fabrication
Noted Scientific Work
§
satellite-based and other weather observing instruments
§
remote sensing applications in earth and atmospheric science
§
spaceflight instrumentation
§
planetary meteorology
§
data analysis and visualization
§
diagnostic and numerical studies of the atmosphere
2
3. What Shapes our Storage Solutions?
§
The SSEC contains many groups and projects, often with different funding
sources. Individual groups and projects ultimately control their own
expenditures.
§
§
The Technical Computing Group provides an infrastructure to share and combine
these resources.
Atmospheric research via satellite meteorology is the major focus of SSEC.
§
§
This has large data needs. Files are typically fairly large – we use 100MB files as a
generic representation of a lot of our data.
Most projects are research, and not strictly operational so high availability is usually
not required, but data integrity is critical.
3
5. High Performance Storage Infrastructure
SSEC has recently created a generally available storage infrastructure, largely
focused on Lustre. Making large file systems available generally, and not just on
compute clusters has changed how people work with large data.
Key components include:
§ Infiniband
§
§
Lustre Routers
§
§
Allow connecting the infiniband resources (storage, compute clusters) to 10G and
1G ethernet systems.
Patchless Lustre Clients
§
§
Once you have taken the step to infiniband infrastructure (which was required by
some of our NWP needs) it’s hard to pay more for lower performance with ethernet
This is particularly useful for extending lustre filesystems to non-cluster nodes. Linux
is the main focus here, our ‘clients’ are what many would consider servers, typically
ethernet connected.
BETA - Providing a mountable file system aggregating all resources (NAS,
Gluster, Lustre) as a single file system to desktops.
§
Work in progress – custom application in house by a research group
5