The document discusses content centric applications, which require new storage architectures optimized for processing and analyzing vast amounts of content. These applications are driving major increases in storage capacity needs. Content centric storage systems must provide high performance, scalability, simplicity of access, and cost effectiveness without compromising functionality. The NetApp E-Series storage system is designed to meet the unique demands of content centric applications through its performance, efficiency, reliability, and ability to integrate with operational environments.
5. Content Centric Applications
Efficient storage systems must be designed from the onset to solve the needs of a particular class of
workloads, those demanding efficiency, performance, and now the need for reliability.
NetApp has a focus on creating more efficient storage systems which started with
the FAS systems and is continued with the E‐Series.
Lower Cost and Complexity Greater Performance
The new generation of applications such as video content, streaming media, Big Data and cloud
infrastructure has different requirements than traditional IT.
Data analytics and Big Data processing applications are designed with a different philosophy than in the
past. This class of problems can be broken into multiple pieces and analyzed in parallel, independent of
other analyses. These applications typically operate using a “MapReduce” model that maps or breaks a
problem apart, distributes it to processing elements, and finally collects results therefore reducing the
data. This model is common for Big Data analytics and requires high‐speed processing with direct access
to storage with lower overhead for greater performance.
Reliability and Content Centric Application
A critical difference between research processing and new content centric applications is their
importance to businesses. As these applications move into the mainstream of business processing, the
need for reliability also increases. These applications often developed from laboratory research
environments with a more relaxed view of reliability; the jobs could simply be restarted. With more
business criticality, the reliability, availability and serviceability (RAS) aspects of the entire system
become critical. As an example, a problem in a storage system that was not highly reliable or available
may require the analysis to be run again which would be acceptable in a research environment. In a
business environment where the results are used in driving decisions and other business criteria, the
delays caused by problems in a less robust system would have a financial impact.
Integration into Operational Environments
Storage platforms such as NetApp’s FAS and V series have a long history of delivering high levels of
business application integration features, as well as addressing multiprotocol and multi‐tenant needs.
These unified platforms are industry leading in their support of data management and data protection
capabilities and in particular, for the high levels of integration between NetApp FAS / V and business
applications for process centric information in a traditional or clustered environment.
These feature rich storage platforms are designed for process centric applications such as MS Exchange,
SQL Server, SharePoint, as well as Oracle, SAP and other applications that benefit from integrated data
protection features. These traditional workloads rely upon finite sets of structured data, and have
different requirements from emerging content centric workloads.
In contrast, a growing number of workloads demand high bandwidth access to data, providing their own
data protection, redundancy or other data management capabilities. These applications do not benefit
from feature rich storage designed for traditional workloads. Performance is critical to the design of
applications such as video content, parallel filesystems based on Lustre, seismic processing or analytic
processing with Hadoop.
Page 4 of 8 Copyright 2012, Evaluator Group, Inc.
6. Content Centric Applications
The ESeries Heritage
Design and Architecture
The NetApp E‐Series has a long history of innovation, including contributing to the development of the
SCSI block protocol, delivering the multi‐disk controller using RAID techniques, and providing multiple
RAID levels on a system. More recent leading technologies include early Fibre Channel RAID system,
native InfiniBand connectivity and early endorsement of SPC‐1 storage benchmark results.
NetApp is continuing this heritage of innovation on the E‐Series with further advancements including
faster interface technology. Continued development of features, packaging, scalability and performance
features needed for the new generation of data intensive applications are expected, enhancing the
opportunity with the already half million systems in use.
The E‐Series includes integrated capabilities for data protection and movement of replicated copies of
data that can be externally managed by content centric applications.
Performance
Attaining high performance for content centric applications relies upon fast access to disks without
bottlenecks. Cache can help improve performance for some workloads, but some high bandwidth
applications rely upon direct access to the underlying media. This is in contrast to more traditional
transaction or business processing applications that often benefit from large data caches or tiering of
data.
Providing the industry’s highest speed connections through FC, InfiniBand, SAS, and 10Gb/s iSCSI allows
E‐Series systems to communicate over high speed links with the least amount of latency while
integrating in the network of choice. This design is used for distributed compute clusters, SAN
filesystems, data analytics and other content centric applications.
Page 5 of 8 Copyright 2012, Evaluator Group, Inc.
7. Content Centric Applications
Storage Density
High storage density is another critical concern in some environments. Large capacity applications with
limited physical space require the highest storage density possible. The NetApp E‐Series uses high‐
density packaging for both Large Form Factor and Small Form Factor disk technology. Coupled with the
density is the ability to intermix different drive technologies including SAS, nearline SAS, and SSD in the
same, dense enclosures.
RAS Characteristics
In environments with thousands, or tens of thousands of disk drives, drive failures are a common
occurrence. While RAID can protect from loss of data when a drive fails, other operations associated
with maintenance can also lead to downtime for storage systems that aren’t designed to achieve high
levels of Reliability, Availability and Serviceability (RAS).
E‐Series have been delivering high reliability since their inception with continued improvements as the
design has been expanded to incorporate the latest technologies. Reliability is only one aspect of RAS
and is dependent upon design aspects to enable continued availability without maintenance or
scheduled downtime.
Real world availability includes measuring downtime for maintenance, upgrades, capacity or
configuration changes. E‐Series systems are able to upgrade disk and controller firmware without
downtime. Adding storage capacity, reconfiguring RAID levels and even RAID level migrations are
features that have all added to enhancing the real‐world availability of E‐Series systems.
System Maturity and Testability
Another component of RAS is the serviceability and implied testability of a system. The design of the E‐
Series has been refined and proven over its 20 plus year history. Many industry standard features such
as mirrored cache in the controllers and support for multiple RAID levels emerged on E‐Series systems.
The evolutionary approach to improvements and feature additions has resulted in a system that is
consistently highly available and reliable.
Page 6 of 8 Copyright 2012, Evaluator Group, Inc.