Contenu connexe Similaire à Scale-Out Network-Attached Storage Addresses Storage Problems for Private Cloud Deployments Similaire à Scale-Out Network-Attached Storage Addresses Storage Problems for Private Cloud Deployments (20) Plus de IBM India Smarter Computing Plus de IBM India Smarter Computing (20) Scale-Out Network-Attached Storage Addresses Storage Problems for Private Cloud Deployments1. WHITE P APER
Scale-Out Network -Attached Storage Addresses Storage
Problems for Private Cloud Deployments
Sponsored by: IBM
Amita Potnis Benjamin Woo
November 2011
IDC OPINION
www.idc.com
IDC has forecast that 29.7% of surveyed users intend to deploy private clouds as a way to
address the problem of improving IT utilization and operational efficiency. The explosion in
data compounds the problem, and the enormity of the growth in unstructured data
compared with the growth in structured data makes this problem even greater.
F.508.935.4015
File and content sharing not only within a datacenter but also across datacenters that
may be geographically dispersed adds to the complexity of data management
IBM Scale-Out Network-Attached Storage (SONAS) with IBM Active Cloud Engine is
P.508.872.8200
designed to meet these challenges.
IN THIS WHITE P APER
In this White Paper, IDC analyzes the IBM SONAS system and how it enables
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA
organizations to achieve economies of scale in capacity, performance, and cost.
IBM has incorporated a number of its key technologies together — in particular, its
history in its Global Parallel File System (GPFS) clustered and parallel file system —
to enable large enterprises to develop private clouds that span multiple geographies.
SITUATION OVERVIEW
What Is a Private Cloud?
According to IDC, a private cloud is designed for, and access is restricted to, a single
enterprise (or extended enterprise); an internal shared resource, not a commercial
offering; an IT organization as "vendor" of a shared/standard service to its users (see
Worldwide Private and Public Cloud Financing 2010–2014 Forecast: Sizing the Cloud
Financing Opportunity for Private and Public Cloud, IDC #226226, December 2010).
According to IDC's 2011 Private Cloud QuickPoll Survey, 29.7% of 229 IT and business
professionals and executives had a strategy to deploy private cloud. Although private
cloud is not widely adopted, the impact of the economic recession and aging server and
storage systems make it possible for companies to evaluate private cloud investments.
The architectural design (elasticity of scalability and performance independent of each
other) and pay-as-you-go approach encourage efficient use of storage assets. This
flexibility serves as a prime attraction to a cloud architecture in general.
2. The leading workloads in the private cloud include but are not limited to file
sharing/collaboration and messaging. Other use cases include on-demand storage
and backup, which has become a very important part of file sharing and collaboration.
What Organizations Are Looking for from
Their Private Cloud
Scalability and operational efficiency. Per IDC's Worldwide File-Based
Storage 2010–2014 Forecast Update (IDC #226267, December 2010), between
2009 and 2014, spending on file-based storage (FBS) solutions will grow at a
compound annual growth rate (CAGR) of 10.7% and reach $23.3 billion.
Between 2009 and 2014, capacity shipped for FBS will increase at a CAGR of
60.3% (and reach 64.73EB in 2014) compared with 22.5% for block-based
storage capacity shipped. With this immense growth in FBS, achieving
economies of scale is difficult and unavoidable.
Economies of scale are related to both capacity and performance, though
each is independent of the other. Generally, while some attributes are
common across workloads, most workloads tend to exhibit unique requirements.
For example, rich media files generally require higher bandwidth and moderate
capacity compared with an archiving workload that demands higher capacity,
less performance, and a much lower cost envelope.
While managing such different workloads, possibly tiered within the same cloud,
organizations need to keep them transparent to each user. It is imperative to keep
workloads isolated from each other to maintain accuracy and consistency, which
gives rise to the need for security and the ability to support multiple tenancy.
Requirements that stem from scalability lead to costs as capacity and
performance demands increase. Therefore, it is necessary to evaluate
operational efficiencies by controlling and optimizing administrative costs as well
as the consumption of power, space, cooling, etc., by choosing a solution that
can help keep spending in check.
Disaster recovery and business continuity. Large capacity in a storage
environment necessitates adequate backup, disaster recovery, and high-
availability plans.
When calculating the total cost of ownership (TCO), organizations should include
the costs related to backup and disaster recovery. Modern storage technologies
are exploring innovative ways of improving the backup task. Most contemporary
approaches involve maintaining a second replica to improve time to recovery.
For highly virtualized environments, the requirements around disaster recovery
and business continuity are compounded by the need to balance I/O traffic from
highly interlaced virtual machines that look to balance loads not only across
virtual and physical servers but also across storage assets.
2 #231122 ©2011 IDC
3. Information governance. Developing and increasingly stringent government
compliance regulations require businesses to store their data for extended
periods of time.
Over time, as the amount of data increases (often as a result of regulatory or
legislative demands), the task of managing and archiving data becomes equally
critical in any datacenter.
It is a complex task to understand and maintain a proper chain of custody for an
entire organization's digital assets. Understanding what to keep, where to keep it,
how to keep it, for how long to keep it, and who has access to it requires more
intelligent storage.
Management and administration. It is true that the amount of data across
organizations is increasing. However, the rate at which data grows differs
between organizations. For some, it may be easy to host and maintain data in-
house, but for others, the increased complexity associated with managing data in
the context of information governance may drain their IT budgets.
It is in the latter group that the opportunity for the cloud, private or public, lies.
Ultimately, increased infrastructure investments will be required, whether they
are made by corporations or by third-party service providers. In either case, a
shift is occurring from traditional capitalized on-premise storage investments to
more implementations that enable pay-as-you-grow acquisition vehicles.
How to Address the Challenge
To meet the challenge, organizations are turning to scale-out FBS as a solution to
address their business and operational challenges.
Scale-out refers to FBS solutions that use a distributed file system or object-based
storage model to span multiple server hosts or controllers while presenting a
single namespace. The scale-out architecture allows for flexible scalability in
performance and capacity independent of each other (see Worldwide File-Based
Storage 2010–2014 Forecast: Consolidation, Efficiency, and Objects Shape Market,
IDC #223558, June 2010).
Several major storage suppliers have already made major investments by either
developing or acquiring scale-out FBS products in light of the projected growth of
unstructured data and the adoption of cloud-based architectures. While most
suppliers have taken the easy road of acquiring such technologies and integrating
them into their portfolios, a few vendors, including IBM, foresaw the evolution toward
unstructured data and corporate information governance and developed scale-out
FBS solutions organically.
To meet the increasing capacity demands, while minimizing the cost impact to end
users, scale-out FBS vendors have leveraged industry-standard x86 platforms, with
high-speed interconnects (such as InfiniBand) along with intelligent, extensible, and
parallel file systems to develop their solutions.
©2011 IDC #231122 3
4. However, despite these advances, many large enterprises are already geographically
dispersed, and the need to be able to scale not only within the datacenter but also
across datacenters has become one of the biggest challenges to the storage industry.
The Solution
IBM has tackled these problems head-on with the development of its SONAS system.
IBM SONAS offers a number of unique features, including the following:
Simplicity. IBM SONAS, using the Active Cloud Engine, offers a single self-
managing global namespace and automated, policy-driven tiered storage system
that simplifies management.
Scalability. Performance and capacity can be scaled through additional storage
server nodes and/or storage expansion units. SONAS is capable of managing up
to 7,200 drives for a total of over 21PB of storage (with 3TB NL SAS drives). It
has different disk tiers within, and optionally, it can move data to tape storage
with integrated Tivoli Storage Manager (TSM) and hierarchical storage
management (HSM) support. It utilizes InfiniBand as the cluster data network,
which has the highest throughput, highest IOPS, and lowest latency of all
available storage network interconnects.
Manageability. Through the use of IBM Active Cloud Engine, SONAS supports a
policy-based data placement and management strategy locally or globally that
does not require any manual interference. Policy-based placement, migration,
deletion, backup/replicate, and restore/retrieve functions are also available to
further automate management of the storage system. Administration is further
simplified with an intuitive, easy-to-use graphical user interface. IBM leverages its
TSM clients, which are preintegrated into SONAS, to improve backup times for
those already using TSM as their main backup software. IBM also provides
Information Lifecycle Management with preintegrated TSM/HSM clients.
High availability and data protection. Offering highly redundant components,
IBM SONAS allows nondisruptive access to data for its users. IBM SONAS
supports antivirus software to protect data. Its "Scan on demand" or "Scan on
access" capability lets the system scan for any viruses depending on the access
patterns of the enterprise. IBM also supports "backup at scale" with multinode,
multithreaded Network Data Management Protocol (NDMP)–based backups and
allows a scheduled/automated full or incremental backup.
Active Cloud Engine Enables the Private Cloud Like Never Before
SONAS is built on the IBM-developed Active Cloud Engine and is designed to
specifically target the challenges around the private cloud and traditional FBS
solutions. IBM Active Cloud Engine is a policy-driven engine that efficiently manages
large amounts of data globally. The Active Cloud Engine is tightly integrated with IBM
Global Parallel File System (GPFS) to allow multiple sites to collaborate on
information exchange.
4 #231122 ©2011 IDC
5. The Active Cloud Engine creates the appearance of a single system for its users. In
essence, large repositories of data can be made accessible to users as a single
resource independent of where the user or application resides. While the files may be
located somewhere else, they appear to users as if they are local by namespace
virtualization. Users can view not only local files but also remotely located files that
they have permission to see. IBM Active Cloud Engine provides data consistency
across multiple sites by employing an "exclusive writer" cache capability that allows
multiple sites to access a file in the global namespace, but only one site at a time is
allowed to modify the file. Exclusive writer ownership can be transferred between
sites to provide for multisite collaboration.
When a user accesses a file, the Active Cloud Engine will ensure that the latest copy
of the file is available locally by checking against the main copy at the remote site.
When a remote site makes changes to the main file, IBM Active Cloud Engine
understands what changes are being made and sends only those incremental
changes. The result is not only fast but also cost-efficient because of the reduced
amount of bandwidth used.
In a private cloud setting, a data set can be administered remotely through the central
repository or each site can be administered independently. Datacenters can easily
maintain backups and archives through the management (tiering) capabilities of the
Active Cloud Engine.
IBM has also extended functionality to include the ability to search the entire data set,
independent of the physical (or geographical) location of the data set itself. This
enables users to share, edit, and delete data quickly. Customers can define
permissions that determine which files can be accessed for security. They can further
determine which files should remain centrally located versus those that should reside
at remote sites.
Use Cases
Several verticals such as education, life sciences and healthcare, media and
entertainment, and government can take advantage of IBM SONAS and IBM Active
Cloud Engine.
Education
File sharing at a university, for example, is an ideal use case for IBM SONAS.
Universities struggle constantly to provide a disparate yet manageable file system to
handle the various workloads and levels of access. Any university in general has to
cater to the storage requirements of its research departments, administration, and
students — among many other workloads. IBM SONAS has the ability to tier storage
that allows allocation of available resources. High-performance storage can be
allocated to the research department, while low-cost storage can be allocated for
student home directories. The Active Cloud Engine's appearance of a single system
combined with its ability to share files makes it easy for universities with multiple
campuses to maintain an accurate data set while enhancing collaboration. Policy-
based searches help keep undesired files, such as MP3s, off university storage.
©2011 IDC #231122 5
6. Life Sciences and Healthcare
Modern-day science focuses largely on researching treatments for diseases such as
cancer. Several institutions such as genomics, cancer research laboratories, and
university research generate gene sequencing and other vital information for studies.
While institutions that are geographically dispersed collect this information, they may
want the information to be centralized for analysis. The collection and analysis process
can take an extended period of time. IBM SONAS' tiering capability helps move data to
the appropriate storage, and its Active Cloud Engine helps keep the data centralized
and accurate. Leveraging IBM SONAS' hierarchical storage management capabilities,
these institutions can automatically store older files on lower-cost storage tiers, including
tape, while still allowing those files to be visible to the user. In future years, data stored
on tape can be recalled and compared with newer genetics results.
Media and Entertainment
The media and entertainment industry needs to use a wide range of applications from
content creation to pre- and post-production to distribution. IBM SONAS' built-in
architecture supports content aggregation from dispersed locations along with policy-
based content archiving (tiering). IBM SONAS also supports content distribution by
searching for or streaming media while supporting real-time applications such as
pay-per-view and video on demand.
Government
Many governments worldwide have set up storage infrastructures to support a variety of
data sets such as video surveillance and archive records. Various geographically
dispersed investigative agencies often refer to video/audio content from numerous
locations. This data needs to be centralized and dispersed efficiently and quickly.
Various records may need to be archived for an extended period of time and therefore
need to be moved to low-cost storage. IBM SONAS combined with IBM Active Cloud
Engine provides an efficient means to handle such government workloads.
FUTURE OUTLOOK
In terms of capacity shipped, of all the FBS segments, the scale-out NAS appliance
segment is expected to grow at a CAGR of 147.5% through 2014. While this still
represents roughly one-eighth of the capacity of non–file server FBS capacity
shipments in 2014, IDC expects this exponential growth to continue through the end
of the decade.
IDC expects scale-out NAS to be the largest segment of the FBS storage market (not
including those FBS systems acquired as file servers) by 2016.
IBM's investment in SONAS and leverage of a successful parallel file system (GPFS),
along with SONAS' simplicity and scalability aspects, should make the solution one of
the major contenders in this segment of the market. The private cloud will benefit from
the following core features of IBM Active Cloud Engine:
Namespace virtualization and ubiquitous access of data across the globe
6 #231122 ©2011 IDC
7. Ability to reduce administrative and network costs by automatically distributing
files closer to users
Ability to move files to appropriate storage tier based on policy
Ability to improve data protection
CHALLENGES/OPPORTUNI TIES
For organizations that are less geographically dispersed but that still need to store
and manage very large data sets, such as national laboratories, oil and gas
corporations, and research facilities, IBM SONAS offers a simple approach to
addressing the data storage problem.
Despite the strengths of scale-out NAS solutions, the offerings are not without their
challenges.
The idea of scale-out NAS is still nascent. Many organizations are still dealing with
the adoption and/or conversion to traditional file-based storage solutions after many
years (perhaps even decades) of using block-based storage networks.
Additionally, many parallel file systems are currently in use. IBM's GPFS is but one of
them. Some of the challenges that surround GPFS are less technical and more
philosophical. Those experienced in parallel file systems often come from a high-
performance computing background, where they may have used competitive parallel
file systems and may be philosophically opposed to GPFS.
That said, large enterprises that need to manage, distribute, protect, and aggregate
massive amounts of data should find IBM SONAS a compelling offering.
Such opportunities include organizations that may leverage its geographically
distributable design to store compliance data. The ability to move data across borders
(or, conversely, retain data within borders) while having a singular, centralized
management console can dramatically reduce the cost of managing these large
data sets.
CONCLUSION
An increasing number of enterprises see the benefit of private clouds as a way to:
Rein in the ever-rising costs of IT
Provide a more business friendly, pay-as-you-use chargeback model
Manage the explosion of data within the organization
Do all of this with minimal (or no) increase in head count
Deliver IT services within a service-level agreement (SLA)
©2011 IDC #231122 7
8. Being a cloud user and a cloud vendor, IBM is an experienced player in the cloud
arena with a deep understanding of the private cloud requirements. IBM has created
an efficient, scalable, and easy-to-use private cloud environment with SONAS to help
enterprises meet the previously mentioned goals. The private cloud is slated to grow
and will be adopted aggressively over the next decade. Customers leaning toward
adopting the private cloud should consider SONAS as a way to improve IT utilization
and operational efficiency.
Copyright Notice
External Publication of IDC Information and Data — Any IDC information that is to be
used in advertising, press releases, or promotional materials requires prior written
approval from the appropriate IDC Vice President or Country Manager. A draft of the
proposed document should accompany any such request. IDC reserves the right to
deny approval of external usage for any reason.
Copyright 2011 IDC. Reproduction without written permission is completely forbidden.
8 #231122 ©2011 IDC