The cumulative effect of decades of IT infrastructure investment around a diverse set of technologies and processes has stifled innovation at organizations around the globe. Layer upon layer of complexity to accommodate a staggering array of applications has created hardened processes that make changes to systems difficult and cumbersome.
1. A Special Report on Infrastructure Futures:
Keeping Pace in the Era of Big Data Growth
How big data analytics impose huge challenges for storage
professionals and the keys for preparing for the future
David Vellante, David Floyer
Analysis from The Wikibon Project May 2012
A Wikibon Reprint
2. Wikibon.org 1 of 13
View the live research note on Wikibon.
The cumulative effect of decades of IT infrastructure investment around a diverse
set of technologies and processes has stifled innovation at organizations around the
globe. Layer upon layer of complexity to accommodate a staggering array of
applications has created hardened processes that make changes to systems difficult
and cumbersome.
The result has been an escalation of labor costs over the years to support this
complexity. Ironically, computers are supposed to automate manual tasks, but the
statistics show some alarming data that flies in the face of this industry promise. In
particular, the percent of spending for both internal and outsourced IT staff has
exploded over the past 15 years. According to Wikibon estimates, of the $250B
spent on server-and storage-related hardware and staffing costs last year, nearly
60% was spent on labor. IDC figures provide further evidence of this trend. The
research firm’s forecasts are even more aggressive than Wikibon’s, with estimates
that suggest labor costs will approach 70% by 2013 (see Figure 1 below).
The situation is untenable for most IT organizations and is compounded by the
explosion of data. Marketers often cite Gartner’s three V’s of Big Data —volume,
velocity, and variety — that refer respectively to data growth, the speed at which
organizations are ingesting data, and the diversity in data texture (e.g. structured,
unstructured, video, etc). There is a fourth V that is often overlooked: Value.
WikiTrend: By 2015, the majority of IT organizations will come to the
realization that big data analytics is tipping the scales and making
information a source of competitive value that can be monetized and not
just a liability that needs to be managed. Those organizations which
cannot capitalize on data as an opportunity, risk losing marketshare.
From an infrastructure standpoint, Wikibon sees five keys to achieving this vision:
▪ Simplifying IT infrastructure through tighter integration across the hardware
stack;
▪ Creating end-to-end virtualization beyond servers into networks, storage, and
applications;
▪ Exploiting flash and managing a changing hardware stack by intelligently
matching data and media characteristics;
▪ Containing data growth by making storage optimization a fundamental capability
of the system;
▪ Developing a service orientation by automating business and IT processes
through infrastructure that can support applications across the portfolio,
versus within a silo, and provide infrastructure-as-a-service that is
“application aware.”
This research note is the latest in a series of efforts to aggregate the experiences of
users within the Wikibon community and put forth a vision for the future of
infrastructure management.
3. Wikibon.org 2 of 13
The IT Labor Problem
The trend toward IT consumerization, led by Web giants servicing millions of users,
often with a single or very few applications, has ushered in a new sense of urgency
for IT organizations. C-level and business line executives have far better
experiences with Web apps from Google, Facebook, and Zynga than with their
internal IT systems as these services have become the poster children of simplicity,
rapid change, speed, and a great user experience.
In an effort to simplify IT and reduce costs, traditional IT organizations have
aggressively adopted server virtualization and built private clouds. Yet relative to
the Web leaders, most IT organizations are still far behind the Internet innovators.
The reasons are quite obvious as large Web properties had the luxury of starting
with a clean sheet of paper and have installed highly homogeneous infrastructure
built for scale.
Both vendor and user communities are fond of citing statistics that 70% of IT
spending is allocated to “Running the Business”, while only 30% goes toward
growth and innovation. Why is this? The answer can be found by observing IT labor
costs over time.
Data derived from researcher IDC (see Figure 1) shows that in 1996, around $30B
was spent on IT infrastructure labor costs, which at the time represented only
about 30% of total infrastructure costs. By next year, the data says that more than
$170B will be spent on managing infrastructure (i.e. labor), which will account for
nearly 70% of the total infrastructure costs (including capex and opex). This is a
whopping 6X increase in labor costs, while overall spending has only increased 2.5X
in those 15+ years.
Figure 1 – IT Labor Cost Over Time
Data Source: IDC 2012
4. Wikibon.org 3 of 13
What does this data tell us? It says we live in a labor-intensive IT economy and
something has to change. The reality is IT investments primarily go toward labor
and this labor-intensity is slowing down innovation. This trend is a primary reason
that IT is not keeping pace with business today — it simply doesn’t have the
economic model to respond quickly at scale. In order for customers to go in new
directions and break this gridlock, vendors must address the REAL cost of
computing, people.
The answer is one part technology, one part people, and one part process.
Virtualization/cloud is the dominant technology trend, and we live in a world where
IT infrastructure and applications, and the security that protects data sources, are
viewed as virtual, not physical entities. The other three dominant technology
themes reported by Wikibon community practitioners are:
1. A move toward pre-engineered and integrated systems (aka converged
infrastructure) that eliminate or at least reduce mundane tasks such as patch
management;
2. Much more aggressive adoption of virtualization beyond servers;
3. A flash-oriented storage hierarchy that exploits automated operations and a
reduction in the manual movement of data — i.e. “smarter systems” that are
both automated and application aware — meaning infrastructure can support
applications across the portfolio and adjust based on quality of service
requirements and policy;
4. Products that are inherently efficient and make data reduction features like
compression and de-duplication fundamental capabilities, not optional add-
ons, along with new media such as flash and the ability to automate
management of the storage infrastructure.
From a people standpoint, organizations are updating skills and training people in
emerging disciplines including data science, devops (the intersection of application
development and infrastructure operations), and other emerging fields that will
enable the monetization of data and deliver hyper increases in productivity.
The goal is that the combination of improved technologies and people skills will lead
to new processes that begin to reshape decades of complexity and deliver a much
more streamlined set of services that are cloud-like and services-oriented.
The hard reality is that this is a difficult task for most organizations, and an
intelligent mix of internal innovation with external sourcing will be required to meet
these objectives and close the gap with the Web giants and emerging cloud service
providers.
New Models of Infrastructure Management
IT infrastructure management is changing to keep pace as new models challenge
existing management practices. Traditional approaches use purpose-built
configurations that meet specific application performance, resilience, and space
5. Wikibon.org 4 of 13
requirements. These are proving wasteful, as infrastructure is often over-
provisioned and underutilized.
The transformative model is to build flexible, self-administered services from
industry-standard components that can be shared and deployed on an as-needed
basis, with usage levels adjusted up or down according to business need. These IT
services building blocks can come as services from public cloud and SaaS providers,
as services provided by the IT department (private clouds), or increasingly as
hybrids between private and public infrastructure.
Efforts by most IT organizations to self-assemble this infrastructure have led to a
repeat of current problems, namely that the specification and maintenance of all
the parts requires significant staff overhead to build and service the infrastructure.
Increasingly, vendors are providing a complete stack of components, including
compute, storage, networking, operating system, and infrastructure management
software.
Creating and maintaining such a stack is not a trivial task. It will not be sufficient
for vendors or systems integrators to create a marketing or sales bundle of
component parts and then hand over the maintenance to the IT department; the
savings from such a model are minimal over traditional approaches. The stack must
be completely integrated, tested, and maintained by the supplier as a single SKU,
or as a well-documented solution with codified best practices that can be applied for
virtually any application. The resultant stack has to be simple enough that a single
IT group can completely manage the system and resolve virtually any issue on its
own.
Equally important, the cost of the stack must be reasonable and must scale out
efficiently. Service providers are effectively using open-source software and focused
specialist skills to decrease the cost of their services. Internal IT will not be able to
compete with services providers if their software costs are out of line.
The risk to this integrated approach according to members of the Wikibon
practitioner community is lock-in. Buyers are concerned that sellers will, over time,
gain pricing power and return to the days of mainframe-like economics. This
concern has merit. Sellers of converged systems today are providing large
incentives to buyers in the form of aggressive pricing and white glove service in an
effort to maintain account control and essentially lock customers into their specific
offering. The best advice is as follows:
▪ Consider converged infrastructure in situations where cloud-like services provide
clear strategic advantage, and the value offsets the risk of lock-in down the
road.
▪ Design processes so that data doesn’t become siloed. In other words, make sure
your data can be migrated easily to other infrastructure.
▪ Don’t sole source. Many providers of integrated infrastructure have realized they
must provide choice of various components such as hypervisor, network, and
server. Keep your options open with a dual-sourcing strategy.
6. Wikibon.org 5 of 13
WikiTrend: Despite the risk of lock-in, by 2017, more than 60%
infrastructure will be purchased as some type of integrated system, either
as a single SKU or a pre-tested reference architecture.
The goal of installing integrated or converged infrastructure is to deliver a world
without stovepipes, where hardware and software can support applications across
the portfolio. The tradeoff of this strategy is it lessens the benefits of tailor-made
infrastructure that exactly meets the needs of an application. For the few
applications that are critical to revenue generation, this will continue to be a viable
model. However, Wikibon users indicate that 90% or more of the applications do
not need a purpose-built approach, and Wikibon has used financial models to
determine that a converged infrastructure environment will cut the operational
costs by more than 50%.
Figure 2 – Traditional Stove-piped Infrastructure Model
Source: Wikibon 2012
The key to exploiting this model is tackling the 90% long tail of applications by
aggregating common technology building blocks into a converged infrastructure.
There are two major objectives in taking this approach:
1. Drive down operational costs by using an integrated stack of hardware,
operating systems, and middleware;
2. Accelerate the deployment of applications.
7. Wikibon.org 6 of 13
Figure 3 – Infrastructure 2.0 Services Model
Source: Wikibon 2012
Virtualization: Moving Beyond Servers
Volume servers that came from the consumer space only had the capability of
running one application per server. The result was servers that had very low
utilization rates, usually well below 10%. Specialized servers that can run multiple
applications can achieve higher utilization rates but at much higher system and
software costs.
Hypervisors, such as VMware, Microsoft’s Hyper V, Xen and hypervisors from IBM
and Oracle, have changed the equation. The hypervisors virtualize the system
resources and allow them to be shared among multiple operating systems. Each
operating system thinks that it has control of a complete hardware system, but the
hypervisor is sharing those resources among them.
The result of this innovation is that volume servers can be driven to much higher
utilization levels, thee-to-four times that of stand-alone systems. This makes low-
cost volume servers that are derived directly from volume consumer products such
as PCs much more attractive as a foundation for processing and much cheaper than
specialized servers and mainframes. There will still be a place for very high-
performance specialized servers for some applications such as certain performance-
critical databases, but the volume will be much lower.
The impact of server virtualization on storage is profound. The I/O path to a server
provides service to many different operating systems and applications. The result is
that the access patterns as seen by the storage devices are much less predictable
and more random. The impact of higher server utilization (and of multi-core
processors) is that IO volumes (IOPS, IOs per second) will be much higher.
8. Wikibon.org 7 of 13
Increasingly, few processor cycles will be available for housekeeping activities such
as backup.
Server virtualization is changing the way that storage is allocated, monitored, and
managed. Instead of defining LUNs and RAID levels, virtual systems are defining
virtual disks and expect array information to reflect these virtual machines and
virtual disks and the applications they are running. Storage virtualization engines
are enabling the pooling of multiple heterogeneous arrays, providing both
investment protection and flexibility for IT organizations with diverse asset bases.
As well, virtualizing the storage layer dramatically simplifies storage provisioning
and management, much in the same way that server virtualization attacked the
problem of underutilized assets.
Conclusions for Storage: Storage arrays will have to serve much higher volumes
of random read and write IOs with applications using multiple protocols. In
addition, storage arrays will need to work across heterogeneous assets and
virtualized systems and speak the language of virtualized administrators. Newer
storage controllers (often implemented as virtual machines) are evolving that will
completely hide the complexities of traditional storage (e.g., the LUNS and RAID
structures) and be replaced with automated storage that is a virtual machine (VM)
focused on providing the metrics that will enable virtual machine operators (e.g.,
VMware administrators) to monitor the performance, resource utilization, and
service level agreement (SLA) at a business application level.
Storage networks will have to adapt to providing shared a transport for the
different protocols. Adaptors and switches will increasingly use lossless Ethernet as
the transport mechanism, with different protocols running underneath.
Backup processes will need to be re-architected and linked to the application versus
a one-size-fits-all approach. Application consistent snaps and continuous backup
processes are some of the technologies that will become increasingly important
over time.
WikiTrend: Virtualization is moving beyond just servers and will impact the
entire infrastructure stack, from storage, backup, networks, infrastructure
management, and security. Overall, the strong trend towards a converged
infrastructure, where storage function placement is more dynamic, being
staged optimally in arrays, in virtual machines or in servers will
necessitate and end-to-end and more intelligent management paradigm.
Flash Storage: Implications to the Stack
Consumers are happy to pay premiums for flash memory over the price of disk
because of the convenience of flash. For example, the early iPods had disk drives
but were replaced by flash because the device required very little battery power
and had no moving parts. The results were much smaller iPods that would work for
days without recharging and would work after being dropped. This led to huge
consumer volume shipments and flash storage costs dropped dramatically.
In the data center, systems and operating system architectures have had to
contend with the volatility of processors and high-speed RAM storage. If power was
9. Wikibon.org 8 of 13
lost to the system, all data in flight was lost. The solutions were either to protect
the processors and RAM with complicated and expensive battery backup systems or
to write the data out to disk storage, which is non-volatile. The difference between
the speed of disk drives (measured in milliseconds, 10-3
) and processor speed
(measured in nanoseconds, 10-9
) is huge and is a major constraint on system
speed. All systems wait for I/O at the same speed. This is especially true for
database systems.
Flash storage is much faster than disk drives (microseconds, 10-6
) and is persistent
– when the power is removed the data is not lost. It can provide an additional
memory level between disk drives and RAM. The impact of flash memory is also
being seen in the iPad effect. The iPad is always on, and the response time for
applications compared with traditional PC systems in amazing. Applications are
being rewritten to take advantage of this capability, and operating systems are
being changed to take advantage of this additional layer. iPads and similar devices
are forecast to have a major impact on portable PCs, and the technology transfer
will have a major impact within the data center, both at the infrastructure level and
in the design of all software.
IO Centric Processing: Big Data Goes Real-time
Wikibon has written extensively about the potential of flash to disrupt industries
and designing systems and infrastructure in the Big Data IO Centric era. The model
developed by Wikibon is shown in Figure 4.
10. Wikibon.org 9 of 13
Figure 4 – Real-time Big Data Processing with IO Centric Storage
Source: Wikibon 2012
The key to this capability is the ability to directly address the flash storage from the
processor with lockable atomic writes, as explained in a previous Wikibon discussion
on designing systems and infrastructure in the Big Data IO Centric era. This
technology has brought down the cost of IO intensive systems by two orders or
magnitude, 100 times, whereas the cost of hard disk-only solutions has remained
constant. This trend will continue.
This technology removes the constraints of disk storage and allows the real-time
parallel ingest of transactional, operational and social media data streams, and
sufficient IO at low-enough cost that allows parallel processing of Big Data
transactional systems at the same time performing Big Data indexing and metadata
processing to drive Big Data Analytics.
WikiTrend: Flash will enable changes in system and application design that
are profound. Transactional systems will evolve, as flash architectures will
remove locking constraints at the highest performance tier. Big Data
analytics will be integrated with operational systems and Big Data streams
will become direct inputs to applications people, devices and machines.
Metadata extraction, index data and other summary data will become
direct inputs to operational Big Data streams and enable more value to be
derived at lower costs from archival and backup systems.
11. Wikibon.org 10 of 13
Conclusions for Storage: Flash will become a ubiquitous technology that will be
used in processors as an additional memory level, in storage arrays as read/write
“Flash cache”, and as a high-speed disk device. Systems management software will
focus high I/O “hot-spots” and low latency I/O on flash technology and allow high-
density disk drives to store the less active data.
Overall within the data center, flash storage will pull storage closer to the
processor. Because of the heat density constraints mentioned above, it is much
easier to put low power flash memory rather than disk drives very close to the
processor.
The result of more storage being closer to the processor will be for some storage
functionality to move away from storage arrays and filers and closer to the
processor, a trend that is made easier by multi-core processors that have cycles to
spare. The challenge for storage management will be to provide the ability to share
a much more distributed storage resource between processors. Future storage
management will have to contend with sharing storage that is within servers as well
as traditional SANs and filers outside servers.
Storage Efficiency Technologies
Storage efficiency is the ability to reduce the amount of physical data on the disk
drives required to store the logical copies of the data as seen by the file systems.
Many of the technologies have become or are becoming mainstream capabilities.
Key technologies include:
▪ Storage virtualization:
Storage virtualization allows volumes to be logically broken into
smaller pieces and mapped onto physical storage. This allows much
greater efficiency in storing data, which previously had to be stored
contiguously. This technology also allows dynamic migration of data
within arrays that can also be used for dynamic tiering systems.
Sophisticated tiering systems, which allow small chunks of data (sub-
LUN) to be migrated to the best place in the storage hierarchy, have
become a standard feature in most arrays.
▪ Thin provisioning:
Thin provisioning is the ability to provision storage dynamically from a
pool of storage that is shared between volumes. This capability has
been extended to include techniques for detecting zeros (blanks) in file
systems and using no physical space to store them. This again has
become a standard feature expected in storage arrays.
▪ Snapshot technologies:
Space-efficient snapshot technologies can be used to store just the
changed blocks and therefore reduce the space required for copies.
This provides the foundation of a new way of backing up systems
using periodic space-efficient snapshots and replicating these copies
remotely.
12. Wikibon.org 11 of 13
▪ Data de-duplication:
Data de-duplication was initially introduced for backup systems, where
many copies of the same or nearly the same data were being stored
for recovery purposes. This technology is now extending to inline
production data, and is set to become a standard feature on storage
controllers.
▪ Data compression:
Originally data compression was an offline process used to reduce the
data held. Data compression is used in almost all tape systems, is now
being extended to online production disk storage systems, and is set
to become a standard feature in many storage controllers. The
standard compression algorithms used are based on LZ (Lempel and
Ziv), and give a compression ratio between 2:1 and 3:1. Compression
is not effective on files that have compression built-in (e.g., JPEG
image files, most audio visual files). The trend is toward real time
compression where performance is not compromised.
WikiTrend: Storage efficiency technologies will have a significant impact
on the amount of storage saved. However, they will not affect the number
of I/Os and the bandwidth required to transfer I/Os. Storage efficiency
techniques will be applied to the most appropriate part of the
infrastructure and become increasingly embedded into systems and
storage design.
Milestones for Next Generation Infrastructure Exploitation
Some key milestones are required to exploit new infrastructure directions in general
and storage infrastructure in particular:
1. Sell the vision to senior business managers.
2. Create a Next Generation Infrastructure Team, including cloud
infrastructure.
3. Set aggressive targets for Infrastructure implementation and cost
savings, in line with external IT service offerings.
4. Select a stack for each set of application suites:
▪ Choose a single vendor Infrastructure stack from a large vendor
that can supply and maintain the hardware and software as a single
stack. The advantage of this approach is the cost of maintenance
within the IT department can be dramatically reduced if the software is
treated as a single SKU and updated as such, and the hardware
firmware is treated the same way. The disadvantage is lack of choice
for components of the stack, and a higher degree of lock-in.
▪ Limit lock-in with a sourcing strategy. Choose an Ecosystem
Infrastructure Stack of software and hardware components that can
be intermixed. The advantage of this approach is greater choice and
13. Wikibon.org 12 of 13
less lock-in, at the expense of significantly increased costs of internal
IT maintenance.
5. Reorganize and flatten IT support by stack(s), and move away from an
organization supporting stovepipes. Give application development and
support groups the responsibility to determine the service levels required,
and the Next Generation Infrastructure team the responsibility to provide the
infrastructure services to meet the SLA. Included in this initiative should be a
move to DevOps, where application development and infrastructure
operation teams are cross-trained with the goal of achieving hyper
productivity.
6. Create a self-service IT environment with a service catalogue and
integrate charge-back or show-back controls.
From a strategic point of view, it will be important for IT to compete with external
IT infrastructure suppliers where internal data proximity or privacy requirements
dictate the use of private clouds, and use complementary external cloud services
where internal clouds are not economic.
Overall Storage Directions and Conclusions
Storage infrastructure will change significantly with the implementation of a new
generation of infrastructure across the portfolio. There will be a small percentage of
application suites that will require a siloed stack and large scale-up monolithic
arrays, but the long tail (90% of applications suites) will require standard storage
services that are inherently efficient and automated. These storage services will be
more distributed within the stack with increasing amounts of flash devices and
distributed within private and public cloud services. Storage software functionality
will become more elastic and will reside or migrate to the part of the stack that
make most practical sense, either in the array or in the server or in a combination
of the two.
The I/O connections between storage and servers will become virtualized, with a
combination of virtualized network adapters and other virtual I/O mechanisms. This
approach will save space, drastically reduce cabling, and allow dynamic
reconfiguration of resources. The transport fabrics will be lossless Ethernet with
some use of InfiniBand or other high speed interconnects for inter-processor
communication. Storage will become protocol agnostic. Where possible, storage will
follow a scale-out model, with meta-data management a key component.
The storage infrastructure will allow dynamic transport of data across the network
when required, for instance to support business continuity, and with some
balancing of workloads. However, data volumes and bandwidth are growing at
approximately the same rate, and large-scale movement of data between sites will
not be a viable strategy. Instead, applications (especially business intelligence and
analytics applications) will often be moved to where the data is (the Hadoop model)
rather than pushing data to the code. This will be especially true of Big Data
environments, where vast amounts of semi-structured data will be available within
the private and public clouds.
14. Wikibon.org 13 of 13
The criteria for selecting storage vendors will change in the future. Storage vendors
will have significant opportunities for innovation within the stack. They will have to
take a systems approach to storage and be able to move the storage software
functionality to the optimal place within the stack in an automated and intelligent
manner. Distributed storage management function will be a critical component of
this strategy, together will seamless integration into backup, recovery and business
continuance. Storage vendors will need to forge close links with the stack providers,
so that there is a single support system (e.g., remote support), a single update
mechanism for maintenance, and a single stack management system.
Action Item: Next generation storage infrastructure is coming to a theater
near you. The bottom line is in order to scale and “compete” with cloud
service providers, internal IT organizations must spend less time on labor-
intensive infrastructure management and more effort on automation, and
providing efficient storage services at scale. The path to this vision will go
through integration in the form of converged infrastructure across the
stack with intelligent management of new types of storage (e.g. flash) and
the integration of Big Data analytics with operational systems to extract
new value from information sources.