SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
Integrating Analytics into the
Operational Fabric of Your Business
A combined platform for optimizing analytics and operations
April 2012
A White Paper by
Dr. Barry Devlin, 9sight Consulting
barry@9sight.com
Business is running ever faster—generating, collecting and using increas-
ing volumes of data about every aspect of the interactions between sup-
pliers, manufacturers, retailers and customers. Within these mountains of
data are seams of gold—patterns of behavior that can be interpreted,
classified and analyzed to allow predictions of real value. Which treat-
ment is likely to be most effective for this patient? What can we offer that
this particular customer is more likely to buy? Can we identify if that
transaction is fraudulent before the sale is closed?
To these questions and more, operational analytics—the combination of
deep data analysis and transaction processing systems—has an answer.
This paper describes what operational analytics is and what it offers to the
business. We explore its relationship to business intelligence (BI) and see
how traditional data warehouse architectures struggle to support it. Now,
the combination of advanced hardware and software technologies provide
the opportunity to create a new integrated platform delivering powerful
operational analytics within the existing IT fabric of the enterprise.
With the IBM DB2 Analytics Accelerator, a new hardware/software offer-
ing on System z, the power of the massively parallel processing (MPP) IBM
Netezza is closely integrated with the mainframe and accessed directly and
transparently via DB2 on z/OS. The IBM DB2 Analytics Accelerator brings
enormous query performance gains to analytic queries and enables direct
integration with operational processes.
This integrated environment also enables distributed data marts to be re-
turned to the mainframe environment, enabling significant reductions in
data management and total ownership costs.
Contents
2 Operational analytics—
diamonds in the detail,
magic in the moment
5 Data warehousing and
the evolution of species
7 An integrated platform
for OLTP and
operational analytics
11 Business benefits and
architectural advantages
13 Conclusions
Copyright © 2012, 9sight Consulting, all rights reserved 2
large multichannel retailer discovered some of its customers were receiving up to 60 catalog
mailings from them a year through multiple marketing campaigns. Customer satisfaction was
at risk, profits slowing. Increased mailing did not drive higher sales. A shift in thinking was
needed. From “finding customers for my products” to “finding the right products for my customers.”
That meant analyzing customer behavior, from what they searched for on the website to what they
bought and even returned in order to know what to offer them. As a result, the retailer saw an extra
US$3.5 million in profit, a 7% drop in mailings as well as increased customer satisfaction.1
The airline industry has long been using historical information about high-value customers, such as
customer preferences, flights taken, recent flight disruptions and more, to enable operational deci-
sions to be taken about who gets priority treatment when, for example, a delayed arrival breaks con-
nections for passengers. That’s using historical data in near real-time. Now, carriers are analyzing
real-time and historical data from customers browsing their website to make pricing decisions on the
fly (no pun intended!) to maximize seat occupancy and profit.2
The wheels of commerce turn ever faster. Business models grow more complex. Channels to cus-
tomers and suppliers multiply. Making the right decision at the right time becomes ever more diffi-
cult. And ever more vital. Analysis followed by action is the key…
Operational analytics—diamonds in the detail, magic in the moment
“Sweet Analytics, 'tis thou hast ravished me.”
3
usiness Analytics. Predictive analytics. Operational Analytics. “Insert-attractive-word-here
Analytics” is a popular marketing game. Even Dr. Faustus espoused “Sweet Analytics”, as
Christopher Marlowe wrote at the end of the 16th
Century! The definitions of the terms over-
lap significantly. The opportunities for confusion multiply. So, let’s define operational analytics:
Analytics
Wikipedia offers a practical definition4
: “analytics is the process of developing optimal or realistic de-
cision recommendations based on insights derived through the application of statistical models and
analysis against existing and/or simulated future data.” This is a good start. It covers all the variants
above and emphasizes recommendations for decisions as the goal. Analysis for the sake of under-
standing the past is interesting, but only analysis that influences future decisions offers return on
investment. But only where decisions lead to actions.
Operational
Business intelligence (BI) practitioners understand “operational” as the day-to-day actions required
to run the business—the online transaction processing (OLTP) systems that record and manage the
detailed, real-time activities between the business, its customers, suppliers, etc. This is in contrast to
informational systems where data is analyzed and reported upon.
Every day-to-day action demands one or more real-time decisions. Sometimes the answer is so ob-
vious that we don’t even see the question. An online retailer receives an order for an in-stock shirt
from a signed-in customer; without question, the order is accepted. But the implicit question—what
should we do with this order?—is much clearer if the item is out of stock, or if we have a higher mar-
gin shirt available that the customer might like. Every operational transaction has a decision associated
with it; every action is preceded by a decision. The decision may be obvious but, sometimes it is worth
asking: is a better outcome possible if we made a different decision and thus took a different action?
A
B
Copyright © 2012, 9sight Consulting, all rights reserved 3
Operational Analytics
We can thus define operational analytics as the process of developing optimal or realistic recommen-
dations for real-time, operational decisions based on insights derived through the application of statis-
tical models and analysis against existing and/or simulated future data, and applying these recommen-
dations in real-time interactions. This definition leads directly to a process:
1
1.
. Perform statistical analysis on a significant sample of historical transactional data to discover the
likelihood of possible outcomes
2
2.
. Predict outcomes (a model) of different actions during future operational interactions
3
3.
. Apply this knowledge in real-time as an operational activity is occurring
4
4.
. Note result and feed back into the analysis stage.
From an IT perspective, steps (1) and (2) have very different processing characteristics than (3) and
(4). The former involve reading and number-crunching of potentially large volumes of data with rela-
tively undemanding constraints on the time taken. The latter require the exact opposite—fast re-
sponse time for writing small data volumes. This leads to a key conclusion. Operational analytics is a
process that requires a combination of informational and operational processing.
Operational BI
While the term operational analytics is very much flavor of the year, operational BI has been around
for years now. Is there any difference between the two? Some analysts and vendors suggest that
analytics is future oriented, while BI is backward-looking and report oriented. While there may be
some historical truth in this distinction, in practical terms today, the difference is limited. Analytics
typically includes more statistical analysis and modeling to reach conclusions, as in steps (1) and (2) of
the above process. Operational BI may include this but also other, simpler approaches to drawing
conclusions for input to operational activity, such as rule-based selection.
Operational analytics—why now and what for?
“Analytics themselves don't constitute a strategy, but using them to optimize
a distinctive business capability certainly constitutes a strategy.”
5
What we’ve been discussing sounds a lot like data mining, a concept that has been around since the
early 1990s. And beyond advances in technology, there is indeed little difference. So, why is opera-
tional analytics suddenly a hot topic? The answers are simple:
1
1.
. Business operations are increasingly automated and digitized via websites, providing ever larger
quantities of data for statistical analysis
2
2.
. Similarly, Web 2.0 is driving further volumes and varieties of analyzable data
3
3.
. As the speed of business change continues to accelerate, competition for business is intense
4
4.
. Data storage and processing continue to increase in power and decrease in cost, making opera-
tional analytics a financially viable process for smaller businesses
5
5.
. Making many small, low-value decisions better can make a bigger contribution to the bottom-line
than a few, high value ones; and the risk of failure is more widely spread
And, as enterprise decision management expert, James Taylor, points out6
, operational data volumes
are large enough to provide statistically significant results and the outcomes of decisions taken can
be seen and tracked over relatively short timeframes. Operational analytics thus offer a perfect plat-
Copyright © 2012, 9sight Consulting, all rights reserved 4
form to begin to apply the technological advances in predictive analytics and test their validity. So,
let’s look briefly at the sort of things leading-edge companies are doing with operational analytics.
Marketing: what’s the next best action?
Cross-selling, upselling, next best offer and the like are marketing approaches that all stem from one
basic premise. It’s far easier to sell to an existing customer (or even a prospect who is in the process
of deciding to buy something) than it is to somebody with whom you have no prior interaction. They
all require that—or, at least, work best when—you know enough about (1) the prospective buyer, (2)
the context of the interaction and (3) your products, to make a sensible decision about what to do
next. Knowing the answers to those three questions can prove tricky; get them wrong and you risk
losing the sale altogether, alienating the customer, or simply selling something unprofitably. With
the growth of inbound marketing via websites and call centers, finding an automated approach to
answering these questions is vital. Operational analytics is that answer.
Analyzing a prospect’s previous buying behavior and even, pattern of browsing can give insight into
interests, stage of life, and other indicators of what may be an appropriate next action from the cus-
tomer’s point of view. A detailed knowledge of the characteristics of your product range supplies the
other side of the equation. The goal is to bring this information together in the form of a predicted
best outcome during the short window of opportunity while the prospect is on the check-out web
page or in conversation with the call center agent.
Consider Marriott International Inc., for example. The group has over 3,500 properties worldwide
and handles around three-quarters of a million new reservations daily. Marriott’s goal is to maximize
customer satisfaction and room occupancy simultaneously using an operational analytics approach.
Factors considered include the customer’s loyalty card status and history, stay length and timing. On
the room inventory side, rooms in the area of interest are categorized according to under- or over-
sold status, room features, etc. This information is brought together in a “best price, best yield” sce-
nario for both the customer and Marriott in under a second while the customer is shopping.
Risk: will the customer leave… and do I care?
“The top 20% of customers… typically generate more than 120% of an organization’s profits.
The bottom 20% generate losses equaling more than 100% of profits.”
7
Customer retention is a central feature of all businesses that have an ongoing relationship with their
customers for the provision of a service such as banking or insurance or a utility such as telecoms,
power or water. In the face of competition, the question asked at contract renewal time is: how like-
ly is this customer to leave? The subsidiary, and equally important, question is: do I care?
In depth analysis using techniques such as logistic regression, a decision tree, or survival analysis of
long-term customer behavior identifies potential churn based on indicators such as dissatisfaction
with service provided, complaints, billing errors or disputes, or a decrease in the number of transac-
tions. In most cases, the result of this analysis of potential churners is combined with an estimate of
likely lifetime value of the customers to aid in prioritization of actions to be taken. In high value cases,
the action may be proactive, involving outbound marketing. In other cases, customers may be
flagged for particular treatment when they next make contact.
Fraud: is it really like it claims to be?
Detecting fraud is something best done as quickly as possible—preferably while in progress. This
clearly points to an operational aspect of implementation. In some cases, like credit card fraud, the
window of opportunity is even shorter than OLTP—suspect transactions must be caught in flight.
Copyright © 2012, 9sight Consulting, all rights reserved 5
Operationalsystems Operationalsystems
Datamarts
Operationalsystems andmore
Datamarts, cubes, spreadsheets, etc.
Enterprise datawarehouse
Business datawarehouse
Personal
data
Public
data
Enhanceddata,
Detailed
Raw data,
Detailed
Enhanceddata,
Summary
Metadata
Metadata
Metadata
Mashups,
Portals,
SOA,
Federation
Operationaldata
store
DataStagingArea
Fig. 1a AdaptedfromDevlin&Murphy (1988) Fig. 1b AdaptedfromDevlin(1997) Fig. 1c
Enterprise datawarehouse
This requires real-time analysis of the event streams in flight, a topic beyond this paper, but one
where IBM and other vendors are offering existing and new tools to meet this growing need. But
there exist many types of fraud in insurance, social services, banking and other areas where opera-
tional analytics, as we’ve defined it, plays a key role in detection and prevention.
As in our previous examples, the first step is the analysis of historical data to discover patterns of be-
havior that can be correlated with proven outcomes, in this case with instances of deliberate fraud in
financial transactions, and even negligent or unthinking use of unnecessarily expensive procedures in
maintenance or medical treatment. Micro-segmentation of the customer base leads to clusters of
people with similar behaviors, groups of which correlate to fraud. Applying analytics on an opera-
tional timeframe can detect the emergence of these patterns in near real-time, allowing preventative
action to be taken.
Data warehousing and the evolution of species
ith the recognition that operational analytics bridges traditional informational (data wa-
rehousing / BI) and operational (OLTP) environments, it makes sense to examine how this
distinction evolved and how, in recent years, it is beginning to break down as a result of
the ever increasing speed of response to change demanded by business today.
Genesis
Data warehousing and System z are cousins. The first data warehousing architecture was conceived
in IBM Europe and implemented on S/370 in the mid-1980s. As I and Paul Murphy documented in an
IBM Systems Journal article8
in 1988, the primary driver for data warehousing was the creation of an
integrated, consistent and reliable repository of historical information for decision support in IBM's
own sales and administration functions. The architecture proposed as a solution a “Business Data
Warehouse (BDW)… [a] single logical storehouse of all the information used to report on the business…
In relational terms, a view / number of views that… may have been obtained from different tables”. The
BDW was largely normalized, and the stored data reconciled and cleansed
through an integrated interface to the operational environment. Figure
1a shows this architecture.
The split between operational and informational processing, driven by
both business and technological considerations, thus goes back to the
W
Figure 1:
Evolution
of the data
warehouse
architecture
Copyright © 2012, 9sight Consulting, all rights reserved 6
very foundations of data warehousing. At that time, business users wanted consistency of informa-
tion across both information sources and time; they wanted to see reports of trends over days and
weeks rather than the minute by minute variations of daily business. This suited IT well. Heavily
loaded and finely tuned OLTP systems would struggle to deliver such reports and might collapse in
the face of ad hoc queries. The architectural solution was obvious—extract, transform and load (ETL)
data from the OLTP systems into the data warehouse on a monthly, weekly and, eventually, daily ba-
sis as business began to value more timely data.
Middle Ages
The elegant simplicity of a single informational layer quickly succumbed to the limitations of early
relational databases, which were optimized for OLTP. As shown in figure 1b9
, the informational layer
was further split into an enterprise data warehouse (EDW) and data marts fed from it. This architec-
tural structure and the rapid growth of commodity servers throughout the 1990s and 2000s, coupled
with functional empowerment of business units, has led to the highly distributed, massively replicated
and often incoherently managed BI environment that is common in most medium and large enterpris-
es today. While commodity hardware has undoubtedly reduced physical implementation costs, the
overall total cost of ownership (TCO) has soared in terms of software licenses, data and ETL adminis-
tration, as well as change management. The risks associated with inconsistent data have also soared.
In parallel, many more functional components have been incorporated into the architecture as shown
in figure 1c, mainly to address the performance needs of specific applications. Of particular interest
for operational analytics is the operational data store (ODS) first described10
in the mid-1990s. This
was the first attempt to bridge the gap that had emerged between operational and informational
systems. According to Bill Inmon’s oft-quoted definitions, both the data warehouse and ODS are sub-
ject oriented and enterprise-level integrated data stores. While the data warehouse is non-volatile
and time variant, the ODS contains current-valued, volatile, detailed corporate data. In essence, what
this means is that the data warehouse is optimized for reading large quantities of data typical of BI
applications, while the ODS is better suited for reading and writing individual records.
The ODS construct continues to be widely used, especially in support of master data management.
However, it and other components introduce further layers and additional copies of data into an al-
ready overburdened architecture. Furthermore, as business requires ever closer to real-time analysis,
the ETL environment must run faster and faster to keep up. Clearly, new thinking is required.
Modern times
Data warehousing / business intelligence stands at a crossroads today. The traditional layered archi-
tecture (figure 1b) recommended by many BI experts is being disrupted from multiple directions:
1
1.
. Business applications such as operational BI and analytics increasingly demand near real-time or
even real-time data access for analysis
2
2.
. Business users no longer appreciate the distinction between operational and informational
processes; as a result, they are merging together
3
3.
. Rapidly growing data volumes and numbers of copies are amplifing data management problems
4
4.
. Hardware and software advances—discussed next—drive “flatter” architectural approaches
This pressure is reflected in the multiple and varied hardware and software solutions currently on
offer in the BI marketplace today. Each of these approaches addresses different aspects of this archi-
tectural disruption to varying degrees. What is required is a more inclusive and integrated approach,
which is enabled by recent advances in technology.
Copyright © 2012, 9sight Consulting, all rights reserved 7
An integrated platform for OLTP and operational analytics
dvances in processing and storage technology as well as in database design over the past
decade have been widely and successfully applied to traditional BI needs—running analytic
queries faster over ever larger data sets. Massively parallel processing (MPP)—where each
processor has its own memory and disks—has been highly beneficial for problems amenable to being
broken up into smaller, highly independent parts. Columnar databases—storing all the fields in each
column physically together, as opposed to traditional row-based databases where the fields of a sin-
gle record are stored sequentially—are also very effective in reducing query time for many types of BI
application, which typically require only a subset of the fields in each row. More recently, technolo-
gical advances and price reductions in solid-state memory devices—either in memory or on solid
state disks (SSD)—present the opportunity to reduce the I/O bottleneck of disk storage for all data-
base applications, including BI.
Each of these diverse techniques has its own strengths, as well as its weaknesses. The same is true of
traditional row-based relational databases running on symmetric multi-processing (SMP) machines
where multiple processors share common memory and disks. SMP is well suited to running high per-
formance OLTP systems like airline reservations, as well as BI processing, such as reporting and key
performance indicator (KPI) production. However, the move towards near real-time BI and opera-
tional analytics, in particular, is shifting the focus to the ever closer relationship between operational
and informational needs. For technology, the emphasis is moving from systems optimized for partic-
ular tasks to those with high performance across multiple areas. We thus see hybrid systems emerg-
ing, where vendors blend differing technologies—SMP and MPP, solid-state and disk storage, row-
and column-based database techniques—in various combinations to address complex business needs.
Operational analytics, as we’ve seen, demands an environment equally capable of handling opera-
tional and informational tasks. Furthermore, these tasks can be invoked in any sequence at any time.
Therefore, in such hybrid systems, the technologies used must be blended seamlessly together,
transparently to users and applications, and automatically managed by the database technology to
ease data management.
Beyond pure technology considerations, operational analytics has operating characteristics that dif-
fer significantly from traditional BI. Because operational analytics is, by definition, integrated into the
operational processes of the business, the entire operational analytics process must have the same
performance, reliability, availability and security (RAS) characteristics as the traditional operational
systems themselves. Processes that include operational analytics will be expected to return results
with the same response time—often sub-second—as standard transactions. They must have the
same high availability—often greater than 99.9%—and the same high levels of security and traceabili-
ty. Simply put, operational analytics systems “inherit” the service level agreements (SLAs) and secu-
rity needs of the OLTP systems rather than those of the data warehouse.
If we consider the usage characteristics of operational analytics systems, we see two aspects. First,
there is the more traditional analysis and modeling that is familiar to BI users. Second, there is the
operational phase that is the preserve of front-office users. While the first group comprises skilled
and experienced BI analysts, the second has more limited computer skills, as well as less time and
inclination to learn them. In addition, it is the front-office users who have daily interaction with the
system. As a result, usage characteristics such as usability, training, and support must also lean to-
wards those of the OLTP environment.
These operating and usage characteristics lead to the conclusion that the hybrid technology envi-
ronment required for operational analytics should preferably be built out from the existing OLTP en-
A
Copyright © 2012, 9sight Consulting, all rights reserved 8
vironment rather than from its data warehouse counterpart. Such an approach avoids upgrading the
RAS characteristics of the data warehouse—a potentially complex and expensive procedure that has
little or no benefit for traditional BI processes. Furthermore, it can allow a reduction in copying of
data from the OLTP to the BI environment—a particularly attractive option given that near real-time
data is often needed in the operational analytic environment.
IBM System z operational and informational processing
IBM System z with DB2 for z/OS continues to be the premier platform of choice for OLTP systems
providing high reliability, availability and security as well as high performance and throughput. For
higher performance, IMS is the database of choice. Despite numerous obituaries since the 1990s,
over 70% of global Fortune 500 companies still run high performance OLTP on System z. DB2 for z/OS
has always been highly optimized for OLTP rather than the very different processing and access cha-
racteristics of heavy analytic workloads, although DB2 10 redresses the balance somewhat.
So, given the wealth of transaction data on DB2 or IMS on z/OS, the question has long arisen as to
where BI data and applications should be located. Following the traditional layered EDW / data mart
architecture shown in figure 1b, a number of options were traditionally considered:
1
1.
. EDW and data marts together on DB2 on z/OS in a partition separate from OLTP systems
This option offers minimal data movement and an environment that takes full advantage of z/OS
skills and RAS strengths. However, in the past, mainframe processing was seen as comparatively
expensive, existing systems were already heavily utilized for OLTP and many common BI tools
were unavailable on this platform.
2
2.
. EDW and/or data marts distributed to other physical servers running different operating systems
Faced with the issues above, customers had to choose between distributing only their data marts
or both EDW and marts to a different platform. When both EDW and data marts were used for
extensive analysis, customers often chose the latter to optimize BI processing on dedicated BI
platforms, such as Teradata. Distributing data marts alone was often driven by specific depart-
mental needs for specialized analysis tools.
The major drawback with this approach is that it drives an enormous proliferation of servers and
data stores. Data center, data management and distribution costs all increase dramatically.
3
3.
. EDW on DB2 on z/OS and data marts distributed to other operating systems and/or servers, ma-
naged by z/OS
In recent years, IBM has extended the System z environment in a number of ways to provide op-
timal support for BI processing. Linux, available since the early 2000s, enables customers to run
BI (and other) applications developed for this platform on System z. The IBM zEnterprise Blade-
Center Extension (zBX), a hardware solution introduced in 2010, runs Windows and AIX systems
under the control and management of System z, further expanding customers’ options for run-
ning non-native BI applications under the control and management of z/OS.
These approaches support both EDW and data marts, although typical reporting EDW and stag-
ing area processing can be optimized very well on DB2 on z/OS and are often placed there.
This third option offers significant benefits. Reducing the number and variety of servers simplifies
and reduces data center TCO. Distribution of data is reduced, leading to lower networking costs.
Fewer copies of data cuts storage costs, but most importantly, diminishes the costs of managing it as
business needs change. In addition, zBX is an effective approach to moving BI processing to more
appropriate platforms and freeing up mainframe cycles for other purposes.
Copyright © 2012, 9sight Consulting, all rights reserved 9
A 2010 paper11
by Rubin Worldwide, an analyst organization specializing in Technology Economics,
provides statistical evidence of the value of option 3 in a more general sense. It compares the aver-
age cost of goods across industries between companies that are mainframe-biased and those that
favor a distributed server approach. The figures show an average additional cost of over 25% for the
distributed model. Only in the case of Web-centric businesses is the balance reversed. A more de-
tailed analysis of the financial services sector12
shows a stronger case for the mainframe-centric ap-
proach. It appears that customers have begun to take notice too—the last two years have seen the
beginnings of an upward trend in mainframe purchase and an expansion in use cases.
IBM DB2 Analytics Accelerator—to System z and DB2, just add Netezza
Available since November 2011, the IBM DB2 Analytics Accelerator (which, for ease of use, I’ll abbre-
viate to IDAA) 2.1 is a hardware/software appliance that deeply integrates the Netezza server, ac-
quired by IBM just one year earlier, with the System z and DB2 on z/OS. From a DB2 user and applica-
tion perspective on z/OS, only one thing changes—vastly improved analytic response times at lower
cost. The DB2 code remains the same. User access is exactly the same as it always was. Reliability,
availability and security is at the same level as for System z. Data management is handled by DB2.
IDAA hardware
With Netezza, IBM acquired a hardware-assisted, MPP, row-
based relational database appliance, shown in figure 2. At left,
two redundant SMP hosts manage the massively parallel envi-
ronment to the right as well as handling all SQL compilation,
planning, and administration. Parallel processing is provided by
up to 12 Snippet BladesTM
(S-Blades) with 96 CPUs, 8 per blade,
in each cabinet. Each S-Blade with 16GB of dedicated memory is
a high-performance database engines for streaming joins, ag-
gregations, sorts, etc. The real performance boosters are the 4
dual-core field programmable gate arrays (FPGA) on each blade
that mediate data from the disks, uncompressing it and filtering
out columns and rows that are irrelevant to the particular query
being processed. The CPU then performs all remaining SQL
function and passes results back to the host. Each S-Blade has
its own dedicated disk array, holding up to 128TB of uncom-
pressed data per cabinet. In the near future, up to 10 cabinets
can be combined giving a total effective data capacity of 1.25
petabytes and nearly 2000 processors.
The IDAA appliance is simply a Netezza box / boxes attached via the twin SMP hosts to the System z
via two dedicated 10Gb networks through which all data and communications pass, a design that en-
sures there is no single point of failure. All network access to the appliance is through these dedicat-
ed links, providing load speeds of up to 1.5TB/hour, and offering the high levels of security and sys-
tems management for which System z is renowned. Additional deployment options allow multiple
IDAAs attached to one System z, and multiple System z machines sharing one or more IDAAs.
IDAA software
IDAA software consists of an update to DB2 and a Data Studio plug-in that manage a set of stored
procedures running in DB2 9 or 10 for z/OS. Figure 3 shows the basic configuration and operation.
The DB2 optimizer analyzes queries received from an application or user. Any judged suitable for ac-
celeration by the IDAA appliance are passed to it via the distributed relational database architecture
FPGA
Memory
CPU
FPGA
Memory
CPU
FPGA
Memory
CPU
Host
Disk
Enclosures
S-Blades™
Network
Fabric
Netezza
Appliance
Figure 2:
Structure of
the IBM Netezza
appliance
Copyright © 2012, 9sight Consulting, all rights reserved 10
(DRDA) interface
and results flow
back by the same
route. Any queries
that cannot or
should not be
passed to IDAA
are run as normal
in DB2 for z/OS.
Because DB2 me-
diates all queries
to IDAA, from a
user or application
viewpoint, the
appliance is invisi-
ble. Analytic que-
ries simply run faster. DB2 applications that ran previously against DB2 on z/OS run without any code
change on the upgraded system. Dynamic SQL is currently supported; static SQL is coming soon. All
DB2 functions such as EXPLAIN and billing stats work as before even when the query is routed in
whole or in part to the Netezza box. IDAA is so closely integrated into DB2 that it appears to a user or
administrator as an internal DB2 process, much like the lock manager or resource manager.
Some or all of the data in DB2 on z/OS must, of course, be copied onto the IDAA box and sliced across
the disks there before any queries can run there. The tables to be deployed on the IDAA box are de-
fined through a Client application called the Data Studio plug-in, which guides the DBA through the
process and creates stored procedures to deploy, load and update tables, create appropriate meta-
data on DB2 and on the IDAA box and run all administrative tasks. Incremental update of IDAA tables
is planned in the near future.
IDAA implementation and results
Given the prerequisite hardware and software, installation of the IDAA appliance and getting it up
and running is a remarkably simple and speedy exercise. In most cases, it takes less than a couple of
days to physically connect the appliance, install the software and define and deploy the tables and
data onto the box. Because there are no changes to existing DB2 SQL, previously developed applica-
tions can be run immediately with little or no testing. Users and applications see immediate benefits.
Performance improvements achieved clearly depend on the type of query involved, as well as on the
size of the base table and the number of rows / columns in the result set. However, customer results
speak for themselves. At the high end, queries that take over 2 hours to run on DB2 on z/OS return
results in 5 seconds on IDAA—a performance improvement of over 1,500 times. Of course, other
queries show smaller benefits. As queries run faster, they also save CPU resources, costing less and
reducing analysts’ waiting time for delivery of results. Even where the speed gain is smaller, it often
still makes sense to offload queries onto the IDDA platform, freeing up more costly mainframe re-
sources to be used for other tasks and taking advantage of the lower power and cooling needs of the
Netezza box. The actual mix of queries determines the overall performance improvement, and how
the freed-up mainframe cycles are redeployed affects the level of savings achieved. However one
customer anticipates a return on investment in less than four months.
Application
Interface
Optimizer
Application
Queries executed with IDAA
Queries executed without IDAA
Query execution run-
time for queries that
cannot be or should not
be off-loaded to IDAA
SMP
Host
DB2 for z/OS
IDAA
FPGA
Memory
CPU
FPGA
Memory
CPU
FPGA
Memory
CPU
IDAA
DRDA
Requestor
Figure 3:
Positioning
IDAA with
DB2 on z/OS
Copyright © 2012, 9sight Consulting, all rights reserved 11
Business benefits and architectural advantages
Business benefits
e’ve already seen the direct bottom-line benefit of faster processing and reduced CPU
loads, freeing up mainframe resources to do the work it is optimized for. Of more inter-
est, perhaps, is the opportunity for users to move to an entirely new approach to analyt-
ics, testing multiple hypotheses in the time they could previous try only one. Innovation is accele-
rated by orders of magnitude as analysts can work at the speed of their thinking, rather than the
speed of the slowest query.
In terms of operational analytics and operational BI applications, the division of labor between the
two environments is particularly appropriate. Furthermore, it is entirely transparent. Complex, ana-
lytical queries requiring extensive table scans of large, historical data sets run on IDAA. Results re-
turned from the analysis can be joined with current or near real-time data in the data warehouse on
the System z to deliver immediate recommendations, creating, in effect, a high performance opera-
tional BI service.
Recall that the OLTP application environment also resides on the mainframe. We can thus envisage,
for example, a bank call center application running in the OLTP environment with direct, real-time
access to customer account balances and the most recent transactions. When more complete, cross-
account, historical information is needed, it can be obtained from the data warehouse environment
via a service oriented architecture (SOA) approach. If more extensive analytics is required for cross-
or up-selling, the CPU-intensive analysis is delegated to IDAA, providing the possibility to do analyses
in seconds that previously would have taken far longer than the customer would remain on the line.
What we see here is the emergence of an integrated information environment that spans traditional
OLTP and informational uses. This is in line with today’s and future business needs that erase the old
distinction between the two worlds. Furthermore, the TCO benefits of a consolidated mainframe-
based platform as discussed on page 9 suggest that there are significant cost savings to be achieved
with this approach, driving further bottom-line business benefit.
Architectural advantages
Returning to our list of architectural deployment options on page 8, we can see that the IDAA ap-
proach is essentially an extension of option 3: EDW on DB2 on z/OS and data marts distributed to
other operating systems and/or servers, managed by z/OS. The data in DB2 on z/OS has the characte-
ristics of an EDW; that on the IDAA is a dependent (fed from the EDW) data mart. The important
point is that, while the IDAA data mart is implemented on another physical server, it is managed en-
tirely by the same DBMS as the EDW. This management function extends from loading and updating
the data mart to providing the single point of interface for both the EDW and data mart.
Using the database management system (DBMS) to manage load and update—as opposed to using
an extract, transform and load (ETL) tool—may seem like a small step. However, it is an important
first step in simplifying the overall data warehouse environment. As we saw in the business benefits,
mixed workload applications are becoming more and more important. Such applications demand
that equivalent data be stored in two (or maybe more) formats for efficient processing. Bringing the
management and synchronization of these multiple copies into the DBMS is key to ensuring data
quality and consistency within increasingly tight time constraints.
The operational BI / call center application mentioned in the previous section can be generalized into
the architectural view shown in figure 4. In this we see both the operational and informational envi-
W
Copyright © 2012, 9sight Consulting, all rights reserved 12
ronment implemented on the System z, both benefiting from the advanced RAS characteristics of the
mainframe environment. ETL within the same platform maximizes the efficiency of loading and up-
dating the warehouse. Within the warehouse, the DB2 DBMS take responsibility for loading and up-
dating the IDAA analytic data mart as previously described. Other data marts can also be consolidat-
ed from distributed platforms into the mainframe-based data warehouse for reasons of performance
or security. These data marts are also maintained by the DBMS, using extract, load and transform
(ELT) techniques. Communication between the operational and informational systems may be via
SOA as shown in the figure above; of course, other techniques such as DRDA could be used.
SOA
Interface
OLTP Application
Data Warehouse
DB2 on z/OS
Analytic
data mart
Informational Application
EDW
Data marts
Application Interface
IDAA
DRDA
Requestor
Application Interface
OperationalDatabase
IMS or DB2 on z/OS
Optimizer
System z managed
and secured
ETL
ELT
Figure 4:
A new
operational /
informational
architecture
Copyright © 2012, 9sight Consulting, all rights reserved 13
Conclusions
“It is not my job to have all the answers, but it is my job to ask lots of penetrating,
disturbing and occasionally almost offensive questions as part of
the analytic process that leads to insight and refinement.”
13
usinesses today face increasing pressure to act quickly and appropriately in all aspects of op-
erations, from supply chain management to customer engagement and everything in be-
tween and beyond. This combination of right time and right answer can be challenging. The
right answer—in terms of consistent, quality data—comes from the data warehouse. The right time
is typically the concern of operational systems. Operational BI spans the gap and, in particular, where
there are large volumes of information available, operational analytics provides the answers.
The current popularity of operational analytics stems from the enormous and rapidly increasing vo-
lumes of data now available and the technological advances that enable far more rapid processing of
such volumes. However, when implemented in the traditional data warehouse architecture, opera-
tional BI and analytics have encountered some challenges, including data transfer volumes, RAS limi-
tations and restrictions in connection to the operational environment.
The IBM DB2 Analytics Accelerator appliance directly addresses these challenges. Running complete-
ly transparently under DB2 on z/OS, the appliance is an IBM Netezza MPP machine directly attached
to the System z. Existing and new queries with demanding data access characteristics are automati-
cally routed to the appliance. Performance gains of over 1,500x have been recorded for some query
types. The combination of MPP query performance and the System z’s renowned security and relia-
bility characteristics provide an ideal platform to build a high-availability operational analytics envi-
ronment to enable business users to act at the speed of their thinking.
For customers who run a large percentage of their OLTP systems on z/OS and have chosen DB2 on
z/OS as their data warehouse platform, IDAA is an obvious choice to turbo-charge query performance
for analytic applications. For those who long ago chose to place their data warehouse elsewhere, it
may be the reason to revisit that decision.
This approach reflects what IBM calls freedom by design, as it simplifies the systems architecture for
the business.
It also provides an ideal platform for consolidating data marts from distributed systems back to the
mainframe environment for clear data management benefits for IT and significant reductions in total
cost of ownership for the whole computing environment. For business, the clear benefit is to closely
link from BI analysis to immediate business actions of real value.
For more information, please go to www.ibm.com/systemzdata
B
Copyright © 2012, 9sight Consulting, all rights reserved 14
Dr. Barry Devlin is among the foremost authorities on business insight and one of the founders of data
warehousing, having published the first architectural paper on the topic in 1988. With over 30 years of IT
experience, including 20 years with IBM as a Distinguished Engineer, he is a widely respected analyst,
consultant, lecturer and author of the seminal book, “Data Warehouse—from Architecture to Imple-
mentation” and numerous White Papers.
Barry is founder and principal of 9sight Consulting. He specializes in the human, organizational and IT
implications of deep business insight solutions that combine operational, informational and collabora-
tive environments. A regular contributor to BeyeNETWORK, Focus, SmartDataCollective and TDWI, Barry
is based in Cape Town, South Africa and operates worldwide.
Brand and product names mentioned in this paper are the trademarks or registered trademarks of IBM.
This paper was sponsored by IBM.
1
IBM Institute of Business Value, “Customer analytics pay off”, GBE03425-USEN-00, (2011)
2
“Business analytics will enable tailored flight pricing, says American Airlines”, Computer Weekly, http://bit.ly/znTJrc ,
28 October 2010, accessed 14 February 2012
3
Marlowe, C., “Doctor Faustus”, act 1, scene 1, (c.1592)
4
http://en.wikipedia.org/wiki/Analytics, accessed 24 January 2012
5
Davenport T. H. and Harris J. G., “Competing on Analytics: The New Science of Winning”, Harvard Business School Press, (2007)
6
Taylor, J., “Where to Begin with Predictive Analytics”, http://bit.ly/yr333L , 1 September 2011, accessed 8 February 2012
7
Selden, L. and Colvin, G., “Killer customers : tell the good from the bad and crush your competitors”, Portfolio, (2004)
8
Devlin, B. A. and Murphy, P. T., “An architecture for a business and information system”, IBM Systems Journal,
Volume 27, Number 1, Page 60 (1988) http://bit.ly/EBIS1988
9
Devlin, B., “Data warehouse—From Architecture to Implementation”, Addison-Wesley, (1997)
10
Inmon, W.H., Imhoff, C. & Battas, G., “Building the Operational Data Store”, John Wiley & Sons, (1996) http://bit.ly/ODS1995
11
Rubin, H.R. “Economics of Computing—The Internal Combustion Mainframe”, (2010), http://bit.ly/zQ1y8D, accessed 16 March 2012
12
Rubin, H.R. “Technology Economics: The Cost Effectiveness of Mainframe Computing”, (2010), http://bit.ly/wsBHRb,
accessed 16 March 2012
13
Gary Loveman, Chairman of the Board, President and CEO, Harrah’s, quoted in Accenture presentation “Knowing Beats Guessing” ,
http://bit.ly/AvlAao, June 2008, accessed 5 March 2012

Contenu connexe

Tendances

Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry
Capgemini
 

Tendances (19)

State Farm presentation at the Chief Analytics Officer Forum East Coast USA (...
State Farm presentation at the Chief Analytics Officer Forum East Coast USA (...State Farm presentation at the Chief Analytics Officer Forum East Coast USA (...
State Farm presentation at the Chief Analytics Officer Forum East Coast USA (...
 
Predictive and prescriptive analytics: Transform the finance function with gr...
Predictive and prescriptive analytics: Transform the finance function with gr...Predictive and prescriptive analytics: Transform the finance function with gr...
Predictive and prescriptive analytics: Transform the finance function with gr...
 
Enova presentation at the Chief Analytics Officer Forum East Coast USA (#CAOF...
Enova presentation at the Chief Analytics Officer Forum East Coast USA (#CAOF...Enova presentation at the Chief Analytics Officer Forum East Coast USA (#CAOF...
Enova presentation at the Chief Analytics Officer Forum East Coast USA (#CAOF...
 
Business Analytics
 Business Analytics  Business Analytics
Business Analytics
 
Business Intelligence Overview
Business Intelligence OverviewBusiness Intelligence Overview
Business Intelligence Overview
 
Best Practices In Predictive Analytics
Best Practices In Predictive AnalyticsBest Practices In Predictive Analytics
Best Practices In Predictive Analytics
 
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
 
Business Partner Product Enablement Roadmap, IBM Predictive Analytics
Business Partner Product Enablement Roadmap, IBM Predictive AnalyticsBusiness Partner Product Enablement Roadmap, IBM Predictive Analytics
Business Partner Product Enablement Roadmap, IBM Predictive Analytics
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business Analytics and Decision Making
Business Analytics and Decision MakingBusiness Analytics and Decision Making
Business Analytics and Decision Making
 
Modern Finance and Best Use of Analytics - Oracle Accenture Case Study
Modern Finance and Best Use of Analytics - Oracle Accenture Case StudyModern Finance and Best Use of Analytics - Oracle Accenture Case Study
Modern Finance and Best Use of Analytics - Oracle Accenture Case Study
 
Risk mgmt-analysis-wp-326822
Risk mgmt-analysis-wp-326822Risk mgmt-analysis-wp-326822
Risk mgmt-analysis-wp-326822
 
Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry
 
Rd big data & analytics v1.0
Rd big data & analytics v1.0Rd big data & analytics v1.0
Rd big data & analytics v1.0
 
FINANCIAL ANALYTICS
FINANCIAL ANALYTICSFINANCIAL ANALYTICS
FINANCIAL ANALYTICS
 
Business Analytics
Business Analytics Business Analytics
Business Analytics
 
Analytics in business
Analytics in businessAnalytics in business
Analytics in business
 
Advanced Analytics in Banking, CITI
Advanced Analytics in Banking, CITIAdvanced Analytics in Banking, CITI
Advanced Analytics in Banking, CITI
 
Pi cube banking on predictive analytics151
Pi cube   banking on predictive analytics151Pi cube   banking on predictive analytics151
Pi cube banking on predictive analytics151
 

En vedette

Jim Cobb Containers Inspectors Certificate
Jim Cobb Containers Inspectors CertificateJim Cobb Containers Inspectors Certificate
Jim Cobb Containers Inspectors Certificate
Jim Cobb
 
BLINK LIGHTING BROCH pdf ok
BLINK LIGHTING BROCH pdf okBLINK LIGHTING BROCH pdf ok
BLINK LIGHTING BROCH pdf ok
Emanuela Nazzani
 
Impress presentazionecifaldijessica
Impress presentazionecifaldijessicaImpress presentazionecifaldijessica
Impress presentazionecifaldijessica
JessicaCifaldi
 

En vedette (13)

Jim Cobb Containers Inspectors Certificate
Jim Cobb Containers Inspectors CertificateJim Cobb Containers Inspectors Certificate
Jim Cobb Containers Inspectors Certificate
 
Introdução
IntroduçãoIntrodução
Introdução
 
Prece aoanjoguardião
Prece aoanjoguardiãoPrece aoanjoguardião
Prece aoanjoguardião
 
Smith Global Leadership emba 2016
Smith Global Leadership emba 2016Smith Global Leadership emba 2016
Smith Global Leadership emba 2016
 
Shooting schedule
Shooting scheduleShooting schedule
Shooting schedule
 
Festival de navidad
Festival de navidadFestival de navidad
Festival de navidad
 
BLINK LIGHTING BROCH pdf ok
BLINK LIGHTING BROCH pdf okBLINK LIGHTING BROCH pdf ok
BLINK LIGHTING BROCH pdf ok
 
Copper Mould Plate
Copper Mould PlateCopper Mould Plate
Copper Mould Plate
 
Impress presentazionecifaldijessica
Impress presentazionecifaldijessicaImpress presentazionecifaldijessica
Impress presentazionecifaldijessica
 
Palestra o carater educativo da dor
Palestra  o carater educativo da dorPalestra  o carater educativo da dor
Palestra o carater educativo da dor
 
Fiscal Theory
Fiscal TheoryFiscal Theory
Fiscal Theory
 
Egipto
EgiptoEgipto
Egipto
 
Unit 3 Management Information System
Unit 3 Management Information SystemUnit 3 Management Information System
Unit 3 Management Information System
 

Similaire à Integrating Analytics into the Operational Fabric of Your Business

BIG DATA & BUSINESS ANALYTICS
BIG DATA & BUSINESS ANALYTICSBIG DATA & BUSINESS ANALYTICS
BIG DATA & BUSINESS ANALYTICS
Vikram Joshi
 
Rethinking Supply Chain Analytics - report - 23 JAN 2018
Rethinking Supply Chain Analytics - report - 23 JAN 2018Rethinking Supply Chain Analytics - report - 23 JAN 2018
Rethinking Supply Chain Analytics - report - 23 JAN 2018
Lora Cecere
 
Improving Data Extraction Performance
Improving Data Extraction PerformanceImproving Data Extraction Performance
Improving Data Extraction Performance
Data Scraping and Data Extraction
 
Sit717 enterprise business intelligence 2019 t2 copy1
Sit717 enterprise business intelligence 2019 t2 copy1Sit717 enterprise business intelligence 2019 t2 copy1
Sit717 enterprise business intelligence 2019 t2 copy1
NellutlaKishore
 

Similaire à Integrating Analytics into the Operational Fabric of Your Business (20)

BIG DATA & BUSINESS ANALYTICS
BIG DATA & BUSINESS ANALYTICSBIG DATA & BUSINESS ANALYTICS
BIG DATA & BUSINESS ANALYTICS
 
Whitepaper - Simplifying Analytics Adoption in Enterprise
Whitepaper - Simplifying Analytics Adoption in EnterpriseWhitepaper - Simplifying Analytics Adoption in Enterprise
Whitepaper - Simplifying Analytics Adoption in Enterprise
 
RAPP Open insight edition 1
RAPP Open insight edition 1RAPP Open insight edition 1
RAPP Open insight edition 1
 
RAPP OpenInsight Edition 1
RAPP OpenInsight Edition 1RAPP OpenInsight Edition 1
RAPP OpenInsight Edition 1
 
Data mining & data warehousing
Data mining & data warehousingData mining & data warehousing
Data mining & data warehousing
 
Sas business analytics
Sas   business analyticsSas   business analytics
Sas business analytics
 
Unlocking big data
Unlocking big dataUnlocking big data
Unlocking big data
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Mining
 
Data Analytics And Business Decision.pdf
Data Analytics And Business Decision.pdfData Analytics And Business Decision.pdf
Data Analytics And Business Decision.pdf
 
Data Analytics And Business Decision.pdf
Data Analytics And Business Decision.pdfData Analytics And Business Decision.pdf
Data Analytics And Business Decision.pdf
 
Mighty Guides- Data Disruption
Mighty Guides- Data DisruptionMighty Guides- Data Disruption
Mighty Guides- Data Disruption
 
Big data is a popular term used to describe the exponential growth and availa...
Big data is a popular term used to describe the exponential growth and availa...Big data is a popular term used to describe the exponential growth and availa...
Big data is a popular term used to describe the exponential growth and availa...
 
Operationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BIOperationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BI
 
Rethinking Supply Chain Analytics - report - 23 JAN 2018
Rethinking Supply Chain Analytics - report - 23 JAN 2018Rethinking Supply Chain Analytics - report - 23 JAN 2018
Rethinking Supply Chain Analytics - report - 23 JAN 2018
 
Mighty Guides Data Disruption
Mighty Guides Data DisruptionMighty Guides Data Disruption
Mighty Guides Data Disruption
 
Mighty Guides- Data Disruption
Mighty Guides- Data Disruption Mighty Guides- Data Disruption
Mighty Guides- Data Disruption
 
Machine Learning for Business - Eight Best Practices for Getting Started
Machine Learning for Business - Eight Best Practices for Getting StartedMachine Learning for Business - Eight Best Practices for Getting Started
Machine Learning for Business - Eight Best Practices for Getting Started
 
Big Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourBig Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer Behaviour
 
Improving Data Extraction Performance
Improving Data Extraction PerformanceImproving Data Extraction Performance
Improving Data Extraction Performance
 
Sit717 enterprise business intelligence 2019 t2 copy1
Sit717 enterprise business intelligence 2019 t2 copy1Sit717 enterprise business intelligence 2019 t2 copy1
Sit717 enterprise business intelligence 2019 t2 copy1
 

Plus de IBM India Smarter Computing

Plus de IBM India Smarter Computing (20)

Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments
 
All-flash Needs End to End Storage Efficiency
All-flash Needs End to End Storage EfficiencyAll-flash Needs End to End Storage Efficiency
All-flash Needs End to End Storage Efficiency
 
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
 
IBM FlashSystem 840 Product Guide
IBM FlashSystem 840 Product GuideIBM FlashSystem 840 Product Guide
IBM FlashSystem 840 Product Guide
 
IBM System x3250 M5
IBM System x3250 M5IBM System x3250 M5
IBM System x3250 M5
 
IBM NeXtScale nx360 M4
IBM NeXtScale nx360 M4IBM NeXtScale nx360 M4
IBM NeXtScale nx360 M4
 
IBM System x3650 M4 HD
IBM System x3650 M4 HDIBM System x3650 M4 HD
IBM System x3650 M4 HD
 
IBM System x3300 M4
IBM System x3300 M4IBM System x3300 M4
IBM System x3300 M4
 
IBM System x iDataPlex dx360 M4
IBM System x iDataPlex dx360 M4IBM System x iDataPlex dx360 M4
IBM System x iDataPlex dx360 M4
 
IBM System x3500 M4
IBM System x3500 M4IBM System x3500 M4
IBM System x3500 M4
 
IBM System x3550 M4
IBM System x3550 M4IBM System x3550 M4
IBM System x3550 M4
 
IBM System x3650 M4
IBM System x3650 M4IBM System x3650 M4
IBM System x3650 M4
 
IBM System x3500 M3
IBM System x3500 M3IBM System x3500 M3
IBM System x3500 M3
 
IBM System x3400 M3
IBM System x3400 M3IBM System x3400 M3
IBM System x3400 M3
 
IBM System x3250 M3
IBM System x3250 M3IBM System x3250 M3
IBM System x3250 M3
 
IBM System x3200 M3
IBM System x3200 M3IBM System x3200 M3
IBM System x3200 M3
 
IBM PowerVC Introduction and Configuration
IBM PowerVC Introduction and ConfigurationIBM PowerVC Introduction and Configuration
IBM PowerVC Introduction and Configuration
 
A Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization PerformanceA Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization Performance
 
IBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architectureIBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architecture
 
X6: The sixth generation of EXA Technology
X6: The sixth generation of EXA TechnologyX6: The sixth generation of EXA Technology
X6: The sixth generation of EXA Technology
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Integrating Analytics into the Operational Fabric of Your Business

  • 1. Integrating Analytics into the Operational Fabric of Your Business A combined platform for optimizing analytics and operations April 2012 A White Paper by Dr. Barry Devlin, 9sight Consulting barry@9sight.com Business is running ever faster—generating, collecting and using increas- ing volumes of data about every aspect of the interactions between sup- pliers, manufacturers, retailers and customers. Within these mountains of data are seams of gold—patterns of behavior that can be interpreted, classified and analyzed to allow predictions of real value. Which treat- ment is likely to be most effective for this patient? What can we offer that this particular customer is more likely to buy? Can we identify if that transaction is fraudulent before the sale is closed? To these questions and more, operational analytics—the combination of deep data analysis and transaction processing systems—has an answer. This paper describes what operational analytics is and what it offers to the business. We explore its relationship to business intelligence (BI) and see how traditional data warehouse architectures struggle to support it. Now, the combination of advanced hardware and software technologies provide the opportunity to create a new integrated platform delivering powerful operational analytics within the existing IT fabric of the enterprise. With the IBM DB2 Analytics Accelerator, a new hardware/software offer- ing on System z, the power of the massively parallel processing (MPP) IBM Netezza is closely integrated with the mainframe and accessed directly and transparently via DB2 on z/OS. The IBM DB2 Analytics Accelerator brings enormous query performance gains to analytic queries and enables direct integration with operational processes. This integrated environment also enables distributed data marts to be re- turned to the mainframe environment, enabling significant reductions in data management and total ownership costs. Contents 2 Operational analytics— diamonds in the detail, magic in the moment 5 Data warehousing and the evolution of species 7 An integrated platform for OLTP and operational analytics 11 Business benefits and architectural advantages 13 Conclusions
  • 2. Copyright © 2012, 9sight Consulting, all rights reserved 2 large multichannel retailer discovered some of its customers were receiving up to 60 catalog mailings from them a year through multiple marketing campaigns. Customer satisfaction was at risk, profits slowing. Increased mailing did not drive higher sales. A shift in thinking was needed. From “finding customers for my products” to “finding the right products for my customers.” That meant analyzing customer behavior, from what they searched for on the website to what they bought and even returned in order to know what to offer them. As a result, the retailer saw an extra US$3.5 million in profit, a 7% drop in mailings as well as increased customer satisfaction.1 The airline industry has long been using historical information about high-value customers, such as customer preferences, flights taken, recent flight disruptions and more, to enable operational deci- sions to be taken about who gets priority treatment when, for example, a delayed arrival breaks con- nections for passengers. That’s using historical data in near real-time. Now, carriers are analyzing real-time and historical data from customers browsing their website to make pricing decisions on the fly (no pun intended!) to maximize seat occupancy and profit.2 The wheels of commerce turn ever faster. Business models grow more complex. Channels to cus- tomers and suppliers multiply. Making the right decision at the right time becomes ever more diffi- cult. And ever more vital. Analysis followed by action is the key… Operational analytics—diamonds in the detail, magic in the moment “Sweet Analytics, 'tis thou hast ravished me.” 3 usiness Analytics. Predictive analytics. Operational Analytics. “Insert-attractive-word-here Analytics” is a popular marketing game. Even Dr. Faustus espoused “Sweet Analytics”, as Christopher Marlowe wrote at the end of the 16th Century! The definitions of the terms over- lap significantly. The opportunities for confusion multiply. So, let’s define operational analytics: Analytics Wikipedia offers a practical definition4 : “analytics is the process of developing optimal or realistic de- cision recommendations based on insights derived through the application of statistical models and analysis against existing and/or simulated future data.” This is a good start. It covers all the variants above and emphasizes recommendations for decisions as the goal. Analysis for the sake of under- standing the past is interesting, but only analysis that influences future decisions offers return on investment. But only where decisions lead to actions. Operational Business intelligence (BI) practitioners understand “operational” as the day-to-day actions required to run the business—the online transaction processing (OLTP) systems that record and manage the detailed, real-time activities between the business, its customers, suppliers, etc. This is in contrast to informational systems where data is analyzed and reported upon. Every day-to-day action demands one or more real-time decisions. Sometimes the answer is so ob- vious that we don’t even see the question. An online retailer receives an order for an in-stock shirt from a signed-in customer; without question, the order is accepted. But the implicit question—what should we do with this order?—is much clearer if the item is out of stock, or if we have a higher mar- gin shirt available that the customer might like. Every operational transaction has a decision associated with it; every action is preceded by a decision. The decision may be obvious but, sometimes it is worth asking: is a better outcome possible if we made a different decision and thus took a different action? A B
  • 3. Copyright © 2012, 9sight Consulting, all rights reserved 3 Operational Analytics We can thus define operational analytics as the process of developing optimal or realistic recommen- dations for real-time, operational decisions based on insights derived through the application of statis- tical models and analysis against existing and/or simulated future data, and applying these recommen- dations in real-time interactions. This definition leads directly to a process: 1 1. . Perform statistical analysis on a significant sample of historical transactional data to discover the likelihood of possible outcomes 2 2. . Predict outcomes (a model) of different actions during future operational interactions 3 3. . Apply this knowledge in real-time as an operational activity is occurring 4 4. . Note result and feed back into the analysis stage. From an IT perspective, steps (1) and (2) have very different processing characteristics than (3) and (4). The former involve reading and number-crunching of potentially large volumes of data with rela- tively undemanding constraints on the time taken. The latter require the exact opposite—fast re- sponse time for writing small data volumes. This leads to a key conclusion. Operational analytics is a process that requires a combination of informational and operational processing. Operational BI While the term operational analytics is very much flavor of the year, operational BI has been around for years now. Is there any difference between the two? Some analysts and vendors suggest that analytics is future oriented, while BI is backward-looking and report oriented. While there may be some historical truth in this distinction, in practical terms today, the difference is limited. Analytics typically includes more statistical analysis and modeling to reach conclusions, as in steps (1) and (2) of the above process. Operational BI may include this but also other, simpler approaches to drawing conclusions for input to operational activity, such as rule-based selection. Operational analytics—why now and what for? “Analytics themselves don't constitute a strategy, but using them to optimize a distinctive business capability certainly constitutes a strategy.” 5 What we’ve been discussing sounds a lot like data mining, a concept that has been around since the early 1990s. And beyond advances in technology, there is indeed little difference. So, why is opera- tional analytics suddenly a hot topic? The answers are simple: 1 1. . Business operations are increasingly automated and digitized via websites, providing ever larger quantities of data for statistical analysis 2 2. . Similarly, Web 2.0 is driving further volumes and varieties of analyzable data 3 3. . As the speed of business change continues to accelerate, competition for business is intense 4 4. . Data storage and processing continue to increase in power and decrease in cost, making opera- tional analytics a financially viable process for smaller businesses 5 5. . Making many small, low-value decisions better can make a bigger contribution to the bottom-line than a few, high value ones; and the risk of failure is more widely spread And, as enterprise decision management expert, James Taylor, points out6 , operational data volumes are large enough to provide statistically significant results and the outcomes of decisions taken can be seen and tracked over relatively short timeframes. Operational analytics thus offer a perfect plat-
  • 4. Copyright © 2012, 9sight Consulting, all rights reserved 4 form to begin to apply the technological advances in predictive analytics and test their validity. So, let’s look briefly at the sort of things leading-edge companies are doing with operational analytics. Marketing: what’s the next best action? Cross-selling, upselling, next best offer and the like are marketing approaches that all stem from one basic premise. It’s far easier to sell to an existing customer (or even a prospect who is in the process of deciding to buy something) than it is to somebody with whom you have no prior interaction. They all require that—or, at least, work best when—you know enough about (1) the prospective buyer, (2) the context of the interaction and (3) your products, to make a sensible decision about what to do next. Knowing the answers to those three questions can prove tricky; get them wrong and you risk losing the sale altogether, alienating the customer, or simply selling something unprofitably. With the growth of inbound marketing via websites and call centers, finding an automated approach to answering these questions is vital. Operational analytics is that answer. Analyzing a prospect’s previous buying behavior and even, pattern of browsing can give insight into interests, stage of life, and other indicators of what may be an appropriate next action from the cus- tomer’s point of view. A detailed knowledge of the characteristics of your product range supplies the other side of the equation. The goal is to bring this information together in the form of a predicted best outcome during the short window of opportunity while the prospect is on the check-out web page or in conversation with the call center agent. Consider Marriott International Inc., for example. The group has over 3,500 properties worldwide and handles around three-quarters of a million new reservations daily. Marriott’s goal is to maximize customer satisfaction and room occupancy simultaneously using an operational analytics approach. Factors considered include the customer’s loyalty card status and history, stay length and timing. On the room inventory side, rooms in the area of interest are categorized according to under- or over- sold status, room features, etc. This information is brought together in a “best price, best yield” sce- nario for both the customer and Marriott in under a second while the customer is shopping. Risk: will the customer leave… and do I care? “The top 20% of customers… typically generate more than 120% of an organization’s profits. The bottom 20% generate losses equaling more than 100% of profits.” 7 Customer retention is a central feature of all businesses that have an ongoing relationship with their customers for the provision of a service such as banking or insurance or a utility such as telecoms, power or water. In the face of competition, the question asked at contract renewal time is: how like- ly is this customer to leave? The subsidiary, and equally important, question is: do I care? In depth analysis using techniques such as logistic regression, a decision tree, or survival analysis of long-term customer behavior identifies potential churn based on indicators such as dissatisfaction with service provided, complaints, billing errors or disputes, or a decrease in the number of transac- tions. In most cases, the result of this analysis of potential churners is combined with an estimate of likely lifetime value of the customers to aid in prioritization of actions to be taken. In high value cases, the action may be proactive, involving outbound marketing. In other cases, customers may be flagged for particular treatment when they next make contact. Fraud: is it really like it claims to be? Detecting fraud is something best done as quickly as possible—preferably while in progress. This clearly points to an operational aspect of implementation. In some cases, like credit card fraud, the window of opportunity is even shorter than OLTP—suspect transactions must be caught in flight.
  • 5. Copyright © 2012, 9sight Consulting, all rights reserved 5 Operationalsystems Operationalsystems Datamarts Operationalsystems andmore Datamarts, cubes, spreadsheets, etc. Enterprise datawarehouse Business datawarehouse Personal data Public data Enhanceddata, Detailed Raw data, Detailed Enhanceddata, Summary Metadata Metadata Metadata Mashups, Portals, SOA, Federation Operationaldata store DataStagingArea Fig. 1a AdaptedfromDevlin&Murphy (1988) Fig. 1b AdaptedfromDevlin(1997) Fig. 1c Enterprise datawarehouse This requires real-time analysis of the event streams in flight, a topic beyond this paper, but one where IBM and other vendors are offering existing and new tools to meet this growing need. But there exist many types of fraud in insurance, social services, banking and other areas where opera- tional analytics, as we’ve defined it, plays a key role in detection and prevention. As in our previous examples, the first step is the analysis of historical data to discover patterns of be- havior that can be correlated with proven outcomes, in this case with instances of deliberate fraud in financial transactions, and even negligent or unthinking use of unnecessarily expensive procedures in maintenance or medical treatment. Micro-segmentation of the customer base leads to clusters of people with similar behaviors, groups of which correlate to fraud. Applying analytics on an opera- tional timeframe can detect the emergence of these patterns in near real-time, allowing preventative action to be taken. Data warehousing and the evolution of species ith the recognition that operational analytics bridges traditional informational (data wa- rehousing / BI) and operational (OLTP) environments, it makes sense to examine how this distinction evolved and how, in recent years, it is beginning to break down as a result of the ever increasing speed of response to change demanded by business today. Genesis Data warehousing and System z are cousins. The first data warehousing architecture was conceived in IBM Europe and implemented on S/370 in the mid-1980s. As I and Paul Murphy documented in an IBM Systems Journal article8 in 1988, the primary driver for data warehousing was the creation of an integrated, consistent and reliable repository of historical information for decision support in IBM's own sales and administration functions. The architecture proposed as a solution a “Business Data Warehouse (BDW)… [a] single logical storehouse of all the information used to report on the business… In relational terms, a view / number of views that… may have been obtained from different tables”. The BDW was largely normalized, and the stored data reconciled and cleansed through an integrated interface to the operational environment. Figure 1a shows this architecture. The split between operational and informational processing, driven by both business and technological considerations, thus goes back to the W Figure 1: Evolution of the data warehouse architecture
  • 6. Copyright © 2012, 9sight Consulting, all rights reserved 6 very foundations of data warehousing. At that time, business users wanted consistency of informa- tion across both information sources and time; they wanted to see reports of trends over days and weeks rather than the minute by minute variations of daily business. This suited IT well. Heavily loaded and finely tuned OLTP systems would struggle to deliver such reports and might collapse in the face of ad hoc queries. The architectural solution was obvious—extract, transform and load (ETL) data from the OLTP systems into the data warehouse on a monthly, weekly and, eventually, daily ba- sis as business began to value more timely data. Middle Ages The elegant simplicity of a single informational layer quickly succumbed to the limitations of early relational databases, which were optimized for OLTP. As shown in figure 1b9 , the informational layer was further split into an enterprise data warehouse (EDW) and data marts fed from it. This architec- tural structure and the rapid growth of commodity servers throughout the 1990s and 2000s, coupled with functional empowerment of business units, has led to the highly distributed, massively replicated and often incoherently managed BI environment that is common in most medium and large enterpris- es today. While commodity hardware has undoubtedly reduced physical implementation costs, the overall total cost of ownership (TCO) has soared in terms of software licenses, data and ETL adminis- tration, as well as change management. The risks associated with inconsistent data have also soared. In parallel, many more functional components have been incorporated into the architecture as shown in figure 1c, mainly to address the performance needs of specific applications. Of particular interest for operational analytics is the operational data store (ODS) first described10 in the mid-1990s. This was the first attempt to bridge the gap that had emerged between operational and informational systems. According to Bill Inmon’s oft-quoted definitions, both the data warehouse and ODS are sub- ject oriented and enterprise-level integrated data stores. While the data warehouse is non-volatile and time variant, the ODS contains current-valued, volatile, detailed corporate data. In essence, what this means is that the data warehouse is optimized for reading large quantities of data typical of BI applications, while the ODS is better suited for reading and writing individual records. The ODS construct continues to be widely used, especially in support of master data management. However, it and other components introduce further layers and additional copies of data into an al- ready overburdened architecture. Furthermore, as business requires ever closer to real-time analysis, the ETL environment must run faster and faster to keep up. Clearly, new thinking is required. Modern times Data warehousing / business intelligence stands at a crossroads today. The traditional layered archi- tecture (figure 1b) recommended by many BI experts is being disrupted from multiple directions: 1 1. . Business applications such as operational BI and analytics increasingly demand near real-time or even real-time data access for analysis 2 2. . Business users no longer appreciate the distinction between operational and informational processes; as a result, they are merging together 3 3. . Rapidly growing data volumes and numbers of copies are amplifing data management problems 4 4. . Hardware and software advances—discussed next—drive “flatter” architectural approaches This pressure is reflected in the multiple and varied hardware and software solutions currently on offer in the BI marketplace today. Each of these approaches addresses different aspects of this archi- tectural disruption to varying degrees. What is required is a more inclusive and integrated approach, which is enabled by recent advances in technology.
  • 7. Copyright © 2012, 9sight Consulting, all rights reserved 7 An integrated platform for OLTP and operational analytics dvances in processing and storage technology as well as in database design over the past decade have been widely and successfully applied to traditional BI needs—running analytic queries faster over ever larger data sets. Massively parallel processing (MPP)—where each processor has its own memory and disks—has been highly beneficial for problems amenable to being broken up into smaller, highly independent parts. Columnar databases—storing all the fields in each column physically together, as opposed to traditional row-based databases where the fields of a sin- gle record are stored sequentially—are also very effective in reducing query time for many types of BI application, which typically require only a subset of the fields in each row. More recently, technolo- gical advances and price reductions in solid-state memory devices—either in memory or on solid state disks (SSD)—present the opportunity to reduce the I/O bottleneck of disk storage for all data- base applications, including BI. Each of these diverse techniques has its own strengths, as well as its weaknesses. The same is true of traditional row-based relational databases running on symmetric multi-processing (SMP) machines where multiple processors share common memory and disks. SMP is well suited to running high per- formance OLTP systems like airline reservations, as well as BI processing, such as reporting and key performance indicator (KPI) production. However, the move towards near real-time BI and opera- tional analytics, in particular, is shifting the focus to the ever closer relationship between operational and informational needs. For technology, the emphasis is moving from systems optimized for partic- ular tasks to those with high performance across multiple areas. We thus see hybrid systems emerg- ing, where vendors blend differing technologies—SMP and MPP, solid-state and disk storage, row- and column-based database techniques—in various combinations to address complex business needs. Operational analytics, as we’ve seen, demands an environment equally capable of handling opera- tional and informational tasks. Furthermore, these tasks can be invoked in any sequence at any time. Therefore, in such hybrid systems, the technologies used must be blended seamlessly together, transparently to users and applications, and automatically managed by the database technology to ease data management. Beyond pure technology considerations, operational analytics has operating characteristics that dif- fer significantly from traditional BI. Because operational analytics is, by definition, integrated into the operational processes of the business, the entire operational analytics process must have the same performance, reliability, availability and security (RAS) characteristics as the traditional operational systems themselves. Processes that include operational analytics will be expected to return results with the same response time—often sub-second—as standard transactions. They must have the same high availability—often greater than 99.9%—and the same high levels of security and traceabili- ty. Simply put, operational analytics systems “inherit” the service level agreements (SLAs) and secu- rity needs of the OLTP systems rather than those of the data warehouse. If we consider the usage characteristics of operational analytics systems, we see two aspects. First, there is the more traditional analysis and modeling that is familiar to BI users. Second, there is the operational phase that is the preserve of front-office users. While the first group comprises skilled and experienced BI analysts, the second has more limited computer skills, as well as less time and inclination to learn them. In addition, it is the front-office users who have daily interaction with the system. As a result, usage characteristics such as usability, training, and support must also lean to- wards those of the OLTP environment. These operating and usage characteristics lead to the conclusion that the hybrid technology envi- ronment required for operational analytics should preferably be built out from the existing OLTP en- A
  • 8. Copyright © 2012, 9sight Consulting, all rights reserved 8 vironment rather than from its data warehouse counterpart. Such an approach avoids upgrading the RAS characteristics of the data warehouse—a potentially complex and expensive procedure that has little or no benefit for traditional BI processes. Furthermore, it can allow a reduction in copying of data from the OLTP to the BI environment—a particularly attractive option given that near real-time data is often needed in the operational analytic environment. IBM System z operational and informational processing IBM System z with DB2 for z/OS continues to be the premier platform of choice for OLTP systems providing high reliability, availability and security as well as high performance and throughput. For higher performance, IMS is the database of choice. Despite numerous obituaries since the 1990s, over 70% of global Fortune 500 companies still run high performance OLTP on System z. DB2 for z/OS has always been highly optimized for OLTP rather than the very different processing and access cha- racteristics of heavy analytic workloads, although DB2 10 redresses the balance somewhat. So, given the wealth of transaction data on DB2 or IMS on z/OS, the question has long arisen as to where BI data and applications should be located. Following the traditional layered EDW / data mart architecture shown in figure 1b, a number of options were traditionally considered: 1 1. . EDW and data marts together on DB2 on z/OS in a partition separate from OLTP systems This option offers minimal data movement and an environment that takes full advantage of z/OS skills and RAS strengths. However, in the past, mainframe processing was seen as comparatively expensive, existing systems were already heavily utilized for OLTP and many common BI tools were unavailable on this platform. 2 2. . EDW and/or data marts distributed to other physical servers running different operating systems Faced with the issues above, customers had to choose between distributing only their data marts or both EDW and marts to a different platform. When both EDW and data marts were used for extensive analysis, customers often chose the latter to optimize BI processing on dedicated BI platforms, such as Teradata. Distributing data marts alone was often driven by specific depart- mental needs for specialized analysis tools. The major drawback with this approach is that it drives an enormous proliferation of servers and data stores. Data center, data management and distribution costs all increase dramatically. 3 3. . EDW on DB2 on z/OS and data marts distributed to other operating systems and/or servers, ma- naged by z/OS In recent years, IBM has extended the System z environment in a number of ways to provide op- timal support for BI processing. Linux, available since the early 2000s, enables customers to run BI (and other) applications developed for this platform on System z. The IBM zEnterprise Blade- Center Extension (zBX), a hardware solution introduced in 2010, runs Windows and AIX systems under the control and management of System z, further expanding customers’ options for run- ning non-native BI applications under the control and management of z/OS. These approaches support both EDW and data marts, although typical reporting EDW and stag- ing area processing can be optimized very well on DB2 on z/OS and are often placed there. This third option offers significant benefits. Reducing the number and variety of servers simplifies and reduces data center TCO. Distribution of data is reduced, leading to lower networking costs. Fewer copies of data cuts storage costs, but most importantly, diminishes the costs of managing it as business needs change. In addition, zBX is an effective approach to moving BI processing to more appropriate platforms and freeing up mainframe cycles for other purposes.
  • 9. Copyright © 2012, 9sight Consulting, all rights reserved 9 A 2010 paper11 by Rubin Worldwide, an analyst organization specializing in Technology Economics, provides statistical evidence of the value of option 3 in a more general sense. It compares the aver- age cost of goods across industries between companies that are mainframe-biased and those that favor a distributed server approach. The figures show an average additional cost of over 25% for the distributed model. Only in the case of Web-centric businesses is the balance reversed. A more de- tailed analysis of the financial services sector12 shows a stronger case for the mainframe-centric ap- proach. It appears that customers have begun to take notice too—the last two years have seen the beginnings of an upward trend in mainframe purchase and an expansion in use cases. IBM DB2 Analytics Accelerator—to System z and DB2, just add Netezza Available since November 2011, the IBM DB2 Analytics Accelerator (which, for ease of use, I’ll abbre- viate to IDAA) 2.1 is a hardware/software appliance that deeply integrates the Netezza server, ac- quired by IBM just one year earlier, with the System z and DB2 on z/OS. From a DB2 user and applica- tion perspective on z/OS, only one thing changes—vastly improved analytic response times at lower cost. The DB2 code remains the same. User access is exactly the same as it always was. Reliability, availability and security is at the same level as for System z. Data management is handled by DB2. IDAA hardware With Netezza, IBM acquired a hardware-assisted, MPP, row- based relational database appliance, shown in figure 2. At left, two redundant SMP hosts manage the massively parallel envi- ronment to the right as well as handling all SQL compilation, planning, and administration. Parallel processing is provided by up to 12 Snippet BladesTM (S-Blades) with 96 CPUs, 8 per blade, in each cabinet. Each S-Blade with 16GB of dedicated memory is a high-performance database engines for streaming joins, ag- gregations, sorts, etc. The real performance boosters are the 4 dual-core field programmable gate arrays (FPGA) on each blade that mediate data from the disks, uncompressing it and filtering out columns and rows that are irrelevant to the particular query being processed. The CPU then performs all remaining SQL function and passes results back to the host. Each S-Blade has its own dedicated disk array, holding up to 128TB of uncom- pressed data per cabinet. In the near future, up to 10 cabinets can be combined giving a total effective data capacity of 1.25 petabytes and nearly 2000 processors. The IDAA appliance is simply a Netezza box / boxes attached via the twin SMP hosts to the System z via two dedicated 10Gb networks through which all data and communications pass, a design that en- sures there is no single point of failure. All network access to the appliance is through these dedicat- ed links, providing load speeds of up to 1.5TB/hour, and offering the high levels of security and sys- tems management for which System z is renowned. Additional deployment options allow multiple IDAAs attached to one System z, and multiple System z machines sharing one or more IDAAs. IDAA software IDAA software consists of an update to DB2 and a Data Studio plug-in that manage a set of stored procedures running in DB2 9 or 10 for z/OS. Figure 3 shows the basic configuration and operation. The DB2 optimizer analyzes queries received from an application or user. Any judged suitable for ac- celeration by the IDAA appliance are passed to it via the distributed relational database architecture FPGA Memory CPU FPGA Memory CPU FPGA Memory CPU Host Disk Enclosures S-Blades™ Network Fabric Netezza Appliance Figure 2: Structure of the IBM Netezza appliance
  • 10. Copyright © 2012, 9sight Consulting, all rights reserved 10 (DRDA) interface and results flow back by the same route. Any queries that cannot or should not be passed to IDAA are run as normal in DB2 for z/OS. Because DB2 me- diates all queries to IDAA, from a user or application viewpoint, the appliance is invisi- ble. Analytic que- ries simply run faster. DB2 applications that ran previously against DB2 on z/OS run without any code change on the upgraded system. Dynamic SQL is currently supported; static SQL is coming soon. All DB2 functions such as EXPLAIN and billing stats work as before even when the query is routed in whole or in part to the Netezza box. IDAA is so closely integrated into DB2 that it appears to a user or administrator as an internal DB2 process, much like the lock manager or resource manager. Some or all of the data in DB2 on z/OS must, of course, be copied onto the IDAA box and sliced across the disks there before any queries can run there. The tables to be deployed on the IDAA box are de- fined through a Client application called the Data Studio plug-in, which guides the DBA through the process and creates stored procedures to deploy, load and update tables, create appropriate meta- data on DB2 and on the IDAA box and run all administrative tasks. Incremental update of IDAA tables is planned in the near future. IDAA implementation and results Given the prerequisite hardware and software, installation of the IDAA appliance and getting it up and running is a remarkably simple and speedy exercise. In most cases, it takes less than a couple of days to physically connect the appliance, install the software and define and deploy the tables and data onto the box. Because there are no changes to existing DB2 SQL, previously developed applica- tions can be run immediately with little or no testing. Users and applications see immediate benefits. Performance improvements achieved clearly depend on the type of query involved, as well as on the size of the base table and the number of rows / columns in the result set. However, customer results speak for themselves. At the high end, queries that take over 2 hours to run on DB2 on z/OS return results in 5 seconds on IDAA—a performance improvement of over 1,500 times. Of course, other queries show smaller benefits. As queries run faster, they also save CPU resources, costing less and reducing analysts’ waiting time for delivery of results. Even where the speed gain is smaller, it often still makes sense to offload queries onto the IDDA platform, freeing up more costly mainframe re- sources to be used for other tasks and taking advantage of the lower power and cooling needs of the Netezza box. The actual mix of queries determines the overall performance improvement, and how the freed-up mainframe cycles are redeployed affects the level of savings achieved. However one customer anticipates a return on investment in less than four months. Application Interface Optimizer Application Queries executed with IDAA Queries executed without IDAA Query execution run- time for queries that cannot be or should not be off-loaded to IDAA SMP Host DB2 for z/OS IDAA FPGA Memory CPU FPGA Memory CPU FPGA Memory CPU IDAA DRDA Requestor Figure 3: Positioning IDAA with DB2 on z/OS
  • 11. Copyright © 2012, 9sight Consulting, all rights reserved 11 Business benefits and architectural advantages Business benefits e’ve already seen the direct bottom-line benefit of faster processing and reduced CPU loads, freeing up mainframe resources to do the work it is optimized for. Of more inter- est, perhaps, is the opportunity for users to move to an entirely new approach to analyt- ics, testing multiple hypotheses in the time they could previous try only one. Innovation is accele- rated by orders of magnitude as analysts can work at the speed of their thinking, rather than the speed of the slowest query. In terms of operational analytics and operational BI applications, the division of labor between the two environments is particularly appropriate. Furthermore, it is entirely transparent. Complex, ana- lytical queries requiring extensive table scans of large, historical data sets run on IDAA. Results re- turned from the analysis can be joined with current or near real-time data in the data warehouse on the System z to deliver immediate recommendations, creating, in effect, a high performance opera- tional BI service. Recall that the OLTP application environment also resides on the mainframe. We can thus envisage, for example, a bank call center application running in the OLTP environment with direct, real-time access to customer account balances and the most recent transactions. When more complete, cross- account, historical information is needed, it can be obtained from the data warehouse environment via a service oriented architecture (SOA) approach. If more extensive analytics is required for cross- or up-selling, the CPU-intensive analysis is delegated to IDAA, providing the possibility to do analyses in seconds that previously would have taken far longer than the customer would remain on the line. What we see here is the emergence of an integrated information environment that spans traditional OLTP and informational uses. This is in line with today’s and future business needs that erase the old distinction between the two worlds. Furthermore, the TCO benefits of a consolidated mainframe- based platform as discussed on page 9 suggest that there are significant cost savings to be achieved with this approach, driving further bottom-line business benefit. Architectural advantages Returning to our list of architectural deployment options on page 8, we can see that the IDAA ap- proach is essentially an extension of option 3: EDW on DB2 on z/OS and data marts distributed to other operating systems and/or servers, managed by z/OS. The data in DB2 on z/OS has the characte- ristics of an EDW; that on the IDAA is a dependent (fed from the EDW) data mart. The important point is that, while the IDAA data mart is implemented on another physical server, it is managed en- tirely by the same DBMS as the EDW. This management function extends from loading and updating the data mart to providing the single point of interface for both the EDW and data mart. Using the database management system (DBMS) to manage load and update—as opposed to using an extract, transform and load (ETL) tool—may seem like a small step. However, it is an important first step in simplifying the overall data warehouse environment. As we saw in the business benefits, mixed workload applications are becoming more and more important. Such applications demand that equivalent data be stored in two (or maybe more) formats for efficient processing. Bringing the management and synchronization of these multiple copies into the DBMS is key to ensuring data quality and consistency within increasingly tight time constraints. The operational BI / call center application mentioned in the previous section can be generalized into the architectural view shown in figure 4. In this we see both the operational and informational envi- W
  • 12. Copyright © 2012, 9sight Consulting, all rights reserved 12 ronment implemented on the System z, both benefiting from the advanced RAS characteristics of the mainframe environment. ETL within the same platform maximizes the efficiency of loading and up- dating the warehouse. Within the warehouse, the DB2 DBMS take responsibility for loading and up- dating the IDAA analytic data mart as previously described. Other data marts can also be consolidat- ed from distributed platforms into the mainframe-based data warehouse for reasons of performance or security. These data marts are also maintained by the DBMS, using extract, load and transform (ELT) techniques. Communication between the operational and informational systems may be via SOA as shown in the figure above; of course, other techniques such as DRDA could be used. SOA Interface OLTP Application Data Warehouse DB2 on z/OS Analytic data mart Informational Application EDW Data marts Application Interface IDAA DRDA Requestor Application Interface OperationalDatabase IMS or DB2 on z/OS Optimizer System z managed and secured ETL ELT Figure 4: A new operational / informational architecture
  • 13. Copyright © 2012, 9sight Consulting, all rights reserved 13 Conclusions “It is not my job to have all the answers, but it is my job to ask lots of penetrating, disturbing and occasionally almost offensive questions as part of the analytic process that leads to insight and refinement.” 13 usinesses today face increasing pressure to act quickly and appropriately in all aspects of op- erations, from supply chain management to customer engagement and everything in be- tween and beyond. This combination of right time and right answer can be challenging. The right answer—in terms of consistent, quality data—comes from the data warehouse. The right time is typically the concern of operational systems. Operational BI spans the gap and, in particular, where there are large volumes of information available, operational analytics provides the answers. The current popularity of operational analytics stems from the enormous and rapidly increasing vo- lumes of data now available and the technological advances that enable far more rapid processing of such volumes. However, when implemented in the traditional data warehouse architecture, opera- tional BI and analytics have encountered some challenges, including data transfer volumes, RAS limi- tations and restrictions in connection to the operational environment. The IBM DB2 Analytics Accelerator appliance directly addresses these challenges. Running complete- ly transparently under DB2 on z/OS, the appliance is an IBM Netezza MPP machine directly attached to the System z. Existing and new queries with demanding data access characteristics are automati- cally routed to the appliance. Performance gains of over 1,500x have been recorded for some query types. The combination of MPP query performance and the System z’s renowned security and relia- bility characteristics provide an ideal platform to build a high-availability operational analytics envi- ronment to enable business users to act at the speed of their thinking. For customers who run a large percentage of their OLTP systems on z/OS and have chosen DB2 on z/OS as their data warehouse platform, IDAA is an obvious choice to turbo-charge query performance for analytic applications. For those who long ago chose to place their data warehouse elsewhere, it may be the reason to revisit that decision. This approach reflects what IBM calls freedom by design, as it simplifies the systems architecture for the business. It also provides an ideal platform for consolidating data marts from distributed systems back to the mainframe environment for clear data management benefits for IT and significant reductions in total cost of ownership for the whole computing environment. For business, the clear benefit is to closely link from BI analysis to immediate business actions of real value. For more information, please go to www.ibm.com/systemzdata B
  • 14. Copyright © 2012, 9sight Consulting, all rights reserved 14 Dr. Barry Devlin is among the foremost authorities on business insight and one of the founders of data warehousing, having published the first architectural paper on the topic in 1988. With over 30 years of IT experience, including 20 years with IBM as a Distinguished Engineer, he is a widely respected analyst, consultant, lecturer and author of the seminal book, “Data Warehouse—from Architecture to Imple- mentation” and numerous White Papers. Barry is founder and principal of 9sight Consulting. He specializes in the human, organizational and IT implications of deep business insight solutions that combine operational, informational and collabora- tive environments. A regular contributor to BeyeNETWORK, Focus, SmartDataCollective and TDWI, Barry is based in Cape Town, South Africa and operates worldwide. Brand and product names mentioned in this paper are the trademarks or registered trademarks of IBM. This paper was sponsored by IBM. 1 IBM Institute of Business Value, “Customer analytics pay off”, GBE03425-USEN-00, (2011) 2 “Business analytics will enable tailored flight pricing, says American Airlines”, Computer Weekly, http://bit.ly/znTJrc , 28 October 2010, accessed 14 February 2012 3 Marlowe, C., “Doctor Faustus”, act 1, scene 1, (c.1592) 4 http://en.wikipedia.org/wiki/Analytics, accessed 24 January 2012 5 Davenport T. H. and Harris J. G., “Competing on Analytics: The New Science of Winning”, Harvard Business School Press, (2007) 6 Taylor, J., “Where to Begin with Predictive Analytics”, http://bit.ly/yr333L , 1 September 2011, accessed 8 February 2012 7 Selden, L. and Colvin, G., “Killer customers : tell the good from the bad and crush your competitors”, Portfolio, (2004) 8 Devlin, B. A. and Murphy, P. T., “An architecture for a business and information system”, IBM Systems Journal, Volume 27, Number 1, Page 60 (1988) http://bit.ly/EBIS1988 9 Devlin, B., “Data warehouse—From Architecture to Implementation”, Addison-Wesley, (1997) 10 Inmon, W.H., Imhoff, C. & Battas, G., “Building the Operational Data Store”, John Wiley & Sons, (1996) http://bit.ly/ODS1995 11 Rubin, H.R. “Economics of Computing—The Internal Combustion Mainframe”, (2010), http://bit.ly/zQ1y8D, accessed 16 March 2012 12 Rubin, H.R. “Technology Economics: The Cost Effectiveness of Mainframe Computing”, (2010), http://bit.ly/wsBHRb, accessed 16 March 2012 13 Gary Loveman, Chairman of the Board, President and CEO, Harrah’s, quoted in Accenture presentation “Knowing Beats Guessing” , http://bit.ly/AvlAao, June 2008, accessed 5 March 2012