Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Thinking of Upgrading to Oracle SOA Suite 11g? Knowing The Right Steps Is Key (article)
1. Volume 18 | Number 4
Fourth Quarter 2011
For the Complete Technology & Database Professional
w w w . i o u g . o r g
BUSINESS
INTELLIGENCE
Understanding Oracle BI
Components and Repository
Modeling Basics
by Abhinav Banerjee
Finding Oracle Database
Machine’s Rightful Place inYour
IT Organization’s Arsenal
by Jim Czuprynski
Going Live on Oracle Exadata
by Marc Fielding
2. My kind of oracle education & training • april 22-26, 2012
Mandalay bay, las vegas
exclusives:With educational tracks devoted to: • bi/data Warehousing/epM
• big data
• database administration and development
• Manageability
• security, risk and compliance
on hot topics like: • cloud computing
• exadata
• High availability
• virtualization
• dba101
full-day deep dives coMpliMentary
for ioug registrants only!
all other attendees pay the regular rate of $599.
8 hours of technical training, on topics like:
• virtualization • Webcenter
• business intelligence • rac
• Manageability • big data
• dba/developer • performance engineering
exclusive access to Hands-on labs
(a $350 value!)
gain actual experience and turn theory into practice at 2-hour
hands-on labs, focused on business intelligence topics like
analytics, warehousing, and obiee.
learn about exclusive ioug registration benefits & register:
http://collaborate12.ioug.org
collaborate 12 • registration opens noveMber 7, 2011
for under $2,000, collaborate 12 — the ioug (independent oracle users group) forum, offers access to over 1,000 hours of
oracle-related education and training through Hands-on labs, boot camps, full-day deep dives, and customer case studies.
exclusive ioug
registration benefits
3. Executive Editor
April Sims
Associate Editor
John Kanagaraj
Asia-Pacific Technical Contributor
Tony Jambu
Managing Editor
Theresa Wojtalewicz
Associate Editor
Alexa Schlosser
Contributing Editors
Ian Abramson
Gary Gordhamer
Arup Nanda
Board Liaison
Todd Sheetz
How can I contribute to SELECT Journal?
Write us a letter. Submit an article. Report on
Oracle use in all corners of the globe.
We prefer articles that conform to APA guide-
lines. Send to select@ioug.org.
Headquarters
Independent Oracle Users Group
401 North Michigan Avenue
Chicago, IL 60611-4267
USA
Phone: +1.312.245.1579
Fax: +1.312.527.6785
E-mail: ioug@ioug.org
Editorial
Theresa Wojtalewicz
Managing Editor
IOUG Headquarters
Phone: +1.312.673.5870
Fax: +1.312.245.1094
E-mail: twojtalewicz@ioug.org
How do I get the next one?
SELECT Journal is a benefit to members of
the Independent Oracle Users Group. For
more information, contact IOUG Headquarters
at +1.312.245.1579
SELECT Journal Online
For the latest updates and addendums or to
download the articles in this and past issues of
SELECT Journal, visit www.selectjournal.org.
Copyright Independent Oracle Users Group 2011 unless otherwise
indicated. All rights reserved. No part of this publication may be
reprinted or reproduced without permission from the editor.
The information is provided on an “as is” basis. The authors,
contributors, editors, publishers, the IOUG and Oracle Corporation
shall have neither liability nor responsibility to any person
or entity with respect to any loss or damages arising from the
information contained in this publication or from the use of
the programs or program segments that are included. This is
not a publication of Oracle Corporation, nor was it produced in
conjunction with Oracle Corporation.
4th Qtr 2011 ■ Page 1
C O N T E N T SVolume 18, No. 4, 4th Qtr. 2011
Features
C O N T E N T S
5 Understanding Oracle BI Components and Repository Modeling Basics
By Abhinav Banerjee
Abhinav discusses how unsuccessful or delayed BI implementations are most often attributed to
an improperly modeled repository not adhering to basic dimensional modeling principles.
12 Finding Oracle Database Machine’s Rightful Place in Your IT Organization’s Arsenal
By Jim Czuprynski
Jim explains how new capabilities in 11gR2 are likely to significantly improve the performance
and throughput of database applications that can be leveraged for improved database
application performance even without implementing an Exadata solution.
18 Going Live On Oracle Exadata
By Marc Fielding
Marc tells the story of a real-world Exadata Database Machine deployment integrating OBIEE
analytics and third-party ETL tools in a geographically distributed, high-availability architecture.
22 Thinking of Upgrading to Oracle SOA Suite 11g? Knowing The Right Steps Is Key
By Ahmed Aboulnaga
Ahmed delves into how upgrading from Oracle SOA Suite 10g to 11g can be costly due to the
dramatic change in the underlying architecture. This article takes you through a tried-and-
tested upgrade strategy to help you avoid the pitfalls early adopters have faced.
28 Beating the Optimizer
By Jonathan Lewis
How do you access data efficiently if there’s no perfect index? Jonathan provides insight on
how to creatively combine indexes in ways that the optimizer cannot yet manage and, by
doing so, minimize the number of table blocks there are to access.
2 From the Editor
3 From the IOUG President
27 Users Group Calendar
30 SELECT Star
33 Advertisers’ Index
34 Quick Study
Features
Regular Features
Reviewers for This Issue
Dan Hotka
Carol B. Baldan
Kimberly Floss
Sumit Sengupta
Darryl Hickson
Chandu Patel
Aaron Diehl
4. Page 2 ■ 4th Qtr 2011
First, I would like to introduce a new contributing editor, long-standing IOUG
volunteer Ian Abramson. His data warehouse expertise was a much-needed
asset to round out our editorial board.
There are also a few new features I would like to bring to readers’ attentions.
We have added the list of reviewers to the Table of Contents to thank them for
their dedication and hard work. SELECT Journal depends on reviewers to
make sure each article’s technical details are correct and pertinent to readers.
Another new feature, called Quick Study, allows SELECT Journal to give a nod
to key volunteers at the IOUG. Readers are given a glimpse of the volunteers’
worlds through a few short questions.
The PowerTips feature (see example on this page), SELECT Journal’s last
enhancement, is a collection of small gems of knowledge throughout the
magazine related to the overall theme of a particular issue. Q4 is focused on
Business Intelligence, making Mark Rittman’s presentation at COLLABORATE
2011, “Oracle Business Intelligence: 11g Architecture and Internals,” a perfect
choice for this role.
2012 Themes and Topics Announcement
We are actively looking for new authors to write on the following topics:
•• Q1 Theme: Tools for Change — Real Appplication Testing, SQL
Performance Analyzer, Oracle Application Testing Suite, SQL Plan
Management, Edition Redefinition
•• Q2 Theme: Time for Change — Case Studies and Migrations: Non-RAC to
RAC, Using Goldengate as an Upgrade, Non-ASM to Grid (RAC or Non-
RAC), Exadata/Exalogic, 11gR2+ Upgrades
•• Q3 Theme: Security — Oracle Database Firewall, Audit Vault, Oracle
Label Security, Oracle Advanced Security, Hardening FMW11g, Oracle
Entitlements Server
•• Q4 Theme: Business Intelligence/Data Warehouse — Big Data,
Performance, Advanced Features, Case Studies
If you are interested in writing on any of these topics, email select@ioug.org.
2012 Print vs. Digital Debate
This year, we have come up with the best compromise for digital versus paper
copies of SELECT Journal. We are printing a hard copy of two editions for
2012: Q1 and Q3. The remaining two editions, Q2 and Q4, will be digital, with
downloadable PDFs of all versions available online at http://www.ioug.org.
Welcome to the Q4 2011 issue
of SELECT Journal !
From the Editor
Why are we doing this? It allows us to have paper copy of the magazine for
the IOUG’s major events, COLLABORATE and Oracle OpenWorld. The digital
version, on the other hand, allows for more features and longer articles.
April Sims
Executive Editor
Another new feature, called Quick Study,
allows SELECT Journal to give a nod to key
volunteers at the IOUG. Readers are given a
glimpse of the volunteers’ worlds ...
Once again, we would like to thank all the authors and reviewers who
contribute to SELECT Journal for their efforts in providing high-quality
content. We always welcome your input and feedback, and we especially look
forward to you sharing your expertise via this fine medium. Email us at
select@ioug.org if you would like to submit an article or sign up to become
a reviewer.
April Sims
Executive Editor, IOUG SELECT Journal
BI Tip | Scale-out BI System
If you use the “Scale-out BI System” option within the “Enterprise” install
type to scale-out your OBIEE 11g system over additional servers, be aware
that the embedded WebLogic license that you get with OBIEE 11g is for
Standard Edition, not Enterprise Edition, which does not include support for
WebLogic Clustering. Therefore, if you wish to use this horizontal scale-out
feature, you’ll need to upgrade your WebLogic license, as a separate
transaction, from Standard to Enterprise Edition, before you can legally use
this feature.
From Mark Rittman’s COLLABORATE 11 presentation
“Oracle Business Intelligence 11g Architecture and Internals”
5. 4th Qtr 2011 ■ Page 3
For Exadata customers, we are working on developing educational content for
online and in-person training offerings. We plan to launch a set of Exadata
webinars providing a curriculum-based approach to Exadata education. The
new IOUG Exadata SIG is active on LinkedIn and forming its web presence.
New for COLLABORATE 12 Conference
I always hear how critical it is to get out of the office and get some training.
I’m happy to say that the IOUG continues to present, in partnership with Quest
and OAUG, COLLABORATE 12 (April 22-26, 2012, in Las Vegas) Collectively, we
bring to the community more than 1,000 peer-driven technical sessions that
provide first-hand insight behind Oracle products. Full-week training can cost
as much as $5,000, but for less than $2,000 and by registering through the
IOUG, you can access:
•• Free Full-Day Deep Dives (non-IOUG registrants cost is $595). Deep
dives are held the Sunday before the conferences opens. See the listing of
offerings for yourself, including topics on BI and Exadata.
•• Exclusive Access to Hands-On Labs. These labs are included in the cost
of your IOUG registration. Take a look at topics.
•• IOUG Boot Camps. These are horizontal sessions that provide you the
chance to learn about specialized topics for one or two full days during
the conference. They’re led by great minds and included in the cost of your
full conference registration. Back again this year for attendees who are
fairly new to the job, we offer our popular DBA 101 Boot Camp.
Registration for COLLABORATE 12 opens Nov. 7, 2011.
As a user group, the IOUG exists for you and because of you. Whether you’ve
recently joined or have been with us for years, I hope that we can be the source
that you turn to, again and again, to solve problems, expand your knowledge,
manage your career and, in short, make work life better.
Sincerely,
Andrew Flower
IOUG President
“IOUG provides great practical education and sharing of best practices on
Oracle technology, especially Business Intelligence,” says Ari Kaplan, analytics
manager for the Chicago Cubs and former president of IOUG. “I apply
what I have learned over the years from IOUG to help mine through large
amounts of data and find the most impactful and actionable information for
our organization.”
I hope you enjoy this issue! The IOUG is always very focused on a Business
Intelligence curriculum, which is becoming increasingly important to all data
management professionals. This issue is one great example of the caliber and
quality of the educational offerings we provide. Thank you to our terrific
editorial board for their support, leadership and vision on bringing this great
content to the community!
This Issue’s Feature: Business Intelligence
Knowing that Business Intelligence is a significant area of interest to our
members and the community, we’ve dedicated this issue of SELECT Journal to
an examination of the process of translating data into a meaningful way of
connecting the dots to drive business insight.
In-depth as it may be, this journal is just the tip of the iceberg when it comes
to the IOUG’s educational offerings. A quick tour of www.ioug.org reveals many
other opportunities for you to become educated about the latest BI issues,
including the IOUG Library, webinars, discussion forums, newsletters and
the Tips & Best Practices Booklet — and, as a member, you can access all of
this for free.
Year-Round Education
One of the best aspects of participating with the IOUG year round is access
to a Special Interest Group (SIG). Whether you are grappling with the
best way to implement a data warehouse that integrates information for the
business and underpins your analytics; making recommendations on how to
be more efficient delivering information; or looking to get the facts you need
to improve the underlying technology, such as an investment in Exadata, there
are others outside your company’s internal team that have similar interests
and objectives.
Participation in a Special Interest Group offers access to fellow members and
Oracle product managers and engineers. IOUG BIWA (Business Intelligence and
Data Warehousing) has thousands of members online and targeted content to
bring together like-minded individuals.
Andrew Flower
IOUG President
Dear Fellow IOUG Members…
From the IOUG President
7. 4th Qtr 2011 ■ Page 5
Understanding Oracle BI
Components and Repository
Modeling Basics
By Abhinav Banerjee
T
he importance of Business Intelligence (BI) is rising
by the day. BI systems, which help organizations
make better and more informed decisions, are
becoming crucial for success. There still are scenarios of
huge BI investments going haywire; for example, multiple
iterations of BI investments can exceed time and budget
limits and implementations can fail user acceptance. One
of the most common reasons for unsuccessful or delayed BI
implementations is an improperly modeled repository (stores
the metadata/business logic used by the BI server) not
adhering to basic dimensional modeling principles. This article
discusses this subject and describes the intricacies related to
repository modeling and the associated concepts.
Introduction
In an Oracle Business Intelligence (OBI) implementation, the repository plays
the most important role as the heart of any BI environment. The entire BI
implementation can go wrong because of a repository that is not well-
designed. RPD, or repository designing and modeling, is one of the most
complex processes in an OBI implementation. RPD is based on knowledge of a
few principles, which include dimensional modeling and data modeling.
In any implementation, we need to ensure our data and dimensional models
are well-designed. The data or dimensional model plays a significant role
depending on the reporting requirements, which might be either operational
or analytical. Once these models are in place, we need to ensure that the
physical and the business models are properly designed and developed.
It is highly recommended to have a well-designed dimensional model to
ensure better performance even if you have a requirement for operational
reporting; dimensional models are optimized for reporting, whereas the
continued on page 6
traditional data-relational models are optimized for transactions. The
complexity increases when requirements might include level-based measures,
aggregates, multiple facts, multiple logical sources, conforming dimensions,
slowly changing dimensions or very large data volumes.
Dimensional Modeling (DM)
DM refers to the methodology used to design data warehouses that need to
support high performance for querying/reporting using the concept of facts
and dimensions.
Facts, or measures, refer to the measurable items or numeric values. These
include sales quantity, sales amount, time taken, etc.
Dimensions are the descriptors or the relative terms for the measures.
Therefore, you have facts relative to the dimensions. Some of the most
common dimensions include account, customer, product and date.
Dimensional modeling includes the design of star or snowflake schema.
Star Schema
The star schema architecture constitutes a central fact table with multiple
dimension tables surrounding it. It will have one to many relationships
between the dimensions and the fact table. The dimensions typically have the
relative descriptive attributes that describe business entities. In case of a star
schema, no two dimensions will be joined directly; rather, all the joins between
the dimensions will be through the central fact table. The facts and dimensions
are joined through a foreign key relationship, with the dimension having the
primary key and the fact having the foreign keys to join to the dimension.
Snowflake Schema
The snowflake schema architecture also has a central fact table with multiple
dimension tables and one to many relationships between the dimension and the
fact table, but it also will have one to many relationships between dimensions.
The dimensions are further normalized into multiple related tables. In this
case, multiple dimension tables will exist related to the main dimension table.
Normally, we have one to many relationships between the dimensions. A primary
key-foreign key relationship exists between the dimension and the fact tables
as well as between dimensions.
Oracle BI Architecture
In order to understand the importance of the repository, we will need to have a
look at the Oracle Business Intelligence Enterprise Edition (OBIEE) architecture.
OBI repository directly corresponds with the Oracle BI server, which in turn talks
to the database, presentation services and the security service.
OBIEE is a state-of-the-art, next-generation BI platform that provides optimized
intelligence to take advantage of the relational/multidimensional database
technologies. It leverages the common industry techniques based on data
warehousing and dimensional modeling.The OBIEE engine dynamically generates
the required SQL based on the user’s inputs and designed model/definition in
the repository to fetch data for the reports from the related databases.
The various components of an OBI environment 11g, as shown in Fig. 1,
include Java EE Server (WebLogic), Oracle BI Server, Oracle BI Presentation
Services, Cluster Controller Services, Oracle BI Scheduler, Oracle Presentation
Catalog, Oracle BI Repository, Security Service and BI Java Host.
The various clients include Catalog Manager, BI Administration Tool, Scheduler
Tool, Scheduler Job Manager, BI Answers and Interactive Dashboards.
The next section takes a closer look at some of the major components within
OBI 11g.
8. Page 6 ■ 4th Qtr 2011
presentation layer. Each layer appears in a separate pane when opened with
the administration tool.
Actions Services
Actions services provides dedicated web services required by the action framework.
The action framework enables users to invoke business process based on
values of certain defined key indicators. It exists as action links in the
presentation catalog.
Security Service
There is a paradigm shift in the security architecture in OBIEE 11g. It
implements the common security architecture as the Fusion Middleware
Stack, which leverages the Oracle platform security service (OPSS) and
WebLogic authenticators. The various security controls that are available
include:
•• Identity Store — an embedded LDAP server in WebLogic to store users
and groups
•• Policy Store — a file to store the permission grants
•• Credential Store — a file to store user and system credentials for
interprocess communication
Cluster Controller Servers
There are two cluster controller servers in OBI 11g: a primary and secondary
cluster controller. By default, they get installed in a clustered environment.
Oracle BI Server
Oracle BI server is the core behind the OBIEE platform. It receives analytical
requests created by presentation services and efficiently accesses data required
by the user using the defined metadata — RPD. BI server generates dynamic
SQL to query data in the physical data sources and provides data to the
presentation services based on the request received. It also works with the help
of definitions in the configuration files and the metadata, which resides in
repository, also referred to as RPD.
Oracle BI Presentation Services
Presentation services is implemented as an extension to a web server. It is
deployed by default on OC4J, but Oracle supports additional web servers, such as
WebLogic, WebSphere and IIS depending on the nature and scale of deployment.
It is responsible for processing the views made available to the user and processes
the data received from the BI server in an appropriate, user-friendly format to
the requesting clients. There also is an associated Oracle BI Java Host service
that is responsible for proper display of the charts and graphs. Presentation
services uses a web catalog to store the saved content.
Oracle BI Repository
The repository has all the business logic and the design defined in it. It is
the repository of the entire business logic/metadata. It can be configured
through the Oracle BI administration tool. It helps build the business model
and organize the metadata properly for presentation to users. The repository
is comprised of three layers: physical, business model and mapping, and the
Understanding Oracle BI Components and Repository Modeling Basics continued from page 5
Figure 1: OBIEE Enterprise Architecture
9. 4th Qtr 2011 ■ Page 7
the schema. Next, the connection pool needs to be defined in repository; details
on how to connect to the database are stored in the OBIEE repository as shown
in Fig. 3. Once complete, the physical layer will have the imported objects. It
populates the connection pool with default values on import.
Build Physical Model
The next step is to build the physical model with the help of imported tables.
It is here that we will define the objects and their relationships. To build the
This provides a proper fallback environment in case of a single installation.
The environment constitutes of a cluster controller and the cluster manager.
Oracle BI Administration Tool
The Oracle BI administration tool is the thick client used to configure the
OBIEE repository. It allows viewing the repository in three separate layers:
physical, business model and mapping, and presentation. The first step of the
development process involves creating the physical layer.
Oracle BI Development Cycle
The development process begins with the creation of initial business requirements.
You should have as many sessions with the business as possible to gather and
confirm all the requirements. Try to look at the sample reports, spreadsheets, etc.
Analyze the existing transactional system and the reporting system if any exist.
Analyze the existing schema for the reporting system and the transaction system.
Based on the requirements and the transaction schema, try to define the
dimension model. There might be multiple iterations to the above steps.
Build Oracle BI Model
We can now look at how to build the Oracle BI model in the repository. Before
we begin, the dimension model will need to be designed to meet the business
requirements. In this section, I will explain the entire process of building the
Oracle BI model. There are multiple parts to this process: import the objects if
they don’t already exist in the physical layer; build the physical layer; build the
logical-business model and mapping layer; build the presentation layer; and
build the reports and dashboards based on the presentation-layer objects.
Import Objects
The first step involves creating a new RPD using the BI administration tool
and saving it. Next, we must import the objects into this repository to start
building the model as shown in Fig. 2. You will need to define the connection
type and other credentials for connecting to the database. In order to import
the tables, select the tables or just click on the schema name to bring them in
continued on page 8
Figure 2: Sample Schema for Import
Figure 3: Connection Pool Details
Figure 4: Physical Model
10. Page 8 ■ 4th Qtr 2011
model, we need to create joins between the physical tables. At the physical
layer we need to create foreign key joins — a sample is shown in Fig. 4. We
should know the join criteria between the various tables. We need to maintain
a 1:M relationship between the fact and the dimensions, which can be done by
selecting the dimension first and then joining it to the fact.
While creating the join, if the fact and the dimensions have the same keys,
then by default they will appear in the Expression Builder. The expression
shows the join criteria; Fig. 5 shows a sample.
There is also a feature to have database hints that tell the database query
optimizer to use the most efficient way to execute the statement, but we need to be
very careful with this feature and use it after proper evaluation as it may have
adverse impact in certain scenarios. This creates the join between the two selected
tables. Similarly, we need to create joins between all the other dimensions and the
fact. In the end, the physical model should look like Fig. 4. Next, we need to run a
consistency check on the physical layer to ensure there are no errors related to
syntax or best practices. If there are no consistency errors, we will see the
consistency check manager screen with no error messages.
Physical Layer Best Practices
Here are some best practices that I have observed are important to follow to
help your project be successful:
•• You should have a separate set of connection pool for the initialization
blocks and for the regular queries generated for the reports. This ensures
a better utilization of the connection pools and ultimately results in
performance improvements.
•• Ensure that “Connection Pooling,”“Multithreaded Connections,” “Shared
Logon” and the appropriate call interface is selected.
•• You should not have any connection pools that cannot connect to the
databases; this might lead to a BI server crash due to continuous polling
of the connection.
•• It is recommended to have a dedicated database connection — and preferably
a system account — for OBI with access to all the required schemas.
•• Always ensure that proper call interface is being used in the connection
Understanding Oracle BI Components and Repository Modeling Basics continued from page 7
Figure 5: Foreign Key Join
Figure 6: Logical Model
pool definition. In the case of Oracle database, it’s better to use an OCI
instead of an ODBC connection.
•• Use the aliases of the tables instead of the main tables; this will avoid
circular joins and caching-related issues.
•• Follow a proper, consistent naming convention to identify the aliases, tables
and columns. These may include W_XXX_F (for the fact tables), W_XXX_D
(for the dimension tables), Dim_W_LOCATION_D_PLANT (for dimension
alias tables) and Fact_W_REVN_F_ROLLUP (for fact alias tables).
•• Always have cache persistence time set to ensure the data gets refreshed as
required in case caching is enabled.
Build BMM Model
The next step is to build the business model and mapping (BMM) layer.
In the development phase, this is the second step after creating the physical
model. While designing the DM, though, it is the first step normally done
before designing the physical model. Planning the business model is
the first step in developing a usable data model for decision support. A
successful model allows the query process to become a natural process by
allowing analysts to structure queries in the same intuitive fashion as they
would ask business questions. This model must be one that business
analysts will inherently understand and one that will answer meaningful
questions correctly.
To begin, we need to give a name to the business model. Right-clicking in the
BMM section of the window opens the following window, which allows the
assignment of a name to the business model. The next step is to create the
container for the business model. The easiest way to build the BMM layer is to
either import in the tables from the physical layer or bring in the tables one by
one as per requirements and then create the joins between them. In a complex
environment, we normally do it one by one, as there might be multiple logical
table sources, calculations and other customizations involved.
Now we must look at the business model. To do that, we right-click on the HR
model and select business model diagram. That will display the BMM diagram
as shown in Fig. 6. The model is similar to the physical model. The major
difference will exist in terms of the join criteria. We do not specify any joining
columns in the logical layer; we only specify the cardinality and the type of
join in this layer, as shown in Fig. 7.
11. 4th Qtr 2011 ■ Page 9
implementations, the number of consistency failures increases after an
upgrade due to a lot of reasons.
BMM Layer Best Practices
To get the most from your OBI solution, each layer must be optimized. The
following tips are some of my best practices that will help in the BM:
•• Always use complex joins for joining the logical tables. Never use foreign
key joins at the logical level as it might restrict the OBIEE server from
using the most optimized path.
•• Use inner joins wherever possible. Minimize the usage of outer joins,
as they normally impact the performance. An easier solution for the
problem of using outer joins is to build multiple logical table sources and,
depending on the requirement, the appropriate logical table source is
accessed.
•• There should be a hierarchy defined for every logical dimension even if it
only consists of a grand total and a detail level.
•• If there is possibility of a hierarchy, then it’s always good to have a
dimension hierarchy defined, as it helps to improve user experience.
•• Ensure each level of the hierarchy has an appropriate number of elements
and the level key.
•• The lowest level of the hierarchy should be same as the lowest grain of the
dimension table. The lowest level of a dimension hierarchy must match
the primary key of its corresponding dimension logical tables. Always
arrange dimensional sources in order of aggregation from lowest level to
highest level.
•• Give business-meaningful names in the BMM layer itself instead of
making the changes in the presentation layer.
•• Use aggregates if required and enable the aggregate rule for all measures.
•• Aggregation should always be performed from a fact logical table and not
from a dimension logical table.
•• Columns that cannot be aggregated should be expressed in a dimension
The administration tool considers a table to be a logical fact table if it is at the
many end of all logical joins that connect it to other logical tables or if it’s not
joined to any of the tables and the facts are displayed in yellow. As visible in
Fig. 7, there are no expressions, so it picks up the base joins from the physical
layer itself. Here in the logical layer we can configure the type of the join
(inner, left outer, right outer, full outer) or the driving (fact or the dimension)
and the cardinality. Cardinality defines how rows in one table are related to
rows in the table to which it is joined. A one-to-many cardinality means that
for every row in the first logical dimension table, there are possibly 0, 1 or
many rows in the second logical table. Setting up the driving cardinality is an
optional step; generally, it is set to none and left to the OBI server to process it.
You should note that this option should be used with extreme caution; an
improper configuration can lead to severe performance degradation.
Driving tables are used in optimizing the manner in which the Oracle BI
server processes cross-database joins when one table is very small and the
other table is very large. Specifying driving tables leads to query optimization
only when the number of rows being selected from the driving table is much
smaller than the number of rows in the table to which it is being joined.
Driving tables are not used for full outer joins. Also important to note here are
the two entries in the features tab of database object that control and tune
driving table performance: MAX_PARAMETERS_PER_DRIVE_JOIN and
MAX_QUERIES_PER_ DRIVE_JOIN.
The BMM layer allows you to create measures with custom calculations.
You can build dimensional hierarchy by right-clicking on the dimension and
selecting “Create Dimension.” Dimensional hierarchy is created for entities
having two or more logical levels, a very common example being year, quarter,
month and day.
Once the customizations are finished, we need to do a consistency check
before the business model can be made available for queries. The BMM
object will have a red symbol until it passes the consistency check. If the
connection is not working or objects have been deleted in the database,
this utility will not report these errors. We can use the consistency check
to test for errors, warnings and best practices violations. In certain
Figure 7: Logical / Complex Join
Figure 8: Custom Calculation
continued on page 10
12. Page 10 ■ 4th Qtr 2011
logical table and not in a fact logical table.
•• Nonaggregated columns can exist in
a logical fact table if they are mapped
to a source that is at the lowest level of
aggregation for all other dimensions.
•• The content/levels should be configured
properly for all the sources to ensure that
OBI generates optimized SQL queries.
•• Create separate logical table sources for
the dimension extensions.
Build the Presentation Layer
Once you are done with the physical and the
BMM models, it is time to create the
presentation layer. To begin, drag and drop
the model from the BMM layer to the
presentation layer. This approach can only be
used when we have fairly simple models or are
building a new model. Next, we will need to
run another consistency check to ensure that
the presentation layer and the repository
are correct in terms of syntax and best
practices. Before completing the development
Understanding Oracle BI Components and Repository Modeling Basics continued from page 9
Figure 10: Repository
Figure 9: Consistency Check
13. 4th Qtr 2011 ■ Page 11
•• Detailed presentation catalogs should have measures from a single fact
table only as a general rule. The detailed dimensions (e.g., degenerate
facts) are nonconforming with other fact tables.
•• Never use special characters for naming convention in the presentation
layer and dashboards.
This completes the configuration of the repository. To use it, we will need
to ensure that the BI server recognizes that this is the correct repository.
That will require configuring the NQSConfig.ini and configuring the
instanceconfig.xml to create a new presentation catalog and open your
reporting solution to the end users for a robust and reliable experience.
C
■ ■ ■ About the Author
Abhinav Banerjee is a principal consultant working with KPI Partners.
He has more than eight years of business intelligence and data integration
experience with more than four years in OBIEE (custom and packaged
analytics). He has worked with several global clients in various domains
that include telecommunications, high tech, manufacturing, energy,
education, and oil and gas. He is also a frequent speaker at various
Oracle conferences such as COLLABORATE and Oracle OpenWorld.
Abhinav specializes in OBIA as well as custom OBIEE implementations.
He can be reached at abhinav1601@gmail.com.
Figure 11: Usage of NQSConfig.ini
cycle, we will need to take a few steps to clean the repository. We can remove
all the columns not required for analysis, but we must keep in mind to not
remove the keys from the logical dimensions, as the business model will not be
valid. We should ensure that there are no extra objects in the repository; it
helps with the maintenance and also keeps the repository light. Once done, the
presentation layer will look as it does in Fig. 10.
Presentation Layer Best Practices
The final layer of the OBI solution is the presentation layer. The best practices
that follow have improved the implementation of reporting:
•• Ensure proper order of the objects so that it allows easy access to the
required entities.
•• Have business friendly/relevant names.
•• Give a small description to serve as a tool tip for the users.
•• Avoid designing the dashboard with large data sets. The requests should be
quick and simple.
•• Avoid using too many columns and use appropriate color combinations.
•• Never combine tables and columns from mutually incompatible logical
fact and dimension tables.
•• Avoid naming catalog folders the same as presentation tables.
14. Page 12 ■ 4th Qtr 2011
Finding Oracle Database
Machine’s Rightful Place in
Your IT Organization’s Arsenal
By Jim Czuprynski
S
ynopsis: The Exadata Database Machine offers
intriguing opportunities to improve the performance of
Oracle Database applications. The latest release of the
Exadata X2-2 Database Machine incorporates several unique
features that are bound tightly to Oracle Database 11gR2.
This article delves into why these new capabilities are likely
to significantly improve the performance and throughput
of database applications. It also looks at how some of the
features intrinsic to Oracle Database 11gR2 can be leveraged
for improved database application performance even without
implementing an Exadata solution.
Exadata Database Machine: Basic Architecture and Concepts
Oracle introduced the first version of the Exadata Database Machine at Oracle
OpenWorld in October 2008. With the release of Database Machine X2 in 2010,
however, it’s now touted as one of the fastest database platforms in the world
based on its capability to process Oracle Database 11g Release 2 application
workloads at blistering speeds (see Table 1). Oracle currently offers the
Exadata Database Machine in three sizes: Quarter Rack, Half Rack and Full
Rack. These machines combine flash memory solid-state storage, high-speed
SAS hard disk drives (HDDs) and highly powered database servers.
IDB, InfiniBand and ZDP
Each Exadata Database Machine ships with Oracle Enterprise Linux (OEL)
version 5 pre-installed as its OS and with Oracle 11g Release 2 RDBMS
pre-installed as its RDBMS. The 11gR2 database kernel has been upgraded so
that it can leverage several features unique to the Exadata Database Machine.
The new iDB (INTELLIGENTDATABASE) communications protocol allows an
11gR2 database to communicate seamlessly and intelligently so that, when
necessary, SQL query processing can be offloaded completely to the Exadata
storage cells without having to retrieve all of the database blocks necessary to
answer the query.
Table 1. Exadata Database Machine: Rack ’Em and Stack ’ Em
EXADATA DATABASE MACHINE
Configuration
X2-2
(“Quarter Rack”)
X2-2
(“Half Rack”)
X2-2
(“Full Rack”)
DATABASE Servers 2 4 8
CPUs 24 48 96
Memory (GB) 192 384 768
Storage Servers 3 7 14
Number of CPUs 12 24 48
Number of Infiniband Switches 2 3 3
SAS Drive Capacity (TB):
High-Performance (Raw / Useable) 21.0 / 9.25 50.0 / 22.5 100.0 / 45.0
High-Capacity (Raw / Useable) 72.0 / 31.5 168.0 / 75.0 336.0 / 150.0
Flash Drive Capacity (TB) 1.1 2.6 5.3
Theoretical Flash Cache IOPS 375,000 750,000 1,500,000
List Price (including support) $366,000 $671,000 $1,220,000
IDB also implements Zero-Loss Zero-Copy Datagram Protocol (ZDP).
Constructed upon Reliable Datagram Sockets (RDS) version 3, this is open
source networking software that’s also a zero-copy implementation of RDS
that is more reliable than the User Datagram Protocol (UDP). Because an
Exadata Database Machine uses 40 GBe Infiniband connections between its
database servers and its storage servers, there is extremely little latency
between when a database server communicates with its corresponding
Exadata storage cells.
Smart Flash Cache
Smart Flash Cache is designed to overcome the limitations of individual hard
disk devices (HDDs) whenever a database application’s random access
response time requires a relatively large number of I/Os per second (IOPS) to
satisfy customer service level agreements. Storage area networks (SANs) can
overcome this limitation by placing dozens or even hundreds of HDDs in large
arrays that have a combined random response time of more than 50,000 IOPS
and then using large amounts of read/write cache to retain data for later
reading if an identical section of a file is still available in cache. This also
enables the SAN’s I/O subsystem to write the data back to disk at the best
possible time to minimize write contention on individual physical disks.
Exadata overcomes these limitations through its Smart Flash Cache features.
Smart Flash Cache is implemented using PCIe-based single-level-cell (SLC)
flash memory within each Exadata storage cell configured specifically for
random access — especially reads — of identical database blocks. Oracle
rates the 384GB of flash memory in an Exadata cell at 75,000 IOPS, and
because multiple cells are linked together over the Infiniband network, they
can perform huge numbers of random read operations in parallel. The largest
Exadata Database Machine configuration — the X2-2 full rack — contains
14 Exadata storage cells, so it can theoretically achieve more than one million
random-access read IOPS.
Smart Flash Cache automatically will retain the most frequently accessed
database blocks for both data and index segments, as well as the database’s
control files and datafile headers. (Oracle DBAs who still want to use Smart
Flash Cache in a similar fashion as they have been using the KEEP cache
should note that Exadata does provide a special storage attribute for segments
15. 4th Qtr 2011 ■ Page 13
Not all SQL statements will be able to leverage storage indexes because only
columns with a datatype of NUMBER, DATE or VARCHAR2 are supported.
But there is a definite tendency in data warehousing and even OLTP processing
for 90 percent of statements to be handled by 10 percent of the data, thus,
the relatively small memory footprint for a storage index generally outweighs
the alternative: unnecessary table scans of extremely large tables. And for
partitioned tables, storage indexes offer yet another advantage. For example,
consider a common situation where a table (INVOICES) has two columns
that have an implicit relationship — for example, the date on which an
invoice is issued (ISSUE_DATE) and the date on which it’s paid (PAID_DATE).
It’s not unusual to partition the INVOICES table based on just one column
(e.g. ISSUE_DATE); however, an Exadata storage cell can take advantage of a
storage index to “prune” the result set further on the PAID_DATE column
whenever it’s used in the predicate of a SQL statement.
Yet another feature unique to Exadata storage cells, Smart Scan, allows the
storage cell to return only the rows and/or columns necessary to satisfy a
query. This is a significant alternative when an execution plan might normally
return a huge number of database blocks because it needs to perform one or
more table scans of large database tables. The storage cell instead scans the
database blocks that need to be retrieved but then assembles just the rows and/
or columns into a result set that satisfy the request. Therefore, the processing
of many SQL statements can be offloaded from the database server directly to
one or more Exadata storage cells.
Smart Scan processing incorporates two features:
•• Predicate filtering lets a storage cell return only the rows necessary to
satisfy a query rather than all the rows that would normally be returned
when a table scan operation is needed.
•• Like predicate filtering, column filtering (also known as column projection)
returns only the columns necessary to answer a query’s request. It reduces
the number of columns necessary to answer a query’s request, thus limiting
the size of the result set returned to the database server.
ASM Redundancy: Efficient Yet Robust Data Protection
Since the initial release of Oracle’s Automatic Storage Management (ASM) file
system in Oracle 10gR1, ASM’s feature set has improved dramatically. One of
ASM’s most valuable features is the capability to provide two-way mirroring
(NORMAL redundancy) or three-way mirroring (HIGH redundancy) for data
stored within an ASM allocation unit (AU). Exadata storage cells leverage
ASM’s NORMAL redundancy settings to provide essential data protection using
JBOD (Just a Bunch Of Disks) HDDs instead of the relatively more expensive
RAID that most enterprise storage systems provide. In addition, storage cells
will leverage Smart Flash Cache whenever ASM needs to write the secondary
AU that protects its counterpart primary AU. The secondary AU will not be
cached in memory but instead can be written immediately to disk. ASM will
instead read from the primary AU retained within Smart Flash Cache.
When Will Exadata Most Likely Perform Miracles?
It’s a rather foregone conclusion Exadata cannot help but reduce the
execution times for reasonably well-tuned data warehousing queries, OLAP
analysis and data mining operations. In fact, Oracle claims that some data
warehousing queries will see reduction in query times by as much as two
orders of magnitude (100x). For starters, Exadata’s Smart Flash Cache
features almost guarantee that the database blocks needed to answer a query
have already most likely been captured and retained within at least some of
the storage cells’ flash cache memory.
called CELL_FLASH_CACHE that will specifically retain any data, index or LOB
segment within the Smart Flash Cache.)
Smart Flash Cache is able to provide these features because it is intimately
aware of not only which database blocks are stored within its confines, but
also how database applications are actively using each database block. As a
database block is retrieved from a storage cell and brought into the database
buffer caches of a database server, Exadata retains the metadata of how the
block is being utilized. Smart Flash Cache can then leverage this information
to decide how the buffer should be retained within the cache and whether that
buffer can satisfy other types of processing requests, including Recovery
Manager (RMAN) backups, DataPump Exports, and especially Exadata’s Smart
Scan and Smart Storage features.
Hybrid Columnar Compression
Unique to the Oracle Database Machine and Exadata, Hybrid Columnar
Compression is a completely different way to store row-level data for a database
table. Once enabled for a given table, Exadata first groups the rows into sets
based on the similarity of values stored within the columns. These row sets are
then tightly packed into logical storage structures called compression units. All
the rows in a compression unit will contain similar values, and thus Exadata
can compress the rows more quickly and also store them more efficiently. A
compression unit contains row sets that encompass extremely similar column
value ranges, so Exadata can also leverage this data homogeneity during SQL
statement processing operations. Because this tends to add overhead to DML
operations, HCC is best used on static or historical data.
Storage Cells, Storage Indexes, Smart Scan and Smart Storage
Exadata leverages Oracle’s Automatic Storage Management (ASM) for
formatting and controlling all HDD and flash disk storage. Each individual
storage cell is a combination of server, HDDs and even flash disks that can be
constructed optionally. Utilizing a portion of the 384GB of flash memory,
Exadata maintains a series of storage regions. These regions are automatically
aligned on the same boundaries as ASM’s allocation units (AU). Each storage
cell indexes these regions to retain metadata about the data distributions,
storing this information within in-memory region indexes. Every region index
can retain data distribution metadata for up to eight individual table columns.
A storage index comprises one or more region indexes, so each storage cell is
thus able to track in memory the value ranges for all data stored within all
columns for all tables within that cell.
Each storage cell automatically and transparently creates and uses storage
indexes for all columns that appear to be well-clustered around similar
column values. Therefore, the largest benefit typically will be obtained when a
column’s data values are ordered within the table’s rows such that similar
values are already closely clustered together, especially when an SQL statement
will access rows using selection criteria predicates against those columns
based on relatively simple equality (=), less than (<) or greater than (>)
operators. Storage indexes also are destroyed whenever a storage cell is
rebooted but will be rebuilt automatically after reboot as the storage cell sees
fit, so there are no additional objects for a DBA to construct and maintain.
Storage cells can thus use storage indexes to quickly skip much of the I/O
processing that would normally be required with a traditional B-Tree or
bitmap index. Without a storage index, it might be necessary to retrieve most
or all rows of a table to determine if a query predicate can even be applied to
the rows. And because a storage index provides intelligence on what’s already
been retrieved from disk and already in cache within the storage cell, I/O may
be completely unnecessary.
continued on page 14
16. Page 14 ■ 4th Qtr 2011
Sorry, but bad SQL is still bad SQL. Poorly written SQL queries that
unwittingly require millions of blocks to be accessed will probably run faster
— or they may not. Using Exadata’s massively parallel architecture to improve
the performance of over-HINTed, under-optimized or otherwise untuned SQL
statements is like using a nuclear weapon to swat a fly: simply overkill.
Points Of Serialization Must Still Be Located And Resolved. While it
may indeed be possible to satisfy a larger number of user connections using
Exadata’s Smart Flash Cache features, points of serialization may still exist
within a poorly designed OLTP application. For example, an OLTP workload
will still perform poorly in a RAC database environment whenever:
•• Sequences are not used to obtain the next value for a “dumb number”
primary key value;
•• An insufficient number of sequence values are not cached on each RAC
database instance to avoid possible contention during index row piece
creation; and
•• DSS or OLAP analyses are being executed simultaneously against the
same database objects — particularly indexes — that are being actively
modified via OLTP DML operations.
Leveraging Non-Exadata Oracle 11gR2 Features To Simulate Exadata
Performance
Exadata’s capabilities to dramatically improve data warehousing and OLAP
query speeds are certainly well-documented, but what if an IT organization’s
current and future application workload profile really can’t benefit fully from
those features? What if the application workloads are extremely diverse
“hybridized” workloads that might benefit dramatically from only some of
Oracle’s Smart Flash Cache features? Interestingly, it may be possible to utilize
those features without incurring the full costs of an entire Exadata system.
Oracle Database 11gR1 added several new enhancements that were most likely
predecessors of the features enabling the tight complementarity of the Exadata
Database Machine and 11gR2. Interestingly, these same features might make
it possible for an astute DBA to reproduce significant performance
improvements for their database application workloads.
Oracle Advanced Compression
In releases prior to Oracle Database 11gR1, table compression could only be
applied to direct insertion of rows into an existing table via the COMPRESS
FOR DIRECT LOAD storage directive. Oracle 11gR1 extended this support to
include table compression for UPDATE and DELETE operations as well via the
COMPRESS FOR ALL OPERATIONS directive. (In Oracle Database 11gR2, the
DIRECT LOAD and ALL OPERATIONS directives have been renamed to BASIC
and OLTP, respectively.) When activated for a table’s data segment, row pieces
within the block will be compressed whenever the PCT_FREE threshold is
reached. This compression continues until Oracle has compressed all row
pieces within the block’s free space to their minimum size.
The ultimate compressibility of row pieces within a block certainly depends
upon the amount of CHAR and VARCHAR2 data and the number of “duplicate”
values within the columns, but Oracle claims that in most cases compression
ratios of 300 percent to 400 percent are not unlikely. Granted, this is still
considerably less than the mega-compressibility that Exadata’s HYBRID
COLUMNAR COMPRESSION feature offers, but in many cases it may significantly
boost performance of OLTP and DSS applications because three to four times
more rows can be read in a single IO operation since decompression is not
needed for table scan operations.
When a result set does need to be constructed, a storage cell is extremely likely
to use Smart Scan features to construct it with extreme efficiency because only
the rows and columns necessary to build it will be retrieved, even when a full
table scan might be required to return the result set. In addition, Oracle claims
that compression ratios of 10:1 are not uncommon with Hybrid Columnar
Compression, so if a 200GB table did need to be scanned, as little as 20GB of
disk I/O might be required. And because a single Exadata cell contains a
relatively large number of CPUs, any query that can benefit from parallel
processing will be able to take advantage of considerable CPU “horsepower”
and will execute extremely quickly.
In addition, if a query is executed with Exadata against a RAC database, then
potentially all RAC database instances could “bring their guns to bear” to
parallelize the query across those instances’ CPUs and memory. Therefore, a
RAC database running in an Exadata environment should offer a significant
opportunity to scale out parallelized query operations. Finally, should an
Exadata cell ever need to actually read database blocks from its I/O subsystem,
Exadata’s 40GBe Infiniband storage network means those blocks will be
retrieved extremely quickly and with minimum overhead.
When Should Exadata Perform Reasonably Well?
As most OLTP applications apply changes to an extremely small number of
database blocks when new rows are added or existing data is changed or
removed, it’s usually much more difficult to improve the performance of OLTP
workloads. Because it may be impossible to reduce significantly the number of
blocks required to complete an OLTP transaction, extreme OLTP application
workloads demand extremely low latency when communicating with the
database server’s I/O subsystem. Each Exadata storage cell is benchmarked to
provide 75,000 IOPS, and by extension this means a full-rack Exadata
Database Machine can accommodate more than one million IOPS (14 storage
cells x 75K IOPS = 1050K IOPS). This means that even a single full-rack
Exadata Database Machine is uniquely positioned to provide the response
times that extreme OLTP workloads typically demand.
While Exadata’s Smart Flash Cache features do promote intelligent data
placement on its underlying storage, an astute Oracle DBA often knows exactly
which database segments — especially data and index segments — for an
OLTP application workload would benefit from placement on the fastest I/O
resources. Exadata’s Smart Data Placement feature gives an Oracle DBA the
flexibility to place these known objects within the most appropriate storage for
the application workload.
Many Oracle shops have implemented RAC databases across several nodes to
allow OLTP applications to scale up so that many thousands of concurrent
user connections can be serviced simultaneously. For extreme OLTP
application workloads, it’s crucial that the private interconnect networking
layer between database servers is reserved exclusively for the intense demands
of RAC Cache Fusion whenever buffers are exchanged between nodes in the
RAC cluster. The good news here is that the Exadata Database Machine’s
Infiniband 40GBe network is also used for the RAC private interconnect, which
means it’s ideally suited for OLTP application workloads.
When Might Exadata Yield Little or No Performance Improvements?
There’s little doubt that Exadata offers an excellent platform for extreme DSS
and OLTP application workloads. But an Oracle DBA should take a few into
account when evaluating whether the considerable investment into the Exadata
infrastructure will yield dramatic benefits for her application workloads:
Finding Oracle Database Machine’s Rightful Place… continued from page 13
17. 4th Qtr 2011 ■ Page 15
Identifying Likely Candidate Tables For Compression
Determining which tables or table partitions might benefit most from
compression means it’s necessary to pinpoint which data segments are most
heavily and frequently accessed during a typical application workload cycle:
•• Automatic Workload Repository (AWR), first introduced in Oracle Database
10gR1, offers an excellent capability to pinpoint which tables and table
partitions are most actively accessed through the Segment Statistics
section of AWR reports and/or ASH reports.
•• Another somewhat more manual alternative to using AWR performance
metadata involves a simple report against the V$SEGMENT_STATISTICS
dynamic view.
•• Oracle Database 11gR2’s new DBMS_COMPRESSION package makes
estimating the potential compression that any table or table partition
might achieve a snap. Procedure GET_COMPRESSION_RATIO allows
an Oracle DBA to calculate the potential average row length and block
compression factor for OLTP compression. The results of this procedure’s
execution against the SH.SALES data warehousing table for these
compression factors is shown in the script available online.
Smaller Is Better: Partitioning and Segment Space Allocation Improvements
Oracle Database 11gR1 also introduced several new partitioning methods.
An extension of the original RANGE partitioning method, the INTERVAL
partitioning method now eliminates the need to attempt to predict the future
size of each table partition; instead, the partition will be materialized only
when a row is first added to the pertinent partition. Oracle 11gR2 expands
upon this concept for nonpartitioned tables and indexes by providing the new
SEGMENT CREATION DEFERRED storage directive, which creates the initial
data segment for the corresponding table only when a row piece for a block
within that segment is first created.
Multicolumn Statistics
As of Oracle 11gR1, it’s possible to gather extended statistics on multiple
columns that encompass data that’s likely to be used simultaneously when
rows are being selected. A few examples of these types of implicit data
relationships include vehicle make and model (e.g., Ford Mustang) and
geographical locations (e.g., Orlando, FL, USA). Whenever the Oracle 11g
query optimizer detects that equality predicates are used against these
multicolumn groups, it will use these extended statistics to build a better
execution plan, thereby dramatically increasing the speed of data searches
Poor Man’s Smart Scan: Caching Result Sets With +RESULT_CACHE
While Exadata automatically creates and retains result sets for an application
workload’s most commonly executed queries via Smart Flash Cache and
Smart Scan, these features have already been available as of Oracle Database
11gR1. Yes, SQL Result Set Caching is not as automatic as Exadata’s
implementation of Smart Scan, but any reasonably astute DBA can determine
the most common queries that her application workloads are using though
careful analysis of AWR reports. For example, she may identify that there’s a
constant demand for up-to-date sales promotions data summarized within
promotion categories, but she also knows that the underlying data for this
query changes relatively infrequently and several user sessions could easily
take advantage of a cached result set. The following code illustrates how
simple it is to capture this result set into the database instance’s Shared Pool,
thus making any part of these results available to any other query requesting
these data:
SELECT /*+RESULT_CACHE*/
promo_category_id
,promo_category
,SUM(promo_cost) total_cost
,ROUND(AVG(promo_cost),2) avg_cost
FROM sh.promotions
GROUP BY promo_category_id, promo_category
ORDER BY promo_category_id, promo_category;
Implementing Oracle Flash Cache in Oracle 11gR2
Many Oracle DBAs haven’t taken notice yet of one of the most revolutionary
features in Oracle Database 11gR2: the ability to dramatically extend the size
of their databases’ buffer caches by enabling Flash Cache. This feature set does
require patching both Oracle 11gR2 Grid Infrastructure and 11gR2 Database
homes to at least release 11.2.0.1.2, or by upgrading these homes to release
11.2.0.2.0. Once patched, it’s relatively simple to deploy Flash Cache against one
of the many server-resident PCIe IO cards that have recently become available
on the market. Table 2 lists two of the more popular high-capacity IO cards,
including their manufacturer’s specifications and approximate list price.
Table 2. Oracle Flash Cache Enablers
Intra-Server IO Card Vendor FusionIO Virident
List Price $14,000 $9,300
Memory Type (MLC, SLC) SLC SLC
Card Size Half or Full Height Half or Full Height
Card Capacity (GB) 400 350
Actual Capacity When Formatted (GB) 322.5 300
Speed Claims (Per Manufacturer) 119,000 IOPSi
300,000 IOPSii
i
119K IOPS with 512 byte block size, 75/25 R/W ratio
ii
300K IOPS with 4KB block size, 75/25 R/W ratio
For the sake of simplicity, I’ll reference the FusionIO card to illustrate how
an intra-server IO card can be formatted and configured for its eventual use
by a single-instance database that’s also taking advantage of the new Grid
Infrastructure features of 11gR2. I recently had the opportunity to experiment
with a FusionIO card. I installed the card within a Hitachi BDS2000 blade
server, configured it for use by installing the appropriate Linux device drivers,
and a few minutes later I was able to format it as a physical IO device. I then
used the native Linux fdisk command to create two new partitions on the
device sized at approximately 28.6 and 271.8 GB, respectively:
#> fdisk -l /dev/fioa
Disk /dev/fioa: 322.5 GB, 322553184256 bytes
255 heads, 63 sectors/track, 39214 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/fioa1 1 3736 30009388+ 83 Linux
/dev/fioa2 3737 39214 284977035 83 Linux
At this point, the FusionIO card’s partitions can be utilized just as if it were any
other physical storage device. The listing below shows the ASMCMD commands
issued and the resulting output from Oracle 11gR2 Grid Infrastructure.
continued on page 16
18. Page 16 ■ 4th Qtr 2011
I placed the 280GB partition of the FusionIO card into a single ASM disk group
named +OFC:
ASMCMD> mkdg ‘<dg name=”OFC”><dsk string=”/dev/fioa2” size=”280G”/></dg>’
Next, the database instance was restarted with a significantlyundersized
database buffer cache of only 10MB. Note that Automatic Memory
Management (AMM) and Automatic Shared Memory Management (ASMM)
also were deactivated so that the instance could not dynamically allocate
additional memory to the database buffer cache when it might run out of
memory so that Oracle Flash Cache could be utilized fully during testing:
SQL> ALTER SYSTEM SET DB_CACHE_SIZE=128M SCOPE=SPFILE;
SQL> ALTER SYSTEM SET SGA_TARGET=0;
SQL> ALTER SYSTEM SET MEMORY_TARGET=0;
SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP;
To activate Oracle Flash Cache as an extension of the instance’s buffer cache,
I then modified just two parameters. DB_FLASH_CACHE_FILE determines the
actual physical location of the Flash Cache file, and DB_FLASH_CACHE_SIZE
restricts the ultimate size of the Flash Cache file itself. As illustrated below,
I only had to specify an ASM disk group as the target for the file; once that’s
done, the database will create a new physical file in the ASM disk group:
SQL> ALTER SYSTEM SET DB_FLASH_CACHE_FILE=+OFC;
SQL> ALTER SYSTEM SET DB_FLASH_CACHE_SIZE=280G;
Conclusions
It’s virtually impossible to question whether the Exadata Database Machine
offers some absolutely incredible performance gains, especially for complex
data warehousing queries, OLAP queries and data mining operations.
Exadata also has the potential to dramatically improve the scale-up of OLTP
application workloads — provided, of course, that the OLTP applications are
truly scalable. But it would be equally unjust to promote Exadata as the
ultimate panacea for improving the performance of all database application
workloads. Some questions to help your team decide include:
•• What is the I/O profile of the database workloads for the server(s) and
storage subsystem(s) that Exadata is intended to replace?
•• What are the minimum application workload performance
improvement targets?
•• What’s the cost/benefit ratio of implementing Exadata Database Machine,
especially when the increased licensing costs are taken into account?
•• What are the potential complexities of migrating existing Oracle databases
to an Exadata environment, and is there a risk of any serious violations to
application service-level agreements while the migration is completed?
•• Finally, is your IT organization ready to accept at least in part the “one
throat to choke” strategy that Exadata Database Machine implies? Or
would simply deploying improved hardware (e.g., faster database servers,
more server DRAM, SSDs and Flash Memory) enable the organization to
improve application workload performance to exceed current service-level
agreements?
Finding Oracle Database Machine’s Rightful Place… continued from page 15
BI Tip | WebLogic Scripting Tool (WLST)
If you want to script administration tasks usually carried out by
Enterprise Manager, take a look at the WebLogic Scripting Tool (WLST)
and the Oracle BI Systems Management API, which comes with features
to change configuration settings, deploy repositories and perform most
other OBIEE systems administration tasks, all from a Jython-based
scripting environment.
From Mark Rittman’s COLLABORATE 11 presentation
“Oracle Business Intelligence 11g Architecture and Internals”
The potential alternatives to a purely Exadata Database Machine solution
presented in this article to solve common database workload performance
issues are offered in Table 3 below. Even if an IT organization decides that the
time for evaluating or implementing an Exadata solution is not on the future
time horizon, these solutions offer insight into exactly how tightly coupled
Oracle Database 11gR2 is with the storage solutions that only Exadata
Database Machine offers:
Table 3. Summary: Possible Alternatives to EXADATA Solutions
Problem EXADATA Integrated
Solutions
Non-EXADATA
Solutions
Extending the database buffer
cache’s capacity and
performance
Smart Flash Cache KEEP/RECYCLE caches
Oracle Flash Cache
Determining which objects to
cache and where for most
efficient usage
Smart Scan
Storage Indexes
AWR Reports
Segment Statistics
Compressing rarely used data Hybrid Columnar Compression
(Archival Compression)
Oracle Advanced Compression
(BASIC and OLTP)
DBMS_COMPRESSION
Compressing
frequently used data
Hybrid Columnar Compression
(Warehouse Compression)
Oracle Advanced Compression
(BASIC and OLTP)
Offloaded SQL Processing Smart Scan
Storage Indexes
SQL Result Set Caching
Partitioning
MultiColumn Statistics
Recovery Manager (RMAN)
Backups that support RTO/RPO
requirements
Block Change Tracking
Incremental Level 1 Backups
Massively Parallelized
Multi-Piece Backup Sets
(SECTION SIZE)
C
■ ■ ■ About the Author
Jim Czuprynski has accumulated more than 30 years of experience
during his information technology career. He has filled diverse roles
at several Fortune 1000 companies in those three decades before
becoming an Oracle database administrator in 2001. He currently
holds OCP certification for Oracle 9i, 10g and 11g. Jim teaches the core
Oracle University database administration courses as well as the
Exadata Database Machine administration course on behalf of Oracle
and its education partners throughout the United States and Canada.
He continues to write a steady stream of articles that focus on myriad
facets of Oracle database administration at databasejournal.com.
20. Page 18 ■ 4th Qtr 2011
Going Live On Oracle Exadata
By Marc Fielding
T
his is the story of a real-world Exadata Database
Machine deployment integrating OBIEE analytics and
third-party ETL tools in a geographically distributed,
high-availability architecture. Learn about our experiences
with large-scale data migration, hybrid columnar compression
and overcoming challenges with system performance. Find out
how Exadata improved response times while reducing power
usage, data center footprint and operational complexity.
The Problem
LinkShare provides marketing services for some of the world’s largest retailers,
specializing in affiliate marketing, lead generation, and search1
. LinkShare’s
proprietary Synergy Analytics platform gives advertisers and website owners
real-time access to online visitor and sales data, helping them manage and
optimize online marketing campaigns. Since the launch of Synergy Analytics,
request volumes have grown by a factor of 10, consequently putting a strain
on the previous database infrastructure.
This strain manifested itself not only in slower response times, but also
increasing difficulty in maintaining real-time data updates, increased
database downtime and insufficient capacity to add large clients to the system.
From the IT perspective, the legacy system was nearing its planned end-of-life
replacement period. Additionally, monthly hard disk failures would impact
performance system-wide as data was rebuilt onto hot spare drives. I/O
volumes and storage capacity were nearing limits and power limitations in the
datacenter facilities made it virtually impossible to add capacity to the existing
system. Therefore, the previous system required a complete replacement.
The Solution
The end-of-life of the previous system gave an opportunity to explore a wide
range of replacement alternatives. They included a newer version of the legacy
database system, a data warehouse system based on Google’s MapReduce2
data-processing framework and Oracle’s Exadata database machine.
Ultimately, Exadata was chosen as the replacement platform for a variety of
factors, including the superior failover capabilities of Oracle RAC and simple,
linear scaling that the Exadata architecture provides. It was also able to fit in
a single rack what had previously required three racks, along with an 8x
reduction in power usage. Exadata was able to deliver cost savings and
improved coverage by allowing the same DBAs that manage the existing
Oracle-based systems to manage Exadata as well.
Once Exadata hardware arrived, initial installation and configuration was
very fast, assured with a combination of teams from implementation partner
Pythian; Oracle’s strategic customer program, Oracle Advanced Customer
Services; and LinkShare’s own DBA team. In less than a week, hardware and
software was installed and running.
The Architecture
User requests are handled through a global load balancing infrastructure,
able to balance loads across datacenters and web servers. A cluster of web
servers and application servers run Oracle Business Intelligence Enterprise
Edition (OBIEE), a business intelligence tool allowing users to gain
insight into online visitor and sale data from a familiar web browser
interface. The OBIEE application servers are then connected to an Exadata
database machine.
1
Affiliate Programs – LinkShare http://www.linkshare.com
2
MapReduce: Simplified Data Processing on Large Clusters, Jeffrey Dean, Sanjay Ghemawat.
http://labs.google.com/papers/mapreduce-osdi04.pdf
Figure 1. Overall System Architecture
Data flows from Oracle 11g-based OLTP systems, using a cluster of ETL
servers running Informatica PowerCenter that extract and transform data for
loading into an operational data store (ODS) schema located on the Exadata
system. The ETL servers then take the ODS data, further transforming it into a
dimensional model in a star schema. The star schema is designed for flexible
and efficient querying as well as storage space efficiency.
LinkShare’s analytics platform serves a worldwide client base and doesn’t have
the off-hours maintenance windows common to many other analytics systems.
The high availability requirements dictated an architecture (Fig. 1) that relies
not on the internal redundancy built into the Exadata platform, but also to
house two independent Exadata machines in geographically separated datacenter
facilities. Rather than using a traditional Oracle Data Guard configuration,
LinkShare opted to take advantage of the read-intensive nature of the analytics
application to simply double-load data from source systems using the existing
ETL platform. This configuration completely removes dependencies between
sites and also permits both sites to service active users concurrently.
In order to reduce migration risks and to permit an accelerated project
timeline, application and data model changes were kept to a bare minimum.
21. 4th Qtr 2011 ■ Page 19
One of Exadata’s headline features is hybrid column compression, which
combines columnar storage with traditional data compression algorithms like
LZW to give higher compression ratios than traditional Oracle data compression.
One decision when implementing columnar compression is choosing a
compression level; the compression levels between QUERY LOW and ARCHIVE
HIGH offer increasing tradeoffs between space savings and compression
overhead.3
Using a sample table to compare compression levels (Fig. 2),
we found the query high compression level to be at the point of diminishing
returns for space savings, while still offering competitive compression
overhead. In the initial implementation, a handful of large and infrequently
accessed table partitions were compressed with hybrid columnar compression,
with the remaining tables using OLTP compression. Based on the good results
with columnar compression, however, we plan to compress additional tables
with columnar compression to achieve further space savings.
Performance Tuning
Avoiding Indexes
Improving performance was a major reason for migrating to Exadata and
made up a large part of the effort in the implementation project. To make
maximum use of Exadata’s offload functionality for the data-intensive
business intelligence workload, it was initially configured with all indexes
removed. (This approach would not be recommended for workloads involving
online transaction processing, however.) The only the exceptions were primary
key indexes required to avoid duplicate rows, and even these indexes were
marked as INVISIBLE to avoid their use in query plans. Foreign key
enforcement was done at the ETL level rather than inside the database,
avoiding the need for additional foreign key indexes.
By removing or hiding all indexes, Oracle’s optimizer is forced to use full
scans. This may seem counterintuitive; full scans require queries to entire
table partitions, as compared to an index scan, which reads only the rows
matching query predicates. But by avoiding index scans, Exadata’s smart
scan storage offloading capability can be brought to bear. Such offloaded
operations run inside Exadata storage servers, which can use their directly
attached disk storage to efficiently scan large volumes of data in parallel.
These smart scans avoid one of the major points of contention with rotating
storage in a database context: slow seek times inherent in single-block
random I/O endemic in index scans and ROWID-based table lookups.
Exadata storage servers have optimizations to reduce the amount of raw disk
I/O. Storage indexes cache high and low values for each storage region,
allowing I/O to be skipped entirely when there is no possibility of a match.
The largest application code changes involved handling differences in date
manipulation syntax between Oracle and the legacy system. The logical data
model, including ODS environment and star schema, was retained.
The legacy system had a fixed and inflexible data partitioning scheme as a
by-product of its massively parallel architecture. It supported only two types of
tables: nonpartitioned tables, and partitioned tables using a single numeric
partition key, hashed across data nodes. The requirement to have equal-sized
partitions to maintain performance required the creation of a numeric
incrementing surrogate key as both primary key and partition key. The move
to Oracle opened up a whole new set of partitioning possibilities that better fit
data access patterns, all with little or no application code changes. More
flexible partitioning allows improved query performance, especially when
combined with full scans, as well as simplifying maintenance activities like the
periodic rebuild and recompression of old data. The final partition layout
ended up combining date range-based partitioning with hash-based
subpartitioning on commonly queried columns.
Data Migration
Data migration was done in three separate ways, depending on the size of the
underlying tables. Small tables (less than 500MB in size) were migrated using
Oracle SQL Developer’s built-in migration tool. This tool’s GUI interface
allowed ETL developers to define migration rules independently of the DBA
team, freeing up DBA time for other tasks. Data transfer for these migrations
was done through the developers’ own desktop computers and JDBC drivers
— on a relatively slow network link — so these transfers were restricted to
small objects. The table definitions and data were loaded into a staging
schema, allowing them to be examined for correctness by QA and DBA teams
before being moved in bulk to their permanent location.
Larger objects were copied using existing Informatica PowerCenter
infrastructure and the largest objects (more than 10GB) were dumped to
text files on an NFS mount using the legacy system’s native query tools,
and loaded into the Exadata database using SQL*Loader direct path loads.
Simultaneous parallel loads on different partitions improved throughput.
Initial SQL*Loader scripts were generated from Oracle SQL Developer’s
migration tool but were edited to add the UNRECOVERABLE, PARALLEL and
PARTITION keywords, enabling direct path parallel loads. The SQL*Loader
method proved to be more than twice as fast as any other migration method,
so many of the tables originally planned to be migrated by the ETL tool were
done by SQL*Loader instead. (Although SQL*Loader was used here because
of DBA team familiarity, external tables are another high-performance
method of importing text data.)
Another tool commonly used in cross-platform migrations is Oracle
Transparent Gateways. Transparent gateways allow non-Oracle systems to
be accessed through familiar database link interfaces as if they were Oracle
systems. We ended up not pursuing this option to avoid any risk of
impacting the former production environment, and to avoid additional
license costs for a short migration period.
One of the biggest challenges in migrating data in a 24x7 environment is not
the actual data transfer; rather, it is maintaining data consistency between
source and destination systems without incurring downtime. We addressed this
issue by leveraging our existing ETL infrastructure: creating bidirectional
mappings for each table and using the ETL system’s change-tracking
capabilities to propagate data changes made in either source or destination
system. This process allowed the ETL system to keep data in the Exadata
systems up to date throughout the migration process. The process was retained
post-migration, keeping data in the legacy system up to date.
continued on page 20
3
Oracle Database Concepts11g Release 2 (11.2)
Figure 2: Comparison of Compression Rates
22. Page 20 ■ 4th Qtr 2011
The Exadata smart flash cache uses flash-based storage to cache the most
frequently used data, avoiding disk I/O if data is cached. The net result is that
reading entire tables can end up being faster than traditional index access,
especially when doing large data manipulations common in data warehouses
like LinkShare’s.
Benchmarking Performance
Given the radical changes between Exadata and the legacy environment,
performance benchmarks were essential to determine the ability of the
Exadata platform to handle current and future workload. Given that the
Exadata system had less than 25 percent of the raw disk spindles and therefore
less I/O capacity compared to the legacy system, business management was
concerned that Exadata performance would degrade sharply under load.
To address these concerns, the implementation team set up a benchmark
environment where the system’s behavior under load could be tested. While
Oracle-to-Oracle migrations may use Real Application Testing (RAT) to gather
workloads and replay them performance testing, RAT does not support
non-Oracle platforms. Other replay tools involving Oracle trace file capture
were likewise not possible.
Eventually a benchmark was set up at the webserver level using the open-
source JMeter4
tool to read existing webserver logs from the legacy production
environment and reformat them into time-synchronized, simultaneous
requests to a webserver and application stack connected to the Exadata system.
This approach had a number of advantages, including completely avoiding
impacts to the legacy environment and using testing infrastructure with which
the infrastructure team was already familiar. A side benefit of using playback
through a full application stack was that it allowed OBIEE and web layers to
be tested for performance and errors. Careful examination of OBIEE error logs
uncovered migration-related issues with report structure and query syntax
that could be corrected. Load replay was also simplified by the read-intensive
nature of the application, avoiding the need for flashback or other tools to
exactly synchronize the database content with the original capture time.
The benchmark was first run with a very small load — approximately
10 percent of the rate of production traffic. At this low rate of query volume,
overall response time was about 20 percent faster than the legacy system.
This was a disappointment when compared to the order of magnitude
improvements expected, but it was still an improvement.
The benchmark load was gradually increased to 100 percent of production
volume. Response time slowed down dramatically to the point where the
benchmark was not even able to complete successfully. Using database-level
performance tools like Oracle’s AWR and SQL monitor, the large smart scans
were immediately visible, representing the majority of response time.
Another interesting wait event was visible: enq: KO – fast object checkpoint.
These KO waits are a side effect of direct-path reads, including Exadata smart
scans. Another session was making data changes — in this case updating a
row value. But such updates are buffered and not direct-path, so they are
initially made to the in-memory buffer cache only. But direct-path reads,
which bypass the buffer cache and read directly from disk, wouldn’t see these
changes. To make sure data is consistent, Oracle introduces the enq: KO – fast
object checkpoint wait event, waiting for the updated blocks to be written to
disk. The net effect is that disk reads would hang, sometime for long periods of
time, until block checkpoints could complete. Enq: KO – fast object checkpoint
waits can be avoided by doing direct-path data modifications. Such data
changes apply only to initially empty blocks, and once the transaction is
committed, the changed data is already made on disk. Unfortunately,
direct-path data modifications can only be applied to bulk inserts using the
/*+APPEND*/ hint or CREATE TABLE AS SELECT, not UPDATE or DELETE.
Operating system level analysis on the storage servers using the Linux iostat
tool showed that the physical disk drives were achieving high read throughput
and running at 100 percent utilization, indicating that the hardware was
functioning properly but struggling with the I/O demands placed on it.
Solving the Problem
To deal with the initial slow performance, we adopted a more traditional data
warehousing feature of Oracle: bitmap indexes and star transformations.5
Bitmap indexes work very differently from Exadata storage offload, doing data
processing at the database server level rather than offloading to Exadata
storage servers. By doing index-based computations in advance of fact table
access, they only retrieve matching rows from fact tables. Fact tables are
generally the largest table in a star schema, thus, bitmap-based data access
typically does much less disk I/O than smart scans, at the expense of CPU time,
disk seek time, and reduced parallelism of operations. By moving to bitmap
indexes, we also give up Exadata processing offload, storage indexes and even
partition pruning, because partition join filters don’t currently work with
bitmap indexes. With the star schema in place at LinkShare, however, bitmap
indexes on the large fact tables allowed very efficient joins of criteria from
dimension tables, along with caching benefits of the database buffer cache.
The inherent space efficiencies of bitmap indexes allowed aggregatete index
size to remain less than 30 percent of the size under the legacy system.
After creating bitmap indexes on each key column on the fact tables, we ran the
same log-replay benchmark as previously. The benchmark returned excellent
results and maintained good response times even when run at load volumes of
eight times that of the legacy system, without requiring any query changes.
Query-Level Tuning
Even with bitmap indexes in place, AWR reports from benchmark runs
identified a handful of queries with unexpectedly high ratios of logical reads
per execution. A closer look at query plans showed the optimizer dramatically
underestimating row cardinality, and in turn choosing nested-loop joins when
hash joins would have been an order of magnitude more efficient. Tuning
options were somewhat limited because OBIEE’s SQL layer does not allow
optimizer hints to be added easily. We instead looked at the SQL tuning advisor
and SQL profiles that are part of Oracle Enterprise Manager’s tuning pack. In
some cases, the SQL tuning advisor was able to correct the row cardinality
estimates directly and resolve the query issues by creating SQL profiles with
the OPT_ESTIMATE query hint.6
SQL profiles automatically insert optimizer
hints whenever a given SQL statement is run, without requiring application
code changes. OBIEE, like other business intelligence tools, generates SQL
statements without bind variables, making it difficult to apply SQL profiles to
OBIEE-generated SQL statements.A further complication came from lack of
bind variables in OBIEE-generated SQL statements. Beginning in Oracle 11gR1,
the FORCE_MATCH option to the DBMS_SQLTUNE.ACCEPT_SQL_PROFILE
procedure7
comes to the rescue, matching any bind variable in a similar
manner than the CURSOR_SHARING=FORCE initialization parameter.
In many cases, however, the SQL tuning advisor simply recommended creating
index combinations that make no sense for star transformations. In these cases,
Going Live On Oracle Exadata continued from page 19
4
Apache JMeter http://jakarta.apache.org/jmeter/
5
Oracle Database Data Warehousing Guide 11g Release 2 (11.2)
6
Oracle’s OPT_ESTIMATE hint: Usage Guide, Christo Kutrovsky. http://www.pythian.com/news/13469/
oracles-opt_estimate-hint-usage-guide/
7
Oracle Database Performance Tuning Guide 11g Release 2 (11.2)
23. 4th Qtr 2011 ■ Page 21
we manually did much of the work the
SQL tuning advisor would normally do by
identifying which optimizer hints would be
required to correct the incorrect assumptions
behind the problematic execution
plan. We then used the undocumented
DBMS_SQLTUNE.IMPORT_SQL_PROFILE
function8
to create SQL profiles that would
add hints to SQL statements much the way
the SQL tuning advisor would normally do
automatically. Analyzing these SQL statements
manually is a very time-consuming activity;
fortunately, only a handful of statements
required such intervention.
Going Live
LinkShare’s Exadata go-live plan was designed
to reduce risk by slowly switching customers
from the legacy system while preserving the
ability to revert should significant problems be discovered. The ETL system’s
simultaneous loads kept all systems up to date, allowing analytics users to run
on either system. Application code was added to the initial login screen to direct
users to either the legacy system or the new system based on business-driven
criteria. Initially, internal users only were directed at Exadata, then 1 percent
of external users, ramping up to 100 percent within two weeks. Go-live impacts
on response time were immediately visible from monitoring graphs, as shown
in Fig. 3. Not only did response times improve, but they also became much
more consistent, avoiding the long outliers and query timeouts that would
plague the legacy system.
The second data center site went live in much the same manner, using the ETL
system to keep data in sync between systems and slowly ramping up traffic to
be balanced between locations.
Operational Aspects
Given that Exadata has a high-speed InfiniBand network fabric, it makes sense
to use this same fabric for the I/O-intensive nature of database backups.
LinkShare commissioned a dedicated backup server with an InfiniBand host
channel adapter connected to one of the Exadata InfiniBand switches. RMAN
backs up the ASM data inside the Exadata storage servers using NFS over IP
over InfiniBand. Initial tests were constrained by the I/O capacity of local disk,
so storage was moved to an EMC storage area network (SAN) already in the
datacenter, using the media server simply as a NFS server for the SAN storage.
Monitoring is based on Oracle Enterprise Manager Grid Control to monitor the
entire Exadata infrastructure. Modules for each Exadata component, including
database, cluster, Exadata storage servers, and InfiniBand hardware, give a
comprehensive status view and alerting mechanism. This is combined with
Foglight9
, a third-party tool already extensively used for performance trending
within LinkShare, installed on the database servers. The monitoring is
integrated with Pythian’s remote DBA service, providing both proactive
monitoring and 24x7 incident response.
Patching in Exadata involves several different layers: database software,
Exadata storage servers, database-server operating system components like
infiniBand drivers, and infrastructure like InfiniBand switches, ILOM
lights-out management cards in servers, and even console switches and power
distribution units. Having a second site allows us to apply the dwindling
number of patches that aren’t rolling installable by routing all traffic to one
site and installing the patch in the other.
Looking Ahead
With Exadata sites now in production, development focus is shifting to
migrating the handful of supporting applications still running on the legacy
system. Retirement of the legacy system has generated immediate savings in
data center and vendor support costs, as well as freeing up effort in DBA, ETL
and development teams to concentrate on a single platform.
On the Exadata front, the roadmap focuses on making better use of newly
available functionality in both the Exadata storage servers and the Oracle
platform in general. In particular, we’re looking at making more use of Exadata’s
columnar compression, incorporating external tables into ETL processes, and
making use of materialized views to precompute commonly queried data.
The Results
The move to Exadata has produced quantifiable benefits for LinkShare.
Datacenter footprint and power usage have dropped by factors of 4x and 8x,
respectively. The DBA team has one less platform to manage. Response times have
improved by factors of 8x or more, improving customer satisfaction. The ability
to see more current data has helped users make better and timelier decisions.
And, ultimately, improving customer retention and new customer acquisition.
C
■ ■ ■ About the Author
Marc Fielding is senior consultant with Pythian’s advanced
technology group where he specializes in high availability, scalability
and performance tuning. He has worked with Oracle database
products throughout the past 10 years, from version 7.3 up to 11gR2.
His experience across the entire enterprise application stack allows
him to provide reliable, scalable, fast, creative and cost-effective
solutions to Pythian’s diverse client base. He blogs regularly on the
Pythian blog www.pythian.com/news, and is reachable via email at
fielding@pythian.com, or on twitter @pythianfielding.
Figure 3: Monitoring-server Response Times Before and After Exadata Go-Live
8
SQL Profiles, Christian Antognini, June 2006. http://antognini.ch/papers/SQLProfiles_20060622.pdf
9
Quest Software Foglight http://www.quest.com/foglight/
24. Page 22 ■ 4th Qtr 2011
Thinking of Upgrading to Oracle
SOA Suite 11g? Knowing The
Right Steps Is Key
By Ahmed Aboulnaga
U
pgrading from Oracle SOA Suite 10g to 11g has
proven to be very costly and challenging due to the
dramatic change in the underlying architecture. This
article presents a tried-and-tested upgrade strategy that will
help you avoid the pitfalls early adopters have faced.
Oracle SOA Suite 11g is used as the backbone for systems integration and
as the foundation for integrating applications such as Fusion Applications.
It is a comprehensive suite of products to help build, deploy and manage
service-oriented architectures (SOA) and is comprised of products and
technologies that include BPEL, Mediator and Web Services Manager. It also
brings with it several great features, including support of the WebLogic Server,
the introduction of the SCA model, improved conformance to standards and
centralized access to artifacts and metadata.
Unfortunately, error correction support ends on December 2013 for the latest
release of Oracle SOA Suite 10g (specifically, 10.1.3.5). Thus, customers will
have to choose between running on an unsupported release or upgrading to
the latest version, currently SOA Suite 11g PS4 (11.1.1.5). Some customers
erroneously believe that the new technology will resolve many of the pain points
experienced in the older one, not realizing that a stabilization phase is still
required. On the other hand, those who have invested (“suffered”) in stabilizing
their current 10g environments may hold off on the upgrade because of the
effort and risk involved.
Let me be blunt. The upgrade process will be painful. Expect nearly all your
code to require at least some change. A successful upgrade from SOA Suite 10g
to 11g can only be achieved when both the development and infrastructure
teams involved have a decent enough understanding of both versions, which is
often not the case initially. A learning curve is inevitable, and typical training
does not prepare you with the necessary upgrade knowledge.
The effort involved in moving from Oracle SOA Suite 10g to 11g is both an
upgrade and a migration. The upgrade is result of moving to a new version of
the same product while the migration is the process of converting existing SOA
Suite 10g code to allow it to deploy and execute on SOA Suite 11g. The entire
process is sometimes unclear, laborious and introduces risk as a result of the
underlying code being altered. The challenges of upgrading don’t end there.
Several core concepts have fundamentally changed with the introduction of SOA
Suite 11g, and there is not enough direction as to what to do. Understandably,
Oracle documentation cannot cover every possible scenario.
Based on recent implementation successes, IPN Web, a systems integrator based
out of the Washington, D.C., area, has developed an approach to upgrading
Oracle SOA Suite 10g to 11g that should suit most implementations while
minimizing risk. This article summarizes that approach and highlights key
areas not covered by Oracle documentation.
Figure 1 is a flow diagram of the main steps involved in the upgrade process,
as documented by Oracle. Unfortunately, this approach does little to actually
help you in your project planning.
Because Oracle Fusion Middleware 11g incorporates more than just SOA Suite,
it is difficult for Oracle to produce a one-size-fits-all upgrade strategy. Secondly,
despite the immense effort Oracle has put in the 11g documentation, and though
it helps walk through various aspects of the code migration process, it is still
lacking in many technical areas.
The approach described in this article is more realistic and helps tackle this
difficult upgrade project by providing clear, correctly sequenced steps.
Furthermore, covered here are specific strategies not addressed by any Oracle
documentation.
The IPN Web SOA Suite 11g Upgrade Approach
Although Oracle delivers great documentation on both Oracle SOA Suite and
its upgrade process, it is not presented in a manner that is project-oriented
and is missing certain areas required by all implementations.
Our modified upgrade approach, based on Figure 2, is as follows:
Figure 1. The Oracle Fusion Middleware Upgrade Process