Informix warehouse and accelerator overview

Informix Warehouse & Informix Warehouse Accelerator Overview ,[object Object],[object Object]

Disclaimer ,[object Object],[object Object],[object Object],[object Object],[object Object]

Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Data Warehousing Industry Trends

State of Data Warehousing in 2011 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

State of Data Warehousing, Cont’d ,[object Object],[object Object],[object Object],[object Object]

State of Data Warehouse, Cont’d ,[object Object],[object Object],[object Object]

Data Warehouse Trends for the CIO, 2011-2012 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Informix Warehouse History ,[object Object],[object Object],[object Object],[object Object]

Existing IDS Warehousing Features ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Informix Warehousing Moving Forward ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Informix Warehouse Roadmap ,[object Object],[object Object],[object Object],[object Object],Informix Warehouse with Storage Optimization/Compression ,[object Object],[object Object],[object Object],External Tables Star Join Optimization Multi-index Scan New Fragmentation Fragment Level Stats Storage Provisioning Warehouse Accelerator

Informix Warehouse 11.70 Features

Typical Data Warehouse Architecture

Source: Forrester Query Tools Analytics BPS Apps BI Apps LOB apps Databases Other transactional data sources I/O & data loading Query processing DBMS & Storage mgmt 11.70 Warehousing Features Data Loading HPL DB utilities ON utilities DataStage External Tables Online attach/detach Data & Storage Management Deep Compression Interval and List Fragmentation Online attach/detach Fragment level stats Storage provisioning Table defragmenter Query Processing Light Scans Merge Hierarchical Queries Multi-Index Scan Skip Scan Bitmap Technology Star and Snowflake join optimization Implicit PDQ Access performance

Informix Warehouse Tooling - SQW Execution DEPLOY Deployment preparation Deploy RUNTIME HTTP service ( WAS ) SQW Runtime Applications Other Servers (DataStage) DB2 Oracle SQL Server Design Studio Admin Console Deploy Data Source Databases Execution Execution Debug SQW Control DB IDS DESIGN Design Center (Eclipse) Data Flows + Control Flows Deployment package Code Units Build Profile User scripts Warehouse DB IDS SQW Execution DB IDS

SQW: Design Studio ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

SQW: Data Modeling ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

SQW: Data Flows ,[object Object],[object Object],[object Object],[object Object],File source Table source Table join aggregation Table target

SQW: Data Flows A simple flow ,[object Object],[object Object],[object Object],[object Object]

SQW: Control Flows ,[object Object],[object Object],[object Object],[object Object]

SQW Overview Design Studio Eclipse Based Design Environment Admin Console Production Environment in Websphere deploy ,[object Object],[object Object],[object Object],create ,[object Object],[object Object],[object Object],manage

Admin Console ,[object Object],[object Object],[object Object]

Informix 11.70 Feature: Warehouse Time-Cyclic Data Management ,[object Object],[object Object],[object Object],[object Object],[object Object],field field field field field field field Jan Feb Mar Apr May 2011 Dec 2010 enables storing data over time field field field field field field field field field field field field field field field field field field field field field field field field field field field field field field field field field field field

Interval Fragmentation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Informix 11.70 Feature: Multi-Index Scan ,[object Object],[object Object],[object Object],[object Object]

Multi-Index Scan – An Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Multi-Index Scan Example ,[object Object],[object Object],[object Object],[object Object],Method retrieves rows based on the most selective constraint using only the index for that column, followed by a sequential evaluation of each of other constraints in a post-retrieval manner.

Multi-Index Scan Example ,[object Object],[object Object],[object Object],[object Object],Gender=‘m’ Zipcode=‘95032’ AND Records Sorted RIDs Income_Category=“high” Education_level = “masters” Sequential Skip Scan

Informix 11.70 Feature: Push Down Hash Join ,[object Object],[object Object],[object Object],[object Object],[object Object],Build Scan Hash Join Build Probe Probe Scan

Typical Star Schema: An Example ,[object Object],[object Object],[object Object],[object Object],[object Object],Dim (D1) Dim (D3) Fact (F) 1M rows sel : 1/1000 10K rows sel : 1/10 10K rows sel : 1/10 10K rows sel: 1/10 Dim (D2)

Prior to 11.70: Standard Left Deep Tree Solution Scan D1 1K 1M Problem Join Second Join Build Too Large Scan F Hash Join Hash Join Scan D3 Hash Join Scan D2 100K 1K 10K 1K

11.70 Feature: Pushdown Hash-Join Solution Scan F Scan D1 1K Scan D3 Scan D2 1K Join Keys Multi Index Scan of Fact Table using Join Keys and Single-Column Indexes Join Keys Pushed Down to Reduce Probe Size Hash Join Hash Join Hash Join 1K 1K 1K 1K

Informix Warehouse Accelerator (IWA)

Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Third Generation of Database Technology ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Example of 2nd Generation Database Disk I/O Issue

How Oracle/Exadata Solves That Problem: Add an I/O Layer

Sun Oracle Database Machine Full Rack ,[object Object],[object Object],[object Object],14 Exadata Storage Cells (Storage Server) per Cell up to 1.5 GB/Sec I/O Bandwidth => 21 GB/Sec per DB machine 8 Oracle RAC Database Servers InfiniBand Switches/Network InfiniBand 16 Gigabit per Channel 8 Cores 24 GB Memory 12 Disks (600 GB/2 TB) 8 Cores 24 GB Memory 12 Disks (600 GB/2 TB) 8 Cores 24 GB Memory 12 Disks (600 GB/2 TB) 8 Cores 24 GB Memory 12 Disks (600 GB/2 TB) 8 Cores 24 GB Memory 12 Disks (600 GB/2 TB) 8 Cores 72 GB Memory 8 Cores 72 GB Memory 8 Cores 72 GB Memory 8 Cores 72 GB Memory 8 Cores 72 GB Memory

Cost of Oracle/Exadata Solution ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Informix Warehouse Accelerator 3 rd Generation Database Technology is Here ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],What is it? The Informix Warehouse Accelerator (IWA) is a workload optimized, appliance-like, add-on, that enables the integration of business insights into operational processes to drive winning strategies. It accelerates select queries, with unprecedented response times. Breakthrough Technology Enabling New Opportunities

Breakthrough technologies for performance Row & Columnar Database Row format within IDS for transactional workloads and columnar data access via accelerator for OLAP queries. Extreme Compression Required because RAM is the limiting factor. Massive Parallelism All cores are used within used for queries Predicate evaluation on compressed data Often scans w/o decompression during evaluation Frequency Partitioning Enabler for the effective parallel access of the compressed data for scanning. Horizontal and Vertical Partition Elimination. In Memory Database 3 rd generation database technology avoids I/O. Compression allows huge databases to be completely memory resident Multi-core and Vector Optimized Algorithms Avoiding locking or synchronization 1 2 3 4 5 6 7 1 2 3 4 5 6 7

TCP/IP Informix Warehouse Accelerator Configuration ,[object Object],[object Object],[object Object],[object Object],[object Object],Bulk Loader SQL Queries (from apps) Informix Warehouse Accelerator Compressed DB partition Query Processor Data Warehouse IDS SQL (via DRDA) Query Router ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Results

Informix Warehouse Accelerator Overview Coordinator Process Orchestrating the distributed tasks like Load or Query execution . Have all the data in main memory spread across all cores. Do the compression and query execution. IDS Query parsing and matching to the Optimizer. Routing query blocks. . . Worker Processes

Target Market: Business Intelligence (BI) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Dimensions Fact Table City Region Store SALES Product Period Brand Month Quarter Category

What IWA is Designed For ,[object Object],[object Object],[object Object],SELECT PRODUCT_DEPARTMENT, REGION, SUM(REVENUE) FROM FACT_SALES F INNER JOIN DIM_PRODUCT P ON F.FKP = P.PK INNER JOIN DIM_REGION R ON F.FKR = R.PK LEFT OUTER JOIN DIM_TIME T ON F.FKT = T.PK WHERE T.YEAR = 2009 AND R.GEOID = 17 AND P.TYPEID = 3 GROUP BY PRODUCT_DEPARTMENT, REGION

Case Study #1: Major U.S. Shoe Retailer ,[object Object],Our Retail users will be really happy to see such a huge improvement in the queries processing timings. This IWA extension to IDS will really bring value to the Retail BI environment. 2 secs 45 mins & up 7 2 secs 30 mins 6 2 secs 2 mins 5 4 secs 30 mins & up 4 2 secs 3 mins 40 secs 3 2 secs 1 min 3 secs 2 4 secs 22 mins 1 IDS 11.7 IWA IDS 11.5 Query

Case Study #2: Datamart at a Government Agency ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Case Study #3: U.S. Government Agency 15800.89% Fact Table Scan 0:00:41 1:48:58 Summarize all transactions by State, County, City, State, Zip, Program, Program Year, Commodity and Fiscal Year 5 108.41% Index Read 0:00:06 0:00:06 Detailed Report on Specific Programs in a Date Range 4 41708.49% Fact Table Scan 0:00:14 1:34:37 Summarize all transactions by State and County 3 7640.45% Fact Table Scan 0:01:05 1:22:32 Find Top 100 Members 2 6023.23% Fact Table Scan 0:01:28 1:28:22 Find Top 100 Entities 1 Improvement Notes Informix w/ IWA Informix Description Query

Row Oriented Data Store Each row stored sequentially ,[object Object],[object Object],[object Object],[object Object],[object Object],If only few columns are required the complete row is still fetched and uncompressed

Columnar Data Store Data is stored sequentially by column If attributes are not required for a specific query execution, they are skipped completely. ,[object Object],[object Object],[object Object]

Compression: Frequency Partitioning Top 64 traded goods – 6 bit code Rest Prod Origin Trade Info (volume, product, origin country) Histogram on Origin Histogram on Product Origin Product China USA GER, FRA, … Rest Table partitioned into Cells Column Partitions Vol ,[object Object],[object Object],[object Object],Cell 4 Cell 1 Cell 2 Cell 3 Cell 5 Cell 6 Common Values Rare values Number of Occurrences

Compression Process: Step 1 Male/John Input tuple Column 1 Column 2 Co-code transform Type specific transform Column 1 & 2 Column 3.A Column Code TupleCode Column Code Column 3 Column 3.B Column Code Male/John/Sat Sat 2006 Male, John, 08/10/06, Mango 101101011 001 01011101 101101011 001 01011101 p = 1/512 p = 1/8 p = 1/512 w35/Mango w35 Huffman Encode Dict Huffman Encode Dict Huffman Encode Dict Male John 08/10/06 Mango 1.5% Steven 1.9% Thomas 2.3% Richard 2.4% Mark 2.5% William 3.5% John 3.5% Robert 3.6% James 3.8% David 4.2% Michael 22% 28% 17% 15% 9% 5% 4% Female 12% 42% 23% 6% 10% 4% 3% Male Sun Sat Fri Thu Wed Tue Mon

Compression Process: Step 2 First tuple code Tuplecode — Sorted Tuplecodes 1 Previous Tuplecode Delta Huffman Encode Delta Code Append Dict Compression Block 101101011100001100 10110101110001011111 1011010111000011101 10110101110001011101 10110101110001011101 0000000000000000001 000 000 00000000000000000001 010 010 0000000000000000101 1110 1110 Look Ma, no delimiters! 101101011100010111010000101110 — — —

Data is Processed in Compressed Format ,[object Object],[object Object],[object Object],[object Object],The Register – Store is an optimization of the Column – Store approach where we try to make the best use of existing hardware. Reshuffeling small data elements at runtime into a register is time consuming and can be avoided. The Register – Store also delivers good vectorization capabilities. Predicate evaluation is done against compressed data!

Register Stores Facilitate SIMD Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],32 bits 32 bits 32 bits 32 bits 128 bits Vector Operation A 1 D 1 G 1 A 2 D 2 G 2 A 4 D 4 G 4 Bank β 1 (32 bits) A 3 D 3 G 3 B 1 E 1 F 1 B 2 E 2 F 2 B 4 E 4 F 4 C 1 H 1 C 3 H 3 C 4 H 4 Bank β 2 (32 bits) Bank β 3 (16 bits) Cell Block B 3 E 3 F 3 C 2 H 2 Result 1 Result 2 Result 3 Result 4 Operand Operand Operand Operand

Simultaneous Evaluation of Equality Predicates State==‘CA’ && Quarter == ‘Q4’ State==01001 && Quarter==1110 Translate value query to Code query Row Mask Selection result … … … … 01001 0 1110 0 == & ,[object Object],[object Object],[object Object],[object Object],State Quarter 11111 0 1111 0

Defining, What Data to Accelerate ,[object Object],[object Object],[object Object],[object Object],Define Worker Processes Coordinator Process IDS + IWA

Distributing data from IDS (Fact tables) Data Fragment Fact Table UNLOAD UNLOAD UNLOAD UNLOAD IDS Stored Procedures Copy A copy of the IDS data is now transferred over to the Worker process. The Worker process holds a subset of the data (compressed) in main memory and is able to execute queries on this subset. The data is evenly distributed (no value based partitioning) across the cpus. Coordinator Process Worker Process Compressed Data Compressed Data Compressed Data Compressed Data Compressed Data Compressed Data Worker Process Worker Process Data Fragment Data Fragment Data Fragment

Distributing data from IDS (Dimension tables) IDS UNLOAD UNLOAD UNLOAD UNLOAD IDS Stored Procedure All dimension tables are transferred to the worker process. Coordinator Process Worker Process Worker Process Worker Process Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table Dimension Table

Mapping Data from IDS to IWA Inside IWA Inside IDS Data Fragment Data Fragment Data Fragment Data Fragment Data Fragment Data Fragment Fact Table Dimension Table Dimension Table Dimension Table Dimension Table Compressed Data Fragment Data Fragment Data Fragment Data Fragment Data Fragment Data Fragment Fact Table Dimension Table Dimension Table Dimension Table Dimension Table

IWA Referenced Hardware Configuration Options: 300 GB SAS hard disk drives each 6 disks 512G Memory X7560 @ 2.27GH 4 X 8 Intel(R) Xeon(R) CPU - 16x 1.8" SAS SSDs with eXFlash or 8x 2.5" SAS HDDs - Optional MAX5 32-DIMM memory expansion - Scalable from 4 sockets and 64 DIMMs to 8 sockets and 128 DIMMs - 8-core, 6-core and 4-core processor options with up to 2.26 GHz (8-core), 2.66 GHz (six-core) and 1.86 GHz (four-core) speeds with up to 16 MB L3 cache - 4-processor, 4U rack-optimized enterprise server with Intel® Xeon® processors.

IWA Software Components ,[object Object],[object Object],[object Object],[object Object]

Informix warehouse and accelerator overview

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Informix warehouse and accelerator overview

Similar to Informix warehouse and accelerator overview (20)

More from Keshav Murthy

More from Keshav Murthy (20)

Recently uploaded

Recently uploaded (20)

Informix warehouse and accelerator overview

Editor's Notes