SlideShare a Scribd company logo
1 of 43
Download to read offline
Performance and Scalability of
Informix® Ultimate Warehouse Edition
on Intel Xeon® 7500 and E7 processors
Session Number 2864

                 Keshava Murthy, IBM®
                     Jantz Tran, Intel®
Agenda

• Intel Inside
• IWA Overview
• Key performance features in Intel
• How IWA is exploiting the Intel features.
• Performance results




1
Tick-Tock Development Model
    Sustained Xeon® Microprocessor Leadership
    Tick       Tock                     Tick            Tock               Tick          Tock            Tick           Tock



           65nm                                    45nm                            32nm                              22nm

                     00                                                                         y            ridge
                 ® 53                          7400             7500          ® E7         Sand /EN     Ivy B N
             Xeon 5100
             X eon
                  ®                  X eon
                                           ®
                                                       X eon
                                                            ®             Xeon               e-EP         EP/E
                                                                                        Bridg



              Intel® Core™                              Nehalem/Westmere               Sandy Bridge/Ivy Bridge
                                                         Microarchitecture                Microarchitecture
                  Microarchitecture
                  First high-volume server Quad-
                                                                  Up to 10 cores                Up to 8 cores
                             Core CPUs                                                        and 20MB Cache
                                                                 and 30MB Cache


                                                      Integrated memory controller
                    Dedicated high-                                                        Integrated PCI Express
                                                           with DDR3 support
                  speed bus per CPU                                                            Turbo Boost 2.0
                                                      Turbo Boost, Intel HT, AES-
                                                                    1
                      HW-assisted                                NI                         Intel Advanced Vector
                  virtualization (VT-x)                                                       Extensions (AVX)
                                                        End-to-end HW-assisted
                                                       virtualization (VT-x, -d, -c)


2
Intel Xeon Processor
            ®                 ®



 Family for Business
                                                                                          Scalable
                        Intel®    Xeon®   processor E7 platforms
                                                                                          Enterprise
                        Scalable (up to 256-way), reliable, powerful 64-bit multi-core servers offering industry-
                        leading performance, expanded memory & I/O capacity, and advanced reliability ideal for
                                          Mainstream                                  Top-of-the-line performance,
                        the most demanding enterprise and mission critical workloads, large scale virtualization and
                                          Enterprise
                        large-node HPC applications.                                  scalability, and reliability
                                    Best combination of
                                    performance, power efficiency,
                 Intel® Xeon® processor 5000 sequence platforms (E5 in 2012)
                                    and cost
Small                                                                              Mission Critical
                 Versatile (up to 2-way) servers for all your infrastructure, high-density, workstationthe most     and HPC
Business                           Enterprise Server optimal performancePerformance and reliability forfor the
                 applications with features that enable                            business critical efficiency outstanding
                                                                                    and power workloads with
                 data center.       Versatility for infrastructure apps (up to 4S) economics

Economical and more                Cloud Computing                                 Cloud Computing
dependable vs. desktop              Efficient, secure, and open platforms for       Highest virtualization density and advanced
                         Intel® Xeon® processor 3000 and IAAS
                                        Internet datacenters sequence platforms (E3 in 2012)
                                                                                  reliability for private cloud

Entry Servers andEconomical (1-way) dependable general purpose 64-bit servers well-suited for small
                                 High Performance Computing &              High Performance Computing
Workstations     businesses and education with features that optimize performance, uptime, and security
                                 Workstations
More features and performance than          Bandwidth-optimized for high                  Greater scaling and memory capacity
traditional desktop systems                 performance analytics & visualization




  Increasing capability
Intel® Xeon® Processor
E7-8800/4800/2800 Product Families
Building on Xeon® 7500 Leadership Capabilities



     More Performance                                                                                                                                     More Expandable
    • 10 cores / 20 threads                                                                                                                       • Supports 32GB DDR3 DIMMs (2TB per
                                                                                                                                                             4-socket system)1
  • 30MB of last level cache



     More Security & RAS                                                                       E7-4800             E7-4800
                                                                                                                                                                 More Efficient

                           SECURITY                                                                                                                         • More performance within
                                                                                                                                                             same max CPU TDP as Xeon
                • Intel® Advanced Encryption                                                                                                                            7500
                  Standard-New Instructions                                                    E7-4800             E7-4800
                                                                                                                                                            • Lower partial active & idle
                  • Intel® Trusted Execution                                                                                                                  power via Intel Intelligent
                       Technology (TXT)                                                                                                                          Power Technology2
                                                                                                                                                            • Support for Low Voltage-
   RELIABILITY, AVAILABILITY, SERVICEABILITY                                                                                                                          DIMMs3
• Enhanced DRAM Double Device Data Correction                                                                                                                • Reduced power memory
        • Fine Grained Memory Mirroring                                                                                                                               buffers4


                                    Delivers more Performance, Expandability and RAS
                                             while improving Energy Efficiency
  1. Up to 64 slots per standard 4 socket system x 32GB/DIMM = 2TB
  2. Uses similar core and package C6 power states enabled on Intel Xeon 5500/5600 series processors. Requires OS support.
  3. Savings dependent on workload and configuration.
  4. Memory buffer power savings of up to 1.3W active and 3W idle per buffer per Intel estimates. Slightly more savings when used with LV DIMMs
Advantages of the Xeon® E7 Platform

                                                              4-socket systems can…
                     …process the biggest workloads…maximize consolidation
                    …increase system uptime…handle highly variable workloads
                    Intel ® Xeon® Processor E7-4800 Product Family vs. Xeon® Processor 5600 Series

             Large Workloads                                                                                                                                    Mission Critical Class System
                                                                                      Highly Variable Workloads
           & Max. Consolidation                                                                                                                                         Availability
  Over 2X the compute performance                                                                                                                                Protects your data by preventing
    across a range of benchmarks1                                        More performance headroom to handle peak,                                                            errors
                                                                          unexpected, or underestimated workloads

Up to 7X memory capacity for greater                                                                                                                             Increased availability via healing,
performance, headroom and memory                                         Compute, memory and I/O scalability extends                                                 redundancy and failover
          DIMM savings2                                                   useful server life in high-growth workloads                                                     technologies


      Up to 2X higher consolidation3                                         Denser compute resources per server                                                 Minimized downtime via failure
                                                                           maximizes performance in constrained sites                                               prediction and proactive
                                                                                                                                                               replacement of failing components

 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and
MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to
   vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that
                                                           product when combined with other products.



1. Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual
    performance. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm
2. 64 DIMM slots vs. 18 slots for the Xeon 5600 processor series platform
3. 2X higher consolidation refresh ratio based on ROI tool comparing Xeon 7500 and Xeon 5600 vs.. older generations.
Advanced Reliability Starts With Silicon
    Intel® Xeon® processor E7 family RAS Capabilities
                 Memory                              I/O Hub                   CPU/Socket
     • Inter-socket Memory Mirroring    •   Physical IOH Hot Add   • Machine Check Architecture
                                                                     Machine Check Architecture (MCA)
            ®
     • Intel® Scalable Memory           •   OS IOH On-lining*        recovery (MCA-R)
                                                                     (MCA) recovery (MCA-R)
       Interconnect (Intel® SMI) Lane   •   PCI-E Hot Plug         • Corrected Machine Check Interrupt
       Failover                                                      (CMCI)
            ®
     • Intel® SMI Clock Fail Over                                  • Corrupt Data Containment Mode
     • Intel® SMI Packet Retry
            ®                                                      • Viral Mode
     • Memory Address Parity                                       • OS Assisted Processor Socket
     • Failed DIMM Isolation                                         Migration*
     • Memory Board Hot Add/Remove                                 • OS CPU on-lining *
     • Dynamic Memory Migration*                                   • CPU Board Hot Add at QPI
     • OS Memory On-lining *                                       • Electronically Isolated (Static)
     • Recovery from Single DRAM                                     Partitioning
       Device Failure (SDDC) plus                                  • Single Core Disable for Fault
       random bit error                                              Resilient Boot
     • Memory Thermal Throttling
     • Demand and Patrol scrubbing
     • Fail Over from Single DRAM                                    Intel® QuickPath Interconnect
       Device Failure (SDDC)
     • Enhanced DRAM Double Device                                 • Intel QPI Packet Retry
       Data Correction                                             • Intel QPI Protocol Protection via
     • Fine Grained Memory Mirroring                                 CRC (8bit or 16bit rolling)
     • Memory DIMM and Rank Sparing                                • QPI Clock Fail Over
     • Intra-socket Memory Mirroring                               • QPI Self-Healing
     • Mirrored Memory Board Hot
       Add/Remove


                 Advanced reliability features work to maintain data integrity

6
®                ®
Intel Xeon processor E5-2600 product family (Sandy Bridge-EP)
New micro-architecture on the 32nm process technology




                                                          Higher performance
                                                                                                                                         Platform Features
         More Efficient
                                                          Lower platform power1                                                        Up to 8 cores, 20 MB cache
                                                                                                                                 New Intel® Advanced Vector Extensions
                                                                                                                                   Optimized Turbo Boost Technology

                                                          Optimized Turbo Boost
      More Intelligent                                    Intel Node Manager                                                                          Sandy Bridge-EP
                                                            enhancements                                                                     QPI


                                                                                                                                    Up to
                                                          Intel AES-NI improvements                                                 2 QPI                                        Up to
          More Secure                                     More robust Intel TXT solutions                                            links                                    4 channels
                                                                                                                                   between                                    DDR3 1600
                                                                                                                                                      Up to 8 Cores
                                                                                                                                    CPUs                                       memory

                                                           Optimized platforms for:                                                       Integrated PCI Express* 3.0
                                                                                                                                           Up to 40 lanes per socket
         More Options                                          Performance
                                                               Smaller Form Factors
                                                               Best value
         1 Lower platform power claim based on a Xeon® 5600 CPU and Sandy Bridge-EP CPU with the same TDP specification and comparable platform configurations.
          Platform power reduction is primarily attributed to TDP reduction from a two-chip solution based on the Intel 5520 chip set and ICH-10R, down to a one-chip south
                                                             bridge solution(Patsburg chip) on the Sandy Bridge platform.
INTEL: Breakthrough technologies for performance

             7. Multi-core, multi-node environment                        1. Large memory support
             Nehalem has 8 cores and Westmere 10 cores. This              64-bit computing; System X with MAX5 supports up
             trend is expected to continue.                               to 6TB on a single SMP box; Up to 640GB on each
                                                                          node of blade center.



    6. Single Instruction Multiple Data                                                  2. Large on-chip Cache
    Specialized instructions for manipulating                                            L1 cache 64KB per core, L2 cache is 256KB per
    128-bit data simultaneously.                                  7
                                                                  7       1
                                                                          1              core and L3 cache is about 24-30 MB.
                                                                                         Additional Translation lookaside buffer (TLB).
                                                             6
                                                             6                2
                                                                              2

                                                              5
                                                              5               3
                                                                              3
         5. Hyperthreading                                            4
                                                                      4              3. Frequency Partitioning
         2x logical processors; increases                                            Enabler for the effective parallel access of
         processor throughput and overall                                            the compressed data for scanning.
         performance of threaded software.                                           Horizontal and Vertical Partition Elimination.



                                                4. Virtualization Performance
                                                Lower overhead: Core micro-architecture
                                                enhancements, EPT, VPID, and End-to-End HW
                                                assist



8
Intel® Xeon® E7 Processor Architecture

     Core 0   L1   L2                      L2    L1   Core 5
     Core 1   L1   L2                      L2    L1   Core 6   Cache Architecture
     Core 2   L1   L2     Shared L3         L2   L1   Core 7   •64K L1 Cache
     Core 3   L1   L2                       L2   L1   Core 8
                                                               •256K L2 Cache
     Core 4   L1   L2                       L2   L1   Core 9
                                                               •30MB 10 slice shared
                                                               Last Level cache (L3)
                                                               (compared to 24MB 8
                                                               slice L3 on Xeon® 7500)
         IMC            IMC           QPI (4 Links)




    • 2 integrated memory controllers
        • Scalable Memory Interconnect (SMI) with support for up to 8 DDR
           channels
    • 4 Quick Path Interconnect (QPI) system interconnect links




9
Intel QuickPath Architecture

•Connectivity
   – Fully-connected by 4 Intel® QuickPath
   – interconnects per socket




                                              MB




                                                                                                    MB
   – 6.4, 5.86, or 4.8 GT/s on all links




                                                   MB




                                                                                               MB
                                                        7500/E7 CPU           7500/E7 CPU




                                              MB




                                                                                                    MB
                                                   MB




                                                                                               MB
   – With 2 IOHs: 82 PCIe lanes (72 Gen2
     Boxboro lanes + 4 Gen1 lanes on unused
     ESI port + 6 Gen1 ICH10 lanes)




                                              MB




                                                                                                    MB
                                                   MB




                                                                                               MB
                                                        7500/E7 CPU           7500/E7 CPU




                                                                                                    MB
                                              MB
   – PCE-E Gen 2.0




                                                   MB




                                                                                               MB
                                                                  Intel® QuickPath
                                                                    interconnects
•Memory
                                                        Boxboro                      Boxboro
   – Registered DDR3 800/1066 MHz via on-
     board memory buffer
   – 64 DIMM support (4:1 DIMM to buffer
     ratio)
Intel® Xeon® 7500/E7 8 Socket Configuration

        4+4 (8S)                            IBM® System
                                              x3850 X5




                                     Up to 10 cores and 2.4 Ghz
                                     per CPU

                                     Support 8 socket mode by
                                        combining 2 systems via
                                        external QPI links

                                     Memory Configuration
                                       4TB in 8 socket server
                                       6TB in 8 socket + MAX5
                                       Continued 1066MHz
                                       support

11
Intel®: SIMD – Single Instruction Multiple Data
technology




• The Intel Xeon® E7 processor supports up to SSE 4.2
    • SIMD capabilities will be expanded to 256-bit registers with the new AVX
      instruction set in the upcoming Intel® Xeon® E5 series processors
• Informix leverages SSE in the Warehouse Accelerator
Intel® Xeon® Processors: Virtualization Performance

                                        Greater                                                                                                                                   Virtualization
                                     Virtualization                                                                                                                               Performance2
                                      Efficiency:                                                                                                                                              VMmark* Performance




                                            Intel QPI

                                   DDR3 Memory
                                   bandwidth and
                                      capacity

                                            Intel® VT
                                              VT-x
                                              VT-d
                                               VT-c




 1 Best published VMmark results as of 20 October 2010.
 See legal information slide, speaker notes and backup foils (if needed) for notes and disclaimers.
 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured
 using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and
 performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Third Generation of Database Technology
     According to IDC’s Article (Carl Olofson) – Feb. 2010
     1st Generation:
      - Vendor proprietary databases of IMS, IDMS, Datacom
     2nd Generation:
      - RDBMS for Open Systems, dependent on disk layout, limitations in scalability and
        disk I/O
      - Database tuning by adding updating stats, creating/dropping indexes, data
        partitioning, summary tables & cubes, force query plans, resource governing
     3rd Generation: IDC Predicts that within 5 years:
     • Most data warehouses will be stored in a columnar fashion
     • Most OLTP database will either be augmented by an in-memory database (IMDB) or
       reside entirely in memory
     • Most large-scale database servers will achieve horizontal scalability through
       clustering


14
Informix Warehouse Accelerator
                                                                           IBM Smart Analytics
Step 1. Install, configure,
                                                                           Studio
start Informix

Step 2. Install, configure,                                                  Step 3
start Accelerator
                                 Step 1
Step 3. Connect Studio to
Informix & add accelerator
                                                                             Step 4
                              Informix Database Server
Step 4. Design, validate,
Deploy Data mart
                                                                             Step 5
Step 5. Load data to
accelerator


Ready for Queries
                                                                           BI Applications
                                                           Step 2
                                                                            Ready
                                          Informix warehouse Accelerator




15
Informix Warehouse Accelerator
 3rd Generation Database Technology is Here

                                                            How is it different?
 What is it?
                                                            • Performance: Unprecedented response
 The Informix Warehouse Accelerator (IWA) is a                times to enable 'train of thought' analysis
 workload optimized, appliance-like, add-on, that enables     frequently blocked by poor query
 the integration of business insights into operational        performance.
 processes to drive winning strategies. It accelerates
                                                            • Integration: Connects to IDS through deep
 select queries, with unprecedented response times.
                                                              integration providing transparency to all
                                                              applications.
                                                            • Self-managed workloads: queries are
                                                              executed in the most efficient way
                                                            • Transparency: applications connected to
                                                              IDS, are entirely unaware of IWA
                                                            • Simplified administration: appliance-like
                                                              hands-free operations, eliminating many
                                                              database tuning tasks




      Breakthrough Technology Enabling New Opportunities
16
17
IWA Software Components
 •   Linux on Intel x86_64 (RHEL 5 or SUSE SLES 11)
 •   IDS 11.70 + IWA code modules including IDS Stored Procedures
     –   Linux on Intel (64 bit)
     –   AIX on Power (64 bit)
     –   HPUX on Itanium (64 bit)
     –   Solaris on Sparc (64bit)
 •   ISAO Studio Plug-in – GUI for Mart definition
 •   OnIWA – On Utilities for Monitoring IWA




18
INTEL/IWA: Breakthrough technologies for performance
     7. Multi-core, multi-node environment                               1. Large memory support
     Nehalem has 8 cores and Westmere 10 cores. This trend is            64-bit computing; System X with MAX5 supports up
     expected to continue. IWA: Parallelize the scan, join, group        to 6TB on a single SMP box; Up to 640GB on each
     operations. Keep copies of dimensions to avoid cross-node           node of blade center. IWA: Compress large dataset
     synchronization.                                                    and keep it in memory; totally avoid IO.


6. Single Instruction Multiple Data
Specialized instructions for manipulating                                              2. Large on-chip Cache
128-bit data simultaneously. IWA:                                                      L1 cache 64KB per core, L2 cache is 256KB per
Compresses the data into deep columnar                           7
                                                                 7       1
                                                                         1             core and L3 cache is about 4-12 MB.
fashion optimized to exploit SIMD. Used in                                             Additional Translation lookaside buffer (TLB).
parallel predicate evaluation in scans.                    6
                                                           6                 2
                                                                             2         IWA: New algorithms to avoid pipeline
                                                                                       flushing and cache hash tables in L2/L3 cache

                                                             5
                                                             5               3
                                                                             3
5. Hyperthreading                                                    4
                                                                     4                3. Frequency Partitioning
2x logical processors; increases processor                                            IWA: Enabler for the effective parallel access
throughput and overall performance of threaded                                        of the compressed data for scanning.
software. IWA: Does not exploit this since the                                        Horizontal and Vertical Partition Elimination.
software is written to avoid pipeline flushing.


                                                4. Virtualization Performance
                                                Lower overhead: Core micro-architecture
                                                enhancements, EPT, VPID, and End-to-End
                                                HW assist IWA: Helps informix and IWA to
                                                seemlessly run and perform in virtualized
                                                environment.

19
IWA: Multi-core and Multi-node environment


                                   Step 1. Submit SQL
                                   DB protocol: SQLI or DRDA                               Informix
                                   Network : TCP/IP,SHM
     Applications
                                                                              2. Query matching and
     BI Tools
                                                                              redirection technology
                                                                                                           Local
                    Step 5. Return results/describe/error                                                  Execution
                    Database protocol: SQLI or DRDA
                    Network : TCP/IP, SHM
                                                                                              Step 3
                                                      Step 4
                                                                                              offload SQL.
                                                      Results:                                DRDA over TCP/IP
                                                      DRDA over TCP/IP

                                                                             Coordinator




                                                     Worker         Worker                 Worker             Worker

                                                     Compressed     Compressed             Compressed         Compressed
                                                     data           data                   data               data
                                                     In memory      In memory              In memory          In memory
                                                                    Memory                                    Memory
                                                     Memory image   image on disk          Memory             image on disk
                                                     on disk                               image on disk


20
IWA: Multi-core and Multi-node environment
     Step1
     SQL from Informix                                           Step5: Send the results
                                                                 back to Infomrix server

     Step2
     Send the queries to all the
                                                                Step4: merge intermediate
     workers                          Coordinator               results, ORDER BY, FIRSTN




 Worker                     Worker                 Worker                 Worker


 Compressed data            Compressed data        Compressed data        Compressed data
 In memory                  In memory              In memory              In memory

 Step3: Scan, Filter,       Step3: Scan, Filter,   Step3: Scan, Filter,   Step3: Scan, Filter,
 join, group                join, group            join, group            join, group



21
IWA: Multi-core and Multi-node environment



                                          Dictionaries
                                          Dictionaries                   Query
                                                                        Executor
                                                         Cell
                                                          3
                                                                       core + $ (HT)
                                                                       core + $ (HT)

                                         Compressed and         Cell
                                                                 1
                                                                       core + $ (HT)
                                                                       core + $ (HT)
                                         Partitioned Data

                                                  Cell                 core + $ (HT)
                                                                       core + $ (HT)
                                                   2




  •   Cell is also the unit of processing, each cell…
      – Assigned to one core
      – Has its own hash table in cache (so no shared object that needs latching!)
  •   Main operator: SCAN over compressed, main-memory table
      – Do selections, GROUP BY, and aggregation as part of this SCAN
      – Only need de-compress for aggregation
  •   Response time ∝ (database size) / (# cores x # nodes)
      – Embarrassing Parallelism – little data exchange across nodes
Expoloiting Larger Memory: Row Oriented Data Store
Each row stored sequentially


     • Optimized for record I/O
     • Fetch and decompress entire
      row, every time
     • Result –
       • Very efficient for
         transactional workloads
       • Not always efficient for
         analytical workloads




                                     If only few columns are required the complete row is still
                                                     fetched and uncompressed


23
Expoloiting Larger Memory: Data is Processed in Compressed Format

     • Within a Register – Store, several columns
       are grouped together.
     • The sum of the width of the compressed
       columns doesn‘t exceed a register
       compatible width. This utilizes the full
       capabilities of a 64 bit system. It doesn‘t
       matter how many columns are placed within
       the register – wide data element.
     • It is beneficial to place commonly used
       columns within the same register – wide
       data element. But this requires dynamic
       knowledge about the executed workload
       (runtime statistics).
     • Having multiple columns within the same
       register – wide data element prevents
       ANDing of different results.


      Predicate evaluation is done against compressed data!
      The Register – Store is an optimization of the Column – Store approach where we try to make the best use
      of existing hardware. Reshuffeling small data elements at runtime into a register is time consuming and can
      be avoided. The Register – Store also delivers good vectorization capabilities.



24
Exploiting Large memory: Compression: Frequency Partitioning

 Trade Info (volume, product,                                                                 Column Partitions
           origin country)                               Histogram




                                         Occurrences
                                         Number of
     Vol       Prod Origin                               on Origin
                                                                                           China GER,
                                                                                           USA FRA,
                                                                                                 …         Rest
                                               Common                    Rare
                                                Values                  values
                                                                                                     Origin
                                                       Top 64
                                                       traded goods
                                                                                           Cell   Cell 3      Cell 4
                                                        – 6 bit code                        1




                                                                                 Product
                                                                                           Cell 2 Cell 5      Cell 6
                                                       Rest

                   Histogram
                   on Product                                                                      Table partitioned
                                                                                                       into Cells
           •    Field lengths vary between cells
                   • Higher Frequencies  Shorter Codes (Approximate Huffman)
           •    Field lengths fixed within cells


25
IWA: SIMD: Register Stores Facilitate SIMD Parallelism
• Access only the banks referenced in the query (like a column store):
     –SELECT SUM (T.G)
     –FROM     T
     –WHERE T.A > 5
     –GROUP BY T.D
• Pack multiple rows from the same bank into the 128-bit register
• Enables yet another layer of parallelism: SIMD (Single-Instruction, Multiple-Data)!

              A1   D1    G1               B1        E1 F1             C1   H1




                                                                                   Cell Block
              A2   D2    G2               B2        E2 F2            C2    H2
      Operand
       32 bits           Operand
                          32 bits              Operand
                                                32 bits              Operand
                                                                      32 bits
              A3   D3    G3               B3       E3 F3             C3    H3
      Vector Operation
              A4   D4    G4               B4       E4 F4              C4   H4
      Result1128 bits Result2                  Result3               Result
                                                                    Bank β3 4
             Bank β1 (32 bits)           Bank β2 (32 bits)
                                                                    (16 bits)

26
IWA:SIMD: Simultaneous Evaluation of Equality Predicates


     • CPU operates on 128-bit units          State==‘CA’ && Quarter == ‘Q4’
       • Lots of fields fit in 128 bits                       Translate value query
     • These fields are at fixed offsets                      to Code query
     • Apply predicates to all columns        State==01001 && Quarter==1110
      simultaneously!
                          State                     Quarter
                         …                …     …             …               Row
                                                &
                        11111             0     1111          0              Mask


                                              ==
                                                                          Selection
                         01001            0     1110          0           result
27
Exploiting Large on-chip Cache


•Encoding makes grouping simple!
    –Coded values assigned densely (by construction)
    –Hence, in principle, grouping is simple: aggTable[group] += aggValue
•Challenges:
    –Fitting hash table in L2 cache
    –Avoiding all branches in hash table lookup
•IWA adaptively uses one of 2 techniques, depending on # of distinct groups
    1.Use dictionary code as a perfect hash (i.e. collision-free), OR
         •aggTable[groupCode] += aggValue
         •No branches, no hash function computation
         •Works great if groupCode is dense
        – i.e., single column, or multiple column with little correlation
    2.Use usual linear probing
        •Involves branches, random access, …
Case Study #1: U.S. Government Agency




29
Case Study #2: Datamart at a Government Agency

     • Microstrategy report was run, which generates
         • 667 SQL statements of which 537 were Select statements
     • Datamart for this report has 250 Tables and 30 GB Data size
     • Original report on XPS and Sun Sparc M9000 took 90 mins
     • With IDS 11.7 on Linux Intel box, it took 40 mins
     • With IWA, it took 67 seconds.




30
Case Study #3: Skechers, USA. Shoe Retailer
 •   Top 7 time-consuming queries in Retail BI and Warehouse:
      (Against 1 Billion rows Fact Tables)

              Query           IDS 11.5              IDS 11.7 IWA
                1             22 mins                  4 secs
                2           1 min 3 secs               2 secs
                3          3 mins 40 secs              2 secs
                4          30 mins & up                4 secs
                5             2 mins                   2 secs
                6             30 mins                  2 secs
                7          45 mins & up                2 secs



     Query acceleration 30x to 1400x – average acceleration        450x
31
Systems Tested
 • 4S Intel® Xeon® 7560 (whitebox)
     – 2.26 GHz 8C CPU
 • 4S Intel® Xeon® E7 4870 (whitebox)
     – 2.40 GHz 10C CPU
     – 256GB 1066GHz DDR3 memory
 • 8S Intel® Xeon® E7 7560 (IBM® System x3850 X5)
     – 2.26 GHz 8C CPU
     – 2TB 1066GHz DDR3 memory




32
POPS schema

                   daily_sales
     Customer                       Product
                350 million rows

      Store

                                      Promotion
                   daily_forecast
      Period
                 1 billion rows




33
Systems Tested
 • 8S Intel® Xeon® E7 7560 (IBM® System x3850 X5)
     – 2.26 GHz 8C CPU
     – 2TB 1066GHz DDR3 memory




37
500 GB SSED
Store Sales ER-Diagram


      73,049                              402

                204,000

                                    4,594,771,672           86,400
               1000



                        1,920,800
                                    1,000,000       7200


                                                           20
                      2,000,000
IWA2            IWA3             IWA4              IWA AVG               IDS1             IDS2              IDS3             IDS AVG        Improvement




  109046           104246          92653              97666          100902.75              3294554           3338352      3341873        3324926.333      3295.179104
   31190            27175           26927              27417             28177.25           1538219           1538364      1538959            1538514      5460.128295
   93377            97192          95638               92691              94724.5           1910772           1884782      1899916           1898490       2004.222772
  119587           117053           117513            117902             118013.75          1765145           1722746     1690400            1726097       1462.623635
   37587            33551           35579              31651                34592           3167302           3173656      3150876        3163944.667     9146.463537
   28228            29301          24602              29846              27994.25           1525738           1526089      1528724        1526850.333      5454.156955
   27644           28075           30083               29362                28791           2201956           2211549      2517291        2310265.333      8024.262212
   119871          123030          123593             117572              121016.5          5963515          6044626       5947525           5985222      4945.790037
   38346            46412          44463              44918              43534.75           1578035           1557525      1544912        1560157.333      3583.705737
   48450           46470           50032              43668                 47155           1526529           1547404      1563874        1545935.667      3278.413035
   43823            42441          45837               43215                43829       21990513             22354449     21903105          22082689      50383.73908
   47400           46582           46573               47031             46896.5            2251672           2278167      2281946           2270595       4841.715267
   56961            58315          56437               60119                57958           5295930           5310507      5325095        5310510.667      9162.687923
    9037             9132            8724              9083                 8994            2523942           2529234      2522585        2525253.667      28077.09214
   47062            52354           51374             49932               50180.5           1546319           1570163      1568083        1561521.667        3111.8097
   47643            50415          55660              52788               51626.5           2274649          2264463       2269677        2269596.333     4396.184776
   85154            85711          83824               91692             86595.25           1620173          1656098      1606029         1627433.333      1879.356354
   59766            59341          55436               58522             58266.25           5311906           5307202      5266918           5295342       9088.18055
    8230            8207            8054                8115                8151.5          2159777           2179435          2181312        2173508     26663.90235
  152764           152408          149153             151100             151356.25          2050590          2065049      2060862        2058833.667      1360.256789
   30991           29582            27391              24197             28040.25           2025557           2037336      2040515       2034469.333       7255.532077
  141504           145702          142908             139664             142444.5           5363204           5165693      5393336            5307411      3725.950107


1383661        1392695       1372454                1368151      1379240.25                             79262889                         ArithMean      8936.42511
Thank You!
     Your Feedback is Important to Us
     • Access your personal session survey list and complete via SmartSite
        – Your smart phone or web browser at: iodsmartsite.com
        – Any SmartSite kiosk onsite
        – Each completed session survey increases your chance to win
           an Apple iPod Touch with daily drawing sponsored by Alliance
           Tech


                 Session Number 2864


41
Thank you!

More Related Content

What's hot

IBM System x en BladeCenter overzicht (june 2012)
IBM System x en BladeCenter overzicht (june 2012)IBM System x en BladeCenter overzicht (june 2012)
IBM System x en BladeCenter overzicht (june 2012)ibmserverblog
 
9sept2009 concept electronics
9sept2009 concept electronics9sept2009 concept electronics
9sept2009 concept electronicsAgora Group
 
Infrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with ConfidenceInfrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with ConfidenceWalter Moriconi
 
Presentation from physical to virtual to cloud emc
Presentation   from physical to virtual to cloud emcPresentation   from physical to virtual to cloud emc
Presentation from physical to virtual to cloud emcxKinAnx
 
Achieving Lowest Latencies at Highest Message Rates: Solarflare & Intel webcast
Achieving Lowest Latencies at Highest Message Rates: Solarflare & Intel webcastAchieving Lowest Latencies at Highest Message Rates: Solarflare & Intel webcast
Achieving Lowest Latencies at Highest Message Rates: Solarflare & Intel webcastfinteligent
 
Cots moves to multicore: AMD
Cots moves to multicore: AMDCots moves to multicore: AMD
Cots moves to multicore: AMDKonrad Witte
 
Embedded Solutions 2010: Intel Multicore by Eastronics
Embedded Solutions 2010:  Intel Multicore by Eastronics Embedded Solutions 2010:  Intel Multicore by Eastronics
Embedded Solutions 2010: Intel Multicore by Eastronics New-Tech Magazine
 
NUMA Performance Considerations in VMware vSphere
NUMA Performance Considerations in VMware vSphereNUMA Performance Considerations in VMware vSphere
NUMA Performance Considerations in VMware vSphereAMD
 
Sun fire x4100 m2, x4200 m2 server customer presentation
Sun fire x4100 m2, x4200 m2 server customer presentationSun fire x4100 m2, x4200 m2 server customer presentation
Sun fire x4100 m2, x4200 m2 server customer presentationxKinAnx
 
Tech Ed09 India Ver M New
Tech Ed09 India Ver M NewTech Ed09 India Ver M New
Tech Ed09 India Ver M Newrsnarayanan
 
IBM Storwize V7000 — unikátní virtualizační diskové pole
IBM Storwize V7000 — unikátní virtualizační diskové poleIBM Storwize V7000 — unikátní virtualizační diskové pole
IBM Storwize V7000 — unikátní virtualizační diskové poleJaroslav Prodelal
 
2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito ctaLorenzo Corbetta
 
Nevmug Amd January 2009
Nevmug   Amd January 2009 Nevmug   Amd January 2009
Nevmug Amd January 2009 csharney
 
Vnx series-technical-review-110616214632-phpapp02
Vnx series-technical-review-110616214632-phpapp02Vnx series-technical-review-110616214632-phpapp02
Vnx series-technical-review-110616214632-phpapp02Newlink
 
IBM Solid State in eX5 servers
IBM Solid State in eX5 serversIBM Solid State in eX5 servers
IBM Solid State in eX5 serversTony Pearson
 

What's hot (20)

IBM System x en BladeCenter overzicht (june 2012)
IBM System x en BladeCenter overzicht (june 2012)IBM System x en BladeCenter overzicht (june 2012)
IBM System x en BladeCenter overzicht (june 2012)
 
9sept2009 concept electronics
9sept2009 concept electronics9sept2009 concept electronics
9sept2009 concept electronics
 
Infrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with ConfidenceInfrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
 
Big Data Smarter Networks
Big Data Smarter NetworksBig Data Smarter Networks
Big Data Smarter Networks
 
EMC - 8sept2011
EMC - 8sept2011EMC - 8sept2011
EMC - 8sept2011
 
Presentation from physical to virtual to cloud emc
Presentation   from physical to virtual to cloud emcPresentation   from physical to virtual to cloud emc
Presentation from physical to virtual to cloud emc
 
Achieving Lowest Latencies at Highest Message Rates: Solarflare & Intel webcast
Achieving Lowest Latencies at Highest Message Rates: Solarflare & Intel webcastAchieving Lowest Latencies at Highest Message Rates: Solarflare & Intel webcast
Achieving Lowest Latencies at Highest Message Rates: Solarflare & Intel webcast
 
6dec2011 - DELL
6dec2011 - DELL6dec2011 - DELL
6dec2011 - DELL
 
Cots moves to multicore: AMD
Cots moves to multicore: AMDCots moves to multicore: AMD
Cots moves to multicore: AMD
 
Embedded Solutions 2010: Intel Multicore by Eastronics
Embedded Solutions 2010:  Intel Multicore by Eastronics Embedded Solutions 2010:  Intel Multicore by Eastronics
Embedded Solutions 2010: Intel Multicore by Eastronics
 
NUMA Performance Considerations in VMware vSphere
NUMA Performance Considerations in VMware vSphereNUMA Performance Considerations in VMware vSphere
NUMA Performance Considerations in VMware vSphere
 
ThinkServer TS430
ThinkServer TS430ThinkServer TS430
ThinkServer TS430
 
Sun fire x4100 m2, x4200 m2 server customer presentation
Sun fire x4100 m2, x4200 m2 server customer presentationSun fire x4100 m2, x4200 m2 server customer presentation
Sun fire x4100 m2, x4200 m2 server customer presentation
 
Tech Ed09 India Ver M New
Tech Ed09 India Ver M NewTech Ed09 India Ver M New
Tech Ed09 India Ver M New
 
IBM Storwize V7000 — unikátní virtualizační diskové pole
IBM Storwize V7000 — unikátní virtualizační diskové poleIBM Storwize V7000 — unikátní virtualizační diskové pole
IBM Storwize V7000 — unikátní virtualizační diskové pole
 
2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta
 
Nevmug Amd January 2009
Nevmug   Amd January 2009 Nevmug   Amd January 2009
Nevmug Amd January 2009
 
Sun Microsystems
Sun MicrosystemsSun Microsystems
Sun Microsystems
 
Vnx series-technical-review-110616214632-phpapp02
Vnx series-technical-review-110616214632-phpapp02Vnx series-technical-review-110616214632-phpapp02
Vnx series-technical-review-110616214632-phpapp02
 
IBM Solid State in eX5 servers
IBM Solid State in eX5 serversIBM Solid State in eX5 servers
IBM Solid State in eX5 servers
 

Similar to Performance and scalability of Informix ultimate warehouse edtion on Intel Xeon 7500 and E7 processors

Intel_Embedded Intel Core Processors Do More Now and in the Future
Intel_Embedded Intel Core Processors Do More Now and in the FutureIntel_Embedded Intel Core Processors Do More Now and in the Future
Intel_Embedded Intel Core Processors Do More Now and in the FutureIşınsu Akçetin
 
AMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series PresentationAMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series PresentationAMD
 
What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?Enkitec
 
Intel(R)Core(Tm)I7 Desktop Processor Product Brief
Intel(R)Core(Tm)I7 Desktop Processor Product BriefIntel(R)Core(Tm)I7 Desktop Processor Product Brief
Intel(R)Core(Tm)I7 Desktop Processor Product BriefOscar del Rio
 
Ibm and Erb's Presentation Insider's Edition Event . September 2010
Ibm and Erb's Presentation Insider's Edition Event .  September 2010Ibm and Erb's Presentation Insider's Edition Event .  September 2010
Ibm and Erb's Presentation Insider's Edition Event . September 2010Erb's Marketing
 
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013Jaroslav Prodelal
 
云计算核心技术架构分论坛 一石三鸟 性能 功耗及成本
云计算核心技术架构分论坛 一石三鸟 性能 功耗及成本云计算核心技术架构分论坛 一石三鸟 性能 功耗及成本
云计算核心技术架构分论坛 一石三鸟 性能 功耗及成本Riquelme624
 
IBM: Servery a datová úložiště vhodná pro virtualizaci a budování privátních ...
IBM: Servery a datová úložiště vhodná pro virtualizaci a budování privátních ...IBM: Servery a datová úložiště vhodná pro virtualizaci a budování privátních ...
IBM: Servery a datová úložiště vhodná pro virtualizaci a budování privátních ...Jaroslav Prodelal
 
Opti plex family-one-pager
Opti plex family-one-pagerOpti plex family-one-pager
Opti plex family-one-pagerArsalan Qureshi
 
Workload consolidation on ATCA with the advantech mic 5333 universal platform
Workload consolidation on ATCA with the advantech mic 5333 universal platformWorkload consolidation on ATCA with the advantech mic 5333 universal platform
Workload consolidation on ATCA with the advantech mic 5333 universal platformPaul Stevens
 
Citrix Xen Desktop Solution White Paper
Citrix Xen Desktop Solution White PaperCitrix Xen Desktop Solution White Paper
Citrix Xen Desktop Solution White PaperReadWriteEnterprise
 
Filename intelvmwaresolutionbrief asset4
Filename intelvmwaresolutionbrief asset4Filename intelvmwaresolutionbrief asset4
Filename intelvmwaresolutionbrief asset4ReadWrite
 
Intel Roadmap 2010
Intel Roadmap 2010Intel Roadmap 2010
Intel Roadmap 2010Umair Mohsin
 
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013Jaroslav Prodelal
 
Fujitsu World Tour 2017 - Compute Platform For The Digital World
Fujitsu World Tour 2017 - Compute Platform For The Digital WorldFujitsu World Tour 2017 - Compute Platform For The Digital World
Fujitsu World Tour 2017 - Compute Platform For The Digital WorldFujitsu India
 

Similar to Performance and scalability of Informix ultimate warehouse edtion on Intel Xeon 7500 and E7 processors (20)

Intel_Embedded Intel Core Processors Do More Now and in the Future
Intel_Embedded Intel Core Processors Do More Now and in the FutureIntel_Embedded Intel Core Processors Do More Now and in the Future
Intel_Embedded Intel Core Processors Do More Now and in the Future
 
AMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series PresentationAMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series Presentation
 
What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?
 
Intel(R)Core(Tm)I7 Desktop Processor Product Brief
Intel(R)Core(Tm)I7 Desktop Processor Product BriefIntel(R)Core(Tm)I7 Desktop Processor Product Brief
Intel(R)Core(Tm)I7 Desktop Processor Product Brief
 
Ibm and Erb's Presentation Insider's Edition Event . September 2010
Ibm and Erb's Presentation Insider's Edition Event .  September 2010Ibm and Erb's Presentation Insider's Edition Event .  September 2010
Ibm and Erb's Presentation Insider's Edition Event . September 2010
 
Rp 70 Xrt Ds
Rp 70 Xrt DsRp 70 Xrt Ds
Rp 70 Xrt Ds
 
Tachion
TachionTachion
Tachion
 
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
 
云计算核心技术架构分论坛 一石三鸟 性能 功耗及成本
云计算核心技术架构分论坛 一石三鸟 性能 功耗及成本云计算核心技术架构分论坛 一石三鸟 性能 功耗及成本
云计算核心技术架构分论坛 一石三鸟 性能 功耗及成本
 
IBM: Servery a datová úložiště vhodná pro virtualizaci a budování privátních ...
IBM: Servery a datová úložiště vhodná pro virtualizaci a budování privátních ...IBM: Servery a datová úložiště vhodná pro virtualizaci a budování privátních ...
IBM: Servery a datová úložiště vhodná pro virtualizaci a budování privátních ...
 
BladeCenter 101
BladeCenter 101BladeCenter 101
BladeCenter 101
 
Opti plex family-one-pager
Opti plex family-one-pagerOpti plex family-one-pager
Opti plex family-one-pager
 
Workload consolidation on ATCA with the advantech mic 5333 universal platform
Workload consolidation on ATCA with the advantech mic 5333 universal platformWorkload consolidation on ATCA with the advantech mic 5333 universal platform
Workload consolidation on ATCA with the advantech mic 5333 universal platform
 
Citrix Xen Desktop Solution White Paper
Citrix Xen Desktop Solution White PaperCitrix Xen Desktop Solution White Paper
Citrix Xen Desktop Solution White Paper
 
Filename intelvmwaresolutionbrief asset4
Filename intelvmwaresolutionbrief asset4Filename intelvmwaresolutionbrief asset4
Filename intelvmwaresolutionbrief asset4
 
Intel Roadmap 2010
Intel Roadmap 2010Intel Roadmap 2010
Intel Roadmap 2010
 
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
 
IBM System x3755 M3 Product Guide
IBM System x3755 M3 Product GuideIBM System x3755 M3 Product Guide
IBM System x3755 M3 Product Guide
 
IBM System x3850 M2 / x3950 M2
IBM System x3850 M2 / x3950 M2IBM System x3850 M2 / x3950 M2
IBM System x3850 M2 / x3950 M2
 
Fujitsu World Tour 2017 - Compute Platform For The Digital World
Fujitsu World Tour 2017 - Compute Platform For The Digital WorldFujitsu World Tour 2017 - Compute Platform For The Digital World
Fujitsu World Tour 2017 - Compute Platform For The Digital World
 

More from Keshav Murthy

N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0Keshav Murthy
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Keshav Murthy
 
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5Keshav Murthy
 
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...Keshav Murthy
 
Couchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing featuresCouchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing featuresKeshav Murthy
 
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram VemulapalliN1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram VemulapalliKeshav Murthy
 
Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.Keshav Murthy
 
Couchbase Query Workbench Enhancements By Eben Haber
Couchbase Query Workbench Enhancements  By Eben Haber Couchbase Query Workbench Enhancements  By Eben Haber
Couchbase Query Workbench Enhancements By Eben Haber Keshav Murthy
 
Mindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developersMindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developersKeshav Murthy
 
Couchbase N1QL: Index Advisor
Couchbase N1QL: Index AdvisorCouchbase N1QL: Index Advisor
Couchbase N1QL: Index AdvisorKeshav Murthy
 
N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0Keshav Murthy
 
From SQL to NoSQL: Structured Querying for JSON
From SQL to NoSQL: Structured Querying for JSONFrom SQL to NoSQL: Structured Querying for JSON
From SQL to NoSQL: Structured Querying for JSONKeshav Murthy
 
Tuning for Performance: indexes & Queries
Tuning for Performance: indexes & QueriesTuning for Performance: indexes & Queries
Tuning for Performance: indexes & QueriesKeshav Murthy
 
Understanding N1QL Optimizer to Tune Queries
Understanding N1QL Optimizer to Tune QueriesUnderstanding N1QL Optimizer to Tune Queries
Understanding N1QL Optimizer to Tune QueriesKeshav Murthy
 
Utilizing Arrays: Modeling, Querying and Indexing
Utilizing Arrays: Modeling, Querying and IndexingUtilizing Arrays: Modeling, Querying and Indexing
Utilizing Arrays: Modeling, Querying and IndexingKeshav Murthy
 
Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5Keshav Murthy
 
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQLBringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQLKeshav Murthy
 
Query in Couchbase. N1QL: SQL for JSON
Query in Couchbase.  N1QL: SQL for JSONQuery in Couchbase.  N1QL: SQL for JSON
Query in Couchbase. N1QL: SQL for JSONKeshav Murthy
 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications Keshav Murthy
 
Introducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSONIntroducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSONKeshav Murthy
 

More from Keshav Murthy (20)

N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
 
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
 
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
 
Couchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing featuresCouchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing features
 
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram VemulapalliN1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
 
Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.
 
Couchbase Query Workbench Enhancements By Eben Haber
Couchbase Query Workbench Enhancements  By Eben Haber Couchbase Query Workbench Enhancements  By Eben Haber
Couchbase Query Workbench Enhancements By Eben Haber
 
Mindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developersMindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developers
 
Couchbase N1QL: Index Advisor
Couchbase N1QL: Index AdvisorCouchbase N1QL: Index Advisor
Couchbase N1QL: Index Advisor
 
N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0
 
From SQL to NoSQL: Structured Querying for JSON
From SQL to NoSQL: Structured Querying for JSONFrom SQL to NoSQL: Structured Querying for JSON
From SQL to NoSQL: Structured Querying for JSON
 
Tuning for Performance: indexes & Queries
Tuning for Performance: indexes & QueriesTuning for Performance: indexes & Queries
Tuning for Performance: indexes & Queries
 
Understanding N1QL Optimizer to Tune Queries
Understanding N1QL Optimizer to Tune QueriesUnderstanding N1QL Optimizer to Tune Queries
Understanding N1QL Optimizer to Tune Queries
 
Utilizing Arrays: Modeling, Querying and Indexing
Utilizing Arrays: Modeling, Querying and IndexingUtilizing Arrays: Modeling, Querying and Indexing
Utilizing Arrays: Modeling, Querying and Indexing
 
Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5
 
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQLBringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
 
Query in Couchbase. N1QL: SQL for JSON
Query in Couchbase.  N1QL: SQL for JSONQuery in Couchbase.  N1QL: SQL for JSON
Query in Couchbase. N1QL: SQL for JSON
 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
 
Introducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSONIntroducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSON
 

Recently uploaded

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Performance and scalability of Informix ultimate warehouse edtion on Intel Xeon 7500 and E7 processors

  • 1. Performance and Scalability of Informix® Ultimate Warehouse Edition on Intel Xeon® 7500 and E7 processors Session Number 2864 Keshava Murthy, IBM® Jantz Tran, Intel®
  • 2. Agenda • Intel Inside • IWA Overview • Key performance features in Intel • How IWA is exploiting the Intel features. • Performance results 1
  • 3. Tick-Tock Development Model Sustained Xeon® Microprocessor Leadership Tick Tock Tick Tock Tick Tock Tick Tock 65nm 45nm 32nm 22nm 00 y ridge ® 53 7400 7500 ® E7 Sand /EN Ivy B N Xeon 5100 X eon ® X eon ® X eon ® Xeon e-EP EP/E Bridg Intel® Core™ Nehalem/Westmere Sandy Bridge/Ivy Bridge Microarchitecture Microarchitecture Microarchitecture First high-volume server Quad- Up to 10 cores Up to 8 cores Core CPUs and 20MB Cache and 30MB Cache Integrated memory controller Dedicated high- Integrated PCI Express with DDR3 support speed bus per CPU Turbo Boost 2.0 Turbo Boost, Intel HT, AES- 1 HW-assisted NI Intel Advanced Vector virtualization (VT-x) Extensions (AVX) End-to-end HW-assisted virtualization (VT-x, -d, -c) 2
  • 4. Intel Xeon Processor ® ® Family for Business Scalable Intel® Xeon® processor E7 platforms Enterprise Scalable (up to 256-way), reliable, powerful 64-bit multi-core servers offering industry- leading performance, expanded memory & I/O capacity, and advanced reliability ideal for Mainstream Top-of-the-line performance, the most demanding enterprise and mission critical workloads, large scale virtualization and Enterprise large-node HPC applications. scalability, and reliability Best combination of performance, power efficiency, Intel® Xeon® processor 5000 sequence platforms (E5 in 2012) and cost Small Mission Critical Versatile (up to 2-way) servers for all your infrastructure, high-density, workstationthe most and HPC Business Enterprise Server optimal performancePerformance and reliability forfor the applications with features that enable business critical efficiency outstanding and power workloads with data center. Versatility for infrastructure apps (up to 4S) economics Economical and more Cloud Computing Cloud Computing dependable vs. desktop Efficient, secure, and open platforms for Highest virtualization density and advanced Intel® Xeon® processor 3000 and IAAS Internet datacenters sequence platforms (E3 in 2012) reliability for private cloud Entry Servers andEconomical (1-way) dependable general purpose 64-bit servers well-suited for small High Performance Computing & High Performance Computing Workstations businesses and education with features that optimize performance, uptime, and security Workstations More features and performance than Bandwidth-optimized for high Greater scaling and memory capacity traditional desktop systems performance analytics & visualization Increasing capability
  • 5. Intel® Xeon® Processor E7-8800/4800/2800 Product Families Building on Xeon® 7500 Leadership Capabilities More Performance More Expandable • 10 cores / 20 threads • Supports 32GB DDR3 DIMMs (2TB per 4-socket system)1 • 30MB of last level cache More Security & RAS E7-4800 E7-4800 More Efficient SECURITY • More performance within same max CPU TDP as Xeon • Intel® Advanced Encryption 7500 Standard-New Instructions E7-4800 E7-4800 • Lower partial active & idle • Intel® Trusted Execution power via Intel Intelligent Technology (TXT) Power Technology2 • Support for Low Voltage- RELIABILITY, AVAILABILITY, SERVICEABILITY DIMMs3 • Enhanced DRAM Double Device Data Correction • Reduced power memory • Fine Grained Memory Mirroring buffers4 Delivers more Performance, Expandability and RAS while improving Energy Efficiency 1. Up to 64 slots per standard 4 socket system x 32GB/DIMM = 2TB 2. Uses similar core and package C6 power states enabled on Intel Xeon 5500/5600 series processors. Requires OS support. 3. Savings dependent on workload and configuration. 4. Memory buffer power savings of up to 1.3W active and 3W idle per buffer per Intel estimates. Slightly more savings when used with LV DIMMs
  • 6. Advantages of the Xeon® E7 Platform 4-socket systems can… …process the biggest workloads…maximize consolidation …increase system uptime…handle highly variable workloads Intel ® Xeon® Processor E7-4800 Product Family vs. Xeon® Processor 5600 Series Large Workloads Mission Critical Class System Highly Variable Workloads & Max. Consolidation Availability Over 2X the compute performance Protects your data by preventing across a range of benchmarks1 More performance headroom to handle peak, errors unexpected, or underestimated workloads Up to 7X memory capacity for greater Increased availability via healing, performance, headroom and memory Compute, memory and I/O scalability extends redundancy and failover DIMM savings2 useful server life in high-growth workloads technologies Up to 2X higher consolidation3 Denser compute resources per server Minimized downtime via failure maximizes performance in constrained sites prediction and proactive replacement of failing components Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. 1. Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm 2. 64 DIMM slots vs. 18 slots for the Xeon 5600 processor series platform 3. 2X higher consolidation refresh ratio based on ROI tool comparing Xeon 7500 and Xeon 5600 vs.. older generations.
  • 7. Advanced Reliability Starts With Silicon Intel® Xeon® processor E7 family RAS Capabilities Memory I/O Hub CPU/Socket • Inter-socket Memory Mirroring • Physical IOH Hot Add • Machine Check Architecture Machine Check Architecture (MCA) ® • Intel® Scalable Memory • OS IOH On-lining* recovery (MCA-R) (MCA) recovery (MCA-R) Interconnect (Intel® SMI) Lane • PCI-E Hot Plug • Corrected Machine Check Interrupt Failover (CMCI) ® • Intel® SMI Clock Fail Over • Corrupt Data Containment Mode • Intel® SMI Packet Retry ® • Viral Mode • Memory Address Parity • OS Assisted Processor Socket • Failed DIMM Isolation Migration* • Memory Board Hot Add/Remove • OS CPU on-lining * • Dynamic Memory Migration* • CPU Board Hot Add at QPI • OS Memory On-lining * • Electronically Isolated (Static) • Recovery from Single DRAM Partitioning Device Failure (SDDC) plus • Single Core Disable for Fault random bit error Resilient Boot • Memory Thermal Throttling • Demand and Patrol scrubbing • Fail Over from Single DRAM Intel® QuickPath Interconnect Device Failure (SDDC) • Enhanced DRAM Double Device • Intel QPI Packet Retry Data Correction • Intel QPI Protocol Protection via • Fine Grained Memory Mirroring CRC (8bit or 16bit rolling) • Memory DIMM and Rank Sparing • QPI Clock Fail Over • Intra-socket Memory Mirroring • QPI Self-Healing • Mirrored Memory Board Hot Add/Remove Advanced reliability features work to maintain data integrity 6
  • 8. ® ® Intel Xeon processor E5-2600 product family (Sandy Bridge-EP) New micro-architecture on the 32nm process technology Higher performance Platform Features More Efficient Lower platform power1 Up to 8 cores, 20 MB cache New Intel® Advanced Vector Extensions Optimized Turbo Boost Technology Optimized Turbo Boost More Intelligent Intel Node Manager Sandy Bridge-EP enhancements QPI Up to Intel AES-NI improvements 2 QPI Up to More Secure More robust Intel TXT solutions links 4 channels between DDR3 1600 Up to 8 Cores CPUs memory Optimized platforms for: Integrated PCI Express* 3.0 Up to 40 lanes per socket More Options Performance Smaller Form Factors Best value 1 Lower platform power claim based on a Xeon® 5600 CPU and Sandy Bridge-EP CPU with the same TDP specification and comparable platform configurations. Platform power reduction is primarily attributed to TDP reduction from a two-chip solution based on the Intel 5520 chip set and ICH-10R, down to a one-chip south bridge solution(Patsburg chip) on the Sandy Bridge platform.
  • 9. INTEL: Breakthrough technologies for performance 7. Multi-core, multi-node environment 1. Large memory support Nehalem has 8 cores and Westmere 10 cores. This 64-bit computing; System X with MAX5 supports up trend is expected to continue. to 6TB on a single SMP box; Up to 640GB on each node of blade center. 6. Single Instruction Multiple Data 2. Large on-chip Cache Specialized instructions for manipulating L1 cache 64KB per core, L2 cache is 256KB per 128-bit data simultaneously. 7 7 1 1 core and L3 cache is about 24-30 MB. Additional Translation lookaside buffer (TLB). 6 6 2 2 5 5 3 3 5. Hyperthreading 4 4 3. Frequency Partitioning 2x logical processors; increases Enabler for the effective parallel access of processor throughput and overall the compressed data for scanning. performance of threaded software. Horizontal and Vertical Partition Elimination. 4. Virtualization Performance Lower overhead: Core micro-architecture enhancements, EPT, VPID, and End-to-End HW assist 8
  • 10. Intel® Xeon® E7 Processor Architecture Core 0 L1 L2 L2 L1 Core 5 Core 1 L1 L2 L2 L1 Core 6 Cache Architecture Core 2 L1 L2 Shared L3 L2 L1 Core 7 •64K L1 Cache Core 3 L1 L2 L2 L1 Core 8 •256K L2 Cache Core 4 L1 L2 L2 L1 Core 9 •30MB 10 slice shared Last Level cache (L3) (compared to 24MB 8 slice L3 on Xeon® 7500) IMC IMC QPI (4 Links) • 2 integrated memory controllers • Scalable Memory Interconnect (SMI) with support for up to 8 DDR channels • 4 Quick Path Interconnect (QPI) system interconnect links 9
  • 11. Intel QuickPath Architecture •Connectivity – Fully-connected by 4 Intel® QuickPath – interconnects per socket MB MB – 6.4, 5.86, or 4.8 GT/s on all links MB MB 7500/E7 CPU 7500/E7 CPU MB MB MB MB – With 2 IOHs: 82 PCIe lanes (72 Gen2 Boxboro lanes + 4 Gen1 lanes on unused ESI port + 6 Gen1 ICH10 lanes) MB MB MB MB 7500/E7 CPU 7500/E7 CPU MB MB – PCE-E Gen 2.0 MB MB Intel® QuickPath interconnects •Memory Boxboro Boxboro – Registered DDR3 800/1066 MHz via on- board memory buffer – 64 DIMM support (4:1 DIMM to buffer ratio)
  • 12. Intel® Xeon® 7500/E7 8 Socket Configuration 4+4 (8S) IBM® System x3850 X5 Up to 10 cores and 2.4 Ghz per CPU Support 8 socket mode by combining 2 systems via external QPI links Memory Configuration 4TB in 8 socket server 6TB in 8 socket + MAX5 Continued 1066MHz support 11
  • 13. Intel®: SIMD – Single Instruction Multiple Data technology • The Intel Xeon® E7 processor supports up to SSE 4.2 • SIMD capabilities will be expanded to 256-bit registers with the new AVX instruction set in the upcoming Intel® Xeon® E5 series processors • Informix leverages SSE in the Warehouse Accelerator
  • 14. Intel® Xeon® Processors: Virtualization Performance Greater Virtualization Virtualization Performance2 Efficiency: VMmark* Performance Intel QPI DDR3 Memory bandwidth and capacity Intel® VT VT-x VT-d VT-c 1 Best published VMmark results as of 20 October 2010. See legal information slide, speaker notes and backup foils (if needed) for notes and disclaimers. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
  • 15. Third Generation of Database Technology According to IDC’s Article (Carl Olofson) – Feb. 2010 1st Generation: - Vendor proprietary databases of IMS, IDMS, Datacom 2nd Generation: - RDBMS for Open Systems, dependent on disk layout, limitations in scalability and disk I/O - Database tuning by adding updating stats, creating/dropping indexes, data partitioning, summary tables & cubes, force query plans, resource governing 3rd Generation: IDC Predicts that within 5 years: • Most data warehouses will be stored in a columnar fashion • Most OLTP database will either be augmented by an in-memory database (IMDB) or reside entirely in memory • Most large-scale database servers will achieve horizontal scalability through clustering 14
  • 16. Informix Warehouse Accelerator IBM Smart Analytics Step 1. Install, configure, Studio start Informix Step 2. Install, configure, Step 3 start Accelerator Step 1 Step 3. Connect Studio to Informix & add accelerator Step 4 Informix Database Server Step 4. Design, validate, Deploy Data mart Step 5 Step 5. Load data to accelerator Ready for Queries BI Applications Step 2 Ready Informix warehouse Accelerator 15
  • 17. Informix Warehouse Accelerator 3rd Generation Database Technology is Here How is it different? What is it? • Performance: Unprecedented response The Informix Warehouse Accelerator (IWA) is a times to enable 'train of thought' analysis workload optimized, appliance-like, add-on, that enables frequently blocked by poor query the integration of business insights into operational performance. processes to drive winning strategies. It accelerates • Integration: Connects to IDS through deep select queries, with unprecedented response times. integration providing transparency to all applications. • Self-managed workloads: queries are executed in the most efficient way • Transparency: applications connected to IDS, are entirely unaware of IWA • Simplified administration: appliance-like hands-free operations, eliminating many database tuning tasks Breakthrough Technology Enabling New Opportunities 16
  • 18. 17
  • 19. IWA Software Components • Linux on Intel x86_64 (RHEL 5 or SUSE SLES 11) • IDS 11.70 + IWA code modules including IDS Stored Procedures – Linux on Intel (64 bit) – AIX on Power (64 bit) – HPUX on Itanium (64 bit) – Solaris on Sparc (64bit) • ISAO Studio Plug-in – GUI for Mart definition • OnIWA – On Utilities for Monitoring IWA 18
  • 20. INTEL/IWA: Breakthrough technologies for performance 7. Multi-core, multi-node environment 1. Large memory support Nehalem has 8 cores and Westmere 10 cores. This trend is 64-bit computing; System X with MAX5 supports up expected to continue. IWA: Parallelize the scan, join, group to 6TB on a single SMP box; Up to 640GB on each operations. Keep copies of dimensions to avoid cross-node node of blade center. IWA: Compress large dataset synchronization. and keep it in memory; totally avoid IO. 6. Single Instruction Multiple Data Specialized instructions for manipulating 2. Large on-chip Cache 128-bit data simultaneously. IWA: L1 cache 64KB per core, L2 cache is 256KB per Compresses the data into deep columnar 7 7 1 1 core and L3 cache is about 4-12 MB. fashion optimized to exploit SIMD. Used in Additional Translation lookaside buffer (TLB). parallel predicate evaluation in scans. 6 6 2 2 IWA: New algorithms to avoid pipeline flushing and cache hash tables in L2/L3 cache 5 5 3 3 5. Hyperthreading 4 4 3. Frequency Partitioning 2x logical processors; increases processor IWA: Enabler for the effective parallel access throughput and overall performance of threaded of the compressed data for scanning. software. IWA: Does not exploit this since the Horizontal and Vertical Partition Elimination. software is written to avoid pipeline flushing. 4. Virtualization Performance Lower overhead: Core micro-architecture enhancements, EPT, VPID, and End-to-End HW assist IWA: Helps informix and IWA to seemlessly run and perform in virtualized environment. 19
  • 21. IWA: Multi-core and Multi-node environment Step 1. Submit SQL DB protocol: SQLI or DRDA Informix Network : TCP/IP,SHM Applications 2. Query matching and BI Tools redirection technology Local Step 5. Return results/describe/error Execution Database protocol: SQLI or DRDA Network : TCP/IP, SHM Step 3 Step 4 offload SQL. Results: DRDA over TCP/IP DRDA over TCP/IP Coordinator Worker Worker Worker Worker Compressed Compressed Compressed Compressed data data data data In memory In memory In memory In memory Memory Memory Memory image image on disk Memory image on disk on disk image on disk 20
  • 22. IWA: Multi-core and Multi-node environment Step1 SQL from Informix Step5: Send the results back to Infomrix server Step2 Send the queries to all the Step4: merge intermediate workers Coordinator results, ORDER BY, FIRSTN Worker Worker Worker Worker Compressed data Compressed data Compressed data Compressed data In memory In memory In memory In memory Step3: Scan, Filter, Step3: Scan, Filter, Step3: Scan, Filter, Step3: Scan, Filter, join, group join, group join, group join, group 21
  • 23. IWA: Multi-core and Multi-node environment Dictionaries Dictionaries Query Executor Cell 3 core + $ (HT) core + $ (HT) Compressed and Cell 1 core + $ (HT) core + $ (HT) Partitioned Data Cell core + $ (HT) core + $ (HT) 2 • Cell is also the unit of processing, each cell… – Assigned to one core – Has its own hash table in cache (so no shared object that needs latching!) • Main operator: SCAN over compressed, main-memory table – Do selections, GROUP BY, and aggregation as part of this SCAN – Only need de-compress for aggregation • Response time ∝ (database size) / (# cores x # nodes) – Embarrassing Parallelism – little data exchange across nodes
  • 24. Expoloiting Larger Memory: Row Oriented Data Store Each row stored sequentially • Optimized for record I/O • Fetch and decompress entire row, every time • Result – • Very efficient for transactional workloads • Not always efficient for analytical workloads If only few columns are required the complete row is still fetched and uncompressed 23
  • 25. Expoloiting Larger Memory: Data is Processed in Compressed Format • Within a Register – Store, several columns are grouped together. • The sum of the width of the compressed columns doesn‘t exceed a register compatible width. This utilizes the full capabilities of a 64 bit system. It doesn‘t matter how many columns are placed within the register – wide data element. • It is beneficial to place commonly used columns within the same register – wide data element. But this requires dynamic knowledge about the executed workload (runtime statistics). • Having multiple columns within the same register – wide data element prevents ANDing of different results. Predicate evaluation is done against compressed data! The Register – Store is an optimization of the Column – Store approach where we try to make the best use of existing hardware. Reshuffeling small data elements at runtime into a register is time consuming and can be avoided. The Register – Store also delivers good vectorization capabilities. 24
  • 26. Exploiting Large memory: Compression: Frequency Partitioning Trade Info (volume, product, Column Partitions origin country) Histogram Occurrences Number of Vol Prod Origin on Origin China GER, USA FRA, … Rest Common Rare Values values Origin Top 64 traded goods Cell Cell 3 Cell 4 – 6 bit code 1 Product Cell 2 Cell 5 Cell 6 Rest Histogram on Product Table partitioned into Cells • Field lengths vary between cells • Higher Frequencies  Shorter Codes (Approximate Huffman) • Field lengths fixed within cells 25
  • 27. IWA: SIMD: Register Stores Facilitate SIMD Parallelism • Access only the banks referenced in the query (like a column store): –SELECT SUM (T.G) –FROM T –WHERE T.A > 5 –GROUP BY T.D • Pack multiple rows from the same bank into the 128-bit register • Enables yet another layer of parallelism: SIMD (Single-Instruction, Multiple-Data)! A1 D1 G1 B1 E1 F1 C1 H1 Cell Block A2 D2 G2 B2 E2 F2 C2 H2 Operand 32 bits Operand 32 bits Operand 32 bits Operand 32 bits A3 D3 G3 B3 E3 F3 C3 H3 Vector Operation A4 D4 G4 B4 E4 F4 C4 H4 Result1128 bits Result2 Result3 Result Bank β3 4 Bank β1 (32 bits) Bank β2 (32 bits) (16 bits) 26
  • 28. IWA:SIMD: Simultaneous Evaluation of Equality Predicates • CPU operates on 128-bit units State==‘CA’ && Quarter == ‘Q4’ • Lots of fields fit in 128 bits Translate value query • These fields are at fixed offsets to Code query • Apply predicates to all columns State==01001 && Quarter==1110 simultaneously! State Quarter … … … … Row & 11111 0 1111 0 Mask == Selection 01001 0 1110 0 result 27
  • 29. Exploiting Large on-chip Cache •Encoding makes grouping simple! –Coded values assigned densely (by construction) –Hence, in principle, grouping is simple: aggTable[group] += aggValue •Challenges: –Fitting hash table in L2 cache –Avoiding all branches in hash table lookup •IWA adaptively uses one of 2 techniques, depending on # of distinct groups 1.Use dictionary code as a perfect hash (i.e. collision-free), OR •aggTable[groupCode] += aggValue •No branches, no hash function computation •Works great if groupCode is dense – i.e., single column, or multiple column with little correlation 2.Use usual linear probing •Involves branches, random access, …
  • 30. Case Study #1: U.S. Government Agency 29
  • 31. Case Study #2: Datamart at a Government Agency • Microstrategy report was run, which generates • 667 SQL statements of which 537 were Select statements • Datamart for this report has 250 Tables and 30 GB Data size • Original report on XPS and Sun Sparc M9000 took 90 mins • With IDS 11.7 on Linux Intel box, it took 40 mins • With IWA, it took 67 seconds. 30
  • 32. Case Study #3: Skechers, USA. Shoe Retailer • Top 7 time-consuming queries in Retail BI and Warehouse: (Against 1 Billion rows Fact Tables) Query IDS 11.5 IDS 11.7 IWA 1 22 mins 4 secs 2 1 min 3 secs 2 secs 3 3 mins 40 secs 2 secs 4 30 mins & up 4 secs 5 2 mins 2 secs 6 30 mins 2 secs 7 45 mins & up 2 secs Query acceleration 30x to 1400x – average acceleration 450x 31
  • 33. Systems Tested • 4S Intel® Xeon® 7560 (whitebox) – 2.26 GHz 8C CPU • 4S Intel® Xeon® E7 4870 (whitebox) – 2.40 GHz 10C CPU – 256GB 1066GHz DDR3 memory • 8S Intel® Xeon® E7 7560 (IBM® System x3850 X5) – 2.26 GHz 8C CPU – 2TB 1066GHz DDR3 memory 32
  • 34. POPS schema daily_sales Customer Product 350 million rows Store Promotion daily_forecast Period 1 billion rows 33
  • 35.
  • 36.
  • 37.
  • 38. Systems Tested • 8S Intel® Xeon® E7 7560 (IBM® System x3850 X5) – 2.26 GHz 8C CPU – 2TB 1066GHz DDR3 memory 37
  • 39. 500 GB SSED Store Sales ER-Diagram 73,049 402 204,000 4,594,771,672 86,400 1000 1,920,800 1,000,000 7200 20 2,000,000
  • 40. IWA2 IWA3 IWA4 IWA AVG IDS1 IDS2 IDS3 IDS AVG Improvement 109046 104246 92653 97666 100902.75 3294554 3338352 3341873 3324926.333 3295.179104 31190 27175 26927 27417 28177.25 1538219 1538364 1538959 1538514 5460.128295 93377 97192 95638 92691 94724.5 1910772 1884782 1899916 1898490 2004.222772 119587 117053 117513 117902 118013.75 1765145 1722746 1690400 1726097 1462.623635 37587 33551 35579 31651 34592 3167302 3173656 3150876 3163944.667 9146.463537 28228 29301 24602 29846 27994.25 1525738 1526089 1528724 1526850.333 5454.156955 27644 28075 30083 29362 28791 2201956 2211549 2517291 2310265.333 8024.262212 119871 123030 123593 117572 121016.5 5963515 6044626 5947525 5985222 4945.790037 38346 46412 44463 44918 43534.75 1578035 1557525 1544912 1560157.333 3583.705737 48450 46470 50032 43668 47155 1526529 1547404 1563874 1545935.667 3278.413035 43823 42441 45837 43215 43829 21990513 22354449 21903105 22082689 50383.73908 47400 46582 46573 47031 46896.5 2251672 2278167 2281946 2270595 4841.715267 56961 58315 56437 60119 57958 5295930 5310507 5325095 5310510.667 9162.687923 9037 9132 8724 9083 8994 2523942 2529234 2522585 2525253.667 28077.09214 47062 52354 51374 49932 50180.5 1546319 1570163 1568083 1561521.667 3111.8097 47643 50415 55660 52788 51626.5 2274649 2264463 2269677 2269596.333 4396.184776 85154 85711 83824 91692 86595.25 1620173 1656098 1606029 1627433.333 1879.356354 59766 59341 55436 58522 58266.25 5311906 5307202 5266918 5295342 9088.18055 8230 8207 8054 8115 8151.5 2159777 2179435 2181312 2173508 26663.90235 152764 152408 149153 151100 151356.25 2050590 2065049 2060862 2058833.667 1360.256789 30991 29582 27391 24197 28040.25 2025557 2037336 2040515 2034469.333 7255.532077 141504 145702 142908 139664 142444.5 5363204 5165693 5393336 5307411 3725.950107 1383661 1392695 1372454 1368151 1379240.25 79262889 ArithMean 8936.42511
  • 41.
  • 42. Thank You! Your Feedback is Important to Us • Access your personal session survey list and complete via SmartSite – Your smart phone or web browser at: iodsmartsite.com – Any SmartSite kiosk onsite – Each completed session survey increases your chance to win an Apple iPod Touch with daily drawing sponsored by Alliance Tech Session Number 2864 41