SlideShare a Scribd company logo
1 of 33
Tuning ETLs for Better BI
Datavail is the largest provider of remote
database administration in the U.S. with nearly
400 DBAs, 24/7 support and onsite/offsite,
onshore/offshore delivery.
Presented by Chuck Ezell
Performance, Tuning & Optimization
Services, Datavail
www.datavail.com
Agenda
• OLTP and OLAP an approach for tuning
• More than just data: peeling back the layers
• Components & Layers of Common ETLs
• Component Points of Failure
• Source, Transformation & Target Tuning Points
• High-Level Tuning Examples
• Monitoring ETL Activity (tools to make it easy)
• Recap & Questions
2
www.datavail.com
OLTP & OLAP
• OLTP Online Transaction Processing
• Best for relational database transactions.
• Emphasis is on Fast Query & Relational Data Integrity
• Emphasis very normalized data
• Business Process Data (operational, workflows, etc…)
• Insert, Update & Delete activity
• OLAP Online Analytical Processing
• Best for structured, sometimes redundant data.
• Emphasis is on ability to aggregate & analyze
• Emphasis on de-normalized & fewer tables
• Data Warehouse (trending, historical, analytical, etc…)
• Write (loading) & Reads (complex selects)
Organization
of Data
Most often both OLTP and OLAP systems exist within all
ETLs but the tuning of each is different.
3
www.datavail.com
The Essence of ETL
Extracting data from various sources, performing transformations
and loading transformed data ready for reporting.
Extraction Transform Load
4
Workflow / Task / Procedure
www.datavail.com
Reporting
Data
Target(s)
Reporting
Data
ETL Stage Components
Temp
Tables
Lookup
File
Lookup
File
Lookup
File
Lookup
File
Transform
Lookup
Tables
Data
Warehouse
Files
Cloud
Data
EBS
Data
Flat
Files
Source(s)
5
www.datavail.com
Data
Data Structure
Code Base
Database Setup Application Setup
Host ServerDisk/CPU/RAM CPU/RAM
ETL Component Layers
OS Architecture
Storage
Network Speeds
6
High-Level
ETL Tuning Helpful Tips
www.datavail.com
Reporting
Data
Target(s)
Reporting
Data
ETL Stage Component Points of Failure
Temp
Tables
Lookup
File
Lookup
File
Lookup
File
Lookup
File
Transform
Lookup
Tables
Data
Warehouse
Files
Cloud
Data
EBS
Data
Flat
Files
Source(s)
Disk (I/O)
Network
Latency
Too Much
In-Memory
Limited
RAM
File System or
Cache Fragmentation
IOP & CPU
Bottlenecks
Limited
Space
Poor
Code
8
www.datavail.com
Source Bottlenecks & Tuning Ideas
• Source is often OLTP structured data (but not always)
• A traditional tuning approach will apply
• Factor in DML causing Fragmentation & Stats problems
• Find poor plans and tune in traditional fashion Data
Warehouse
Files
Cloud
Data
EBS
Data
Flat
Files
• SQL Code (better filtering, use of
custom and vendor functions)
• Statistics
• Indexing & Table
Fragmentation
• Conflicting Sessions or
Processes during ETL
• Offload or replicate data for
better isolation
9
www.datavail.com
Transformation Bottlenecks & Tuning Ideas
• Depending on your ETL, high % could be in-memory
• RAM & Temp space is critical(the more the better)
• Filesystem lookups can be slow (lack of indexing)
• Filesystems can become fragmented (depending OS)
• SQL Code (in memory merges and joins)
• Statistics can hinder on temp
tables
• Indexing could slow a process
down
• Lack of proper temp space will
cause failures (watch logs & ASM)
• Filesystem lookups perform better
if they’re converted to DB table
lookups
Temp
Tables
Lookup
File
Lookup
File
Lookup
File
Lookup
File
Lookup
Tables
10
www.datavail.com
Target Bottlenecks & Tuning Ideas
• OLAP Write speeds and I/O are overlooked
• Indexing and Stats can be problematic
• Loading could be single inserts in a loop
• SQL Code
• Inserts can benefit from HINT
“APPEND” or “APPEND_VALUES”
• Inserts and Updates could benefit from
PARALLEL hinting
• Stats and Indexing added after loads
and performed in Parallel (split out tasks)
• Confirm Async I/O settings in OS and
DB
• Use Bulk Loading where possible
Reporting
Data
Reporting
Data
11
Common Problems & Fails
www.datavail.com
? ?
What do we want from our ETLs?
Setting goals will affect our approach however, there are two main
goals for any and all ETLs.
13
Speed Consistency
&
www.datavail.com
Common Problems Seen
• Doing too much in-memory
• Doing too much from filesystem
• Not considering network speeds or drive speeds
• Not considering system or session conflicts
• Not taking advantage of ASYNC features
• Not Partitioning
• Not providing enough resources to database
• Not reviewing workflow logs
• Not knowing the business purpose of the data or each task
• Using HINTs too much or wrongly (ordered, cardinality, parallel)
14
www.datavail.com
Using ORDERED /*+ HINTs */
• ORDERED forces the table join order
• Instructs Optimizer to join in the order they appear in the SQL code
• Use LEADING() instead but only for investigation
15
/*+ ORDERED */
/*+ LEADING(FA_BOOK_TYPE_D, FIN_BUSN_LOCATION_D) */
www.datavail.com
Using CARDINALITY /*+ HINTs */
• Cardinality has been deprecated from 10g on
• Use OPT_ESTIMATE() instead
16
CARDINALITY(5) OPT_ESTIMATE(table tabname rows=5)
Wrong
select count(*) from tabname; Result=35,754,849
CARDINALITY(35754849)
or
OPT_ESTIMATE(table tabname rows=35754849)
Right
www.datavail.com
Using PARALLEL /*+ HINTs */
Original Plan
Plan with Full Table Scans
17
PARALLEL(auto) or PARALLEL(32)
Could cause unpredictable runtimes
www.datavail.com
Using PARALLEL /*+ HINTs */
Parallelism Introduced
Time and Cost is Reduced
Parallel Hinting also consumed CPU
and didn’t solve plan problems.
18
www.datavail.com
Plan Improvement w/ Indexing
Full Table Scan due to NVL() function on filter
condition causing Long Operations
Filtering against almost 1 million rows
19
www.datavail.com
Plan Improvement w/ Indexing
Function Based Index Immediately
Improved Performance
Index improved filtering performance by reducing
read activity from 947k to 253 rows
20
www.datavail.com
Plan Improvement w/ Indexing
Parallel Hints didn’t reduce Long Ops
Parallel Hinting could improve the
performance of the indexing further but
alone would only a band-aid.
21
Monitoring ETL Activity
Finding the Bottlenecks
www.datavail.com
Monitor Sessions
23
www.datavail.com
Long Operations = Potential Slow Reads v$session_longops
24
www.datavail.com
Monitoring Tools: Putty & Top
25
www.datavail.com
Monitoring Tools: DB Time Monitor dominicgiles.com
26
www.datavail.com
Monitoring Tools: Monitor DB dominicgiles.com
27
www.datavail.com
Monitor Tasks in DAC
DAC serves the following purposes:
- DAC is a metadata driven administration and deployment tool
- Manages Application Configuration
- Manages the execution of warehouse loads
- Provides a monitoring capabilities
28
www.datavail.com
Monitor Tasks in DAC
29
www.datavail.com
In Closing
• OLTP and OLAP an approach for tuning
• More than just data: peeling back the layers
• Components & Layers of Common ETLs
• Component Points of Failure
• Source, Transformation & Target Tuning Points
• High-Level Tuning Examples
• Monitoring ETL Activity (tools to make it easy)
• Recap & Questions
30
Questions?
Questions can also be sent to
kelley.Weir@Datavail.com
or chuck.Ezell@Datavail.com
Presented by Chuck Ezell
Performance, Tuning & Optimization
Services, Datavail
Datavail is the largest provider of remote
database administration in the U.S. with nearly
400 DBAs, 24/7 support and onsite/offsite,
onshore/offshore delivery.
www.datavail.com 32
www.datavail.com
Monitoring Large Datasets
Long Operations
33

More Related Content

What's hot

A Practitioner's Guide to Successfully Migrate from Oracle to Sybase ASE Part 2
A Practitioner's Guide to Successfully Migrate from Oracle to Sybase ASE Part 2A Practitioner's Guide to Successfully Migrate from Oracle to Sybase ASE Part 2
A Practitioner's Guide to Successfully Migrate from Oracle to Sybase ASE Part 2
Dobler Consulting
 
SamBarrie_Primaryvzt
SamBarrie_PrimaryvztSamBarrie_Primaryvzt
SamBarrie_Primaryvzt
Sam Barrie
 
KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...
KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...
KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...
DataWorks Summit
 

What's hot (20)

Oracle DB In-Memory technologie v kombinaci s procesorem M7
Oracle DB In-Memory technologie v kombinaci s procesorem M7Oracle DB In-Memory technologie v kombinaci s procesorem M7
Oracle DB In-Memory technologie v kombinaci s procesorem M7
 
What's New in SAP Replication Server 15.7.1 SP100
What's New in SAP Replication Server 15.7.1 SP100What's New in SAP Replication Server 15.7.1 SP100
What's New in SAP Replication Server 15.7.1 SP100
 
Database and application performance vivek sharma
Database and application performance vivek sharmaDatabase and application performance vivek sharma
Database and application performance vivek sharma
 
A Practitioner's Guide to Successfully Migrate from Oracle to Sybase ASE Part 2
A Practitioner's Guide to Successfully Migrate from Oracle to Sybase ASE Part 2A Practitioner's Guide to Successfully Migrate from Oracle to Sybase ASE Part 2
A Practitioner's Guide to Successfully Migrate from Oracle to Sybase ASE Part 2
 
Sap replication server
Sap replication serverSap replication server
Sap replication server
 
Spark + HBase
Spark + HBase Spark + HBase
Spark + HBase
 
data stage-material
data stage-materialdata stage-material
data stage-material
 
SamBarrie_Primaryvzt
SamBarrie_PrimaryvztSamBarrie_Primaryvzt
SamBarrie_Primaryvzt
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
 
Hive Does ACID
Hive Does ACIDHive Does ACID
Hive Does ACID
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at Cask
 
Advanced ASE Performance Tuning Tips
Advanced ASE Performance Tuning Tips Advanced ASE Performance Tuning Tips
Advanced ASE Performance Tuning Tips
 
KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...
KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...
KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...
 
Introduction to Datawarehousing
Introduction to  DatawarehousingIntroduction to  Datawarehousing
Introduction to Datawarehousing
 
How to Load Data, Revisited, UTOUG
How to Load Data, Revisited, UTOUGHow to Load Data, Revisited, UTOUG
How to Load Data, Revisited, UTOUG
 
Storage Optimization and Operational Simplicity in SAP Adaptive Server Enter...
Storage Optimization and Operational Simplicity in SAP  Adaptive Server Enter...Storage Optimization and Operational Simplicity in SAP  Adaptive Server Enter...
Storage Optimization and Operational Simplicity in SAP Adaptive Server Enter...
 
Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)
Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)
Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)
 
Larry Ellison Introduces Oracle Database In-Memory
Larry Ellison Introduces Oracle Database In-MemoryLarry Ellison Introduces Oracle Database In-Memory
Larry Ellison Introduces Oracle Database In-Memory
 
Tips Tricks and Little known features in SAP ASE
Tips Tricks and Little known features in SAP ASETips Tricks and Little known features in SAP ASE
Tips Tricks and Little known features in SAP ASE
 

Viewers also liked

Viewers also liked (6)

Why Now May Be The Time To Consider A Managed Services Approach to Database A...
Why Now May Be The Time To Consider A Managed Services Approach to Database A...Why Now May Be The Time To Consider A Managed Services Approach to Database A...
Why Now May Be The Time To Consider A Managed Services Approach to Database A...
 
Tuning OEM Templates
Tuning OEM Templates Tuning OEM Templates
Tuning OEM Templates
 
Performing Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEXPerforming Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEX
 
The business value of managed services: Findings from IDC research sponsored...
The business value of managed services:  Findings from IDC research sponsored...The business value of managed services:  Findings from IDC research sponsored...
The business value of managed services: Findings from IDC research sponsored...
 
Managed Services Presentation
Managed Services PresentationManaged Services Presentation
Managed Services Presentation
 
ITS Managed Services Introduction
ITS Managed Services IntroductionITS Managed Services Introduction
ITS Managed Services Introduction
 

Similar to Tuning ETL's for Better BI

oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021
ssuser8ccb5a
 
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxCERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
camyla81
 
ORACLE 12C-New-Features
ORACLE 12C-New-FeaturesORACLE 12C-New-Features
ORACLE 12C-New-Features
Navneet Upneja
 
שבוע אורקל 2016
שבוע אורקל 2016שבוע אורקל 2016
שבוע אורקל 2016
Aaron Shilo
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]
shuwutong
 
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Aaron Shilo
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse design
Calpont
 
Designing High Performance ETL for Data Warehouse
Designing High Performance ETL for Data WarehouseDesigning High Performance ETL for Data Warehouse
Designing High Performance ETL for Data Warehouse
Marcel Franke
 

Similar to Tuning ETL's for Better BI (20)

oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021
 
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxCERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
 
Taming the Beast: Optimizing Oracle EBS for Radical Efficiency
Taming the Beast: Optimizing Oracle EBS for Radical EfficiencyTaming the Beast: Optimizing Oracle EBS for Radical Efficiency
Taming the Beast: Optimizing Oracle EBS for Radical Efficiency
 
Connecting Hadoop and Oracle
Connecting Hadoop and OracleConnecting Hadoop and Oracle
Connecting Hadoop and Oracle
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
 
StreamHorizon overview
StreamHorizon overviewStreamHorizon overview
StreamHorizon overview
 
ETL (1).ppt
ETL (1).pptETL (1).ppt
ETL (1).ppt
 
Taming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI OptionsTaming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI Options
 
Optimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseOptimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data Warehouse
 
ORACLE 12C-New-Features
ORACLE 12C-New-FeaturesORACLE 12C-New-Features
ORACLE 12C-New-Features
 
שבוע אורקל 2016
שבוע אורקל 2016שבוע אורקל 2016
שבוע אורקל 2016
 
An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)
 
An AMIS overview of database 12c
An AMIS overview of database 12cAn AMIS overview of database 12c
An AMIS overview of database 12c
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]
 
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
 
Database story by DevOps
Database story by DevOpsDatabase story by DevOps
Database story by DevOps
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse design
 
Performance tuning intro
Performance tuning introPerformance tuning intro
Performance tuning intro
 
Designing High Performance ETL for Data Warehouse
Designing High Performance ETL for Data WarehouseDesigning High Performance ETL for Data Warehouse
Designing High Performance ETL for Data Warehouse
 

More from Datavail

More from Datavail (20)

Journey to Cloud Analytics
Journey to Cloud Analytics Journey to Cloud Analytics
Journey to Cloud Analytics
 
Accelerate SQL Server Migration to the AWS Cloud
Accelerate SQL Server Migration to the AWS Cloud Accelerate SQL Server Migration to the AWS Cloud
Accelerate SQL Server Migration to the AWS Cloud
 
MOUS 2020 - Hyperion 11.2 vs. Cloud: Should I Stay or Should I Go?
MOUS 2020 - Hyperion 11.2 vs. Cloud: Should I Stay or Should I Go?MOUS 2020 - Hyperion 11.2 vs. Cloud: Should I Stay or Should I Go?
MOUS 2020 - Hyperion 11.2 vs. Cloud: Should I Stay or Should I Go?
 
Oracle Enterprise Manager Seven Robust Features to Put in Action final
Oracle Enterprise Manager Seven Robust Features to Put in Action finalOracle Enterprise Manager Seven Robust Features to Put in Action final
Oracle Enterprise Manager Seven Robust Features to Put in Action final
 
Lessons from Migrating Oracle Databases to Amazon RDS or Amazon Aurora
Lessons from Migrating Oracle Databases to Amazon RDS or Amazon Aurora Lessons from Migrating Oracle Databases to Amazon RDS or Amazon Aurora
Lessons from Migrating Oracle Databases to Amazon RDS or Amazon Aurora
 
EPM 11.2: Lessons Learned and 2021 Preparedness
EPM 11.2: Lessons Learned and 2021 PreparednessEPM 11.2: Lessons Learned and 2021 Preparedness
EPM 11.2: Lessons Learned and 2021 Preparedness
 
Optimizing Oracle Databases & Applications Gives Fast Food Giant Major Gains
Optimizing Oracle Databases & Applications Gives Fast Food Giant Major GainsOptimizing Oracle Databases & Applications Gives Fast Food Giant Major Gains
Optimizing Oracle Databases & Applications Gives Fast Food Giant Major Gains
 
RMOUG 2020: Keeping Pace with Change
RMOUG 2020: Keeping Pace with Change RMOUG 2020: Keeping Pace with Change
RMOUG 2020: Keeping Pace with Change
 
Upcoming Extended Support Deadlines & What They Mean for You
Upcoming Extended Support Deadlines & What They Mean for YouUpcoming Extended Support Deadlines & What They Mean for You
Upcoming Extended Support Deadlines & What They Mean for You
 
SQL on Linux
SQL on LinuxSQL on Linux
SQL on Linux
 
Reduce Cost by Tuning Queries on Azure DBaaS
Reduce Cost by Tuning Queries on Azure DBaaSReduce Cost by Tuning Queries on Azure DBaaS
Reduce Cost by Tuning Queries on Azure DBaaS
 
MOUS 2019 - Keeping Pace with Change: Prepare for Tomorrow & Advance Your Car...
MOUS 2019 - Keeping Pace with Change: Prepare for Tomorrow & Advance Your Car...MOUS 2019 - Keeping Pace with Change: Prepare for Tomorrow & Advance Your Car...
MOUS 2019 - Keeping Pace with Change: Prepare for Tomorrow & Advance Your Car...
 
Essbase On-Prem to Oracle Analytics Cloud - How, When, and Why
Essbase On-Prem to Oracle Analytics Cloud - How, When, and WhyEssbase On-Prem to Oracle Analytics Cloud - How, When, and Why
Essbase On-Prem to Oracle Analytics Cloud - How, When, and Why
 
Is "Free" Good Enough for Your MySQL Environment?
Is "Free" Good Enough for Your MySQL Environment?Is "Free" Good Enough for Your MySQL Environment?
Is "Free" Good Enough for Your MySQL Environment?
 
Critical Preflight Checks for Your EPM Applications
Critical Preflight Checks for Your EPM ApplicationsCritical Preflight Checks for Your EPM Applications
Critical Preflight Checks for Your EPM Applications
 
SQL to Azure Migrations
SQL to Azure MigrationsSQL to Azure Migrations
SQL to Azure Migrations
 
Essbase On-Prem to Oracle Analytics Cloud - How, When, and Why
Essbase On-Prem to Oracle Analytics Cloud - How, When, and WhyEssbase On-Prem to Oracle Analytics Cloud - How, When, and Why
Essbase On-Prem to Oracle Analytics Cloud - How, When, and Why
 
3 Ways to Lead an Accidental DBA
3 Ways to Lead an Accidental DBA3 Ways to Lead an Accidental DBA
3 Ways to Lead an Accidental DBA
 
Creating a Solid EPM Punch List
Creating a Solid EPM Punch ListCreating a Solid EPM Punch List
Creating a Solid EPM Punch List
 
Why NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB AtlasWhy NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB Atlas
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Tuning ETL's for Better BI

  • 1. Tuning ETLs for Better BI Datavail is the largest provider of remote database administration in the U.S. with nearly 400 DBAs, 24/7 support and onsite/offsite, onshore/offshore delivery. Presented by Chuck Ezell Performance, Tuning & Optimization Services, Datavail
  • 2. www.datavail.com Agenda • OLTP and OLAP an approach for tuning • More than just data: peeling back the layers • Components & Layers of Common ETLs • Component Points of Failure • Source, Transformation & Target Tuning Points • High-Level Tuning Examples • Monitoring ETL Activity (tools to make it easy) • Recap & Questions 2
  • 3. www.datavail.com OLTP & OLAP • OLTP Online Transaction Processing • Best for relational database transactions. • Emphasis is on Fast Query & Relational Data Integrity • Emphasis very normalized data • Business Process Data (operational, workflows, etc…) • Insert, Update & Delete activity • OLAP Online Analytical Processing • Best for structured, sometimes redundant data. • Emphasis is on ability to aggregate & analyze • Emphasis on de-normalized & fewer tables • Data Warehouse (trending, historical, analytical, etc…) • Write (loading) & Reads (complex selects) Organization of Data Most often both OLTP and OLAP systems exist within all ETLs but the tuning of each is different. 3
  • 4. www.datavail.com The Essence of ETL Extracting data from various sources, performing transformations and loading transformed data ready for reporting. Extraction Transform Load 4 Workflow / Task / Procedure
  • 6. www.datavail.com Data Data Structure Code Base Database Setup Application Setup Host ServerDisk/CPU/RAM CPU/RAM ETL Component Layers OS Architecture Storage Network Speeds 6
  • 8. www.datavail.com Reporting Data Target(s) Reporting Data ETL Stage Component Points of Failure Temp Tables Lookup File Lookup File Lookup File Lookup File Transform Lookup Tables Data Warehouse Files Cloud Data EBS Data Flat Files Source(s) Disk (I/O) Network Latency Too Much In-Memory Limited RAM File System or Cache Fragmentation IOP & CPU Bottlenecks Limited Space Poor Code 8
  • 9. www.datavail.com Source Bottlenecks & Tuning Ideas • Source is often OLTP structured data (but not always) • A traditional tuning approach will apply • Factor in DML causing Fragmentation & Stats problems • Find poor plans and tune in traditional fashion Data Warehouse Files Cloud Data EBS Data Flat Files • SQL Code (better filtering, use of custom and vendor functions) • Statistics • Indexing & Table Fragmentation • Conflicting Sessions or Processes during ETL • Offload or replicate data for better isolation 9
  • 10. www.datavail.com Transformation Bottlenecks & Tuning Ideas • Depending on your ETL, high % could be in-memory • RAM & Temp space is critical(the more the better) • Filesystem lookups can be slow (lack of indexing) • Filesystems can become fragmented (depending OS) • SQL Code (in memory merges and joins) • Statistics can hinder on temp tables • Indexing could slow a process down • Lack of proper temp space will cause failures (watch logs & ASM) • Filesystem lookups perform better if they’re converted to DB table lookups Temp Tables Lookup File Lookup File Lookup File Lookup File Lookup Tables 10
  • 11. www.datavail.com Target Bottlenecks & Tuning Ideas • OLAP Write speeds and I/O are overlooked • Indexing and Stats can be problematic • Loading could be single inserts in a loop • SQL Code • Inserts can benefit from HINT “APPEND” or “APPEND_VALUES” • Inserts and Updates could benefit from PARALLEL hinting • Stats and Indexing added after loads and performed in Parallel (split out tasks) • Confirm Async I/O settings in OS and DB • Use Bulk Loading where possible Reporting Data Reporting Data 11
  • 13. www.datavail.com ? ? What do we want from our ETLs? Setting goals will affect our approach however, there are two main goals for any and all ETLs. 13 Speed Consistency &
  • 14. www.datavail.com Common Problems Seen • Doing too much in-memory • Doing too much from filesystem • Not considering network speeds or drive speeds • Not considering system or session conflicts • Not taking advantage of ASYNC features • Not Partitioning • Not providing enough resources to database • Not reviewing workflow logs • Not knowing the business purpose of the data or each task • Using HINTs too much or wrongly (ordered, cardinality, parallel) 14
  • 15. www.datavail.com Using ORDERED /*+ HINTs */ • ORDERED forces the table join order • Instructs Optimizer to join in the order they appear in the SQL code • Use LEADING() instead but only for investigation 15 /*+ ORDERED */ /*+ LEADING(FA_BOOK_TYPE_D, FIN_BUSN_LOCATION_D) */
  • 16. www.datavail.com Using CARDINALITY /*+ HINTs */ • Cardinality has been deprecated from 10g on • Use OPT_ESTIMATE() instead 16 CARDINALITY(5) OPT_ESTIMATE(table tabname rows=5) Wrong select count(*) from tabname; Result=35,754,849 CARDINALITY(35754849) or OPT_ESTIMATE(table tabname rows=35754849) Right
  • 17. www.datavail.com Using PARALLEL /*+ HINTs */ Original Plan Plan with Full Table Scans 17 PARALLEL(auto) or PARALLEL(32) Could cause unpredictable runtimes
  • 18. www.datavail.com Using PARALLEL /*+ HINTs */ Parallelism Introduced Time and Cost is Reduced Parallel Hinting also consumed CPU and didn’t solve plan problems. 18
  • 19. www.datavail.com Plan Improvement w/ Indexing Full Table Scan due to NVL() function on filter condition causing Long Operations Filtering against almost 1 million rows 19
  • 20. www.datavail.com Plan Improvement w/ Indexing Function Based Index Immediately Improved Performance Index improved filtering performance by reducing read activity from 947k to 253 rows 20
  • 21. www.datavail.com Plan Improvement w/ Indexing Parallel Hints didn’t reduce Long Ops Parallel Hinting could improve the performance of the indexing further but alone would only a band-aid. 21
  • 24. www.datavail.com Long Operations = Potential Slow Reads v$session_longops 24
  • 26. www.datavail.com Monitoring Tools: DB Time Monitor dominicgiles.com 26
  • 28. www.datavail.com Monitor Tasks in DAC DAC serves the following purposes: - DAC is a metadata driven administration and deployment tool - Manages Application Configuration - Manages the execution of warehouse loads - Provides a monitoring capabilities 28
  • 30. www.datavail.com In Closing • OLTP and OLAP an approach for tuning • More than just data: peeling back the layers • Components & Layers of Common ETLs • Component Points of Failure • Source, Transformation & Target Tuning Points • High-Level Tuning Examples • Monitoring ETL Activity (tools to make it easy) • Recap & Questions 30
  • 31. Questions? Questions can also be sent to kelley.Weir@Datavail.com or chuck.Ezell@Datavail.com Presented by Chuck Ezell Performance, Tuning & Optimization Services, Datavail Datavail is the largest provider of remote database administration in the U.S. with nearly 400 DBAs, 24/7 support and onsite/offsite, onshore/offshore delivery.