SlideShare une entreprise Scribd logo
1  sur  35
Data Warehouse Optimization
Data Warehouse Optimization
3
Finding Business Pains
• Frequent or near-term EDW expansion/spend
• Short time windows for data
• SLA challenges with ELT
• Reports/analytics that are “Too big”
• Compliance issues requiring long-term storage AND
query
• Resource restrictions/contention or
disenfranchised/frustrated users
3
4
Common Challenges with the Data Warehouse
4
OLTP
Enterprise
Applications
Data
Warehouse
QueryExtract
Transform
Load
Business
Intelligence
Transform
1
1
1
Slow data transformations, missed SLAs.
2
2
Slow queries, poor QoS and missed opportunities.
4 Must archive. Archived data can’t provide value.
3
3 Wrong or incomplete, modified copies are made.5 Constant pressure to buy additional
warehouse capacity, just to maintain
current quality of service.
NO room to expand use cases.
NO room to innovate.
5
An EDH Compliments the Data Warehouse
5
OLTP
Enterprise
Applications
Data
Warehouse
Query
Extract
Load Business
Intelligence
Cloudera
3
3 Avoid “spreadmarts” across departments.
Transform
Query
2
2
Empowered business analysts.
2
1 Data loaded when & where it’s needed.
1
4 Complete view of all your
products, customers, etc.
5 Cost effective, infinitely scalable,
production ready enterprise data hub for all
your data.
All data.
All users.
6
Hadoop as a Data Warehouse???
6
7
2014 Gartner MQ for Data Warehouse DBMS
7
“A data warehouse DBMS is now expected to
coordinate data virtualization strategies, and
distributed file and/or processing
approaches, to address changes in data
management and access requirements.”
8
Thinking About Optimization
9
Understanding Benefits for Your Organization
9
• Help You Assess Your Enterprise Data Warehouse Ecosystem
• Identify Viable
Migration Candidates
and Target Reference
Architecture
• Develop a Project Plan
to Deliver the Full Scope
of Benefits
• Understand the
Business Case for
Making the Investment
10
Working With You Through the EDW Assessment
Process
10
Information
•Collect information about your
EDW environment
Analysis
•Identify migration candidates
•Determine feasibility
Recommendations
•Develop a migration plan
•Establish a business case
11
Identifying Sources and Workloads
12
Key Hadoop Platform Requirements
• High availability
• Disaster recovery
• Downtime-less upgrades
• Auditability
• Low-latency SQL & BI support
• Deep SAS & R support
13
Customers Agree: Cloudera Delivers
Customer Workload Results
Leading Payments
Company
Analytics, ETL
Processing, DR
Largest fraud discovery in firm history
Time to report collapsed from 2 days => 2 hours
Save $30M on DR
Global Money Center
Bank
Data Processing (ELT) Avoided tens of millions in expansion purchases
42% faster processing
Mobile Device
Manufacturer
Data Processing (ELT) Offloaded 90% of data volume; keep all data
Fortune 500 Retailer Analytics More insights by supporting more exploration of more
extensive & granular data
Leading Financial
Regulator
Data Processing (ELT)
and DR
Shrank EDW footprint by 4PB, 20X perf. boost
14
DATA WAREHOUSE
Operational Business
Intelligence
Analytics Self-Service BI
Data Processing (ELT)
Staged Data
Operational
Data
Archival Data
WORKLOADSDATA
Assessing Workloads and Data
• Data Processing (ELT)
• Staged data, to be processed
• Temp tables, BLOB/CLOB types, etc.
• Analytics / Machine Learning
• Deep and broad data sets, within and
beyond the warehouse
• Self-Service BI (Ad-Hoc Query)
• Operational data, actively used for BI
• Archival data, inactively used for BI
15
Offload Data Processing (ELT)
High-scale batch data processing
Implemented as SQL + scripting or ETL
running on expensive HW infrastructure
Staging data stored across diverse, temp
tables
High fraction of overall EDW utilization
(25 – 80%)
Difficult to store, manage staging data
in relational form
Limited user adoption risk to migrate
ETL tools to simplify migration
Over 2X the performance
1/10th the cost
What to Migrate Influencing Factors Better in Cloudera
Reliability for mission-critical workloads: high availability, disaster recovery,
downtime-less upgrades
Low-latency SQL processing, ability to absorb short-cycle ELT
Broad support of leading data integration tools
Only Available with Cloudera Partners
16
Offload Self-Service Business Intelligence
Self-Service BI,
Exploratory BI,
Data Discovery
Uncertain business questions
and uncertain data
Fastest growing workload for
many warehouses
Comparable support for end
user tools between Cloudera
and DBMS products
Schema flexibility
End user self-service on full
fidelity data
1/10th the cost
Workload Migration Priority Better In Cloudera
Open source parallel interactive SQL engine: Cloudera Impala
Integration and certification of every leading SSBI vendor
Only Available with Cloudera Partners
17
Offload Analytics / Machine Learning
Training & scoring
predictive models
Deep and broad data sets, within and
beyond the warehouse
Statisticians want unconstrained
analysis; limited DW compute resources
Paying top dollar for warehouse data
storage only to load into ML tools
Inability to analyze data beyond the
warehouse
Greater user productivity
(pre-packaged ML libraries, no more
down-sampling)
Support for 3rd party ML tools
Greater flexibility
(SQL + MR + SAS procs)
1/10th the cost
Workload and Data Influencing Factors Better in Cloudera
Ability to run SAS, R natively on the same cluster
Interactive search and SQL experience for data exploration
Built-in analytics libraries (Mahout, DataFu, ClouderaML)
Support from Cloudera’s Data Science team
Only Available with Cloudera Partners
18
Sample Cloudera Tools for Assisting Migration
• High-speed connector – Moves data between the two systems
• Data definition – Tool for mapping EDW tables & datatypes to Hive tables &
datatypes
• Mainframe input / output format – Support direct feed of mainframe data
into Cloudera
• Result validation – Verifies SQL applications in Cloudera produce the same
results as the original applications
• Support for SQL-H (planned) – Remote queries from EDW to Cloudera
18
19
Groundwork for Optimization
20
• Install and configure CDH and Cloudera Manager
• Run standard and specialized performance tests
• Recommend tuning, compression and
decompression, and scheduler configurations
• Document recommended cluster configuration
• Train and certify Hadoop administrators
Is Your Data Architecture Aligned to Your Use Case?
Lay the Foundation for Data Migration and Ensure Success
21
How Quickly and Securely Can You Transition Your Data?
Migrate Disparate Data Sources to Boost Performance
• Collect low-efficiency data from various silos
• Redeploy latent data from EDWs, RDBMSs,
and Hadoop environment
• Develop, test, and implement data
processing jobs
• Integrate Hadoop with relevant external
systems
• Document workload migration
22
Is Your Operational Environment Ready for Handover?
Maximize ROI by Rationalizing All Systems, Teams, and Workloads
• Review current and future requirements
• Review full ecosystem, all jobs, and regular processes
• Review application architecture, ingestion pipeline, data schema,
and data partitioning system
• Review key management and monitoring processes and relevant
production procedures
• Recommend additional training to assure Hadoop expertise on
management and operations teams
• Document cluster configuration, solutions implementation, and
production recommendations
23
How Much Additional Value Can You Capture Long-Term?
Ongoing Optimization Is Key to Deferring Additional Cost
• Expand framework without expanding
footprint
• Rationalize beyond initial burn-in period
• Evolve cluster to support additional use cases
• Annually benchmark performance to
diagnostic
• Balance business opportunity against
technical risk
24
Building the Optimization Plan
25
Prioritizing Workloads and Data
Current EDW
Constraints
Workload
Transferability
User
Communities
• Focus on computation
constraints
• Focus on disk space constraints
• Similar or same SQL functionality
• Similar or same tools support
• Opportunity for performance gains
• Group related workloads by user
community
• Migrate one community at a time
1 2 3
26
The Optimization Process
Profile Prioritize Migrate Validate
• Analyze all of the
workload in your
data warehouse
• Queries
• Objects
• User communities
• Framework driven
methodology for
ordering workloads
• Balance financial
opportunity with
business risk
• Set up data ingest
paths to Cloudera
• Map EDW
workload to
Cloudera
Repeat annually to defer
additional expansion
• Verify results
• Evaluate
performance
differences & tune
• Side-by-side “burn
in” period
• Cut-over
27
Sample EDW Rationalization Process
Initial Quarter Second Quarter Third Quarter Fourth Quarter
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12
Program Management
Responsible for overall program
success, resource assignment, project
management, and risk mitigation
Cloudera Migration Teams
Expert resources delivering initial
project framework and advanced
implementation releases
${Customer} Migration Teams
Customer staff resources, taking on
increasing responsibility for release
implementation over time
ProcessPeople
Technology
Management & Risk Mitigation
Initial EDW Assessment
Architecture Oversight
Assessment and Stratification Process
Detailed Workload Analysis
Implement Reference Architecture
Establish Repeatable Migration Approach
Enhance SDLC, Release, and Configuration Management Processes
Release
1
Release
2
Release
3
Release
N
Migration SDLC
Assignment/Kick-off
Execution
Testing
User Acceptance
Documentation
Sign-off
Release
2
Release
3
Release
N
Release
4
Release
5
28
Workload Classification
Cloudera Architecture Implementing Cloudera’s reference architecture(s) and building environment to fit
unique customer requirements
Data Ecosystem
Integration
BI, ETL, and other applications that require integration with the big data platform,
including existing EDW
Data Processing High-scale batch data processing, Implemented as SQL + scripting or via ETL tools,
Staging data stored across diverse, temp tables
Self-service BI Exploratory BI, Data Discovery, Uncertain business questions and uncertain data
Analytics Training & scoring, predictive models, deep and broad data sets (within and
beyond the warehouse)
Archival Processes Traditional archive storage and processes
29
Workload Complexity
Basic
• Leverages pre-existing
architecture and integrations
• Utilizes all off-the-shelf
components
• Repeatable solutions from
existing
training/documentation
Moderate
• Requires minimal
modifications to existing
architecture,
integrations, or other
dependencies
• Some expertise required
for new design decisions
Advanced
• Establishing new
reference architectures
• Several new design
decisions involved
• Unique skillsets required
(eg. Machine learning)
30
Sample Complexity vs. Time for Various Project Types
ComplexityofTask
Estimated Phase
Low
Moderate
High
1 2 3 4
Machine Learning Modeling
Graph Analytics Modeling
Hadoop cluster install/config
One-off ingest/ETL processes
Predictive Analytics Modeling
Production Certification
Hadoop storage schemas
Decision tree/forest/ensemble
Data Pipelining
Generic ingest/ETL processes
31
Mapping Resources to Project Task Type
ComplexityofTask
Estimated Phase
Low
Moderate
High
1 2 3 4
Data Scientist
Senior Architect
Consultant
Architect
Principal Architect
32
Developers AdminData Warehouse
Specialist
Architects
Technology & Ops
Management & Leadership
Big Data
Visionary
Executive
Sponsor
Program
Manager
Business & Data
Lead Data
Scientist
Lead Business
Analyst
LOB Rep
LOB Rep
LOB Rep
Data
Wranglers
Typical Big Data COE Program Roles
Staff Centrally and Train to Scale
33
Benefits Summary
1. Lower costs of data management, growth
2. Improve quality of service
• Meet critical data processing SLAs
• Faster BI queries
3. Extend existing warehouse capacity
• Increase ROI from current investments
• More operational data – volume and schemas
• More business intelligence and analytics workloads
4. Retain all data for analysis
5. Deliver a foundation for innovation
• Bring more applications to Hadoop data for low incremental cost
34
The Experts Agree
34
35
Questions?

Contenu connexe

Tendances

OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...Altinity Ltd
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
The Big Data Analytics Ecosystem at LinkedIn
The Big Data Analytics Ecosystem at LinkedInThe Big Data Analytics Ecosystem at LinkedIn
The Big Data Analytics Ecosystem at LinkedInrajappaiyer
 
Introduction à Neo4j
Introduction à Neo4jIntroduction à Neo4j
Introduction à Neo4jNeo4j
 
Open core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineageOpen core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineageJulien Le Dem
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesNeo4j
 
Using Apache Spark as ETL engine. Pros and Cons
Using Apache Spark as ETL engine. Pros and Cons          Using Apache Spark as ETL engine. Pros and Cons
Using Apache Spark as ETL engine. Pros and Cons Provectus
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business IntelligenceAlmog Ramrajkar
 
Accelerating Big Data Analytics with Apache Kylin
Accelerating Big Data Analytics with Apache KylinAccelerating Big Data Analytics with Apache Kylin
Accelerating Big Data Analytics with Apache KylinTyler Wishnoff
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaAzure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaDatabricks
 
Data Warehousing Datamining Concepts
Data Warehousing Datamining ConceptsData Warehousing Datamining Concepts
Data Warehousing Datamining Conceptsraulmisir
 
OLAP OnLine Analytical Processing
OLAP OnLine Analytical ProcessingOLAP OnLine Analytical Processing
OLAP OnLine Analytical ProcessingWalid Elbadawy
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief OverviewHal Kalechofsky
 

Tendances (20)

OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
The Big Data Analytics Ecosystem at LinkedIn
The Big Data Analytics Ecosystem at LinkedInThe Big Data Analytics Ecosystem at LinkedIn
The Big Data Analytics Ecosystem at LinkedIn
 
Introduction à Neo4j
Introduction à Neo4jIntroduction à Neo4j
Introduction à Neo4j
 
Open core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineageOpen core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineage
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Dbm630_Lecture02-03
Dbm630_Lecture02-03Dbm630_Lecture02-03
Dbm630_Lecture02-03
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
 
Using Apache Spark as ETL engine. Pros and Cons
Using Apache Spark as ETL engine. Pros and Cons          Using Apache Spark as ETL engine. Pros and Cons
Using Apache Spark as ETL engine. Pros and Cons
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
 
Accelerating Big Data Analytics with Apache Kylin
Accelerating Big Data Analytics with Apache KylinAccelerating Big Data Analytics with Apache Kylin
Accelerating Big Data Analytics with Apache Kylin
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaAzure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
 
Why Data Vault?
Why Data Vault? Why Data Vault?
Why Data Vault?
 
Data modelling 101
Data modelling 101Data modelling 101
Data modelling 101
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Data Warehousing Datamining Concepts
Data Warehousing Datamining ConceptsData Warehousing Datamining Concepts
Data Warehousing Datamining Concepts
 
OLAP OnLine Analytical Processing
OLAP OnLine Analytical ProcessingOLAP OnLine Analytical Processing
OLAP OnLine Analytical Processing
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief Overview
 

En vedette

Energy conservation week celebration
Energy conservation week celebrationEnergy conservation week celebration
Energy conservation week celebrationSudha Arun
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Clusterairbots
 
Cloud Computing v.s. Cyber Security
Cloud Computing v.s. Cyber Security Cloud Computing v.s. Cyber Security
Cloud Computing v.s. Cyber Security Bahtiyar Bircan
 
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...Dr.Choen Krainara
 
Making Display Advertising Work for Auto Dealers
Making Display Advertising Work for Auto DealersMaking Display Advertising Work for Auto Dealers
Making Display Advertising Work for Auto DealersSpeed Shift Media
 
Real-World Data Governance: Data Governance Roles & Responsibilities
Real-World Data Governance: Data Governance Roles & ResponsibilitiesReal-World Data Governance: Data Governance Roles & Responsibilities
Real-World Data Governance: Data Governance Roles & ResponsibilitiesDATAVERSITY
 
Top 10 heavy duty diesel mechanic interview questions and answers
Top 10 heavy duty diesel mechanic interview questions and answersTop 10 heavy duty diesel mechanic interview questions and answers
Top 10 heavy duty diesel mechanic interview questions and answerstonychoper8206
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousingKavisha Uniyal
 
Lab Report on copper cycle
 Lab Report on copper cycle  Lab Report on copper cycle
Lab Report on copper cycle Karanvir Sidhu
 
Equity derivatives
Equity derivativesEquity derivatives
Equity derivativesRahul Sane
 
How to perform an efficient Cold Chain Compliance and Gap Analysis
How to perform an efficient Cold Chain Compliance and Gap Analysis How to perform an efficient Cold Chain Compliance and Gap Analysis
How to perform an efficient Cold Chain Compliance and Gap Analysis Alternatives Technologie Pharma
 
Financial Management Best Practices
Financial Management Best PracticesFinancial Management Best Practices
Financial Management Best PracticesAutotask
 
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나Amazon Web Services Korea
 
Consulting Company Valuation Model
Consulting Company Valuation ModelConsulting Company Valuation Model
Consulting Company Valuation ModelTony Rice
 
Lecture 1 introduction to construction procurement process.
Lecture 1   introduction to construction procurement process.Lecture 1   introduction to construction procurement process.
Lecture 1 introduction to construction procurement process.Aszahari Aie
 
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theoBài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theoMasterCode.vn
 
Energy management final ppt
Energy management final pptEnergy management final ppt
Energy management final pptEcoEvents
 
Top 10 electrical project engineer interview questions and answers
Top 10 electrical project engineer interview questions and answersTop 10 electrical project engineer interview questions and answers
Top 10 electrical project engineer interview questions and answersrobin26331
 

En vedette (20)

Security issues in cloud database
Security  issues  in cloud   database Security  issues  in cloud   database
Security issues in cloud database
 
Energy conservation week celebration
Energy conservation week celebrationEnergy conservation week celebration
Energy conservation week celebration
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Cluster
 
Cloud Computing v.s. Cyber Security
Cloud Computing v.s. Cyber Security Cloud Computing v.s. Cyber Security
Cloud Computing v.s. Cyber Security
 
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
 
Making Display Advertising Work for Auto Dealers
Making Display Advertising Work for Auto DealersMaking Display Advertising Work for Auto Dealers
Making Display Advertising Work for Auto Dealers
 
Real-World Data Governance: Data Governance Roles & Responsibilities
Real-World Data Governance: Data Governance Roles & ResponsibilitiesReal-World Data Governance: Data Governance Roles & Responsibilities
Real-World Data Governance: Data Governance Roles & Responsibilities
 
Top 10 heavy duty diesel mechanic interview questions and answers
Top 10 heavy duty diesel mechanic interview questions and answersTop 10 heavy duty diesel mechanic interview questions and answers
Top 10 heavy duty diesel mechanic interview questions and answers
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousing
 
Lab Report on copper cycle
 Lab Report on copper cycle  Lab Report on copper cycle
Lab Report on copper cycle
 
Equity derivatives
Equity derivativesEquity derivatives
Equity derivatives
 
How to perform an efficient Cold Chain Compliance and Gap Analysis
How to perform an efficient Cold Chain Compliance and Gap Analysis How to perform an efficient Cold Chain Compliance and Gap Analysis
How to perform an efficient Cold Chain Compliance and Gap Analysis
 
Financial Management Best Practices
Financial Management Best PracticesFinancial Management Best Practices
Financial Management Best Practices
 
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
 
Churn management
Churn managementChurn management
Churn management
 
Consulting Company Valuation Model
Consulting Company Valuation ModelConsulting Company Valuation Model
Consulting Company Valuation Model
 
Lecture 1 introduction to construction procurement process.
Lecture 1   introduction to construction procurement process.Lecture 1   introduction to construction procurement process.
Lecture 1 introduction to construction procurement process.
 
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theoBài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
 
Energy management final ppt
Energy management final pptEnergy management final ppt
Energy management final ppt
 
Top 10 electrical project engineer interview questions and answers
Top 10 electrical project engineer interview questions and answersTop 10 electrical project engineer interview questions and answers
Top 10 electrical project engineer interview questions and answers
 

Similaire à Data Warehouse Optimization

BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationBigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationExcelerate Systems
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with HadoopPrecisely
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesDATAVERSITY
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitMing Yuan
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketDremio Corporation
 
Cloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an EcosystemCloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an EcosystemDatabricks
 

Similaire à Data Warehouse Optimization (20)

BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationBigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummit
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
 
Cloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an EcosystemCloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an Ecosystem
 

Plus de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Plus de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Dernier

Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 

Dernier (20)

Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 

Data Warehouse Optimization

  • 3. 3 Finding Business Pains • Frequent or near-term EDW expansion/spend • Short time windows for data • SLA challenges with ELT • Reports/analytics that are “Too big” • Compliance issues requiring long-term storage AND query • Resource restrictions/contention or disenfranchised/frustrated users 3
  • 4. 4 Common Challenges with the Data Warehouse 4 OLTP Enterprise Applications Data Warehouse QueryExtract Transform Load Business Intelligence Transform 1 1 1 Slow data transformations, missed SLAs. 2 2 Slow queries, poor QoS and missed opportunities. 4 Must archive. Archived data can’t provide value. 3 3 Wrong or incomplete, modified copies are made.5 Constant pressure to buy additional warehouse capacity, just to maintain current quality of service. NO room to expand use cases. NO room to innovate.
  • 5. 5 An EDH Compliments the Data Warehouse 5 OLTP Enterprise Applications Data Warehouse Query Extract Load Business Intelligence Cloudera 3 3 Avoid “spreadmarts” across departments. Transform Query 2 2 Empowered business analysts. 2 1 Data loaded when & where it’s needed. 1 4 Complete view of all your products, customers, etc. 5 Cost effective, infinitely scalable, production ready enterprise data hub for all your data. All data. All users.
  • 6. 6 Hadoop as a Data Warehouse??? 6
  • 7. 7 2014 Gartner MQ for Data Warehouse DBMS 7 “A data warehouse DBMS is now expected to coordinate data virtualization strategies, and distributed file and/or processing approaches, to address changes in data management and access requirements.”
  • 9. 9 Understanding Benefits for Your Organization 9 • Help You Assess Your Enterprise Data Warehouse Ecosystem • Identify Viable Migration Candidates and Target Reference Architecture • Develop a Project Plan to Deliver the Full Scope of Benefits • Understand the Business Case for Making the Investment
  • 10. 10 Working With You Through the EDW Assessment Process 10 Information •Collect information about your EDW environment Analysis •Identify migration candidates •Determine feasibility Recommendations •Develop a migration plan •Establish a business case
  • 12. 12 Key Hadoop Platform Requirements • High availability • Disaster recovery • Downtime-less upgrades • Auditability • Low-latency SQL & BI support • Deep SAS & R support
  • 13. 13 Customers Agree: Cloudera Delivers Customer Workload Results Leading Payments Company Analytics, ETL Processing, DR Largest fraud discovery in firm history Time to report collapsed from 2 days => 2 hours Save $30M on DR Global Money Center Bank Data Processing (ELT) Avoided tens of millions in expansion purchases 42% faster processing Mobile Device Manufacturer Data Processing (ELT) Offloaded 90% of data volume; keep all data Fortune 500 Retailer Analytics More insights by supporting more exploration of more extensive & granular data Leading Financial Regulator Data Processing (ELT) and DR Shrank EDW footprint by 4PB, 20X perf. boost
  • 14. 14 DATA WAREHOUSE Operational Business Intelligence Analytics Self-Service BI Data Processing (ELT) Staged Data Operational Data Archival Data WORKLOADSDATA Assessing Workloads and Data • Data Processing (ELT) • Staged data, to be processed • Temp tables, BLOB/CLOB types, etc. • Analytics / Machine Learning • Deep and broad data sets, within and beyond the warehouse • Self-Service BI (Ad-Hoc Query) • Operational data, actively used for BI • Archival data, inactively used for BI
  • 15. 15 Offload Data Processing (ELT) High-scale batch data processing Implemented as SQL + scripting or ETL running on expensive HW infrastructure Staging data stored across diverse, temp tables High fraction of overall EDW utilization (25 – 80%) Difficult to store, manage staging data in relational form Limited user adoption risk to migrate ETL tools to simplify migration Over 2X the performance 1/10th the cost What to Migrate Influencing Factors Better in Cloudera Reliability for mission-critical workloads: high availability, disaster recovery, downtime-less upgrades Low-latency SQL processing, ability to absorb short-cycle ELT Broad support of leading data integration tools Only Available with Cloudera Partners
  • 16. 16 Offload Self-Service Business Intelligence Self-Service BI, Exploratory BI, Data Discovery Uncertain business questions and uncertain data Fastest growing workload for many warehouses Comparable support for end user tools between Cloudera and DBMS products Schema flexibility End user self-service on full fidelity data 1/10th the cost Workload Migration Priority Better In Cloudera Open source parallel interactive SQL engine: Cloudera Impala Integration and certification of every leading SSBI vendor Only Available with Cloudera Partners
  • 17. 17 Offload Analytics / Machine Learning Training & scoring predictive models Deep and broad data sets, within and beyond the warehouse Statisticians want unconstrained analysis; limited DW compute resources Paying top dollar for warehouse data storage only to load into ML tools Inability to analyze data beyond the warehouse Greater user productivity (pre-packaged ML libraries, no more down-sampling) Support for 3rd party ML tools Greater flexibility (SQL + MR + SAS procs) 1/10th the cost Workload and Data Influencing Factors Better in Cloudera Ability to run SAS, R natively on the same cluster Interactive search and SQL experience for data exploration Built-in analytics libraries (Mahout, DataFu, ClouderaML) Support from Cloudera’s Data Science team Only Available with Cloudera Partners
  • 18. 18 Sample Cloudera Tools for Assisting Migration • High-speed connector – Moves data between the two systems • Data definition – Tool for mapping EDW tables & datatypes to Hive tables & datatypes • Mainframe input / output format – Support direct feed of mainframe data into Cloudera • Result validation – Verifies SQL applications in Cloudera produce the same results as the original applications • Support for SQL-H (planned) – Remote queries from EDW to Cloudera 18
  • 20. 20 • Install and configure CDH and Cloudera Manager • Run standard and specialized performance tests • Recommend tuning, compression and decompression, and scheduler configurations • Document recommended cluster configuration • Train and certify Hadoop administrators Is Your Data Architecture Aligned to Your Use Case? Lay the Foundation for Data Migration and Ensure Success
  • 21. 21 How Quickly and Securely Can You Transition Your Data? Migrate Disparate Data Sources to Boost Performance • Collect low-efficiency data from various silos • Redeploy latent data from EDWs, RDBMSs, and Hadoop environment • Develop, test, and implement data processing jobs • Integrate Hadoop with relevant external systems • Document workload migration
  • 22. 22 Is Your Operational Environment Ready for Handover? Maximize ROI by Rationalizing All Systems, Teams, and Workloads • Review current and future requirements • Review full ecosystem, all jobs, and regular processes • Review application architecture, ingestion pipeline, data schema, and data partitioning system • Review key management and monitoring processes and relevant production procedures • Recommend additional training to assure Hadoop expertise on management and operations teams • Document cluster configuration, solutions implementation, and production recommendations
  • 23. 23 How Much Additional Value Can You Capture Long-Term? Ongoing Optimization Is Key to Deferring Additional Cost • Expand framework without expanding footprint • Rationalize beyond initial burn-in period • Evolve cluster to support additional use cases • Annually benchmark performance to diagnostic • Balance business opportunity against technical risk
  • 25. 25 Prioritizing Workloads and Data Current EDW Constraints Workload Transferability User Communities • Focus on computation constraints • Focus on disk space constraints • Similar or same SQL functionality • Similar or same tools support • Opportunity for performance gains • Group related workloads by user community • Migrate one community at a time 1 2 3
  • 26. 26 The Optimization Process Profile Prioritize Migrate Validate • Analyze all of the workload in your data warehouse • Queries • Objects • User communities • Framework driven methodology for ordering workloads • Balance financial opportunity with business risk • Set up data ingest paths to Cloudera • Map EDW workload to Cloudera Repeat annually to defer additional expansion • Verify results • Evaluate performance differences & tune • Side-by-side “burn in” period • Cut-over
  • 27. 27 Sample EDW Rationalization Process Initial Quarter Second Quarter Third Quarter Fourth Quarter M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 Program Management Responsible for overall program success, resource assignment, project management, and risk mitigation Cloudera Migration Teams Expert resources delivering initial project framework and advanced implementation releases ${Customer} Migration Teams Customer staff resources, taking on increasing responsibility for release implementation over time ProcessPeople Technology Management & Risk Mitigation Initial EDW Assessment Architecture Oversight Assessment and Stratification Process Detailed Workload Analysis Implement Reference Architecture Establish Repeatable Migration Approach Enhance SDLC, Release, and Configuration Management Processes Release 1 Release 2 Release 3 Release N Migration SDLC Assignment/Kick-off Execution Testing User Acceptance Documentation Sign-off Release 2 Release 3 Release N Release 4 Release 5
  • 28. 28 Workload Classification Cloudera Architecture Implementing Cloudera’s reference architecture(s) and building environment to fit unique customer requirements Data Ecosystem Integration BI, ETL, and other applications that require integration with the big data platform, including existing EDW Data Processing High-scale batch data processing, Implemented as SQL + scripting or via ETL tools, Staging data stored across diverse, temp tables Self-service BI Exploratory BI, Data Discovery, Uncertain business questions and uncertain data Analytics Training & scoring, predictive models, deep and broad data sets (within and beyond the warehouse) Archival Processes Traditional archive storage and processes
  • 29. 29 Workload Complexity Basic • Leverages pre-existing architecture and integrations • Utilizes all off-the-shelf components • Repeatable solutions from existing training/documentation Moderate • Requires minimal modifications to existing architecture, integrations, or other dependencies • Some expertise required for new design decisions Advanced • Establishing new reference architectures • Several new design decisions involved • Unique skillsets required (eg. Machine learning)
  • 30. 30 Sample Complexity vs. Time for Various Project Types ComplexityofTask Estimated Phase Low Moderate High 1 2 3 4 Machine Learning Modeling Graph Analytics Modeling Hadoop cluster install/config One-off ingest/ETL processes Predictive Analytics Modeling Production Certification Hadoop storage schemas Decision tree/forest/ensemble Data Pipelining Generic ingest/ETL processes
  • 31. 31 Mapping Resources to Project Task Type ComplexityofTask Estimated Phase Low Moderate High 1 2 3 4 Data Scientist Senior Architect Consultant Architect Principal Architect
  • 32. 32 Developers AdminData Warehouse Specialist Architects Technology & Ops Management & Leadership Big Data Visionary Executive Sponsor Program Manager Business & Data Lead Data Scientist Lead Business Analyst LOB Rep LOB Rep LOB Rep Data Wranglers Typical Big Data COE Program Roles Staff Centrally and Train to Scale
  • 33. 33 Benefits Summary 1. Lower costs of data management, growth 2. Improve quality of service • Meet critical data processing SLAs • Faster BI queries 3. Extend existing warehouse capacity • Increase ROI from current investments • More operational data – volume and schemas • More business intelligence and analytics workloads 4. Retain all data for analysis 5. Deliver a foundation for innovation • Bring more applications to Hadoop data for low incremental cost

Notes de l'éditeur

  1. IN THIS SESSION, WE WILL EXPLORE USING HADOOP TO ADDRESS QUESTIONS AND ISSUES SURROUNDING * Cost of storage * Value of accessibility * Getting maximum return on your IT investments and all of your data
  2. Tie workloads to data types