In the digital world, semi-structured data is as important as transactional, structured data. Both need to be analyzed to create a competitive advantage. Unfortunately, neither the data lake nor the data warehouse are adequate to handle the analysis of both data types.
These slides—based on the webinar from EMA Research and Vertica—delve into the push toward the innovative unified analytics warehouse (UAW), a merging of the data lake and data warehouse.
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Analytics Warehouse
1. IT & DATA MANAGEMENT RESEARCH,
INDUSTRY ANALYSIS & CONSULTING
How to Merge the Data Lake and the Data Warehouse
The Power of a Unified Analytics Warehouse
John Santaferraro
Research Director
EMA
Jeremiah Morrow
Senior Product Marketing Manager
Vertica
2. IT & DATA MANAGEMENT RESEARCH,
INDUSTRY ANALYSIS & CONSULTING
Watch the On-Demand Webinar
Slide 2
How to Merge the Data Lake and the Data Warehouse: The
Power of a Unified Analytics Warehouse on-demand webinar:
https://info.enterprisemanagement.com/how-to-merge-the-data-lake-
and-the-data-warehouse-webinar-ws
Check out upcoming webinars from EMA here:
https://www.enterprisemanagement.com/freeResearch
9. IT & DATA MANAGEMENT RESEARCH,
INDUSTRY ANALYSIS & CONSULTING
Business
Intelligence
CRM ERP
Billing
Application Data
Customer
Operational
Financial
ETL
Analytics Database
Transactional
Data
Message Queues
Files
Data Warehouse
Batch
Visualization
10. IT & DATA MANAGEMENT RESEARCH,
INDUSTRY ANALYSIS & CONSULTING
Data Lake
1
0
Contextual
Data
Files
Weather
Geo
Low
Latency
Batch
Transactional
Data
Application Data
OLTP/ODS
Batch ETL
or EL with T
done on
mass
storage
Data Prep /
Enrichment
Streaming
Data
Application data
Web clicks
Logs
Sensors
Operational metrics
User tracking
Geo-location
Visualization
Applications
Artificial Intelligence
Cloud
On-Premises
AND / OR
Distributed
Pub/Sub
Distributed
Prepped
Data
Object
Storage
Stream Processing
Data Lake
Mass Storage
Query Engine /
Machine
Learning
11. IT & DATA MANAGEMENT RESEARCH,
INDUSTRY ANALYSIS & CONSULTING
Cooperative Data Architecture
1
1
Contextual
Data
Files
Weather
Geo
Low
Latency
Batch
Transactional
Data
Application Data
OLTP/ODS
Batch ETL
or EL with T
done on
mass
storage
Data Lake
Mass Storage
Data Prep /
Enrichment
Streaming
Data
Application data
Web clicks
Logs
Sensors
Operational metrics
User tracking
Geo-location
Visualization
Applications
Artificial Intelligence
Import
Export
Query
Cloud
On-Premises
AND / OR
Distributed
Pub/Sub
Distributed
Columnar
Data
Object
Storage
Stream Processing
25. IT & DATA MANAGEMENT RESEARCH,
INDUSTRY ANALYSIS & CONSULTING
Unified Analytics Warehouse
2
5
Contextual
Data
Files
Weather
Geo
Low
Latency
Batch
Transactional
Data
Application Data
OLTP/ODS
Batch ETL
or EL with T
done in
warehouse
Streaming
Data
Application data
Web clicks
Logs
Sensors
Operational metrics
User tracking
Geo-location
Stream Processing
Cloud
Visualization
Applications
Artificial Intelligence
Shared
Storage
On-Premises
HDFS HDFS
AND / OR Ingestion/ ELT/
Data Prep / Enrichment
Managed ML
Models
Reporting /
BI
Data Science /
ML
Departmental
Use
38. What is Vertica?
3
8
SQL Database
Load and store data in a
data warehouse
designed for blazing
fast analytics
Query Engine
Ask complex analytical
questions and get fast
answers regardless of
where the data resides
Vertica is an advanced analytics platform built for the scale and complexity of today’s data-driven
world. It combines the power of a high-performance, MPP query engine with advanced analytics
and machine learning.
Analytics & ML
Create, train and deploy
advanced analytics and
machine learning models
at massive scale
39. Remove scale, performance and capacity constraints
3
9
Get data quickly enough to act upon it, explore your data interactively,
and enable everyone to make their own data-driven decisions
Fear of more uses or growing data volumes is a thing of the past
Scale Data Volumes Scale Users
SQL Database
+
Vertica Advanced Analytics Platform
+
Get data quickly enough to act upon it, explore your data interactively,
and enable everyone to make their own data-driven decisions
Analytics & ML Query Engine
40. 2020 McKnight Consulting Group Benchmark
4
0
“At 60 concurrent users, Vertica in Eon
Mode was 1.8x less expensive than
Redshift. The unnamed data cloud
platform consistently had the highest price
performance.”
“Vertica in Eon Mode had 1.14x the QPH
of the next highest database (unnamed
data cloud platform) for the 60
concurrent user workload.”
“Vertica in Eon Mod consistently had the
shortest elapsed time for the longest
running [query] thread across the
concurrency profiles at 250 TB.”
Vertica in Eon Mode, Amazon Redshift, and an Unnamed Data Cloud Platform
42. Machine Learning can simplify business processes and
improve the customer experience
4
2
Predictive Maintenance
System
Problem
Customer
Call Dispatch
Onsite
Trouble-
shooting
Remote Monitoring Predicts
Potential failure Service Scheduled Problem Avoided
Parts
Delivery
Repair or
Replace
System
Functional
Reactive Maintenance
Predictive Maintenance
43. High Level Architecture
4
3
Data Lake (Raw Data Archive)
Parallel Extract Transform Load
Legacy
Database
s
CRM &
Business
Systems
Installed Base
Philips Remote Service Network
DSP
(HSI)
Remote
Monitoring Radar
4.0
Remote Service
RSW & CHA
Reliability
Dashboards
R&D Access
44. Outcomes
4
4
"Remote service provides us with
an engineer online all the time.
They tell us when we've got a fault
before we know we've got a fault.
And not only that, they can fix the
fault before we knew we had a
fault. And that's impressive."
Cobalt Imaging, Gloucester
"Now we will have more uptime
on the scanner and potentially be
able to see more patients…It's a
new level of service for us, with a
greater satisfaction." -
Radiographer, New Stobhill
Hospital in Glasgow
500 TB of data in more than 300 tables. 30 trillion data points
80 different data sources integrated
24/7
months from scratch to production8
live data feeds. Millions of logs per week.
…and marching towards 0 unplanned downtime