SlideShare a Scribd company logo
1 of 21
Vertica
Why?
●
●

●
●

Postgres benchmarks (2014-01-13)
Remember, these queries are expected to
occur within a web request/repsonse cycle!
After 60 seconds connections time out
We are used to web pages loading in 1-2
seconds
Count
●

SELECT count(*) FROM transactions

●

(229527.0ms)

●

=> [{"count"=>78144197}]

●

SELECT count(*) FROM transactions WHERE client_id = 131

●

(85451.0ms)

●

=> [{"count"=>34406416}]
Yikes.
Don't panic!
(and carry a towel)
We have a few tricks.
●

What if we had a table that recorded 1 row per
client that tracked all the counts of transactions
for each client?
id

client_id

count_transactions

1

131

34406416

2

132

10587625

3

133

85095

What if we wired this table up to a SQL parser?
Mondrian!
●
●

●
●

Robust aggregate table interface
Auto recognizes aggregate tables via naming
convention
Queries are directed to the correct table
If aggregate tables are missed, fall back to fact
table

●

Can define multiple aggregate tables / fact table

●

Also has an intelligent segment cache
But theres a problem.
●

●

SELECT count(distinct(user_id) FROM transactions
Aggregate tables rely on properties of addition
operations

●

distinct(set_1) + distinct(set_2) != distinct(set_1 + set_2)

●

We have no choice but to query our fact table.
Ok, now we can panic.
Options?
●

●

●

NOSQL (map reduce) – Hbase/Hadoop,
Mongo, etc
Columnar – Lucid, Paraccell, Vertica
Bleeding Edge – Google BigQuery, Apache
Drill
Much Cluster, Many Computer
●

●

●

All of these solutions are using distributed
systems to query lots of data quickly
Querying 100 million rows on a single computer
is not fast on current hardware
And we are projecting to have a lot more than
100 million rows this year
Vertica
●

Columnar

●

Distributed

●

Speaks SQL

●

Compatible with Mondrian

●

Its fast!

●

“drop in” replacement for Postgres
Row based database
id

name

favorite_color

1

brian

blue

2

dennis

red

3

nelson

green

4

spencer

green

(1,brian,blue)(2,dennis,red)(3,nelson,green)(4,spencer,green)
Columnar database
id

name

favorite_color

1

brian

blue

2

dennis

red

3

nelson

green

4

spencer

green

(1,2,3,4)(brian,dennis,nelson,spencer)(blue,red,green,green)
Do you even index, bro?
Nope!
●
●

●

●

●

Vertica has no indexes
Vertica has “projections” which are similar to a
materialized view
Projections are transparent to the query (like an
index)
Projections are used to optimize JOIN, GROUP
BY, and other sorts of queries
Provides a tool to autobuild projections based
on query analysis
Tradeoffs
Columnar

Row Based

●

Slow single row read

●

Fast single row read

●

Slow single row write

●

Fast single row write

●

Fast aggreagtes

●

Slow aggregates

●

Compression (5-10x)

●

No compression
Distributed
●

Data split among servers

●

Horizontal scaling

●

Data is compressed, so its stored in memory

●

Node failure is tolerated

●

Network IO is important
Count All Transactions
Postgres – 230s

Vertica – 2.10s

Distinct User Count All Transactions
Postgres – 187s

Vertica – 0.63s
So you just drop it in, right?
●

6 or 7 gems needed updates

●

Had to roll an activerecord driver

●

AreL saved us from a lot of pain

●

●
●

Still some SQL problems (database drop,
multirow insert)
Lots of DevOps help needed
Currently deployed to sand and qa, hitting
production soon!
Thank You

More Related Content

What's hot

Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleUnderstanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Guatemala User Group
 

What's hot (20)

Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solid
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Distributed SQL Databases Deconstructed
Distributed SQL Databases DeconstructedDistributed SQL Databases Deconstructed
Distributed SQL Databases Deconstructed
 
Understanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginnersUnderstanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginners
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Catalogs - Turning a Set of Parquet Files into a Data Set
Catalogs - Turning a Set of Parquet Files into a Data SetCatalogs - Turning a Set of Parquet Files into a Data Set
Catalogs - Turning a Set of Parquet Files into a Data Set
 
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleUnderstanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path Forward
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark Streaming
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Run Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdf
Run Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdfRun Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdf
Run Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdf
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Introduction to Vertica (Architecture & More)
Introduction to Vertica (Architecture & More)Introduction to Vertica (Architecture & More)
Introduction to Vertica (Architecture & More)
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in Spark
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in Delta
 

Viewers also liked

Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and VerticaBridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Steve Watt
 

Viewers also liked (20)

Vertica loading best practices
Vertica loading best practicesVertica loading best practices
Vertica loading best practices
 
HP Vertica basics
HP Vertica basicsHP Vertica basics
HP Vertica basics
 
Vertica
VerticaVertica
Vertica
 
Vertica architecture
Vertica architectureVertica architecture
Vertica architecture
 
How to install Vertica in a single node.
How to install Vertica in a single node.How to install Vertica in a single node.
How to install Vertica in a single node.
 
Hp vertica certification guide
Hp vertica certification guideHp vertica certification guide
Hp vertica certification guide
 
Apps, hybrider eller responsivt design
Apps, hybrider eller responsivt designApps, hybrider eller responsivt design
Apps, hybrider eller responsivt design
 
Mobil E-handel i Danmark 2011
Mobil E-handel i Danmark 2011Mobil E-handel i Danmark 2011
Mobil E-handel i Danmark 2011
 
Mobile business apps 2011
Mobile business apps 2011Mobile business apps 2011
Mobile business apps 2011
 
Mobile e handel januar 2012
Mobile e handel januar 2012Mobile e handel januar 2012
Mobile e handel januar 2012
 
Succes med Hybrid Mobil App - Phonegap BDmobil 2014
Succes med Hybrid Mobil App - Phonegap BDmobil 2014Succes med Hybrid Mobil App - Phonegap BDmobil 2014
Succes med Hybrid Mobil App - Phonegap BDmobil 2014
 
Mobile megatrends 2011
Mobile megatrends 2011Mobile megatrends 2011
Mobile megatrends 2011
 
Mobile trends e handel mode 2013, april
Mobile trends e handel mode 2013, aprilMobile trends e handel mode 2013, april
Mobile trends e handel mode 2013, april
 
Seminar Mobil ehandel livsstilstrends 2014
Seminar Mobil ehandel livsstilstrends 2014Seminar Mobil ehandel livsstilstrends 2014
Seminar Mobil ehandel livsstilstrends 2014
 
Vertica
VerticaVertica
Vertica
 
Vertica mpp columnar dbms
Vertica mpp columnar dbmsVertica mpp columnar dbms
Vertica mpp columnar dbms
 
Vertica finalist interview
Vertica finalist interviewVertica finalist interview
Vertica finalist interview
 
Optimize Your Vertica Data Management Infrastructure
Optimize Your Vertica Data Management InfrastructureOptimize Your Vertica Data Management Infrastructure
Optimize Your Vertica Data Management Infrastructure
 
Vertica the convertro way
Vertica   the convertro wayVertica   the convertro way
Vertica the convertro way
 
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and VerticaBridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
 

Similar to Vertica

Sql on hadoop the secret presentation.3pptx
Sql on hadoop  the secret presentation.3pptxSql on hadoop  the secret presentation.3pptx
Sql on hadoop the secret presentation.3pptx
Paulo Alonso
 
Plmce2012 scaling pinterest
Plmce2012 scaling pinterestPlmce2012 scaling pinterest
Plmce2012 scaling pinterest
Mohit Jain
 

Similar to Vertica (20)

Amazon Redshift
Amazon RedshiftAmazon Redshift
Amazon Redshift
 
Accelerating analytics on the Sensor and IoT Data.
Accelerating analytics on the Sensor and IoT Data. Accelerating analytics on the Sensor and IoT Data.
Accelerating analytics on the Sensor and IoT Data.
 
Sql on hadoop the secret presentation.3pptx
Sql on hadoop  the secret presentation.3pptxSql on hadoop  the secret presentation.3pptx
Sql on hadoop the secret presentation.3pptx
 
Social media analytics using Azure Technologies
Social media analytics using Azure TechnologiesSocial media analytics using Azure Technologies
Social media analytics using Azure Technologies
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Why Gateways are Important in Your IoT Architecture
Why Gateways are Important in Your IoT ArchitectureWhy Gateways are Important in Your IoT Architecture
Why Gateways are Important in Your IoT Architecture
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
Plmce2012 scaling pinterest
Plmce2012 scaling pinterestPlmce2012 scaling pinterest
Plmce2012 scaling pinterest
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Apache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them AllApache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them All
 
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
 
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
 
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
 
Introducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneIntroducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache Lucene
 
Multi Valued Vectors Lucene
Multi Valued Vectors LuceneMulti Valued Vectors Lucene
Multi Valued Vectors Lucene
 
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Vertica