SlideShare une entreprise Scribd logo
1  sur  74
Télécharger pour lire hors ligne
TimescaleDB:
Re-engineering PostgreSQL
as a time-series database
David Kohn
R & D Engineer, Timescale
david@timescale.com · github.com/timescale · Apache 2 License
Open Source (Apache 2.0)
• github.com/timescale/timescaledb
Join the Community
• slack.timescale.com
Industrial
Machines
AI & ML
Inferences
Energy &
Utilities
Time-series
Data is
Everywhere
Web/mobile
Events
Transportation &
Logistics
Financial
Datacenter &
DevOps
Of every type
• Regular:  Machines and sensors
• Irregular:  Web and machine events
• Forward looking:  Logistics and forecasting
• Derived data:  Inferences from AI/ML models
Time-series data is recording
the change of your world
Time-series data is recording
every datapoint as a new entry
Existing databases don’t work for time series
Relational Databases NoSQL Databases
Every other time-series database today is NoSQL
Hard to scale
Underperform on complex queries,

are hard to use, and lead to data silos
1 million+ downloads in <18 months
Empower Organizations to
Analyze the Past, Understand the
Present, and Predict the Future
Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (premium LRS storage)
Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)
Hard to scale
Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (premium LRS storage)
Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)
Hard to scale
B-tree Insert Pain
1 2010
1 10 13 24 2925
5Insert batch: 178
Memory Capacity: 2 NODES
IN MEMORY
WRITE TO DISK
B-tree Insert Pain
1 2010
1 10 13 24 2925
5Insert batch: 178
Memory Capacity: 2 NODES
IN MEMORY
WRITE TO DISK
1 2010
1 10 13 24 2925
Insert batch: 8
5
17
B-tree Insert Pain
Memory Capacity: 2 NODES
IN MEMORY
WRITE TO DISK
10 13
B-tree Insert Pain
1 2010
1 24 2925
Insert batch: 8
5 17
Memory Capacity: 2 NODES
IN MEMORY
WRITE TO DISK
Challenge in scaling up
• Indexes write to random parts of B-tree
• As table grows large
– Indexes no longer fit in memory
– Random writes cause swapping
Device: A
Time: 01:01:01
Device: Z
Time: 01:01:01
Device, Time DESC
Is there a better way?
• Ingest millions of datapoint
per second
• Scale to 100s billions of rows
• Elastically scale up and out
• Faster than Influx, Cassandra,
Mongo, vanilla Postgres
Scale &
Performance
• Inherits 20+ years of
PostgreSQL reliability
• Streaming replication,
HA, backup/recovery
• Data lifecycle: continuous
rollups, retention, archiving
• Enterprise-grade security
Proven &
Enterprise Ready
• Zero learning curve
• Zero friction: Existing tools
and connectors work
• Enrich understanding: JOIN
against relational data
• Freedom for data model, no
cardinality issues
SQL for
time series
TimescaleDB

Scalable time-series database, full SQL
Packaged as a PostgreSQL extension
>20x
TimescaleDB vs. PostgreSQL
(batch inserts)
TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage)
Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)
1.11M
METRICS / S
TimescaleDB vs.
PostgreSQL
SPEEDUP
Table scans, simple
column rollups
~0-20%
GROUPBYs 20-200%
Time-ordered
GROUPBYs
400-10000x
DELETEs 2000x
TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage)
Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)
Enjoy the entire PostgreSQL ecosystem
Key-value store with
indexed key lookup at
high-write rates
NoSQL champion: Log-Structured Merge Trees
• Compressed data storage
• Common approach for time series:
use key <name, tags, field, time>
+
NoSQL + LSMTs Come at a Cost
• Significant memory overhead
• Lack of secondary indexes / tag lock-in
• Less powerful queries
• Weaker consistency (no ACID)
• No JOINS
• Loss of SQL ecosystem
+
Query Speedup
Table scans,
column rollups
~0%
GROUPBYs 4-6x
Time-ordered
GROUPBYs
1450x
Lastpoint 101xMongoDB TimescaleDB
vs. MongoDB
20% Higher Inserts
TimescaleDB 0.9.2, MongoDB 3.6, Azure standard D8s v3 (8 vCPU), 4 1-TB disks in raid0
Query Speedup
Table scans,
column rollups
2-44x
GROUPBYs 1-3x
Time-ordered
GROUPBYs
1900x
Lastpoint 1400x
vs. Cassandra
10x Higher Inserts
TimescaleDB 0.5, Cassandra 3.11.0, Azure standard DS4 v2 (8 cores), SSD (LRS storage)
Each TimescaleDB row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)
Each Cassandra row has 2 columns (1 key, combo of tags + host + timestamp)
TimescaleDB
3 nodes
Cassandra
30 nodes
Ratio
Write
Throughput
(metrics / sec)
956,910 695,294 138%
Monthly Cost
(Azure)
$3,325 $33,251 10%
How?
Time-series workloads are different
Time-series
• Primarily UPDATEs
• Writes randomly distributed
• Transactions to multiple 

primary keys
• Primarily INSERTs
• Writes to recent time interval
• Writes primarily associated

with a timestamp
OLTP
How it works
Time

(older)
Time-space partitioning

(for both scaling up & out)
Time

(older)
Intervals
1) manually specified
2) automatically adjusted
Time-space partitioning

(for both scaling up & out)
Space
Time

(older)
(hash partitioning)
Intervals
1) manually specified
2) automatically adjusted
Time-space partitioning

(for both scaling up & out)
Chunk (sub-table)
Space
Time

(older)
(hash partitioning)
Intervals
1) manually specified
2) automatically adjusted
Automatic Space-time Partitioning
Chunks
Automatic Space-time Partitioning
Chunks
But treat it like a single table
Chunks
• Indexes
• Triggers
• Constraints
• Foreign keys
• UPSERTs
• Table mgmt
Hypertable
TimescaleDB: Easy to Get Started
CREATE TABLE conditions (
time timestamptz,
temp float,
humidity float,
device text
);
SELECT create_hypertable('conditions', 'time', ‘device', 4,
chunk_time_interval => interval '1 week’);
INSERT INTO conditions
VALUES ('2017-10-03 10:23:54+01', 73.4, 40.7, 'sensor3');
SELECT * FROM conditions;
time | temp | humidity | device
------------------------+------+----------+---------
2017-10-03 11:23:54+02 | 73.4 | 40.7 | sensor3
Create partitions
automatically at runtime.


Avoid a lot of manual
work.
CREATE TABLE conditions (
time timestamptz,
temp float,
humidity float,
device text
);
CREATE TABLE conditions_p1 PARTITION OF conditions
FOR VALUES FROM (MINVALUE) TO ('g')
PARTITION BY RANGE (time);
CREATE TABLE conditions_p2 PARTITION OF conditions
FOR VALUES FROM ('g') TO ('n')
PARTITION BY RANGE (time);
CREATE TABLE conditions_p3 PARTITION OF conditions
FOR VALUES FROM ('n') TO ('t')
PARTITION BY RANGE (time);
CREATE TABLE conditions_p4 PARTITION OF conditions
FOR VALUES FROM ('t') TO (MAXVALUE)
PARTITION BY RANGE (time);
-- Create time partitions for the first week in each device partition
CREATE TABLE conditions_p1_y2017m10w01 PARTITION OF conditions_p1
FOR VALUES FROM ('2017-10-01') TO ('2017-10-07');
CREATE TABLE conditions_p2_y2017m10w01 PARTITION OF conditions_p2
FOR VALUES FROM ('2017-10-01') TO ('2017-10-07');
CREATE TABLE conditions_p3_y2017m10w01 PARTITION OF conditions_p3
FOR VALUES FROM ('2017-10-01') TO ('2017-10-07');
CREATE TABLE conditions_p4_y2017m10w01 PARTITION OF conditions_p4
FOR VALUES FROM ('2017-10-01') TO (‘2017-10-07');
-- Create time-device index on each leaf partition
CREATE INDEX ON conditions_p1_y2017m10w01 (time);
CREATE INDEX ON conditions_p2_y2017m10w01 (time);
CREATE INDEX ON conditions_p3_y2017m10w01 (time);
CREATE INDEX ON conditions_p4_y2017m10w01 (time);
INSERT INTO conditions VALUES ('2017-10-03 10:23:54+01',
73.4, 40.7, ‘sensor3');
Chunking benefits
Chunks are “right-sized”
Recent (hot) chunks fit in memory
Single node: Scaling up via adding disks
• Faster inserts
• Parallelized queries
How Benefit
Chunks spread across many disks (elastically!)
either RAIDed or via distinct tablespaces
Writes
Schema
Changes
Reads
Multi-node: High availability and scaling read throughput
Multi-node: Scaling out across sharded primaries
U
nderdevelopm
ent
• Chunks spread across servers
• Insert/query to any server
• Distributed query optimizations
(push-down LIMITs and aggregates, etc.)
Chunk-aware query
optimizations
SELECT time, temp FROM data

WHERE time > now() - interval ‘7 days’

AND device_id = ‘12345’
Avoid querying chunks via constraint exclusion
Avoid querying chunks via constraint exclusion
SELECT time, device_id, temp FROM data

WHERE time > ‘2017-08-22 18:18:00+00’
Avoid querying chunks via constraint exclusion
SELECT time, device_id, temp FROM data

WHERE time > now() - interval ’24 hours’
Additional time-based query optimizations
PG doesn’t
know to use
the index
CREATE INDEX ON readings(time);
SELECT date_trunc(‘minute’, time) as bucket,
avg(cpu)
FROM readings
GROUP BY bucket
ORDER BY bucket DESC
LIMIT 10;
Timescale
understands
time
Global queries but local indexes
• Constraint exclusion selects chunks globally
• Local indexes speed up queries on chunks
– B-tree, Hash, GiST, SP-GiST, GIN and BRIN
– Secondary and composite columns, UNIQUE* constraints
Optimized for many chunks
• Faster chunk exclusion
– Avoid opening / gather stats on all chunks during constraint exclusion:
Decreased planning on 4000 chunks from 600ms to 36ms
• Better LIMITs across chunks
– Avoid requiring one+ tuple per chunk during MergeAppend / LIMIT
“ We've been using TimescaleDB for over a year to
store all kinds of sensor and telemetry data as part of
our Power Management database.
We've scaled to 500 billion rows and the performance
we're seeing is monstrous, almost 70% faster queries.”
- Sean Wallace, Software Engineer
500B
ROWS
400K
ROWS / SEC
50K
CHUNKS
5min
INTERVALS
Efficient retention policies
SELECT time, device_id, temp FROM data

WHERE time > now() - interval ’24 hours’
Drop chunks, don’t delete rows
avoids vacuuming
Is it just about performance?
Simplify your stack
VS
TimescaleDB

(with JOINS)
RDBMS NoSQL
Application Application
Rich Time Analytics
Geospatial Temporal Analysis (with PostGIS)
Data Retention + Aggregations
Granularity raw 15 min day
Retention 1 week 1 month forever
Unlock the richness of your monitoring data
TimescaleDB
+
PostgreSQL
Prometheus
Remote Storage Adapter
+
pg_prometheus
Prometheus Grafana
pg_prometheus
Prometheus Data Model in TimescaleDB / PostgreSQL
CREATE TABLE metrics (sample prom_sample);
INSERT INTO metrics
VALUES (‘cpu_usage{service=“nginx”,host=“machine1”} 34.6 1494595898000’);
• Scrape metrics with CURL:
curl http://myservice/metrics | grep -v “^#” | psql -c “COPY metrics FROM STDIN”
• New data type prom_sample: <time, name, value, labels>
Automate normalized storage
SELECT create_prometheus_table(‘metrics’);
Time
01:02:00

01:03:00
01:04:00
01:04:00
01:04:00
Value
90
1024
70
900
70
Label
{host: “h001”}
{host: “h002”}
{host: “1984” }
{host: “super”}
{host: “marshal”}
Id
1
2
3
4
5
Label Id
1
1
2
2
5
Name
CPU
Mem
CPU
Mem
IO
Labels stored in separate host metadata table
Easily query auto-created view
SELECT sample
FROM metrics
WHERE time > NOW() - interval ’10 min’ AND
name = ‘cpu_usage’ AND
Labels @> ‘{“service”: “nginx”}’;
Columns: | sample | time | name | value | labels |
+
+
What’s Next?
2PC
Multi-node: Scaling out across sharded primaries
U
nderdevelopm
ent
Writes Reads
Query planning +
constraint exclusion
minute
Continuous aggregations and hierarchical views
U
nderdevelopm
ent
Granularity raw hour
minute
Continuous aggregations and hierarchical views
U
nderdevelopm
ent
Granularity raw hour
Tiered data storage and automated archiving
U
nderdevelopm
ent
SAN
Time

(older)
archive_chunks (‘3 months’)
move_chunks (‘1 week’, ssd, hdd)
Scale Full clustering
Performance
+ ease-of-use
Continuous data aggregations and
intelligent hierarchical views
Performance
Lazy chunk management
(index creation, reindex, CLUSTER)
Ease-of-use
Analytical features
(gap filling, LOCF, fuzzy joins, etc.)
Total
Cost-of-Ownership
Tiered data storage
Automated data archiving
Open Source (Apache 2.0)
• github.com/timescale/timescaledb
Join the Community
• slack.timescale.com
Re-Engineering PostgreSQL as a Time-Series Database

Contenu connexe

Tendances

Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundMasahiko Sawada
 
PostgreSQLの関数属性を知ろう
PostgreSQLの関数属性を知ろうPostgreSQLの関数属性を知ろう
PostgreSQLの関数属性を知ろうkasaharatt
 
分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)
分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)
分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)NTT DATA OSS Professional Services
 
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -Masahiko Sawada
 
これからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみようこれからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみようNobuyuki Sasaki
 
PostgreSQL開発コミュニティに参加しよう! ~2022年版~(Open Source Conference 2022 Online/Kyoto 発...
PostgreSQL開発コミュニティに参加しよう! ~2022年版~(Open Source Conference 2022 Online/Kyoto 発...PostgreSQL開発コミュニティに参加しよう! ~2022年版~(Open Source Conference 2022 Online/Kyoto 発...
PostgreSQL開発コミュニティに参加しよう! ~2022年版~(Open Source Conference 2022 Online/Kyoto 発...NTT DATA Technology & Innovation
 
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataProblems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataJignesh Shah
 
新しいTERASOLUNA Batch Frameworkとは
新しいTERASOLUNA Batch Frameworkとは新しいTERASOLUNA Batch Frameworkとは
新しいTERASOLUNA Batch Frameworkとはapkiban
 
PostgreSQLによるデータ分析ことはじめ
PostgreSQLによるデータ分析ことはじめPostgreSQLによるデータ分析ことはじめ
PostgreSQLによるデータ分析ことはじめOhyama Masanori
 
CentOS 8で標準搭載! 「389-ds」で構築する 認証サーバーについて
CentOS 8で標準搭載! 「389-ds」で構築する 認証サーバーについてCentOS 8で標準搭載! 「389-ds」で構築する 認証サーバーについて
CentOS 8で標準搭載! 「389-ds」で構築する 認証サーバーについてNobuyuki Sasaki
 
HA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティスHA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティスEnterpriseDB
 
MySQL Storage Engines
MySQL Storage EnginesMySQL Storage Engines
MySQL Storage EnginesKarthik .P.R
 
PostgreSQL 15 開発最新情報
PostgreSQL 15 開発最新情報PostgreSQL 15 開発最新情報
PostgreSQL 15 開発最新情報Masahiko Sawada
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Yuki Morishita
 
pg_bigmで全文検索するときに気を付けたい5つのポイント(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_bigmで全文検索するときに気を付けたい5つのポイント(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)pg_bigmで全文検索するときに気を付けたい5つのポイント(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_bigmで全文検索するときに気を付けたい5つのポイント(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)NTT DATA Technology & Innovation
 

Tendances (20)

Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
PostgreSQLの関数属性を知ろう
PostgreSQLの関数属性を知ろうPostgreSQLの関数属性を知ろう
PostgreSQLの関数属性を知ろう
 
分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)
分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)
分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)
 
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
 
これからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみようこれからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみよう
 
PostgreSQL開発コミュニティに参加しよう! ~2022年版~(Open Source Conference 2022 Online/Kyoto 発...
PostgreSQL開発コミュニティに参加しよう! ~2022年版~(Open Source Conference 2022 Online/Kyoto 発...PostgreSQL開発コミュニティに参加しよう! ~2022年版~(Open Source Conference 2022 Online/Kyoto 発...
PostgreSQL開発コミュニティに参加しよう! ~2022年版~(Open Source Conference 2022 Online/Kyoto 発...
 
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataProblems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
 
新しいTERASOLUNA Batch Frameworkとは
新しいTERASOLUNA Batch Frameworkとは新しいTERASOLUNA Batch Frameworkとは
新しいTERASOLUNA Batch Frameworkとは
 
PostgreSQLによるデータ分析ことはじめ
PostgreSQLによるデータ分析ことはじめPostgreSQLによるデータ分析ことはじめ
PostgreSQLによるデータ分析ことはじめ
 
CentOS 8で標準搭載! 「389-ds」で構築する 認証サーバーについて
CentOS 8で標準搭載! 「389-ds」で構築する 認証サーバーについてCentOS 8で標準搭載! 「389-ds」で構築する 認証サーバーについて
CentOS 8で標準搭載! 「389-ds」で構築する 認証サーバーについて
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
HA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティスHA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティス
 
Dbts 分散olt pv2
Dbts 分散olt pv2Dbts 分散olt pv2
Dbts 分散olt pv2
 
Redshift VS BigQuery
Redshift VS BigQueryRedshift VS BigQuery
Redshift VS BigQuery
 
MySQL Storage Engines
MySQL Storage EnginesMySQL Storage Engines
MySQL Storage Engines
 
VIOPS10: SSDの基本技術と最新動向
VIOPS10: SSDの基本技術と最新動向VIOPS10: SSDの基本技術と最新動向
VIOPS10: SSDの基本技術と最新動向
 
PostgreSQL 15 開発最新情報
PostgreSQL 15 開発最新情報PostgreSQL 15 開発最新情報
PostgreSQL 15 開発最新情報
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編
 
pg_bigmで全文検索するときに気を付けたい5つのポイント(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_bigmで全文検索するときに気を付けたい5つのポイント(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)pg_bigmで全文検索するときに気を付けたい5つのポイント(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_bigmで全文検索するときに気を付けたい5つのポイント(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
 

Similaire à Re-Engineering PostgreSQL as a Time-Series Database

Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic
 
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffDatabases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffTimescale
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataPatrick McFadin
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japanHiromitsu Komatsu
 
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLeveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLucidworks
 
Leveraging the Power of Solr with Spark
Leveraging the Power of Solr with SparkLeveraging the Power of Solr with Spark
Leveraging the Power of Solr with SparkQAware GmbH
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseAmazon Web Services
 
How Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftHow Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftAttunity
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Data Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQLData Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQLBasho Technologies
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseAmazon Web Services
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기Amazon Web Services Korea
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at PollfishPollfish
 

Similaire à Re-Engineering PostgreSQL as a Time-Series Database (20)

Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffDatabases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Presentation
PresentationPresentation
Presentation
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
 
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLeveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
 
Leveraging the Power of Solr with Spark
Leveraging the Power of Solr with SparkLeveraging the Power of Solr with Spark
Leveraging the Power of Solr with Spark
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
How Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftHow Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Data Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQLData Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQL
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 

Plus de All Things Open

Building Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityBuilding Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityAll Things Open
 
Modern Database Best Practices
Modern Database Best PracticesModern Database Best Practices
Modern Database Best PracticesAll Things Open
 
Open Source and Public Policy
Open Source and Public PolicyOpen Source and Public Policy
Open Source and Public PolicyAll Things Open
 
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...All Things Open
 
The State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil NashThe State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil NashAll Things Open
 
Total ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScriptTotal ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScriptAll Things Open
 
What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?All Things Open
 
How to Write & Deploy a Smart Contract
How to Write & Deploy a Smart ContractHow to Write & Deploy a Smart Contract
How to Write & Deploy a Smart ContractAll Things Open
 
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
 Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlowAll Things Open
 
DEI Challenges and Success
DEI Challenges and SuccessDEI Challenges and Success
DEI Challenges and SuccessAll Things Open
 
Scaling Web Applications with Background
Scaling Web Applications with BackgroundScaling Web Applications with Background
Scaling Web Applications with BackgroundAll Things Open
 
Supercharging tutorials with WebAssembly
Supercharging tutorials with WebAssemblySupercharging tutorials with WebAssembly
Supercharging tutorials with WebAssemblyAll Things Open
 
Using SQL to Find Needles in Haystacks
Using SQL to Find Needles in HaystacksUsing SQL to Find Needles in Haystacks
Using SQL to Find Needles in HaystacksAll Things Open
 
Configuration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit InterceptConfiguration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit InterceptAll Things Open
 
Scaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship ProgramScaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship ProgramAll Things Open
 
Build Developer Experience Teams for Open Source
Build Developer Experience Teams for Open SourceBuild Developer Experience Teams for Open Source
Build Developer Experience Teams for Open SourceAll Things Open
 
Deploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache BeamDeploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache BeamAll Things Open
 
Sudo – Giving access while staying in control
Sudo – Giving access while staying in controlSudo – Giving access while staying in control
Sudo – Giving access while staying in controlAll Things Open
 
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML ApplicationsFortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML ApplicationsAll Things Open
 
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...All Things Open
 

Plus de All Things Open (20)

Building Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityBuilding Reliability - The Realities of Observability
Building Reliability - The Realities of Observability
 
Modern Database Best Practices
Modern Database Best PracticesModern Database Best Practices
Modern Database Best Practices
 
Open Source and Public Policy
Open Source and Public PolicyOpen Source and Public Policy
Open Source and Public Policy
 
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
 
The State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil NashThe State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil Nash
 
Total ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScriptTotal ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScript
 
What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?
 
How to Write & Deploy a Smart Contract
How to Write & Deploy a Smart ContractHow to Write & Deploy a Smart Contract
How to Write & Deploy a Smart Contract
 
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
 Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
 
DEI Challenges and Success
DEI Challenges and SuccessDEI Challenges and Success
DEI Challenges and Success
 
Scaling Web Applications with Background
Scaling Web Applications with BackgroundScaling Web Applications with Background
Scaling Web Applications with Background
 
Supercharging tutorials with WebAssembly
Supercharging tutorials with WebAssemblySupercharging tutorials with WebAssembly
Supercharging tutorials with WebAssembly
 
Using SQL to Find Needles in Haystacks
Using SQL to Find Needles in HaystacksUsing SQL to Find Needles in Haystacks
Using SQL to Find Needles in Haystacks
 
Configuration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit InterceptConfiguration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit Intercept
 
Scaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship ProgramScaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship Program
 
Build Developer Experience Teams for Open Source
Build Developer Experience Teams for Open SourceBuild Developer Experience Teams for Open Source
Build Developer Experience Teams for Open Source
 
Deploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache BeamDeploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache Beam
 
Sudo – Giving access while staying in control
Sudo – Giving access while staying in controlSudo – Giving access while staying in control
Sudo – Giving access while staying in control
 
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML ApplicationsFortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
 
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
 

Dernier

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 

Dernier (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 

Re-Engineering PostgreSQL as a Time-Series Database

  • 1. TimescaleDB: Re-engineering PostgreSQL as a time-series database David Kohn R & D Engineer, Timescale david@timescale.com · github.com/timescale · Apache 2 License
  • 2. Open Source (Apache 2.0) • github.com/timescale/timescaledb Join the Community • slack.timescale.com
  • 3. Industrial Machines AI & ML Inferences Energy & Utilities Time-series Data is Everywhere Web/mobile Events Transportation & Logistics Financial Datacenter & DevOps
  • 4. Of every type • Regular:  Machines and sensors • Irregular:  Web and machine events • Forward looking:  Logistics and forecasting • Derived data:  Inferences from AI/ML models
  • 5. Time-series data is recording the change of your world
  • 6. Time-series data is recording every datapoint as a new entry
  • 7. Existing databases don’t work for time series Relational Databases NoSQL Databases Every other time-series database today is NoSQL Hard to scale Underperform on complex queries,
 are hard to use, and lead to data silos
  • 8.
  • 9. 1 million+ downloads in <18 months
  • 10.
  • 11. Empower Organizations to Analyze the Past, Understand the Present, and Predict the Future
  • 12. Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (premium LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics) Hard to scale
  • 13. Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (premium LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics) Hard to scale
  • 14. B-tree Insert Pain 1 2010 1 10 13 24 2925 5Insert batch: 178 Memory Capacity: 2 NODES IN MEMORY WRITE TO DISK
  • 15. B-tree Insert Pain 1 2010 1 10 13 24 2925 5Insert batch: 178 Memory Capacity: 2 NODES IN MEMORY WRITE TO DISK
  • 16. 1 2010 1 10 13 24 2925 Insert batch: 8 5 17 B-tree Insert Pain Memory Capacity: 2 NODES IN MEMORY WRITE TO DISK
  • 17. 10 13 B-tree Insert Pain 1 2010 1 24 2925 Insert batch: 8 5 17 Memory Capacity: 2 NODES IN MEMORY WRITE TO DISK
  • 18. Challenge in scaling up • Indexes write to random parts of B-tree • As table grows large – Indexes no longer fit in memory – Random writes cause swapping Device: A Time: 01:01:01 Device: Z Time: 01:01:01 Device, Time DESC
  • 19. Is there a better way?
  • 20. • Ingest millions of datapoint per second • Scale to 100s billions of rows • Elastically scale up and out • Faster than Influx, Cassandra, Mongo, vanilla Postgres Scale & Performance • Inherits 20+ years of PostgreSQL reliability • Streaming replication, HA, backup/recovery • Data lifecycle: continuous rollups, retention, archiving • Enterprise-grade security Proven & Enterprise Ready • Zero learning curve • Zero friction: Existing tools and connectors work • Enrich understanding: JOIN against relational data • Freedom for data model, no cardinality issues SQL for time series TimescaleDB
 Scalable time-series database, full SQL Packaged as a PostgreSQL extension
  • 21. >20x TimescaleDB vs. PostgreSQL (batch inserts) TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics) 1.11M METRICS / S
  • 22. TimescaleDB vs. PostgreSQL SPEEDUP Table scans, simple column rollups ~0-20% GROUPBYs 20-200% Time-ordered GROUPBYs 400-10000x DELETEs 2000x TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)
  • 23. Enjoy the entire PostgreSQL ecosystem
  • 24. Key-value store with indexed key lookup at high-write rates NoSQL champion: Log-Structured Merge Trees • Compressed data storage • Common approach for time series: use key <name, tags, field, time> +
  • 25. NoSQL + LSMTs Come at a Cost • Significant memory overhead • Lack of secondary indexes / tag lock-in • Less powerful queries • Weaker consistency (no ACID) • No JOINS • Loss of SQL ecosystem +
  • 26. Query Speedup Table scans, column rollups ~0% GROUPBYs 4-6x Time-ordered GROUPBYs 1450x Lastpoint 101xMongoDB TimescaleDB vs. MongoDB 20% Higher Inserts TimescaleDB 0.9.2, MongoDB 3.6, Azure standard D8s v3 (8 vCPU), 4 1-TB disks in raid0
  • 27. Query Speedup Table scans, column rollups 2-44x GROUPBYs 1-3x Time-ordered GROUPBYs 1900x Lastpoint 1400x vs. Cassandra 10x Higher Inserts TimescaleDB 0.5, Cassandra 3.11.0, Azure standard DS4 v2 (8 cores), SSD (LRS storage) Each TimescaleDB row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics) Each Cassandra row has 2 columns (1 key, combo of tags + host + timestamp)
  • 28. TimescaleDB 3 nodes Cassandra 30 nodes Ratio Write Throughput (metrics / sec) 956,910 695,294 138% Monthly Cost (Azure) $3,325 $33,251 10%
  • 29. How?
  • 31. Time-series • Primarily UPDATEs • Writes randomly distributed • Transactions to multiple 
 primary keys • Primarily INSERTs • Writes to recent time interval • Writes primarily associated
 with a timestamp OLTP
  • 34. Time-space partitioning
 (for both scaling up & out) Time
 (older) Intervals 1) manually specified 2) automatically adjusted
  • 35. Time-space partitioning
 (for both scaling up & out) Space Time
 (older) (hash partitioning) Intervals 1) manually specified 2) automatically adjusted
  • 36. Time-space partitioning
 (for both scaling up & out) Chunk (sub-table) Space Time
 (older) (hash partitioning) Intervals 1) manually specified 2) automatically adjusted
  • 39. But treat it like a single table Chunks • Indexes • Triggers • Constraints • Foreign keys • UPSERTs • Table mgmt Hypertable
  • 40. TimescaleDB: Easy to Get Started CREATE TABLE conditions ( time timestamptz, temp float, humidity float, device text ); SELECT create_hypertable('conditions', 'time', ‘device', 4, chunk_time_interval => interval '1 week’); INSERT INTO conditions VALUES ('2017-10-03 10:23:54+01', 73.4, 40.7, 'sensor3'); SELECT * FROM conditions; time | temp | humidity | device ------------------------+------+----------+--------- 2017-10-03 11:23:54+02 | 73.4 | 40.7 | sensor3
  • 41. Create partitions automatically at runtime. 
 Avoid a lot of manual work. CREATE TABLE conditions ( time timestamptz, temp float, humidity float, device text ); CREATE TABLE conditions_p1 PARTITION OF conditions FOR VALUES FROM (MINVALUE) TO ('g') PARTITION BY RANGE (time); CREATE TABLE conditions_p2 PARTITION OF conditions FOR VALUES FROM ('g') TO ('n') PARTITION BY RANGE (time); CREATE TABLE conditions_p3 PARTITION OF conditions FOR VALUES FROM ('n') TO ('t') PARTITION BY RANGE (time); CREATE TABLE conditions_p4 PARTITION OF conditions FOR VALUES FROM ('t') TO (MAXVALUE) PARTITION BY RANGE (time); -- Create time partitions for the first week in each device partition CREATE TABLE conditions_p1_y2017m10w01 PARTITION OF conditions_p1 FOR VALUES FROM ('2017-10-01') TO ('2017-10-07'); CREATE TABLE conditions_p2_y2017m10w01 PARTITION OF conditions_p2 FOR VALUES FROM ('2017-10-01') TO ('2017-10-07'); CREATE TABLE conditions_p3_y2017m10w01 PARTITION OF conditions_p3 FOR VALUES FROM ('2017-10-01') TO ('2017-10-07'); CREATE TABLE conditions_p4_y2017m10w01 PARTITION OF conditions_p4 FOR VALUES FROM ('2017-10-01') TO (‘2017-10-07'); -- Create time-device index on each leaf partition CREATE INDEX ON conditions_p1_y2017m10w01 (time); CREATE INDEX ON conditions_p2_y2017m10w01 (time); CREATE INDEX ON conditions_p3_y2017m10w01 (time); CREATE INDEX ON conditions_p4_y2017m10w01 (time); INSERT INTO conditions VALUES ('2017-10-03 10:23:54+01', 73.4, 40.7, ‘sensor3');
  • 43. Chunks are “right-sized” Recent (hot) chunks fit in memory
  • 44. Single node: Scaling up via adding disks • Faster inserts • Parallelized queries How Benefit Chunks spread across many disks (elastically!) either RAIDed or via distinct tablespaces
  • 46. Multi-node: Scaling out across sharded primaries U nderdevelopm ent • Chunks spread across servers • Insert/query to any server • Distributed query optimizations (push-down LIMITs and aggregates, etc.)
  • 48. SELECT time, temp FROM data
 WHERE time > now() - interval ‘7 days’
 AND device_id = ‘12345’ Avoid querying chunks via constraint exclusion
  • 49. Avoid querying chunks via constraint exclusion SELECT time, device_id, temp FROM data
 WHERE time > ‘2017-08-22 18:18:00+00’
  • 50. Avoid querying chunks via constraint exclusion SELECT time, device_id, temp FROM data
 WHERE time > now() - interval ’24 hours’
  • 51. Additional time-based query optimizations PG doesn’t know to use the index CREATE INDEX ON readings(time); SELECT date_trunc(‘minute’, time) as bucket, avg(cpu) FROM readings GROUP BY bucket ORDER BY bucket DESC LIMIT 10; Timescale understands time
  • 52. Global queries but local indexes • Constraint exclusion selects chunks globally • Local indexes speed up queries on chunks – B-tree, Hash, GiST, SP-GiST, GIN and BRIN – Secondary and composite columns, UNIQUE* constraints
  • 53. Optimized for many chunks • Faster chunk exclusion – Avoid opening / gather stats on all chunks during constraint exclusion: Decreased planning on 4000 chunks from 600ms to 36ms • Better LIMITs across chunks – Avoid requiring one+ tuple per chunk during MergeAppend / LIMIT
  • 54. “ We've been using TimescaleDB for over a year to store all kinds of sensor and telemetry data as part of our Power Management database. We've scaled to 500 billion rows and the performance we're seeing is monstrous, almost 70% faster queries.” - Sean Wallace, Software Engineer 500B ROWS 400K ROWS / SEC 50K CHUNKS 5min INTERVALS
  • 55. Efficient retention policies SELECT time, device_id, temp FROM data
 WHERE time > now() - interval ’24 hours’ Drop chunks, don’t delete rows avoids vacuuming
  • 56. Is it just about performance?
  • 57. Simplify your stack VS TimescaleDB
 (with JOINS) RDBMS NoSQL Application Application
  • 60. Data Retention + Aggregations Granularity raw 15 min day Retention 1 week 1 month forever
  • 61. Unlock the richness of your monitoring data TimescaleDB + PostgreSQL Prometheus Remote Storage Adapter + pg_prometheus Prometheus Grafana
  • 62. pg_prometheus Prometheus Data Model in TimescaleDB / PostgreSQL CREATE TABLE metrics (sample prom_sample); INSERT INTO metrics VALUES (‘cpu_usage{service=“nginx”,host=“machine1”} 34.6 1494595898000’); • Scrape metrics with CURL: curl http://myservice/metrics | grep -v “^#” | psql -c “COPY metrics FROM STDIN” • New data type prom_sample: <time, name, value, labels>
  • 63. Automate normalized storage SELECT create_prometheus_table(‘metrics’); Time 01:02:00
 01:03:00 01:04:00 01:04:00 01:04:00 Value 90 1024 70 900 70 Label {host: “h001”} {host: “h002”} {host: “1984” } {host: “super”} {host: “marshal”} Id 1 2 3 4 5 Label Id 1 1 2 2 5 Name CPU Mem CPU Mem IO Labels stored in separate host metadata table
  • 64. Easily query auto-created view SELECT sample FROM metrics WHERE time > NOW() - interval ’10 min’ AND name = ‘cpu_usage’ AND Labels @> ‘{“service”: “nginx”}’; Columns: | sample | time | name | value | labels |
  • 65. +
  • 66. +
  • 68. 2PC Multi-node: Scaling out across sharded primaries U nderdevelopm ent Writes Reads Query planning + constraint exclusion
  • 69. minute Continuous aggregations and hierarchical views U nderdevelopm ent Granularity raw hour
  • 70. minute Continuous aggregations and hierarchical views U nderdevelopm ent Granularity raw hour
  • 71. Tiered data storage and automated archiving U nderdevelopm ent SAN Time
 (older) archive_chunks (‘3 months’) move_chunks (‘1 week’, ssd, hdd)
  • 72. Scale Full clustering Performance + ease-of-use Continuous data aggregations and intelligent hierarchical views Performance Lazy chunk management (index creation, reindex, CLUSTER) Ease-of-use Analytical features (gap filling, LOCF, fuzzy joins, etc.) Total Cost-of-Ownership Tiered data storage Automated data archiving
  • 73. Open Source (Apache 2.0) • github.com/timescale/timescaledb Join the Community • slack.timescale.com