OpenTSDB 2.x: Distributed, Scalable Time Series Database

OpenTSDB 2.x
Distributed, Scalable Time Series Database
Benoit Sigoure tsunanet@gmail.com
Chris Larsen clarsen@yahoo-inc.com

Who We Are
Benoit Sigoure
● Created OpenTSDB at StumbleUpon
● Software Engineer @ Arista Networks
Chris Larsen
● Release manager for OpenTSDB 2.x
● Software Engineer @ Yahoo

What Is OpenTSDB?
● Open Source Time Series Database
● Store trillions of data points
● Sucks up all data and keeps going
● Never lose precision
● Scales using HBase

What good is it?
● Systems Monitoring & Measurement
o Servers
o Networks
● Sensor Data
o The Internet of Things
o SCADA
● Financial Data
● Scientific Experiment Results

Use Cases
● > 100 Region Servers ~ 30TB
● 60 TSDs
● 600,000 writes per second
● Primary data store for operational data
● Telemetry from StatsD, Ostrich and derived
data from Storm

Use Cases
● Monitoring application performance and
statistics
● 50 region servers, 2.4M writes/s ~ 200TB
● Multi-tenant and Kerberos secure HBase
● ~200k writes per second per TSD
● Central monitoring for all Yahoo properties
● Over a billion time series served

Some Other Users
● Box: 23 servers, 90K wps, System, app network, business metrics
● Limelight Networks: 8 servers, 30k wps, 24TB of data
● Ticketmaster: 13 servers, 90K wps, ~40GB a day

What Are Time Series?
● Time Series: data points for an identity
over time
● Typical Identity:
o Dotted string: web01.sys.cpu.user.0
● OpenTSDB Identity:
o Metric: sys.cpu.user
o Tags (name/value pairs):
host=web01 cpu=0

What Are Time Series?
Data Point:
● Metric + Tags
● + Value: 42
● + Timestamp: 1234567890
sys.cpu.user 1234567890 42 host=web01 cpu=0
^ a data point ^

Writing Data
1) Open Telnet style socket, write:
put sys.cpu.user 1234567890 42 host=web01 cpu=0
2) ..or, post JSON to:
http://<host>:<port>/api/put
3) .. or import big files with CLI
● No schema definition
● No RRD file creation
● Just write!

Querying Data
● Graph with the GUI
● CLI tools
● HTTP API
● Aggregate multiple series
● Simple query language
To average all CPUs on host:
start=1h-ago
avg sys.cpu.user
host=web01

HBase Data Tables
● tsdb - Data point table. Massive
● tsdb-uid - Name to UID and UID to
name mappings
● tsdb-meta - Time series index and
meta-data
● tsdb-tree - Config and index for
hierarchical naming schema

Data Table Schema
● Row key is a concatenation of UIDs and time:
o metric + timestamp + tagk1 + tagv1… + tagkN + tagvN
● sys.cpu.user 1234567890 42 host=web01 cpu=0
x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02
● Timestamp normalized on 1 hour boundaries
● All data points for an hour are stored in one row
● Enables fast scans of all time series for a metric
● …or pass a row key filter for specific time series with
particular tags

Salting
1 Metric
+
Tag Cardinality > 1 Million
+
1 Write per Series, per Second
=
1 Sad Region

Salting
● Hash on the metric and tags
● Modulo into one of 20 buckets
● Prepend bucket ID to key
● Writes now hit 20 regions/servers
sys.cpu.user 1234567890 host=web01
x00x00x00x01x49x95xFBx70x00x00x01x00x00x01
sys.cpu.user 1234567890 host=web02
x01x00x00x01x49x95xFBx70x00x00x01x00x00x02

Salting
● 1M writes per second
to 50K per second =
But wait, there’s more!
● Query with 20
asynchronous
scanners...
20 Happy Regions

Appends
● Row holds one hour of data
o 60 columns @ 1 minute resolution
o 3,600 columns @ 1 second resolution
o 3,600,000 columns @ millisecond resolution
● 3.6M columns + row key
+ timestamp = network overhead

Appends
● Qualifiers encode
○ offset from row base time
○ data type (float or integer)
○ value length (1 to 8 bytes)
● Timestamp = 4 bytes
● Row key >= 13 bytes to 55 bytes
● 1 serialized column >= 20 to 69 bytes
[ 0b00001111, 0b11000111 ]
<---------------->^<->
delta (seconds) type value length

Appends
Compact it into one column!
● Concatenate qualifiers
● Concatenate values
● 1 row key, 1 timestamp
● 1 row from 69K to 14K, 83% savings
Qualifier: [ 0b00000000, 0b00000000, ... 0b00000000, 0b00010000 ]
Value: [ 0b00000001, ... 0b00000002 ]
Key: [ x00x00x00x01x49x95xFBx70x00x00x01x00x00x01 ]
Timestamp: [ x49x95xFBx70 ]

Appends
After each hour TSDs iterate over rows written:
● Read row from HBase (via a Get)
● Compact in memory
● Write new column to HBase (via a Put)
● Delete all old columns (via a MultiAction)
Drawbacks:
● Network traversal
● HBase RPC queue
● Cache busting

Appends
Try HBase Appends
● Qualifier special prefix
● Concatenate offsets and values in value array
● Still one key, one timestamp, one column
Qualifier: [ 0b00000004 ]
Value: [ 0b00000000, 0b00000000, 0b00000001,
<-Offset/type/length-> <-value ->
...0b00000000, 0b00010000, 0b00000002 ]

Appends
Reduces RPC count and network traffic but
increases region server CPU usage
<-------- Appends -------><-------------- Puts ------------->

Storage Exception Plugin
What to do during...
● Region Splits
● Region Moves
● Server Crashes
● AsyncHBase queues retries
NSREs
● Requeue into message buss:
Kafka
● Spool on disk: RocksDB

Downsampling
● Previous timestamps based on
first value
● Now snap to proper modulo
buckets
● Fill missing values with NaN, Null
or zeros during emission

New for OpenTSDB 2.1
● Improved compaction code path
● Last value query
● Meta table based lookup filtering
● Read/write TSD modes
● Preload UIDs
● fsck utility update
● CORS support

New for OpenTSDB 2.2
● Salting
● Appends
● Random UID Assignment
● NaN, Null or Zero fill policy during
downsampling
● Fully async query path
● Query tracking and statistics
● Additional thread, JVM and AsyncHBase stats

OpenTSDB Community
Bosun - Monitoring system based on OpenTSDB from the
folks at Stack Exchange

OpenTSDB Community
Graphana - Kibana and elasticsearch based front-end for
OpenTSDB, Graphite and InfluxDB

AsyncHBase 1.7
● AsyncHBase is a fully asynchronous, multi-
threaded HBase client
● Supports HBase 0.90 to 1.0
● Remains 2x faster than HTable in
PerformanceEvaluation
● Support for scanner filters, META prefetch,
“fail-fast” RPCs

New for 1.7
● Secure RPC support
● Per region client statistics
● RPC timeouts
● Atomic AppendRequest
● Unit tests!

GoHBase
● Prototype for a new pure-Go HBase client
● Apache 2.0 License
● https://github.com/tsuna/gohbase
● Still early on but feel free to help

The Future
● New query language with
support for
o Complex filters
o Cross metric expressions
o Data manipulation / Analysis
● Distributed queries
● Quality/Confidence measures

More Information
Thank you to everyone who has helped test, debug and add to OpenTSDB
2.1 and 2.2 including, but not limited to:
John Tamblin, Jesse Chang, Rajesh Gopalakrishna Pillai, Siddartha Guthikonda, Sean Miller, Aveek Misra,
Ashwin Ramachandrain, Francis Liu, Slawek Ligus, Gabriel Avellaneda, Nick Whitehead
● Contribute at github.com/OpenTSDB/opentsdb
● Website: opentsdb.net
● Documentation: opentsdb.net/docs/build/html
● Mailing List: groups.google.com/group/opentsdb
Images
● http://photos.jdhancock.com/photo/2013-06-04-212438-the-lonely-vacuum-of-space.html
● http://en.wikipedia.org/wiki/File:Semi-automated-external-monitor-defibrillator.jpg
● http://upload.wikimedia.org/wikipedia/commons/1/17/Dining_table_for_two.jpg
● http://upload.wikimedia.org/wikipedia/commons/9/92/Easy_button.JPG
● http://pixabay.com/en/compression-archiver-compress-149782/
● https://openclipart.org/detail/201862/kerberos-icon
● http://lego.cuusoo.com/ideas/view/96

OpenTSDB 2.x: Distributed, Scalable Time Series Database

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à OpenTSDB 2.x: Distributed, Scalable Time Series Database

Similaire à OpenTSDB 2.x: Distributed, Scalable Time Series Database (20)

Plus de HBaseCon

Plus de HBaseCon (20)

Dernier

Dernier (20)

OpenTSDB 2.x: Distributed, Scalable Time Series Database