OpenTSDB is an open source distributed time series database for storing large amounts of metrics data and performing fast queries. It is scalable and can store trillions of data points across multiple servers. Data is stored in HBase and queries are performed using a simple query language. OpenTSDB 2.x includes new features like salting to distribute writes across servers, compact storage formats using column appends, and downsampling to fill in missing data during aggregation.
2. Who We Are
Benoit Sigoure
● Created OpenTSDB at StumbleUpon
● Software Engineer @ Arista Networks
Chris Larsen
● Release manager for OpenTSDB 2.x
● Software Engineer @ Yahoo
3. What Is OpenTSDB?
● Open Source Time Series Database
● Store trillions of data points
● Sucks up all data and keeps going
● Never lose precision
● Scales using HBase
4. What good is it?
● Systems Monitoring & Measurement
o Servers
o Networks
● Sensor Data
o The Internet of Things
o SCADA
● Financial Data
● Scientific Experiment Results
5. Use Cases
● > 100 Region Servers ~ 30TB
● 60 TSDs
● 600,000 writes per second
● Primary data store for operational data
● Telemetry from StatsD, Ostrich and derived
data from Storm
6. Use Cases
● Monitoring application performance and
statistics
● 50 region servers, 2.4M writes/s ~ 200TB
● Multi-tenant and Kerberos secure HBase
● ~200k writes per second per TSD
● Central monitoring for all Yahoo properties
● Over a billion time series served
7. Some Other Users
● Box: 23 servers, 90K wps, System, app network, business metrics
● Limelight Networks: 8 servers, 30k wps, 24TB of data
● Ticketmaster: 13 servers, 90K wps, ~40GB a day
8. What Are Time Series?
● Time Series: data points for an identity
over time
● Typical Identity:
o Dotted string: web01.sys.cpu.user.0
● OpenTSDB Identity:
o Metric: sys.cpu.user
o Tags (name/value pairs):
host=web01 cpu=0
9. What Are Time Series?
Data Point:
● Metric + Tags
● + Value: 42
● + Timestamp: 1234567890
sys.cpu.user 1234567890 42 host=web01 cpu=0
^ a data point ^
11. Writing Data
1) Open Telnet style socket, write:
put sys.cpu.user 1234567890 42 host=web01 cpu=0
2) ..or, post JSON to:
http://<host>:<port>/api/put
3) .. or import big files with CLI
● No schema definition
● No RRD file creation
● Just write!
12. Querying Data
● Graph with the GUI
● CLI tools
● HTTP API
● Aggregate multiple series
● Simple query language
To average all CPUs on host:
start=1h-ago
avg sys.cpu.user
host=web01
13. HBase Data Tables
● tsdb - Data point table. Massive
● tsdb-uid - Name to UID and UID to
name mappings
● tsdb-meta - Time series index and
meta-data
● tsdb-tree - Config and index for
hierarchical naming schema
14. Data Table Schema
● Row key is a concatenation of UIDs and time:
o metric + timestamp + tagk1 + tagv1… + tagkN + tagvN
● sys.cpu.user 1234567890 42 host=web01 cpu=0
x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02
● Timestamp normalized on 1 hour boundaries
● All data points for an hour are stored in one row
● Enables fast scans of all time series for a metric
● …or pass a row key filter for specific time series with
particular tags
16. Salting
● Hash on the metric and tags
● Modulo into one of 20 buckets
● Prepend bucket ID to key
● Writes now hit 20 regions/servers
sys.cpu.user 1234567890 host=web01
x00x00x00x01x49x95xFBx70x00x00x01x00x00x01
sys.cpu.user 1234567890 host=web02
x01x00x00x01x49x95xFBx70x00x00x01x00x00x02
17. Salting
● 1M writes per second
to 50K per second =
But wait, there’s more!
● Query with 20
asynchronous
scanners...
20 Happy Regions
19. Appends
● Row holds one hour of data
o 60 columns @ 1 minute resolution
o 3,600 columns @ 1 second resolution
o 3,600,000 columns @ millisecond resolution
● 3.6M columns + row key
+ timestamp = network overhead
20. Appends
● Qualifiers encode
○ offset from row base time
○ data type (float or integer)
○ value length (1 to 8 bytes)
● Timestamp = 4 bytes
● Row key >= 13 bytes to 55 bytes
● 1 serialized column >= 20 to 69 bytes
[ 0b00001111, 0b11000111 ]
<---------------->^<->
delta (seconds) type value length
21. Appends
Compact it into one column!
● Concatenate qualifiers
● Concatenate values
● 1 row key, 1 timestamp
● 1 row from 69K to 14K, 83% savings
Qualifier: [ 0b00000000, 0b00000000, ... 0b00000000, 0b00010000 ]
Value: [ 0b00000001, ... 0b00000002 ]
Key: [ x00x00x00x01x49x95xFBx70x00x00x01x00x00x01 ]
Timestamp: [ x49x95xFBx70 ]
22. Appends
After each hour TSDs iterate over rows written:
● Read row from HBase (via a Get)
● Compact in memory
● Write new column to HBase (via a Put)
● Delete all old columns (via a MultiAction)
Drawbacks:
● Network traversal
● HBase RPC queue
● Cache busting
23. Appends
Try HBase Appends
● Qualifier special prefix
● Concatenate offsets and values in value array
● Still one key, one timestamp, one column
Qualifier: [ 0b00000004 ]
Value: [ 0b00000000, 0b00000000, 0b00000001,
<-Offset/type/length-> <-value ->
...0b00000000, 0b00010000, 0b00000002 ]
24. Appends
Reduces RPC count and network traffic but
increases region server CPU usage
<-------- Appends -------><-------------- Puts ------------->
25. Storage Exception Plugin
What to do during...
● Region Splits
● Region Moves
● Server Crashes
● AsyncHBase queues retries
NSREs
● Requeue into message buss:
Kafka
● Spool on disk: RocksDB
26. Downsampling
● Previous timestamps based on
first value
● Now snap to proper modulo
buckets
● Fill missing values with NaN, Null
or zeros during emission
27. New for OpenTSDB 2.1
● Improved compaction code path
● Last value query
● Meta table based lookup filtering
● Read/write TSD modes
● Preload UIDs
● fsck utility update
● CORS support
28. New for OpenTSDB 2.2
● Salting
● Appends
● Random UID Assignment
● NaN, Null or Zero fill policy during
downsampling
● Fully async query path
● Query tracking and statistics
● Additional thread, JVM and AsyncHBase stats
31. AsyncHBase 1.7
● AsyncHBase is a fully asynchronous, multi-
threaded HBase client
● Supports HBase 0.90 to 1.0
● Remains 2x faster than HTable in
PerformanceEvaluation
● Support for scanner filters, META prefetch,
“fail-fast” RPCs
32. New for 1.7
● Secure RPC support
● Per region client statistics
● RPC timeouts
● Atomic AppendRequest
● Unit tests!
36. The Future
● New query language with
support for
o Complex filters
o Cross metric expressions
o Data manipulation / Analysis
● Distributed queries
● Quality/Confidence measures
37. More Information
Thank you to everyone who has helped test, debug and add to OpenTSDB
2.1 and 2.2 including, but not limited to:
John Tamblin, Jesse Chang, Rajesh Gopalakrishna Pillai, Siddartha Guthikonda, Sean Miller, Aveek Misra,
Ashwin Ramachandrain, Francis Liu, Slawek Ligus, Gabriel Avellaneda, Nick Whitehead
● Contribute at github.com/OpenTSDB/opentsdb
● Website: opentsdb.net
● Documentation: opentsdb.net/docs/build/html
● Mailing List: groups.google.com/group/opentsdb
Images
● http://photos.jdhancock.com/photo/2013-06-04-212438-the-lonely-vacuum-of-space.html
● http://en.wikipedia.org/wiki/File:Semi-automated-external-monitor-defibrillator.jpg
● http://upload.wikimedia.org/wikipedia/commons/1/17/Dining_table_for_two.jpg
● http://upload.wikimedia.org/wikipedia/commons/9/92/Easy_button.JPG
● http://pixabay.com/en/compression-archiver-compress-149782/
● https://openclipart.org/detail/201862/kerberos-icon
● http://lego.cuusoo.com/ideas/view/96