Contenu connexe Similaire à InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData (20) InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData2. © InfluxData. All rights reserved.
Agenda
By the end of this session, participants should be able to…
● Describe time-series data
● Explain what InfluxDB is in relation to InfluxData
● Understand the InfluxDB Data Model
● Differentiate the Data Model for .x and .x
● Reason about schema decisions
10. © InfluxData. All rights reserved.
Is it time-series?
© InfluxData. All rights reserved.
Is it time-series?
11. © InfluxData. All rights reserved.
Is it time-series?
© InfluxData. All rights reserved.
Is it time-series?
12. © InfluxData. All rights reserved.
What is a time-series database
(tsdb)
© InfluxData. All rights reserved.
What is a time-series database
(tsdb)
13. © InfluxData. All rights reserved.
Time-Series Database
● A database where you manage and store
time-series data
● Efficiently handles time-series data
● Supports common time-series data patterns
○ Data retention enforcement
○ Time-based queries
© InfluxData. All rights reserved.
Time-Series Database
● A database where you manage and store
time-series data
● Efficiently handles time-series data
● Supports common time-series data patterns
○ Data retention enforcement
○ Time-based queries
15. © InfluxData. All rights reserved.
Why couldn’t I just use [...]
© InfluxData. All rights reserved.
Why couldn’t I just use [...]
16. © InfluxData. All rights reserved.
Time-Series is not just a
database problem
© InfluxData. All rights reserved.
Time-Series is not just a
database problem
17. © InfluxData. All rights reserved.
Time-Series Problems
● Visualize you time-series
● Monitor & Alert on your time-series
● Process your time-series
© InfluxData. All rights reserved.
Time-Series Problems
● Visualize you time-series
● Monitor & Alert on your time-series
● Process your time-series
18. © InfluxData. All rights reserved.
What is InfluxData/InfluxDB
© InfluxData. All rights reserved.
What is InfluxData/InfluxDB
19. © InfluxData. All rights reserved.
InfluxData History
● - Errplane
● - InfluxDB is born
● - Transition to InfluxData
○ A platform for time-series data
● - Development of InfluxDB . begins
● - Delivered InfluxDB Cloud .
© InfluxData. All rights reserved.
InfluxData History
● - Errplane
● - InfluxDB is born
● - Transition to InfluxData
○ A platform for time-series data
● - Development of InfluxDB . begins
● - Delivered InfluxDB Cloud .
20. © InfluxData. All rights reserved.
InfluxData Values
● Easy to get started with
● Vertically integrated platform for time-series
data
● Aims to solve the time-series problem from
start to finish
● Performant at all scales
© InfluxData. All rights reserved.
InfluxData Values
● Easy to get started with
● Vertically integrated platform for time-series
data
● Aims to solve the time-series problem from
start to finish
● Performant at all scales
22. © InfluxData. All rights reserved.
Canonical Time-Series Line Graph
© InfluxData. All rights reserved.
Canonical Time-Series Line Graph
23. © InfluxData. All rights reserved.
The Measurement (Label)
© InfluxData. All rights reserved.
The Measurement (Label)
24. © InfluxData. All rights reserved.
Tags (Legend)
© InfluxData. All rights reserved.
Tags (Legend)
25. © InfluxData. All rights reserved.
Fields (Y-Axis Values)
© InfluxData. All rights reserved.
Fields (Y-Axis Values)
26. © InfluxData. All rights reserved.
Time (X-Axis)
© InfluxData. All rights reserved.
Time (X-Axis)
27. © InfluxData. All rights reserved.
Series
stock_price,ticker=A,market=NASDAQ
© InfluxData. All rights reserved.
Series
stock_price,ticker=A,market=NASDAQ
28. © InfluxData. All rights reserved.
Data Model
● Measurement
○ High level grouping of data (cpu, memory)
● Tags
○ Indexed key-value pairs (host, version)
● Fields
○ Sequenced key-value pairs (value, price)
● Time
● Series
○ Unique combination of measurement,
tags, fields
© InfluxData. All rights reserved.
Data Model
● Measurement
○ High level grouping of data (cpu, memory)
● Tags
○ Indexed key-value pairs (host, version)
● Fields
○ Sequenced key-value pairs (value, price)
● Time
● Series
○ Unique combination of measurement,
tags, fields
29. © InfluxData. All rights reserved.
How do we express the data?
© InfluxData. All rights reserved.
How do we express the data?
30. © InfluxData. All rights reserved.
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Measurement
31. © InfluxData. All rights reserved.
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Tags
32. © InfluxData. All rights reserved.
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Fields
33. © InfluxData. All rights reserved.
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
timestamp
34. © InfluxData. All rights reserved.
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Series
© InfluxData. All rights reserved.
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Series
35. © InfluxData. All rights reserved.
Where is the time-series data stored?
© InfluxData. All rights reserved.
Where is the time-series data stored?
36. © InfluxData. All rights reserved.
InfluxDB 1.x Data Hierarchy
● Database
○ Contains many Retention Policies
○ Are containers for access control
● Retention Policies
○ Define retention rules for time-series data
○ Contain the actual time-series
© InfluxData. All rights reserved.
InfluxDB 1.x Data Hierarchy
● Database
○ Contains many Retention Policies
○ Are containers for access control
● Retention Policies
○ Define retention rules for time-series data
○ Contain the actual time-series
38. © InfluxData. All rights reserved.
InfluxDB 2.0 Data Hierarchy
● Buckets
○ Are containers for access control
○ Define retention rules for time-series data
○ Contain the actual time-series
40. © InfluxData. All rights reserved.
InfluxDB 2.0 Instance Hierarchy
● Organizations
○ Logical groupings of resources
○ Contain Buckets, Dashboards, Tasks, etc...
● Buckets
○ Are containers for access control
○ Define retention rules for time-series data
○ Contain the actual time-series
43. © InfluxData. All rights reserved.
InfluxData Query Languages
● InfluxQL
○ SQL-like query language
● TICKscript
○ Time-series processing language used in Kapacitor
● Flux
○ Next generation functional data-scripting language
44. © InfluxData. All rights reserved.
InfluxQL
> SELECT index, id FROM h2o_quality WHERE time > now() - 1w GROUP BY location
name: h2o_quality
tags: location = coyote_creek
time index id
---- ----- ---
2015-08-18T00:00:00Z 41 1
2015-08-18T00:00:00Z 41 1
name: h2o_quality
tags: location = santa_monica
time index id
---- ----- ---
2015-08-18T00:00:00Z 99 2
2015-08-18T00:06:00Z 56 2
45. © InfluxData. All rights reserved.
TICKscript
var measurement = 'requests'
var data = stream
|from()
.measurement(measurement)
|where(lambda: "is_up" == TRUE)
|where(lambda: "my_field" > 10)
|window()
.period(5m)
.every(5m)
// Count number of points in window
data
|count('value')
.as('the_count')
// Compute mean of data window
data
|mean('value')
.as('the_average')
46. © InfluxData. All rights reserved.
Flux
// get all data from the telegraf db
from(bucket:”telegraf/autogen”)
// filter that by the last hour
|> range(start:-1h)
// filter further by series with a specific measurement and field
|> filter(fn: (r) => r._measurement == "cpu")
|> filter(fn: (r) => r._field == "usage_user")
47. © InfluxData. All rights reserved.
Why Flux?
● Composable
○ Users should be able to take pieces of different scripts and
combine them into a single one to solve their own problem
● Extensible
○ Adding new functions and capabilities to flux should be easy
● Shareable
○ Users should be able to create libraries and packages to
solve specific problems
● Flexible
○ Users should be able to solve their entire processing
workload in flux
© InfluxData. All rights reserved.
Why Flux?
● Composable
○ Users should be able to take pieces of different scripts and
combine them into a single one to solve their own problem
● Extensible
○ Adding new functions and capabilities to flux should be easy
● Shareable
○ Users should be able to create libraries and packages to
solve specific problems
● Flexible
○ Users should be able to solve their entire processing
workload in flux
48. What not to do
Schema Design
What not to do
Schema Design
49. © InfluxData. All rights reserved.
Don’t Encode Data into the Measurement/Tags
Bad:
cpu.server-5.us-west value=2 1444234982000000000
cpu.server-6.us-west value=4 1444234982000000000
mem-free.server-6.us-west value=2500 1444234982000000000
Good:
cpu,host=server-5,region=us-west value=2 1444234982000000000
cpu,host=server-6,region=us-west value=4 1444234982000000000
mem-free,host=server-6,region=us-west value=2500 1444234982000000
50. © InfluxData. All rights reserved.
Don’t Use Tags with VERY High Variability
Bad:
response_time,session_id=33254331,request_id=3424347 value=340 14442349820000
Good-ish:
response_time session_id=33254331,request_id=3424347,value=340 14442349820000
51. © InfluxData. All rights reserved.
Don’t Use too Few Tags
Bad:
cpu,region=us-west host="server1",value=0.5 1444234986000
cpu,region=us-west host="server2",value=4 1444234982000
cpu,region=us-west host="server2",value=1 1444234982000
Good-ish:
cpu,region=us-west,host=server1 value=0.5 1444234986000
cpu,region=us-west,host=server2 value=4 1444234982000
cpu,region=us-west,host=server2 value=1 1444234982000
52. © InfluxData. All rights reserved.
What should I do?
© InfluxData. All rights reserved.
What should I do?
53. © InfluxData. All rights reserved.
Designing a Schema
● What dashboards do I need?
● What alerts do I need?
● What data to I want to use to make reports?
● What data do I need readily available if there is an incident?
● In short, what queries do I need to execute?
If you don’t think about these things first, you can easily end up with
dashboards that take quite a bit of time to load.
© InfluxData. All rights reserved.
Designing a Schema
● What dashboards do I need?
● What alerts do I need?
● What data to I want to use to make reports?
● What data do I need readily available if there is an incident?
● In short, what queries do I need to execute?
If you don’t think about these things first, you can easily end up with
dashboards that take quite a bit of time to load.
54. © InfluxData. All rights reserved.
Schema Example
● I operate a SaaS application
● There are ~ different services I operate
● I want to know the request and error rates for each service
● I want to trigger an alert if the error rate is beyond a theshold for
a specific service
○ Each service will have its own threshold
● I want to see the request rate of the services with the top error
rates
© InfluxData. All rights reserved.
Schema Example
● I operate a SaaS application
● There are ~ different services I operate
● I want to know the request and error rates for each service
● I want to trigger an alert if the error rate is beyond a theshold for
a specific service
○ Each service will have its own threshold
● I want to see the request rate of the services with the top error
rates
55. © InfluxData. All rights reserved.
Data Available
● app Service name, e.g. user_service, auth_service…
● container_id Container ID of the container running the service
● path HTTP request path
● method HTTP method, e.g. GET, POST, DELETE…
● src Hostname of client making request
● dest Hostname of server being contacted
● status HTTP status code associated with the request
● request_id Unique request identifier
● duration Duration of request
● bytes_tx Number of bytes transmitted
● bytes_rx Number of bytes received
© InfluxData. All rights reserved.
Data Available
● app Service name, e.g. user_service, auth_service…
● container_id Container ID of the container running the service
● path HTTP request path
● method HTTP method, e.g. GET, POST, DELETE…
● src Hostname of client making request
● dest Hostname of server being contacted
● status HTTP status code associated with the request
● request_id Unique request identifier
● duration Duration of request
● bytes_tx Number of bytes transmitted
● bytes_rx Number of bytes received
56. © InfluxData. All rights reserved.
Question
Why would it be a bad idea to make request_id a tag?
© InfluxData. All rights reserved.
Question
Why would it be a bad idea to make request_id a tag?
57. © InfluxData. All rights reserved.
Answer
Why would it be a bad idea to make request_id a tag?
request_id has a very high cardinality and would result in the creation of an extremely large number of
series.
59. © InfluxData. All rights reserved.
Data Available
● app Service name, e.g. user_service, auth_service…
● container_id Container ID of the container running the service
● path HTTP request path
● method HTTP method, e.g. GET, POST, DELETE…
● src Hostname of client making request
● dest Hostname of server being contacted
● status HTTP status code associated with the request
● request_id Unique request identifier
● duration Duration of request
● bytes_tx Number of bytes transmitted
● bytes_rx Number of bytes received
60. © InfluxData. All rights reserved.
Answer
measurement:
latency
tags:
app container_id path method src dst status
fields:
request_id duration bytes_tx bytes_rx