Devoxx france 2015 influxdb

@zepouet#InfluxDB
:: InfluxDB ::
@zepouet
http://www.treeptik.fr
http://www.cloudunit.fr
http://www.labaixbidouille.com

@zepouet#InfluxDB
:: InfluxDB :: Time Series ::
• About Me
• What is a time serie ?
• State of the Art in 2015
• Why yet another product for time series ?
• Live Demo
• Q/A

@zepouet#InfluxDB
About Me
•Treeptik
•MarsJUG
•LabAixBidouille

What is a time series ?
Things happening in times…

@zepouet#InfluxDB
Events, events… events
• Measurements (physical sensors…)
• Exceptions (applications)
• Page views
• User actions
• Commits Git
• Webapp Deployment
• Things appening in time

@zepouet#InfluxDB
What we have to store ?
• At the moment, we have :
• Graphite
• OpenTSDB (events, Hadoop, HBase…)
• Kairos (events, rewrite from OpenTSBD)
• Ganglia (more present in BigData/Hadoop)
• And others…

@zepouet#InfluxDB
What we have to collect ?
• At the moment, we have :
• CollectD
• Sensu
• DropWizard/Metrics
• JMXTrans
• Jolokia

@YourTwitterHandle@YourTwitterHandle@zepouet#InfluxDB
Something missing…

@zepouet#InfluxDB
Because in 2015, we need
• Simple product to install and manage
• To store millions of points (IoT is here)
• HTTP native support (JSON)
• Build with API
• Automatically clear out old data
• Easy scalable : cloud is a buzzword

UseCase : Fablab

@zepouet#InfluxDB
wiki.labaixbidouille.com/index.php?4tle=Domo4que

@zepouet#InfluxDB
Feedback
•Data volume :
•1 event / sensor / minute
•1 * 60 * 24 = 1440 events per day
•42.300 events per month
•518.400 events per year
•First error : use MYSQL
•Second error : bad pattern with InfluxDB

@zepouet#InfluxDB
1.21GIGAWATTS

@zepouet#InfluxDB
About InfluxDB
•An opensource distributed time series database
• ErrPlane
• MIT License
• Written in GO
• Young but awesome project

@zepouet#InfluxDB
InfluxDB :: design goals
• Simple to install and manage thank to Go.
• No external dependencies like Zookeeper and Hadoop.
• HTTP(s) interface for reading and writing data.
• Horizontally scalable.
• On disk and in memory. Most data is cold.
• Compute percentiles and others functions on the ﬂy.
• Downsample data on different windows of time.

@zepouet#InfluxDB
InfluxDB :: installing
• MacOS : $ brew install influxdb
• Debian : $ sudo dpkg -i influxdb_latest_amd64.deb
• CentOS : $ sudo rpm -ivh influxdb-latest-1.x86_64.rpm
• Docker : $ docker run tutum/influxdb
• Soon ARM and Windows

@zepouet#InfluxDB
InfluxDB :: running
• $ influxdb -config=/usr/local/etc/influxdb.conf
• Ports
• 8083 : UI
• 8086 :API
• 8090 : Cluster management raft
• 8099 : Cluster management protobuf

@zepouet#InfluxDB
InfluxDB :: design
• Database (like in Mysql, Postgres…)
• Time Series (kind of like tables with time, sequence number and
columns)
• A timeserie is composed by points or events (kinds of like
rows)
• Primary index is always time
• Null values are not stored
• You can have millions of series

@zepouet#InfluxDB
InfluxDB :: security
• Cluster admins
• Database admins
• Database users
• Read permissions
• only certains series
• only queries with a column having a speciﬁc value (e.g. customer_id = 32)
• Write permissions
• only certains series
• only columns having a speciﬁc value

@zepouet#InfluxDB
InfluxDB :: create points
curl -X POST -d '[{"name":"temp","columns":
["celsius"],"points":[[23]]}]' ‘http://localhost:8086/db/
mydb/series?u=root&p=root
curl -G 'http://localhost:8086/db/mydb/series?
u=root&p=root' --data-urlencode "q=select * from temp"

@zepouet#InfluxDB
InfluxDB :: Pitfalls
• Schemaless Warning
• Data partinioning with one serie
Time Name Host Metrics
3236765 cpu web0 78
3236765 disk_io web0 98344
3236765 load db1 5
3236765 eth_0 ldap0 8755

@zepouet#InfluxDB
3236765 disk_io web0 98344
3236766 disk_io web0 98354
3236767 disk_io web0 98224
3236768 disk_io web0 98994
3236765 eth_0 ldap0 8755
3236766 eth_0 ldap0 8721
3236767 eth_0 ldap0 8734
3236768 eth_0 ldap0 8723
3236765 cpu web0 78
3236766 cpu web0 77
3236767 cpu web0 79
3236768 cpu web0 76
3236765 load db1 5
3236766 load db1 6
3236767 load db1 5
3236768 load db1 7

@zepouet#InfluxDB
InfluxDB :: Why so many series?
• To take advantage of the Storage engines
• Points are indexed by time, not by any other
columns
• Tricks : easily work with grafana
InfluxDB works best with large number of series with
fewer columns in each one

@zepouet#InfluxDB
:: Query Langage
• select * from /.*/ limit 1
• select val1, val2 from serverA
• select cpu from /server.*/
• select * from /.*/ where time > now() - 1h
• select * from /.*/ where time > ‘2013-08-12 23:32:00’
• select * from /.*/ group by time(10m)
• select count(val) from /.*/ group by time(10m)
• select percentile(val, 95) from /.*/ group by time(10m)
• select count(distinct(val)) from /.*/

@zepouet#InfluxDB
:: Query Langage
• DELETE
• delete from response_times where time < now() - 1h
• delete from /^stats.*/ where time < now() - 7d
• drop series response_times
• GROUP BY
• select count(type) from events group by time(10m);
• select count(type),type from events group by time(10m), type;

@zepouet#InfluxDB
:: Visualize and summarize
• Graphs
• Last 10 minutes
• Last 4 hours
• Last 24 hours
• Past week
• Past month
• All time

@zepouet#InfluxDB
:: Merging :: Series
• select count(type)
from user_events merge admin_events
group by time(10m)
• select mean(value)
from merge(/.*az.1.*.cpu/)
group by time(1h)

@zepouet#InfluxDB
:: Joining :: Series
• select hosta.value + hostb.value
from cpu_load as hosta inner join cpu_load as hostb
where hosta.host = 'hosta.inﬂuxdb.orb'
and hostb.host = ‘hostb.inﬂuxdb.org’;
• select errors_per_minute.value / page_views_per_minute.value
from errors_per_minute inner join page_views_per_minute

@zepouet#InfluxDB
:: Naming Strategy :: 0.8
• Tag versus Value
• Rule :
<tagName>.<tagValue>.serieName
• Examples :
arduino.uno.shield.ethernet.sensor.dht11.temperature
arduino.uno.shield.ethernet.sensor.dht11.temperature
arduino.uno.shield.wifi.sensor.dht22.humidity
arduino.uno.shield.wifi.sensor.dht22.humidity

@zepouet#InfluxDB
:: Naming Strategy :: 0.9+
• Migration processus
• Rule : serieName = serieName
• Tag are defined into JSON and indexed

{
"database" : "domotic",
"points": [
{
"name": "temperature_x",
"tags": {
"arduino": "uno",
"shield": "wifi",
"position": "indoor",
"sensor": "dht22",
},
"timestamp": "2015-03-28T14:50:00Z",
"fields": {
"celsius": 23.2,
"farenheit": 192
}
}
]
}

@zepouet#InfluxDB
:: Continuous Queries
• select count(type) from events
group by time(10m), type
into events.count_per_type.10m
DOWNSAMPLING

@zepouet#InfluxDB
Soon in april 2015
• New model Clustering
• Inﬂux shell
• Tags indexed
• Backup

@zepouet#InfluxDB
Libraries
• https://github.com/influxdb/influxdb-java
Official java client
• https://github.com/davidB/metrics-influxdb
A reporter for metrics which announces measurements to an InfluxDB server.
• https://github.com/vietj/vertx-influxdb-metrics
Proof of concept of reporting to InfluxDB

@zepouet#InfluxDB
davidb/metrics-influxdb
Non official plugin from https://github.com/dropwizard/metrics

@zepouet#InfluxDB
Carbon-influxdb
https://github.com/dropwizard/metrics

Demo

Q & A

Devoxx france 2015 influxdb

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Devoxx france 2015 influxdb

Similar to Devoxx france 2015 influxdb (20)

Recently uploaded

Recently uploaded (20)

Devoxx france 2015 influxdb