Contenu connexe Similaire à Fluentd meetup at Slideshare Similaire à Fluentd meetup at Slideshare (20) Plus de Sadayuki Furuhashi Plus de Sadayuki Furuhashi (20) Fluentd meetup at Slideshare2. Self-introduction
> Sadayuki Furuhashi
twitter/github: @frsyuki
> Treasure Data, Inc.
Founder & Software Architect
> Open source projects
MessagePack - “It’s like JSON. but fast and small”
Fluentd - “Log everything in JSON”
5. Collect Store Process Visualize
Reporting & Monitoring
6. easier & shorter time
Collect Store Process Visualize
Hadoop / Hive Excel
MongoDB Tableau
Treasure Data R
7. How to shorten here? easier & shorter time
Collect Store Process Visualize
Hadoop / Hive Excel
MongoDB Tableau
Treasure Data R
8. How to shorten here? easier & shorter time
Collect Store Process Visualize
Hadoop / Hive Excel
MongoDB Tableau
Treasure Data R
12. Fluentd
= ✓ Plugins
syslogd
+ ✓ JSON
many
13. Access logs Alerting
Apache Nagios
App logs Analysis
Frontend MongoDB
Backend MySQL
System logs Hadoop
syslogd
Archiving
filter / buffer / routing
Databases Amazon S3
14. Access logs Alerting
Apache Nagios
App logs Analysis
Frontend MongoDB
Backend MySQL
System logs Hadoop
syslogd
Archiving
filter / buffer / routing
Databases Amazon S3
15. Access logs Alerting
Apache Nagios
App logs Analysis
Frontend MongoDB
Backend MySQL
System logs Hadoop
syslogd
Archiving
filter / buffer / routing
Databases Amazon S3
17. log
Input Plugins Output Plugins
time
tag
2012-02-04 01:33:51
myapp.buylog {
JSON “user”: ”me”,
“path”: “/buyItem”,
“price”: 150,
“referer”: “/landing”
}
record
18. in_tail: reads file and parses lines
apache
fluentd
in_tail
access.log
✓ read a log file
✓ custom regexp
✓ custom parser in Ruby
19. failure handling & retrying
apache
fluentd
in_tail
access.log buffer
✓ retry automatically
✓ exponential retry wait
✓ persistent on a file
20. routing / copying
Hadoop
apache
fluentd
in_tail
access.log buffer
Amazon S3
✓ routing based on tags
✓ copy to multiple storages
21. # logs from a file # store logs to MongoDB and S3
<source> <match **>
type tail type copy
path /var/log/httpd.log
format apache2 <match>
tag web.access type mongo
</source> host mongo.example.com
capped
# logs from client libraries capped_size 200m
<source> </match>
type forward
port 24224 <match>
</source> type s3
path archive/
</match>
</match>
Fluentd
22. forwarding
fluentd
send / ack
fluentd fluentd
Fluentd fluentd
fluentd fluentd
fluentd
23. Fluentd
= ✓ Plugins
syslogd
+ ✓ JSON
many
24. Fluentd - plugin distribution platform
$ fluent-gem search -rd fluent-plugin
$ fluent-gem install fluent-plugin-mongo
25. Fluentd - plugin distribution platform
$ fluent-gem search -rd fluent-plugin
$ fluent-gem install fluent-plugin-mongo
117 plugins!
26. Treasure Data?
Collect Store Process Visualize
Hadoop / Hive Excel
MongoDB Tableau
Treasure Data R
our company provides
29. Fluentd and Flume NG - configuration
# source
host1.sources = avro-source1
host1.sources.avro-source1.type = avro
<source> host1.sources.avro-source1.bind = 0.0.0.0
type forward host1.sources.avro-source1.port = 41414
port 24224 host1.sources.avro-source1.channels = ch1
</source>
# channel
<match **> host1.channels = ch_avro_log
type file host1.channels.ch_avro_log.type = memory
path /var/log/logs
</match> # sink
host1.sinks = log-sink1
host1.sinks.log-sink1.type = logger
host1.sinks.log-sink1.channel = ch1
30. Fluentd and Flume NG - topology
fluentd
send / ack
fluentd fluentd
Fluentd fluentd
fluentd fluentd
fluentd
Agent
send / ack
Agent Collector
Flume NG Collector
Agent Collector
Agent
31. out_hdfs ✓ automatic fail-over
✓ load balancing
fluentd
apache
fluentd fluentd
in_tail
fluentd
access.log buffer
✓ slice files based on time
✓ retry automatically
2013-01-01/01/access.log.gz ✓ exponential retry wait
2013-01-01/02/access.log.gz ✓ persistent on a file
2013-01-01/03/access.log.gz
...
32. out_s3
apache
fluentd
in_tail
access.log buffer Amazon S3
✓ slice files based on time
✓ retry automatically
2013-01-01/01/access.log.gz ✓ exponential retry wait
2013-01-01/02/access.log.gz ✓ persistent on a file
2013-01-01/03/access.log.gz
...
33. out_hdfs ✓ custom text formater
apache
fluentd
in_tail
access.log buffer HDFS
✓ slice files based on time
✓ retry automatically
2013-01-01/01/access.log.gz ✓ exponential retry wait
2013-01-01/02/access.log.gz ✓ persistent on a file
2013-01-01/03/access.log.gz
...