Masahiro Nakagawa from Treasure Data gave a presentation on Fluentd, an open source log collector. Fluentd allows for reliable and structured logging, forwarding, and processing of data through its pluggable architecture. It can collect logs from various sources and output to different destinations using plugins. Common uses of Fluentd include log aggregation, monitoring, and analysis on large-scale architectures.
7. Related Products
easier & shorter time
Collect
???
Thursday, October 31, 13
Store Process
Cloudera
Horton Works
Treasure Data
Visualize
Excel
Tableau
R
12. In short
>
Open sourced log collector written in Ruby
>
Using rubygems ecosystem for plugins
It’s like syslogd, but
uses JSON for log messages
Thursday, October 31, 13
14. Event structure(log message)
✓ Time
>
default second unit
>
from data source or
adding parsed time
✓ Tag
>
for message routing
Thursday, October 31, 13
✓ Record
>
JSON format
>
MessagePack
internally
>
non-unstructured
17. Configuration and operation
●
>
No central / master node
>
●
>
HTTP include helps conf sharing
Operation depends on your environment
>
>
●
>
Use your deamon management
Use Chef in Treasure Data
Apache like syntax and Ruby DSL
Thursday, October 31, 13
18. # receive events via HTTP
<source>
type http
port 8888
</source>
# save alerts to a file
<match alert.**>
type file
path /var/log/fluent/alerts
</match>
# read logs from a file
<source>
type tail
path /var/log/httpd.log
format apache
tag apache.access
</source>
# forward other logs to servers
<match **>
type forward
<server>
host 192.168.0.11
weight 20
</server>
<server>
host 192.168.0.12
weight 60
</server>
</match>
# save access logs to MongoDB
<match apache.access>
type mongo
database apache
collection log
</match>
Thursday, October 31, 13
include http://example.com/conf
19. Reliability (core + plugin)
>
●
Buffering
>
Use file buffer for persistent data
>
buffer chunk has ID for idempotent
>
●
Retrying
>
●
Error handling
>
transaction, failover, etc on forward plugin
>
secondary
Thursday, October 31, 13
30. Other status
>
●
Localizing docs into Japanese
>
>
●
https://github.com/fluent/fluentd-docs/tree/
master/docs/ja
Windows support
>
Started by JBAT
https://github.com/fluent/fluentd/tree/windows
>
Thursday, October 31, 13
Feedback and patch are welcome!
31. v11
>
●
Spec is not fixed yet
>
●
Breaking source code compatibility
>
●
Several improvments
>
>
>
●
routing label, filter, error stream, etc.
serverengine based: multi-process, signal, etc.
http://magazine.rubyist.net/?0044FluentdV11NewFeatures
Thursday, October 31, 13
32. td-agent
>
●
Open sourced distribution package of Fluentd
>
>
>
●
ETL part of Treasure Data
deb, rpm, homebrew
Including useful components
>
>
>
●
ruby, jemalloc, fluentd
3rd party gems: td, mongo, webhdfs, etc...
http://packages.treasure-data.com/
Thursday, October 31, 13
36. Pros and Cons
>
●
Pros
>
>
●
Using central master to manage all nodes
Cons
>
Java culture (Pros for Java-er?)
Difficult configuration and setup
>
Difficult topology
>
Mainly for Hadoop
less plugins?
Thursday, October 31, 13
38. Pros and Cons
>
●
Pros
>
>
Built-in ElasticSearch and Kibana
>
>
●
Bundled 140 plugins (input/filter/codec/output)
Works on Windows but unstable...
Cons
>
mainly for JRuby
>
Need external daemon for centralized env
Redis, RabbitMQ or etc
Thursday, October 31, 13
40. Treasure Data
Worker
Frontend
Hadoop
Job Queue
Hadoop
Applications push
metrics to Fluentd
(via local Fluentd)
Treasure
Data
for historical analysis
Thursday, October 31, 13
Fluentd
Fluentd
sums up data minutes
(partial aggregation)
Librato
Metrics
for realtime analysis
41. Cookpad
hundreds of app servers
Rails app
td-agent
sends event logs
Rails app
td-agent
Daily/Hourly
Batch
Treasure Data
sends event logs
Rails app
MySQL
td-agent
sends event logs
Unlimited scalability
Flexible schema
Realtime
Less performance impact
Thursday, October 31, 13
Google
Spreadsheet
Logs are available
after several mins.
Feedback rankings
KPI
visualization
✓ Over 100 RoR servers (2012/2/4)
45. Conclusion
>
●
Fluentd is now a widely-used project
>
>
>
●
There are many use cases
Many contributors and plugins
Keep it simple
>
Thursday, October 31, 13
Easy to use and integrate your environment