2. Who are you?
• Masahiro Nakagawa
• github: @repeatedly
• Treasure Data Inc.
• Fluentd / td-agent developer
• Fluentd Enterprise support
• I love OSS :)
• D Language, MessagePack, The organizer of several meetups, etc…
3. Beats
• Agent for each purpose by Elastic
• https://www.elastic.co/products/beats
• official: topbeat, filebeat, packetbeat
• 3rd party: dockerbeat, nginxbeat, etc…
• Beats support several outputs: elasticsearch,
logstash, stdout and etc.
• logstash output uses lumberjack protocol so
we can use it for communicating with Beats.
4. Fluentd
• Pluggable streaming event collector
• Lightweight, robust and flexible
• Lots of plugins on rubygems
• Used by AWS, GCP, MS and more companies
• Resources
• http://www.fluentd.org/
• Webinar: https://www.youtube.com/watch?v=6uPB_M7cbYk
5. fluent-plugin-beats
• Input plugin for Elastic Beats
• https://github.com/repeatedly/fluent-plugin-beats
• Use lumberjack protocol to handle events
• Tested with topbeat, filebeat, packetbeat
• Beats use same event format so it should work
with 3rd party Beats.
8. Note: Performance
• Tested on Mac Book Pro, not 2 machines.
2.6 GHz Intel Core i7, 16 GB 1600 MHz DDR3
fluentd with in_tail fluent-agent-hydra filebeat
80,000 events/sec 100,000+ events/sec 18,000 events/sec
Read nginx 100000 logs and count by flowcounter_simple
9. 1. Lumberjack protocol doesn’t focus on throughput
• lumberjack sends/receives ack on each record
2. Beats framework is slow? [Issue #587]
• filebeat is slower than logstash-forwarder
Why filebeat is slow?
data frame
Publish events
ack
ack
Lumberjack protocol
10. Conclusion
• Beats are useful for collecting various metrics
• fluent-plugin-beats can handle Beats event
and route events to elasitcsearch properly
• Thanks fluent-plugin-elasticsearch plugin ;)
• Note that filebeat is slow so it is not good
on high volume environment
• Use fluentd or fluent-agent-hydra instead