SlideShare une entreprise Scribd logo
1  sur  52
Télécharger pour lire hors ligne
Masahiro Nakagawa
Feb 21, 2015
RubyKansai #65
Fluentd
Unified logging layer
Who are you?
> Masahiro Nakagawa
> github/twitter: @repeatedly
> Treasure Data, Inc.
> Senior Software Engineer
> Fluentd / td-agent developer
> Living at OSS :)
> D language - Phobos committer
> Fluentd - Main maintainer
> MessagePack / RPC - D and Python (only RPC)
> The organizer of several meetups (Presto, DTM, etc…)
> etc…
Structured logging	

!
Reliable forwarding	

!
Pluggable architecture
http://fluentd.org/
What’s Fluentd?
> Data collector for unified logging layer
> Streaming data transfer based on JSON
> Written in Ruby
> Gem based various plugins
> http://www.fluentd.org/plugins
> Working in production
> http://www.fluentd.org/testimonials
Background
Data Analytics Flow
Collect Store Process Visualize
Data source
Reporting
Monitoring
Data Analytics Flow
Store Process
Cloudera
Horton Works
Treasure Data
Collect Visualize
Tableau
Excel
R
easier & shorter time
???
TD Service Architecture
Time to Value
Send query result 
Result Push
Acquire
 Analyze
Store
Plazma DB
Flexible, Scalable,
Columnar Storage
Web Log
App Log
Censor
CRM
ERP
RDBMS
Treasure Agent(Server)
SDK(JS, Android, iOS, Unity)
Streaming Collector
Batch /
Reliability
Ad-hoc /

Low latency
KPI$
KPI Dashboard
BI Tools
Other Products
RDBMS, Google Docs,
AWS S3, FTP Server, etc.
Metric Insights 
Tableau, 
Motion Board etc. 
POS
REST API
ODBC / JDBC
SQL, Pig 
Bulk Uploader
Embulk,

TD Toolbelt
SQL-based query
@AWS or @IDCF
Connectivity
Economy & Flexibility Simple & Supported
Dive into…
Divide & Conquer & Retry
error retry
error retry retry
retry
Batch
Stream
Other stream
Application
・・・
Server2
Application
・・・
Server3
Application
・・・
Server1
FluentLog Server
High Latency!
must wait for a day...
Before…
Application
・・・
Server2
Application
・・・
Server3
Application
・・・
Server1
Fluentd Fluentd Fluentd
Fluentd Fluentd
In streaming!
After…
Core Plugins
> Divide & Conquer

> Buffering & Retrying

> Error handling

> Message routing

> Parallelism
> read / receive data
> from API, database,

command, etc…
> write / send data
> to API, database, alert,
graph, etc…
Apache to Mongo
tail
insert
event
buffering
127.0.0.1 - - [11/Dec/2012:07:26:27] "GET / ...
127.0.0.1 - - [11/Dec/2012:07:26:30] "GET / ...
127.0.0.1 - - [11/Dec/2012:07:26:32] "GET / ...
127.0.0.1 - - [11/Dec/2012:07:26:40] "GET / ...
127.0.0.1 - - [11/Dec/2012:07:27:01] "GET / ...
...
Fluentd
Web Server
2012-02-04 01:33:51	

apache.log	

{	

"host": "127.0.0.1",	

"method": "GET",	

...	

}
> default second unit
> from data source
Event structure(log message)
✓ Time
> for message routing
> where is from?
✓ Tag
> JSON format
> MessagePack

internally
> schema-free
✓ Record
Architecture (v0.12 or later)
EngineInput
Filter Output
Buffer
> grep
> record_transfomer	

> …
> Forward	

> File tail	

> ...
> Forward	

> File	

> ...
Output
> File	

> Memory
not pluggable
FormatterParser
Configuration and operation
> No central / master node
> include helps configuration sharing
> Operation depends on your environment
> Use your deamon management
> Use Chef in Treasure Data
> Apache like syntax and Ruby DSL
# receive events via HTTP
<source>
type http
port 8888
</source>
!
# read logs from a file
<source>
type tail
path /var/log/httpd.log
format apache
tag apache.access
</source>
!
# save access logs to MongoDB
<match apache.access>
type mongo
database apache
collection log
</match>
# save alerts to a file	

<match alert.**>	

type file	

path /var/log/fluent/alerts	

</match>	

!
# forward other logs to servers	

<match **>	

type forward	

<server>	

host 192.168.0.11	

weight 20	

</server>	

<server>	

host 192.168.0.12	

weight 60	

</server>	

</match>	

!
include http://example.com/conf
Plugins - use rubygems
$ fluent-gem search -rd fluent-plugin!
!
$ fluent-gem search -rd fluent-mixin!
!
$ fluent-gem install fluent-plugin-mongo
in_tail
✓ read a log file!
✓ custom regexp!
✓ custom parser in Ruby
FluentdApache
access.log
> json
> csv
> tsv
> ltsv
Supported format:
> apache
> apache_error
> apache2
> nginx
> syslog
> none



out_webhdf
Fluentd
buffer
✓ retry automatically!
✓ exponential retry wait!
✓ persistent on a file
✓ slice files based on time
2013-01-01/01/access.log.gz!
2013-01-01/02/access.log.gz!
2013-01-01/03/access.log.gz!
...
HDFS
✓ custom text formatter
Apache
access.log
out_copy
✓ routing based on tags!
✓ copy to multiple storages
Amazon S3
Fluentd
buffer
Apache
access.log
out_forward
apache
✓ automatic fail-over!
✓ load balancing
FluentdApache
bufferaccess.log
✓ retry automatically!
✓ exponential retry wait!
✓ persistent on a file
Fluentd
Fluentd
Fluentd
Before
After
or Embulk
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databases
buffering / processing / routing
M x N → M + N
Use-cases
Treasure Data
Frontend
Job Queue
Worker
Hadoop
Presto
Fluentd
Applications push
metrics to Fluentd

(via local Fluentd)
Librato
Metrics
for realtime analysis
Treasure
Data
for historical analysis
Fluentd sums up data minutes

(partial aggregation)
hundreds of app servers
sends event logs
sends event logs
sends event logs
Rails app td-agent
td-agent
td-agent
Google
Spreadsheet
Treasure Data
MySQL
Logs are available
after several mins.
Daily/Hourly
Batch
KPI
visualizationFeedback rankings
Rails app
Rails app
Unlimited scalability
Flexible schema
Realtime
Less performance impact
Cookpad
✓ Over 100 RoR servers (2012/2/4)
Slideshare
http://engineering.slideshare.net/2014/04/skynet-project-monitor-scale-and-auto-heal-a-system-in-the-cloud/
Log Analysis System And its designs in LINE Corp. 2014 early
Roadmap
v0.10 (old stable)
> Mainly for log forwarding
> with good performance
> working in production
> almost users use td-agent
> Various plugins
> http://www.fluentd.org/plugins
v0.12 (current stable)
> Event handling improvement
> Filter
> Label
> Error Stream
> At-least-once semantics in forwarding
> require_ack_response parameter
> http://ogibayashi.github.io/blog/2014/12/16/try-
fluentd-v0-dot-12-at-least-once/
> Apply filtering routine to event stream
> No more tag tricks!











Filter
<match access.**>	

type record_reformer	

tag reformed.${tag}	

</match>	

!
<match reformed.**>	

type growthforecast	

</match>
<filter access.**>	

type record_transformer	

…	

</filter>
v0.10: v0.12:
<match access.**>	

type growthforecast	

</match>
> Internal event routing
> Redirect events to another group
> much easier to group and share plugins











Label
<source>	

type forward	

</source>	

!
<match app1.**>	

type record_reformer	

</match>	

!
…
<source>	

type forward	

@label @APP1	

</source>
<label @APP1>	

<match access.**>	

type s3	

</match>	

</label>
v0.10: v0.12:
Error stream with Label
> Can handle an error at each record level
> It is still prototype












 ERROR!
{"event":1, ...}
{"event":2, ...}
{"event":3, ...}
chunk1
{"event":4, ...}
{"event":5, ...}
{"event":6, ...}
chunk2
…
Input
OK
ERROR!
OK
OK
OK
Output
<label @ERROR>	

<match **>	

type file	

...	

</match>	

</label>
Error stream
Built-in @ERROR is used	

when error occurred in “emit”
v0.14 (next stable)
> New plugin APIs
> Actor
> New base classes (#309)
> ServerEngine based core engine
> Robust supervisor 
> Sub-second time support (#461)
> Zero downtime restart
Actor
> Easy to write popular routines
> Hide implementation details















class TimerWatcher <	

Coolio::TimerWatcher	

...	

end	

!
def start	

@loop = Coolio::Loop.new	

@timer = ...	

@loop.attach(@timer)	

@thread = ...	

end
def configure(conf)	

actor.every(@interval) {	

router.emit(...)	

}	

end	

!
def start	

actor.start	

end
v10: v0.14:
> Socket manager shared resources with
workers











40
Supervisor
TCP
1. Listen to TCP socket
Zero downtime restart
41
Worker
Supervisor
heartbeat
TCP
TCP
1. Listen to TCP socket	

2. Pass its socket to worker
Zero downtime restart
> Socket manager shared resources with
workers











42
Worker
Supervisor
Worker
TCP
TCP
1. Listen to TCP socket	

2. Pass its socket to worker	

3. Do same action

at worker restarting

with keeping TCP socket
heartbeat
Zero downtime restart
> Socket manager shared resources with
workers











TODO: How to implement on JRuby?
v1 (future stable)
> Fix new features / APIs
> Plugin APIs
> Default configurations
> Clear versioning and stability
> No breaking API compatibility!
> Breaking compatibility by Fluentd v2 ?
Roadmap summary
> v0.10 (old stable)
> v0.12 (current stable)
> Filter / Label / At-least-once
> v0.14 (spring, 2015)
> New plugin APIs, ServerEngine, Time…
> v1 (early summer, 2015)
> Fix new features / APIs
https://github.com/fluent/fluentd/wiki/V1-Roadmap
Other TODO
> Windows support
> Need feedback!
> https://github.com/fluent/fluentd/tree/windows
> Also check: http://qiita.com/okahashi117
> JRuby support
> msgpack / cool.io now work on JRuby
> https://github.com/fluent/fluentd/issues/317
Ecosystem
Treasure Agent (td-agent)
> Treasure Data distribution of Fluentd
> Treasure Agent 2 is current stable
> Update core components
> We recommend to use v2, not v1
> Next version, 2.2.0, uses fluentd v0.12
> In this week or next week
fluentd-forwarder
> Forwarding agent written in Go
> Focusing log forwarding to Fluentd
> Work on Windows
> Bundle TCP input/output and TD output
> No flexible plugin mechanizm
> We have a plan to add some input/output
> Similar product
> fluent-agent-lite, fluent-agent-hydra, ik
fluentd-ui
> Manage Fluentd instance via Web UI
> https://github.com/fluent/fluentd-ui











Embulk
> Bulk Loader version of Fluentd
> Pluggable architecture
> JRuby, JVM languages
> High performance parallel processing
> Share your script as a plugin
> https://github.com/embulk
http://www.slideshare.net/frsyuki/embuk-making-data-integration-works-relaxed
HDFS
MySQL
Amazon S3
Embulk
CSV Files
SequenceFile
Salesforce.com
Elasticsearch
Cassandra
Hive
Redis
✓ Parallel execution
✓ Data validation
✓ Error recovery
✓ Deterministic behaviour
✓ Idempotent retrying
Plugins Plugins
bulk load
Check: treasuredata.com
Cloud service for the entire data pipeline

Contenu connexe

Tendances

FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp servers
Angelo Failla
 
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
StreamNative
 

Tendances (20)

Apache Flink Hands On
Apache Flink Hands OnApache Flink Hands On
Apache Flink Hands On
 
Pulsar Functions Deep Dive_Sanjeev kulkarni
Pulsar Functions Deep Dive_Sanjeev kulkarniPulsar Functions Deep Dive_Sanjeev kulkarni
Pulsar Functions Deep Dive_Sanjeev kulkarni
 
Linux HTTPS/TCP/IP Stack for the Fast and Secure Web
Linux HTTPS/TCP/IP Stack for the Fast and Secure WebLinux HTTPS/TCP/IP Stack for the Fast and Secure Web
Linux HTTPS/TCP/IP Stack for the Fast and Secure Web
 
Flink 0.10 - Upcoming Features
Flink 0.10 - Upcoming FeaturesFlink 0.10 - Upcoming Features
Flink 0.10 - Upcoming Features
 
Splunk Conf 2014 - Getting the message
Splunk Conf 2014 - Getting the messageSplunk Conf 2014 - Getting the message
Splunk Conf 2014 - Getting the message
 
Going FaaSter, Functions as a Service at Netflix
Going FaaSter, Functions as a Service at NetflixGoing FaaSter, Functions as a Service at Netflix
Going FaaSter, Functions as a Service at Netflix
 
Oracle API Gateway Installation
Oracle API Gateway InstallationOracle API Gateway Installation
Oracle API Gateway Installation
 
The Integration of Laravel with Swoole
The Integration of Laravel with SwooleThe Integration of Laravel with Swoole
The Integration of Laravel with Swoole
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp servers
 
Multitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEMultitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINE
 
FluentD for end to end monitoring
FluentD for end to end monitoringFluentD for end to end monitoring
FluentD for end to end monitoring
 
"Wie passen Serverless & Autonomous zusammen?"
"Wie passen Serverless & Autonomous zusammen?""Wie passen Serverless & Autonomous zusammen?"
"Wie passen Serverless & Autonomous zusammen?"
 
Experiences with Microservices at Tuenti
Experiences with Microservices at TuentiExperiences with Microservices at Tuenti
Experiences with Microservices at Tuenti
 
OSMC 2021 | Icinga-Installer – the easy way to your Icinga
OSMC 2021 | Icinga-Installer – the easy way to your IcingaOSMC 2021 | Icinga-Installer – the easy way to your Icinga
OSMC 2021 | Icinga-Installer – the easy way to your Icinga
 
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
 
Fluentd meetup logging infrastructure in paa s
Fluentd meetup   logging infrastructure in paa sFluentd meetup   logging infrastructure in paa s
Fluentd meetup logging infrastructure in paa s
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
 
Failsafe Mechanism for Yahoo Homepage
Failsafe Mechanism for Yahoo HomepageFailsafe Mechanism for Yahoo Homepage
Failsafe Mechanism for Yahoo Homepage
 
Dockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and NovaDockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and Nova
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 

Similaire à Fluentd - RubyKansai 65

Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
Sadayuki Furuhashi
 

Similaire à Fluentd - RubyKansai 65 (20)

Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4
 
The basics of fluentd
The basics of fluentdThe basics of fluentd
The basics of fluentd
 
Fluentd - road to v1 -
Fluentd - road to v1 -Fluentd - road to v1 -
Fluentd - road to v1 -
 
Treasure Data and OSS
Treasure Data and OSSTreasure Data and OSS
Treasure Data and OSS
 
Fluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At FossasiaFluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At Fossasia
 
Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014
 
SQL for Everything at CWT2014
SQL for Everything at CWT2014SQL for Everything at CWT2014
SQL for Everything at CWT2014
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container Era
 
Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOScon
 
SQL on Hadoop in Taiwan
SQL on Hadoop in TaiwanSQL on Hadoop in Taiwan
SQL on Hadoop in Taiwan
 
Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015
 
From nothing to Prometheus : one year after
From nothing to Prometheus : one year afterFrom nothing to Prometheus : one year after
From nothing to Prometheus : one year after
 
apidays LIVE Jakarta - REST the events: REST APIs for Event-Driven Architectu...
apidays LIVE Jakarta - REST the events: REST APIs for Event-Driven Architectu...apidays LIVE Jakarta - REST the events: REST APIs for Event-Driven Architectu...
apidays LIVE Jakarta - REST the events: REST APIs for Event-Driven Architectu...
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 
Extreme replication at IOUG Collaborate 15
Extreme replication at IOUG Collaborate 15Extreme replication at IOUG Collaborate 15
Extreme replication at IOUG Collaborate 15
 
Fluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EUFluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EU
 
Microservices Technology Stack
Microservices Technology StackMicroservices Technology Stack
Microservices Technology Stack
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)
 
BigQuery case study in Groovenauts & Dive into the DataflowJavaSDK
BigQuery case study in Groovenauts & Dive into the DataflowJavaSDKBigQuery case study in Groovenauts & Dive into the DataflowJavaSDK
BigQuery case study in Groovenauts & Dive into the DataflowJavaSDK
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 

Plus de N Masahiro

Plus de N Masahiro (20)

Fluentd v1 and future at techtalk
Fluentd v1 and future at techtalkFluentd v1 and future at techtalk
Fluentd v1 and future at techtalk
 
Fluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconFluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at Kubecon
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshell
 
Presto changes
Presto changesPresto changes
Presto changes
 
Fluentd v0.14 Overview
Fluentd v0.14 OverviewFluentd v0.14 Overview
Fluentd v0.14 Overview
 
Fluentd and Kafka
Fluentd and KafkaFluentd and Kafka
Fluentd and Kafka
 
fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14
 
Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and Fluentd
 
How to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataHow to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdata
 
Fluentd v0.12 master guide
Fluentd v0.12 master guideFluentd v0.12 master guide
Fluentd v0.12 master guide
 
Can you say the same words even in oss
Can you say the same words even in ossCan you say the same words even in oss
Can you say the same words even in oss
 
I am learing the programming
I am learing the programmingI am learing the programming
I am learing the programming
 
Fluentd meetup dive into fluent plugin (outdated)
Fluentd meetup dive into fluent plugin (outdated)Fluentd meetup dive into fluent plugin (outdated)
Fluentd meetup dive into fluent plugin (outdated)
 
D vs OWKN Language at LLnagoya
D vs OWKN Language at LLnagoyaD vs OWKN Language at LLnagoya
D vs OWKN Language at LLnagoya
 
Goodbye Doost
Goodbye DoostGoodbye Doost
Goodbye Doost
 
Final presentation at pfintern
Final presentation at pfinternFinal presentation at pfintern
Final presentation at pfintern
 
Kernel VM 5 LT
Kernel VM 5 LTKernel VM 5 LT
Kernel VM 5 LT
 
D言語のコミッタになる一つの方法
D言語のコミッタになる一つの方法D言語のコミッタになる一つの方法
D言語のコミッタになる一つの方法
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Fluentd - RubyKansai 65

  • 1. Masahiro Nakagawa Feb 21, 2015 RubyKansai #65 Fluentd Unified logging layer
  • 2. Who are you? > Masahiro Nakagawa > github/twitter: @repeatedly > Treasure Data, Inc. > Senior Software Engineer > Fluentd / td-agent developer > Living at OSS :) > D language - Phobos committer > Fluentd - Main maintainer > MessagePack / RPC - D and Python (only RPC) > The organizer of several meetups (Presto, DTM, etc…) > etc…
  • 3. Structured logging ! Reliable forwarding ! Pluggable architecture http://fluentd.org/
  • 4. What’s Fluentd? > Data collector for unified logging layer > Streaming data transfer based on JSON > Written in Ruby > Gem based various plugins > http://www.fluentd.org/plugins > Working in production > http://www.fluentd.org/testimonials
  • 6. Data Analytics Flow Collect Store Process Visualize Data source Reporting Monitoring
  • 7. Data Analytics Flow Store Process Cloudera Horton Works Treasure Data Collect Visualize Tableau Excel R easier & shorter time ???
  • 8. TD Service Architecture Time to Value Send query result Result Push Acquire Analyze Store Plazma DB Flexible, Scalable, Columnar Storage Web Log App Log Censor CRM ERP RDBMS Treasure Agent(Server) SDK(JS, Android, iOS, Unity) Streaming Collector Batch / Reliability Ad-hoc /
 Low latency KPI$ KPI Dashboard BI Tools Other Products RDBMS, Google Docs, AWS S3, FTP Server, etc. Metric Insights Tableau, Motion Board etc. POS REST API ODBC / JDBC SQL, Pig Bulk Uploader Embulk,
 TD Toolbelt SQL-based query @AWS or @IDCF Connectivity Economy & Flexibility Simple & Supported
  • 10. Divide & Conquer & Retry error retry error retry retry retry Batch Stream Other stream
  • 13. Core Plugins > Divide & Conquer
 > Buffering & Retrying
 > Error handling
 > Message routing
 > Parallelism > read / receive data > from API, database,
 command, etc… > write / send data > to API, database, alert, graph, etc…
  • 14. Apache to Mongo tail insert event buffering 127.0.0.1 - - [11/Dec/2012:07:26:27] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:30] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:32] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:40] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:27:01] "GET / ... ... Fluentd Web Server 2012-02-04 01:33:51 apache.log { "host": "127.0.0.1", "method": "GET", ... }
  • 15. > default second unit > from data source Event structure(log message) ✓ Time > for message routing > where is from? ✓ Tag > JSON format > MessagePack
 internally > schema-free ✓ Record
  • 16. Architecture (v0.12 or later) EngineInput Filter Output Buffer > grep > record_transfomer > … > Forward > File tail > ... > Forward > File > ... Output > File > Memory not pluggable FormatterParser
  • 17. Configuration and operation > No central / master node > include helps configuration sharing > Operation depends on your environment > Use your deamon management > Use Chef in Treasure Data > Apache like syntax and Ruby DSL
  • 18. # receive events via HTTP <source> type http port 8888 </source> ! # read logs from a file <source> type tail path /var/log/httpd.log format apache tag apache.access </source> ! # save access logs to MongoDB <match apache.access> type mongo database apache collection log </match> # save alerts to a file <match alert.**> type file path /var/log/fluent/alerts </match> ! # forward other logs to servers <match **> type forward <server> host 192.168.0.11 weight 20 </server> <server> host 192.168.0.12 weight 60 </server> </match> ! include http://example.com/conf
  • 19. Plugins - use rubygems $ fluent-gem search -rd fluent-plugin! ! $ fluent-gem search -rd fluent-mixin! ! $ fluent-gem install fluent-plugin-mongo
  • 20. in_tail ✓ read a log file! ✓ custom regexp! ✓ custom parser in Ruby FluentdApache access.log > json > csv > tsv > ltsv Supported format: > apache > apache_error > apache2 > nginx > syslog > none
 

  • 21. out_webhdf Fluentd buffer ✓ retry automatically! ✓ exponential retry wait! ✓ persistent on a file ✓ slice files based on time 2013-01-01/01/access.log.gz! 2013-01-01/02/access.log.gz! 2013-01-01/03/access.log.gz! ... HDFS ✓ custom text formatter Apache access.log
  • 22. out_copy ✓ routing based on tags! ✓ copy to multiple storages Amazon S3 Fluentd buffer Apache access.log
  • 23. out_forward apache ✓ automatic fail-over! ✓ load balancing FluentdApache bufferaccess.log ✓ retry automatically! ✓ exponential retry wait! ✓ persistent on a file Fluentd Fluentd Fluentd
  • 26. Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache Frontend Access logs syslogd App logs System logs Backend Databases buffering / processing / routing M x N → M + N
  • 28. Treasure Data Frontend Job Queue Worker Hadoop Presto Fluentd Applications push metrics to Fluentd
 (via local Fluentd) Librato Metrics for realtime analysis Treasure Data for historical analysis Fluentd sums up data minutes
 (partial aggregation)
  • 29. hundreds of app servers sends event logs sends event logs sends event logs Rails app td-agent td-agent td-agent Google Spreadsheet Treasure Data MySQL Logs are available after several mins. Daily/Hourly Batch KPI visualizationFeedback rankings Rails app Rails app Unlimited scalability Flexible schema Realtime Less performance impact Cookpad ✓ Over 100 RoR servers (2012/2/4)
  • 31. Log Analysis System And its designs in LINE Corp. 2014 early
  • 33. v0.10 (old stable) > Mainly for log forwarding > with good performance > working in production > almost users use td-agent > Various plugins > http://www.fluentd.org/plugins
  • 34. v0.12 (current stable) > Event handling improvement > Filter > Label > Error Stream > At-least-once semantics in forwarding > require_ack_response parameter > http://ogibayashi.github.io/blog/2014/12/16/try- fluentd-v0-dot-12-at-least-once/
  • 35. > Apply filtering routine to event stream > No more tag tricks!
 
 
 
 
 
 Filter <match access.**> type record_reformer tag reformed.${tag} </match> ! <match reformed.**> type growthforecast </match> <filter access.**> type record_transformer … </filter> v0.10: v0.12: <match access.**> type growthforecast </match>
  • 36. > Internal event routing > Redirect events to another group > much easier to group and share plugins
 
 
 
 
 
 Label <source> type forward </source> ! <match app1.**> type record_reformer </match> ! … <source> type forward @label @APP1 </source> <label @APP1> <match access.**> type s3 </match> </label> v0.10: v0.12:
  • 37. Error stream with Label > Can handle an error at each record level > It is still prototype
 
 
 
 
 
 
 ERROR! {"event":1, ...} {"event":2, ...} {"event":3, ...} chunk1 {"event":4, ...} {"event":5, ...} {"event":6, ...} chunk2 … Input OK ERROR! OK OK OK Output <label @ERROR> <match **> type file ... </match> </label> Error stream Built-in @ERROR is used when error occurred in “emit”
  • 38. v0.14 (next stable) > New plugin APIs > Actor > New base classes (#309) > ServerEngine based core engine > Robust supervisor > Sub-second time support (#461) > Zero downtime restart
  • 39. Actor > Easy to write popular routines > Hide implementation details
 
 
 
 
 
 
 
 class TimerWatcher < Coolio::TimerWatcher ... end ! def start @loop = Coolio::Loop.new @timer = ... @loop.attach(@timer) @thread = ... end def configure(conf) actor.every(@interval) { router.emit(...) } end ! def start actor.start end v10: v0.14:
  • 40. > Socket manager shared resources with workers
 
 
 
 
 
 40 Supervisor TCP 1. Listen to TCP socket Zero downtime restart
  • 41. 41 Worker Supervisor heartbeat TCP TCP 1. Listen to TCP socket 2. Pass its socket to worker Zero downtime restart > Socket manager shared resources with workers
 
 
 
 
 

  • 42. 42 Worker Supervisor Worker TCP TCP 1. Listen to TCP socket 2. Pass its socket to worker 3. Do same action
 at worker restarting
 with keeping TCP socket heartbeat Zero downtime restart > Socket manager shared resources with workers
 
 
 
 
 
 TODO: How to implement on JRuby?
  • 43. v1 (future stable) > Fix new features / APIs > Plugin APIs > Default configurations > Clear versioning and stability > No breaking API compatibility! > Breaking compatibility by Fluentd v2 ?
  • 44. Roadmap summary > v0.10 (old stable) > v0.12 (current stable) > Filter / Label / At-least-once > v0.14 (spring, 2015) > New plugin APIs, ServerEngine, Time… > v1 (early summer, 2015) > Fix new features / APIs https://github.com/fluent/fluentd/wiki/V1-Roadmap
  • 45. Other TODO > Windows support > Need feedback! > https://github.com/fluent/fluentd/tree/windows > Also check: http://qiita.com/okahashi117 > JRuby support > msgpack / cool.io now work on JRuby > https://github.com/fluent/fluentd/issues/317
  • 47. Treasure Agent (td-agent) > Treasure Data distribution of Fluentd > Treasure Agent 2 is current stable > Update core components > We recommend to use v2, not v1 > Next version, 2.2.0, uses fluentd v0.12 > In this week or next week
  • 48. fluentd-forwarder > Forwarding agent written in Go > Focusing log forwarding to Fluentd > Work on Windows > Bundle TCP input/output and TD output > No flexible plugin mechanizm > We have a plan to add some input/output > Similar product > fluent-agent-lite, fluent-agent-hydra, ik
  • 49. fluentd-ui > Manage Fluentd instance via Web UI > https://github.com/fluent/fluentd-ui
 
 
 
 
 

  • 50. Embulk > Bulk Loader version of Fluentd > Pluggable architecture > JRuby, JVM languages > High performance parallel processing > Share your script as a plugin > https://github.com/embulk http://www.slideshare.net/frsyuki/embuk-making-data-integration-works-relaxed
  • 51. HDFS MySQL Amazon S3 Embulk CSV Files SequenceFile Salesforce.com Elasticsearch Cassandra Hive Redis ✓ Parallel execution ✓ Data validation ✓ Error recovery ✓ Deterministic behaviour ✓ Idempotent retrying Plugins Plugins bulk load
  • 52. Check: treasuredata.com Cloud service for the entire data pipeline