SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
Fluentd and Distributed Logging
Masahiro Nakagawa

Senior Software Engineer   
CNCon / KubeCon at North America
Logging and Containers
Logging on production
• Service Logs
• Web access logs
• Ad logs
• Transcation logs (Game, EC, etc…)
• System Logs
• Syslog, systemd and other logs
• Audit logs
• Metrics (CPU, memory, etc…)
Logs for Bussiness
Logs for Service
KPI
Machine Learning
…
System monitoring
Root cause check
…
Distributed tracing
The Container Era
Server Era Container Era
Service Architecture Monolithic Microservices
System Image Mutable Immutable
Local Data Persistent Ephemeral
Network Physical addresses No fixed addresses
Log Collection syslogd / rsync ?
• No permanent storage



• No fixed physical addresses



• No fixed mappings between servers and roles



• Lots of application types
Logging challenges with Containers
Transfer logs to anywhere ASAP
Push logs from containers
Label logs with Service/Tags
Need to handle various logs
Fluentd overview
Simple core

+ Variety of plugins
Buffering, HA (failover),
Secondary output, etc.
Like syslogd in streaming manner
AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL
What’s Fluentd?
Streaming way with Fluentd
Log Server
Application
Server A
File FileFile
Application
Server C
File FileFile
Application
Server B
File FileFile
Low latency!

Seconds or minutes
Easy to analyze!!

Parsed and formatted
M x N problem for data integration
LOG
script to
parse data
cron job for
loading
filtering
script
syslog
script
Tweet-
fetching
script
aggregation
script
aggregation
script
script to
parse data
rsync
server
LOG
A solution: unified logging layer
M + N
Fluentd Architecture
Internal Architecture (simplified)
Plugin
Input Filter Buffer Output
Plugin Plugin Plugin
2017-12-06 15:15:15

myapp.buy
Time
Tag
Record
{

“user”:”me”,

“path”: “/buyItem”,

“price”: 150,

“referer”: “/landing”

}
Retry
Error
Retry
Batch
Stream Error
Retry
Retry
Divide & Conquer for retry
Divide & Conquer for recovery
Buffer
(on-disk or in-memory)
Error
Overloaded!!
recovery
recovery + flow control
queued chunks
3rd party input plugins
dstat
df
AMQL
munin
SQL
3rd party output plugins
Graphite
Distributed Logging
Architecture
Source (Container + Agent)
Transferring / Aggregation Layer
Destination (Storage / Database / Service)
Logging Workflow
Source
Aggregator
Destination
• Retrieve logs: File / Network / API …
• Parse payload for structured logging
• Get logs from multiple sources
• Split/Merge logs into streams
• Receive logs from Aggreagtors
• Store formatted logs
How to collect logs from containers

using Fluentd in source layer?
Text logging with --log-driver=fluentd
Server
Container
App
FluentdSTDOUT / STDERR
docker run 
--log-driver=fluentd 

--log-opt 
fluentd-address=localhost:24224
{

“container_id”: “ad6d5d32576a”,

“container_name”: “myapp”,

“source”: stdout

}
Metrics collection with fluent-logger
Server
Container
App
Fluentd
from fluent import sender
from fluent import event
sender.setup('app.events', host='localhost')
event.Event('purchase', {
'user_id': 21, 'item_id': 321, 'value': '1'
})
tag = app.events.purchase

{

“user_id”: 21,

“item_id”: 321

“value”: 1,

}
fluent-logger library
Shared data volume and tailing
Server
Container
App
Fluentd
<source>
@type tail
path /mnt/*/access.log
pos_file /var/log/fluentd/access.log.pos
<format>
@type nginx
</format>
tag nginx.access
</source>
/mnt/nginx/logs
Logging methods for each purpose
• Collecting log messages
• --log-driver=fluentd
• Application metrics
• fluent-logger
• Access logs, logs from middleware
• Shared data volume
• System metrics (CPU usage, Disk capacity, etc.)
• Fluentd’s input plugins (Fluentd pulls data periodically)
• Prometheus or other monitoring agent
Deployment Patterns
Server 1
Container A
Application
Container B
Application
Server 2
Container C
Application
Container D
Application
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
Primitive deployment…
Too many connections
from many containers!
Embedding destination
IPsin ALL Docker images

makes management hard
Server 1
Container A
Application
Container B
Application
Fluentd
Server 2
Container C
Application
Container D
Application
Fluentd Kafka
elasticsearch
HDFS
Container
Container
Container
Container
Destination is always
localhost from app’s
point of view
Source aggregation decouples config
from apps
Server 1
Container A
Application
Container B
Application
Fluentd
Server 2
Container C
Application
Container D
Application
Fluentd
active / standby /
load balancing
Destination aggregation makes storages scalable
for high traffic
Aggregation server(s)
Aggregation servers
• Logging directly from microservices makes log storages
overloaded.
• Too many connections
• Too frequent import API calls
• Aggregation servers make the logging infrastracture
more reliable and scalable.
• Connection aggregation
• Buffering for less frequent import API calls
• Data persistency during downtime
• Automatic retry at recovery from downtime
Source side Aggregation
Destination

side

Aggregation
No Yes
No
Yes
Should use these patterns?
• Source-side aggregation: Yes
• Fluentd frees logging pain from applications
• Buffering, Retry, HA, etc…
• Application don’t need to care destination changes
• Destination-side aggregation: It depends
• good for high traffic
• maybe, no need for cloud logging services
• may need for self-hosted distributed systems or

cloud services which charges per request
Scalable Distributed Logging
• Network
• Split heavy traffic into traffics to nodes
• Merge connections
• CPU / Memory
• Distribute processing to nodes about heavy processing
• High Availability
• Switch / fallback from a node to another for failure
• Agility
• Avoid reconfiguring whole logging layer to modify
destinations
Fluentd ♡ Container
• Fluentd model fits container based systems
• Pluggable and Robust pipelines
• Support typical deployment patterns
• Smart CNCF products for scalable system
• k8s: Container orchestration
• Prometheus: Monitoring
• Fluentd: Logging
• JAEGER: Distributed Tracing
• etc…
Let’s make scalable and stable system!
Enjoy logging!

Contenu connexe

Tendances

Taming the Tiger: Tips and Tricks for Using Telegraf
Taming the Tiger: Tips and Tricks for Using TelegrafTaming the Tiger: Tips and Tricks for Using Telegraf
Taming the Tiger: Tips and Tricks for Using TelegrafInfluxData
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)Ryan Blue
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters MongoDB
 
[KubeConEU2023] Lima pavilion
[KubeConEU2023] Lima pavilion[KubeConEU2023] Lima pavilion
[KubeConEU2023] Lima pavilionAkihiro Suda
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustAltinity Ltd
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking ExplainedThomas Graf
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentationIlias Okacha
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevAltinity Ltd
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introductionchrislusf
 
Federated Engine 실무적용사례
Federated Engine 실무적용사례Federated Engine 실무적용사례
Federated Engine 실무적용사례I Goo Lee
 
Deep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingDeep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingSreenivas Makam
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with KubernetesDeivid Hahn Fração
 
Handle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaHandle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaJiangjie Qin
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...HostedbyConfluent
 
Implementing Microservices with NATS
Implementing Microservices with NATSImplementing Microservices with NATS
Implementing Microservices with NATSApcera
 

Tendances (20)

Taming the Tiger: Tips and Tricks for Using Telegraf
Taming the Tiger: Tips and Tricks for Using TelegrafTaming the Tiger: Tips and Tricks for Using Telegraf
Taming the Tiger: Tips and Tricks for Using Telegraf
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
[KubeConEU2023] Lima pavilion
[KubeConEU2023] Lima pavilion[KubeConEU2023] Lima pavilion
[KubeConEU2023] Lima pavilion
 
Apache Airflow overview
Apache Airflow overviewApache Airflow overview
Apache Airflow overview
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentation
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
 
Apache airflow
Apache airflowApache airflow
Apache airflow
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 
Federated Engine 실무적용사례
Federated Engine 실무적용사례Federated Engine 실무적용사례
Federated Engine 실무적용사례
 
Deep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingDeep dive into Kubernetes Networking
Deep dive into Kubernetes Networking
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with Kubernetes
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 
Handle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaHandle Large Messages In Apache Kafka
Handle Large Messages In Apache Kafka
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
 
Implementing Microservices with NATS
Implementing Microservices with NATSImplementing Microservices with NATS
Implementing Microservices with NATS
 

Similaire à Fluentd and Distributed Logging at Kubecon

Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to TelegrafInfluxData
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraSATOSHI TAGOMORI
 
Distributed Logging Architecture in the Container Era
Distributed Logging Architecture in the Container EraDistributed Logging Architecture in the Container Era
Distributed Logging Architecture in the Container EraGlenn Davis
 
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...Flink Forward
 
Building an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult StepsBuilding an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult StepsDigitalOcean
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Amazon Web Services
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseBig Data Spain
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container EraSadayuki Furuhashi
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 
A Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingA Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingStreamNative
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructureharendra_pathak
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataApache Apex
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike, Inc.
 
From the Trenches: Effectively Scaling Your Cloud Infrastructure and Optimizi...
From the Trenches: Effectively Scaling Your Cloud Infrastructure and Optimizi...From the Trenches: Effectively Scaling Your Cloud Infrastructure and Optimizi...
From the Trenches: Effectively Scaling Your Cloud Infrastructure and Optimizi...Allan Mangune
 

Similaire à Fluentd and Distributed Logging at Kubecon (20)

Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to Telegraf
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
 
Distributed Logging Architecture in the Container Era
Distributed Logging Architecture in the Container EraDistributed Logging Architecture in the Container Era
Distributed Logging Architecture in the Container Era
 
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
 
Building an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult StepsBuilding an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult Steps
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container Era
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
 
A Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingA Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and Processing
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
 
Cloud Migration
Cloud MigrationCloud Migration
Cloud Migration
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory Architecture
 
From the Trenches: Effectively Scaling Your Cloud Infrastructure and Optimizi...
From the Trenches: Effectively Scaling Your Cloud Infrastructure and Optimizi...From the Trenches: Effectively Scaling Your Cloud Infrastructure and Optimizi...
From the Trenches: Effectively Scaling Your Cloud Infrastructure and Optimizi...
 

Plus de N Masahiro

Fluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EUFluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EUN Masahiro
 
Fluentd v1 and future at techtalk
Fluentd v1 and future at techtalkFluentd v1 and future at techtalk
Fluentd v1 and future at techtalkN Masahiro
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellN Masahiro
 
Presto changes
Presto changesPresto changes
Presto changesN Masahiro
 
Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOSconN Masahiro
 
Fluentd v0.14 Overview
Fluentd v0.14 OverviewFluentd v0.14 Overview
Fluentd v0.14 OverviewN Masahiro
 
Fluentd and Kafka
Fluentd and KafkaFluentd and Kafka
Fluentd and KafkaN Masahiro
 
fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14N Masahiro
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics PlatformN Masahiro
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and FluentdN Masahiro
 
How to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataHow to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataN Masahiro
 
Fluentd v0.12 master guide
Fluentd v0.12 master guideFluentd v0.12 master guide
Fluentd v0.12 master guideN Masahiro
 
Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4N Masahiro
 
Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015N Masahiro
 
Fluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At FossasiaFluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At FossasiaN Masahiro
 
Treasure Data and OSS
Treasure Data and OSSTreasure Data and OSS
Treasure Data and OSSN Masahiro
 
Fluentd - RubyKansai 65
Fluentd - RubyKansai 65Fluentd - RubyKansai 65
Fluentd - RubyKansai 65N Masahiro
 
Fluentd - road to v1 -
Fluentd - road to v1 -Fluentd - road to v1 -
Fluentd - road to v1 -N Masahiro
 
Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014N Masahiro
 
SQL for Everything at CWT2014
SQL for Everything at CWT2014SQL for Everything at CWT2014
SQL for Everything at CWT2014N Masahiro
 

Plus de N Masahiro (20)

Fluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EUFluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EU
 
Fluentd v1 and future at techtalk
Fluentd v1 and future at techtalkFluentd v1 and future at techtalk
Fluentd v1 and future at techtalk
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshell
 
Presto changes
Presto changesPresto changes
Presto changes
 
Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOScon
 
Fluentd v0.14 Overview
Fluentd v0.14 OverviewFluentd v0.14 Overview
Fluentd v0.14 Overview
 
Fluentd and Kafka
Fluentd and KafkaFluentd and Kafka
Fluentd and Kafka
 
fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and Fluentd
 
How to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataHow to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdata
 
Fluentd v0.12 master guide
Fluentd v0.12 master guideFluentd v0.12 master guide
Fluentd v0.12 master guide
 
Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4
 
Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015
 
Fluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At FossasiaFluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At Fossasia
 
Treasure Data and OSS
Treasure Data and OSSTreasure Data and OSS
Treasure Data and OSS
 
Fluentd - RubyKansai 65
Fluentd - RubyKansai 65Fluentd - RubyKansai 65
Fluentd - RubyKansai 65
 
Fluentd - road to v1 -
Fluentd - road to v1 -Fluentd - road to v1 -
Fluentd - road to v1 -
 
Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014
 
SQL for Everything at CWT2014
SQL for Everything at CWT2014SQL for Everything at CWT2014
SQL for Everything at CWT2014
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Fluentd and Distributed Logging at Kubecon

  • 1. Fluentd and Distributed Logging Masahiro Nakagawa
 Senior Software Engineer    CNCon / KubeCon at North America
  • 3. Logging on production • Service Logs • Web access logs • Ad logs • Transcation logs (Game, EC, etc…) • System Logs • Syslog, systemd and other logs • Audit logs • Metrics (CPU, memory, etc…) Logs for Bussiness Logs for Service KPI Machine Learning … System monitoring Root cause check … Distributed tracing
  • 4. The Container Era Server Era Container Era Service Architecture Monolithic Microservices System Image Mutable Immutable Local Data Persistent Ephemeral Network Physical addresses No fixed addresses Log Collection syslogd / rsync ?
  • 5. • No permanent storage
 
 • No fixed physical addresses
 
 • No fixed mappings between servers and roles
 
 • Lots of application types Logging challenges with Containers Transfer logs to anywhere ASAP Push logs from containers Label logs with Service/Tags Need to handle various logs
  • 7. Simple core
 + Variety of plugins Buffering, HA (failover), Secondary output, etc. Like syslogd in streaming manner AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL What’s Fluentd?
  • 8. Streaming way with Fluentd Log Server Application Server A File FileFile Application Server C File FileFile Application Server B File FileFile Low latency! Seconds or minutes Easy to analyze!! Parsed and formatted
  • 9. M x N problem for data integration LOG script to parse data cron job for loading filtering script syslog script Tweet- fetching script aggregation script aggregation script script to parse data rsync server
  • 10. LOG A solution: unified logging layer M + N
  • 12. Internal Architecture (simplified) Plugin Input Filter Buffer Output Plugin Plugin Plugin 2017-12-06 15:15:15 myapp.buy Time Tag Record { “user”:”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”
 }
  • 14. Divide & Conquer for recovery Buffer (on-disk or in-memory) Error Overloaded!! recovery recovery + flow control queued chunks
  • 15. 3rd party input plugins dstat df AMQL munin SQL
  • 16. 3rd party output plugins Graphite
  • 18. Architecture Source (Container + Agent) Transferring / Aggregation Layer Destination (Storage / Database / Service)
  • 19. Logging Workflow Source Aggregator Destination • Retrieve logs: File / Network / API … • Parse payload for structured logging • Get logs from multiple sources • Split/Merge logs into streams • Receive logs from Aggreagtors • Store formatted logs
  • 20. How to collect logs from containers
 using Fluentd in source layer?
  • 21. Text logging with --log-driver=fluentd Server Container App FluentdSTDOUT / STDERR docker run --log-driver=fluentd 
 --log-opt fluentd-address=localhost:24224 { “container_id”: “ad6d5d32576a”, “container_name”: “myapp”, “source”: stdout }
  • 22. Metrics collection with fluent-logger Server Container App Fluentd from fluent import sender from fluent import event sender.setup('app.events', host='localhost') event.Event('purchase', { 'user_id': 21, 'item_id': 321, 'value': '1' }) tag = app.events.purchase { “user_id”: 21, “item_id”: 321 “value”: 1, } fluent-logger library
  • 23. Shared data volume and tailing Server Container App Fluentd <source> @type tail path /mnt/*/access.log pos_file /var/log/fluentd/access.log.pos <format> @type nginx </format> tag nginx.access </source> /mnt/nginx/logs
  • 24. Logging methods for each purpose • Collecting log messages • --log-driver=fluentd • Application metrics • fluent-logger • Access logs, logs from middleware • Shared data volume • System metrics (CPU usage, Disk capacity, etc.) • Fluentd’s input plugins (Fluentd pulls data periodically) • Prometheus or other monitoring agent
  • 26. Server 1 Container A Application Container B Application Server 2 Container C Application Container D Application Kafka elasticsearch HDFS Container Container Container Container Primitive deployment… Too many connections from many containers! Embedding destination IPsin ALL Docker images
 makes management hard
  • 27. Server 1 Container A Application Container B Application Fluentd Server 2 Container C Application Container D Application Fluentd Kafka elasticsearch HDFS Container Container Container Container Destination is always localhost from app’s point of view Source aggregation decouples config from apps
  • 28. Server 1 Container A Application Container B Application Fluentd Server 2 Container C Application Container D Application Fluentd active / standby / load balancing Destination aggregation makes storages scalable for high traffic Aggregation server(s)
  • 29. Aggregation servers • Logging directly from microservices makes log storages overloaded. • Too many connections • Too frequent import API calls • Aggregation servers make the logging infrastracture more reliable and scalable. • Connection aggregation • Buffering for less frequent import API calls • Data persistency during downtime • Automatic retry at recovery from downtime
  • 31. Should use these patterns? • Source-side aggregation: Yes • Fluentd frees logging pain from applications • Buffering, Retry, HA, etc… • Application don’t need to care destination changes • Destination-side aggregation: It depends • good for high traffic • maybe, no need for cloud logging services • may need for self-hosted distributed systems or
 cloud services which charges per request
  • 32. Scalable Distributed Logging • Network • Split heavy traffic into traffics to nodes • Merge connections • CPU / Memory • Distribute processing to nodes about heavy processing • High Availability • Switch / fallback from a node to another for failure • Agility • Avoid reconfiguring whole logging layer to modify destinations
  • 33. Fluentd ♡ Container • Fluentd model fits container based systems • Pluggable and Robust pipelines • Support typical deployment patterns • Smart CNCF products for scalable system • k8s: Container orchestration • Prometheus: Monitoring • Fluentd: Logging • JAEGER: Distributed Tracing • etc… Let’s make scalable and stable system!