SlideShare une entreprise Scribd logo
1  sur  19
Apache Flume: Data Collection System for 
HADOOP
Outline 
Overview of Flume 
Flume Sources 
Channels & Sinks 
Flume Topology 
Production Architecture 
Monitoring & Performance
Overview of Flume 
Collection, Aggregation of streaming Event Data 
Typically used for log data 
Significant advantages over ad-hoc solutions 
Reliable, Scalable, Manageable, Customizable 
and High Performance 
Declarative, Dynamic Configuration 
Contextual Routing 
Feature rich 
Fully extensible
Core Concepts: Event 
An Event is the fundamental unit of data transported by 
Flume from its point of origination to its final destination. 
Event is a byte array payload accompanied by optional 
headers. 
Payload is opaque to Flume 
Headers are specified as an unordered collection of string key-value 
pairs, with keys being unique across the collection 
Headers can be used for contextual routing
Core Concepts: Client 
An entity that generates events and sends them to 
one or more Agents. 
Example 
Flume log4j Appender 
Custom Client using Client SDK (org.apache.flume.api) 
Decouples Flume from the system where event data is consumed from 
Not needed in all cases
Core Concepts: Agent 
A container for hosting Sources, Channels, Sinks and other 
components that enable the transportation of events from one 
place to another. 
Fundamental part of a Flume flow 
Provides Configuration, Life-Cycle Management, and Monitoring 
Support for hosted components
Typical Aggregation Flow 
[Client]+  Agent [ Agent]*  Destination
Core Concepts: Source 
An active component that receives events from a 
specialized location or mechanism and places it on one 
or Channels. 
Different Source types: 
Specialized sources for integrating with well-known 
systems. Example: Syslog, Netcat 
Auto-Generating Sources: Exec, SEQ 
IPC sources for Agent-to-Agent communication: Avro 
Require at least one channel to function
Source 
Reads data from the source system and passes onto the next hop or to the 
final destination. 
Flume Sources: 
Avro Source 
Exec Source 
JMS Source 
Spooling Directory Source
Core Concepts: Channel 
A passive component that buffers the incoming 
events until they are drained by Sinks. 
Different Channels offer different levels of persistence: 
Memory Channel: volatile 
File Channel: backed by WAL implementation 
JDBC Channel: backed by embedded Database 
Channels are fully transactional 
Provide weak ordering 
guarantees 
Can work with any number of Sources and Sinks.
Core Concepts: Sink 
An active component that removes events from 
a Channel and transmits them to their next hop 
destination. 
Different types of Sinks: 
Terminal sinks that deposit events to their final 
destination. For example: HDFS, HBase 
IPC sink for Agent-to-Agent communication: Avro 
Require exactly one channel to function
Sinks 
Writes data to the next hop or to the final destination. 
Flume Sinks: 
Avro Sink 
HDFS Sink 
HBASE Sink 
File Sink 
Null Sink 
Logger Sink
What is the source in Flume
Fanout
Flume Channels 
Memory Channel 
Recommended if data loss due to crashes are 
ok 
File Channel 
Recommended channel. 
JDBC Channel 
Persistent store of data but introduces 
bottleneck and single point of failure.
Memory Channel 
Events stored on heap 
Limited capacity 
No persistence after a system/process crash 
Very fast 
3 config parameters: 
capacity: Maximum # of events that can be in the channel 
transactionCapacity: Maximum # of events in one txn. 
keepAlive: how long to wait to put/take an event
File Channel
Current Flume Flow
Monitoring: protocol support 
Several monitoring protocols supported out of the box 
JMX 
Ganglia 
HTTP (JSON) 
Java opts must be set in flume-env.sh to configure monitoring 
Ganglia and HTTP monitoring are mutually exclusive

Contenu connexe

Tendances

Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014Steve Hoffman
 
Apache Flume - DataDayTexas
Apache Flume - DataDayTexasApache Flume - DataDayTexas
Apache Flume - DataDayTexasArvind Prabhakar
 
Centralized logging with Flume
Centralized logging with FlumeCentralized logging with Flume
Centralized logging with FlumeRatnakar Pawar
 
Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insightsOmid Vahdaty
 
Extracting twitter data using apache flume
Extracting twitter data using apache flumeExtracting twitter data using apache flume
Extracting twitter data using apache flumeBharat Khanna
 
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...DataWorks Summit
 
Apache flume by Swapnil Dubey
Apache flume by Swapnil DubeyApache flume by Swapnil Dubey
Apache flume by Swapnil DubeySwapnil Dubey
 
Apache Flume and its use case in Manufacturing
Apache Flume and its use case in ManufacturingApache Flume and its use case in Manufacturing
Apache Flume and its use case in ManufacturingRapheephan Thongkham-Uan
 
Flume @ Austin HUG 2/17/11
Flume @ Austin HUG 2/17/11Flume @ Austin HUG 2/17/11
Flume @ Austin HUG 2/17/11Cloudera, Inc.
 
ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016Jayesh Thakrar
 
Apache Flume
Apache FlumeApache Flume
Apache FlumeGetInData
 
Filesystems, RPC and HDFS
Filesystems, RPC and HDFSFilesystems, RPC and HDFS
Filesystems, RPC and HDFSAlexander Alten
 
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Introduction to streaming and messaging  flume,kafka,SQS,kinesis Introduction to streaming and messaging  flume,kafka,SQS,kinesis
Introduction to streaming and messaging flume,kafka,SQS,kinesis Omid Vahdaty
 
Flume with Twitter Integration
Flume with Twitter IntegrationFlume with Twitter Integration
Flume with Twitter IntegrationRockyCIce
 
Data Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache FlumeData Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache FlumeArvind Prabhakar
 
Large scale near real-time log indexing with Flume and SolrCloud
Large scale near real-time log indexing with Flume and SolrCloudLarge scale near real-time log indexing with Flume and SolrCloud
Large scale near real-time log indexing with Flume and SolrCloudDataWorks Summit
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperAnandMHadoop
 
Big data: Loading your data with flume and sqoop
Big data:  Loading your data with flume and sqoopBig data:  Loading your data with flume and sqoop
Big data: Loading your data with flume and sqoopChristophe Marchal
 

Tendances (20)

Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
 
Apache Flume - DataDayTexas
Apache Flume - DataDayTexasApache Flume - DataDayTexas
Apache Flume - DataDayTexas
 
Centralized logging with Flume
Centralized logging with FlumeCentralized logging with Flume
Centralized logging with Flume
 
Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insights
 
Extracting twitter data using apache flume
Extracting twitter data using apache flumeExtracting twitter data using apache flume
Extracting twitter data using apache flume
 
Cloudera's Flume
Cloudera's FlumeCloudera's Flume
Cloudera's Flume
 
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
 
Apache flume by Swapnil Dubey
Apache flume by Swapnil DubeyApache flume by Swapnil Dubey
Apache flume by Swapnil Dubey
 
Apache Flume and its use case in Manufacturing
Apache Flume and its use case in ManufacturingApache Flume and its use case in Manufacturing
Apache Flume and its use case in Manufacturing
 
Flume @ Austin HUG 2/17/11
Flume @ Austin HUG 2/17/11Flume @ Austin HUG 2/17/11
Flume @ Austin HUG 2/17/11
 
ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016
 
Flume intro-100715
Flume intro-100715Flume intro-100715
Flume intro-100715
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Filesystems, RPC and HDFS
Filesystems, RPC and HDFSFilesystems, RPC and HDFS
Filesystems, RPC and HDFS
 
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Introduction to streaming and messaging  flume,kafka,SQS,kinesis Introduction to streaming and messaging  flume,kafka,SQS,kinesis
Introduction to streaming and messaging flume,kafka,SQS,kinesis
 
Flume with Twitter Integration
Flume with Twitter IntegrationFlume with Twitter Integration
Flume with Twitter Integration
 
Data Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache FlumeData Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache Flume
 
Large scale near real-time log indexing with Flume and SolrCloud
Large scale near real-time log indexing with Flume and SolrCloudLarge scale near real-time log indexing with Flume and SolrCloud
Large scale near real-time log indexing with Flume and SolrCloud
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
Big data: Loading your data with flume and sqoop
Big data:  Loading your data with flume and sqoopBig data:  Loading your data with flume and sqoop
Big data: Loading your data with flume and sqoop
 

En vedette

Secrets St.James/Wild Orchid Pictures
Secrets St.James/Wild Orchid PicturesSecrets St.James/Wild Orchid Pictures
Secrets St.James/Wild Orchid Pictureschglat
 
Prsentation
PrsentationPrsentation
Prsentationdarja18
 
Secrets The Vine
Secrets The VineSecrets The Vine
Secrets The Vinechglat
 
Amanda Hawaii
Amanda HawaiiAmanda Hawaii
Amanda Hawaiichglat
 
Kidology Experience - Resume
Kidology Experience - ResumeKidology Experience - Resume
Kidology Experience - ResumeBrandon Maddux
 
All American Marketing Experience - Resume
All American Marketing Experience - ResumeAll American Marketing Experience - Resume
All American Marketing Experience - ResumeBrandon Maddux
 
Jill Cozumel
Jill CozumelJill Cozumel
Jill Cozumelchglat
 
Wedding Options
Wedding OptionsWedding Options
Wedding Optionschglat
 
Kelley Antigua
Kelley AntiguaKelley Antigua
Kelley Antiguachglat
 
I........................you
I........................youI........................you
I........................youTanhatairn
 
Kriminologia
KriminologiaKriminologia
KriminologiaINA33
 
Hard Rock Riviera Maya
Hard Rock Riviera MayaHard Rock Riviera Maya
Hard Rock Riviera Mayachglat
 
Paulina Jamaica Options
Paulina Jamaica OptionsPaulina Jamaica Options
Paulina Jamaica Optionschglat
 
Pam Curacao
Pam CuracaoPam Curacao
Pam Curacaochglat
 
Vanessasaggiorogagliazzo.let'stalkaboutenglish
Vanessasaggiorogagliazzo.let'stalkaboutenglishVanessasaggiorogagliazzo.let'stalkaboutenglish
Vanessasaggiorogagliazzo.let'stalkaboutenglishvanessasagli
 
Secrets Akumal
Secrets AkumalSecrets Akumal
Secrets Akumalchglat
 
Liz Puerto Vallarta
Liz Puerto VallartaLiz Puerto Vallarta
Liz Puerto Vallartachglat
 

En vedette (20)

AdminCMS
AdminCMSAdminCMS
AdminCMS
 
Secrets St.James/Wild Orchid Pictures
Secrets St.James/Wild Orchid PicturesSecrets St.James/Wild Orchid Pictures
Secrets St.James/Wild Orchid Pictures
 
Prsentation
PrsentationPrsentation
Prsentation
 
Secrets The Vine
Secrets The VineSecrets The Vine
Secrets The Vine
 
Jana
Jana Jana
Jana
 
Amanda Hawaii
Amanda HawaiiAmanda Hawaii
Amanda Hawaii
 
Kidology Experience - Resume
Kidology Experience - ResumeKidology Experience - Resume
Kidology Experience - Resume
 
All American Marketing Experience - Resume
All American Marketing Experience - ResumeAll American Marketing Experience - Resume
All American Marketing Experience - Resume
 
Jill Cozumel
Jill CozumelJill Cozumel
Jill Cozumel
 
Wedding Options
Wedding OptionsWedding Options
Wedding Options
 
Kelley Antigua
Kelley AntiguaKelley Antigua
Kelley Antigua
 
I........................you
I........................youI........................you
I........................you
 
Kriminologia
KriminologiaKriminologia
Kriminologia
 
Hard Rock Riviera Maya
Hard Rock Riviera MayaHard Rock Riviera Maya
Hard Rock Riviera Maya
 
Media pitch
Media pitchMedia pitch
Media pitch
 
Paulina Jamaica Options
Paulina Jamaica OptionsPaulina Jamaica Options
Paulina Jamaica Options
 
Pam Curacao
Pam CuracaoPam Curacao
Pam Curacao
 
Vanessasaggiorogagliazzo.let'stalkaboutenglish
Vanessasaggiorogagliazzo.let'stalkaboutenglishVanessasaggiorogagliazzo.let'stalkaboutenglish
Vanessasaggiorogagliazzo.let'stalkaboutenglish
 
Secrets Akumal
Secrets AkumalSecrets Akumal
Secrets Akumal
 
Liz Puerto Vallarta
Liz Puerto VallartaLiz Puerto Vallarta
Liz Puerto Vallarta
 

Similaire à Apache Flume Data Collection System for Streaming Log Data

Flume lspe-110325145754-phpapp01
Flume lspe-110325145754-phpapp01Flume lspe-110325145754-phpapp01
Flume lspe-110325145754-phpapp01joahp
 
Session 09 - Flume
Session 09 - FlumeSession 09 - Flume
Session 09 - FlumeAnandMHadoop
 
Flume DS -JSP.pptx
Flume DS -JSP.pptxFlume DS -JSP.pptx
Flume DS -JSP.pptxJayesh Patil
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to FlumeRupak Roy
 
Data persistency (draco, cygnus, sth comet, quantum leap)
Data persistency (draco, cygnus, sth comet, quantum leap)Data persistency (draco, cygnus, sth comet, quantum leap)
Data persistency (draco, cygnus, sth comet, quantum leap)Fernando Lopez Aguilar
 
Open Source Big Data Ingestion - Without the Heartburn!
Open Source Big Data Ingestion - Without the Heartburn!Open Source Big Data Ingestion - Without the Heartburn!
Open Source Big Data Ingestion - Without the Heartburn!Pat Patterson
 
Apache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scaleApache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scaleAbdelkrim Hadjidj
 
GOTO Night Amsterdam - Stream processing with Apache Flink
GOTO Night Amsterdam - Stream processing with Apache FlinkGOTO Night Amsterdam - Stream processing with Apache Flink
GOTO Night Amsterdam - Stream processing with Apache FlinkRobert Metzger
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
 
FIWARE Tech Summit - FIWARE Cygnus and STH-Comet
FIWARE Tech Summit - FIWARE Cygnus and STH-CometFIWARE Tech Summit - FIWARE Cygnus and STH-Comet
FIWARE Tech Summit - FIWARE Cygnus and STH-CometFIWARE
 
QCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache FlinkQCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache FlinkRobert Metzger
 
Palo Alto Networks PAN-OS 4.0 New Features
Palo Alto Networks PAN-OS 4.0 New FeaturesPalo Alto Networks PAN-OS 4.0 New Features
Palo Alto Networks PAN-OS 4.0 New Featureslukky753
 
Large scale, distributed and reliable messaging with Kafka
Large scale, distributed and reliable messaging with KafkaLarge scale, distributed and reliable messaging with Kafka
Large scale, distributed and reliable messaging with KafkaRafał Hryniewski
 
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Apache Flink Taiwan User Group
 
INTERNET OF THINGS & AZURE
INTERNET OF THINGS & AZUREINTERNET OF THINGS & AZURE
INTERNET OF THINGS & AZUREDotNetCampus
 
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza SeattleBuilding Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza SeattleEvan Chan
 

Similaire à Apache Flume Data Collection System for Streaming Log Data (20)

Avvo fkafka
Avvo fkafkaAvvo fkafka
Avvo fkafka
 
Flume lspe-110325145754-phpapp01
Flume lspe-110325145754-phpapp01Flume lspe-110325145754-phpapp01
Flume lspe-110325145754-phpapp01
 
Session 09 - Flume
Session 09 - FlumeSession 09 - Flume
Session 09 - Flume
 
Flume DS -JSP.pptx
Flume DS -JSP.pptxFlume DS -JSP.pptx
Flume DS -JSP.pptx
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to Flume
 
Data persistency (draco, cygnus, sth comet, quantum leap)
Data persistency (draco, cygnus, sth comet, quantum leap)Data persistency (draco, cygnus, sth comet, quantum leap)
Data persistency (draco, cygnus, sth comet, quantum leap)
 
Open Source Big Data Ingestion - Without the Heartburn!
Open Source Big Data Ingestion - Without the Heartburn!Open Source Big Data Ingestion - Without the Heartburn!
Open Source Big Data Ingestion - Without the Heartburn!
 
Apache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scaleApache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scale
 
GOTO Night Amsterdam - Stream processing with Apache Flink
GOTO Night Amsterdam - Stream processing with Apache FlinkGOTO Night Amsterdam - Stream processing with Apache Flink
GOTO Night Amsterdam - Stream processing with Apache Flink
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analytics
 
FIWARE Tech Summit - FIWARE Cygnus and STH-Comet
FIWARE Tech Summit - FIWARE Cygnus and STH-CometFIWARE Tech Summit - FIWARE Cygnus and STH-Comet
FIWARE Tech Summit - FIWARE Cygnus and STH-Comet
 
Flume vs. kafka
Flume vs. kafkaFlume vs. kafka
Flume vs. kafka
 
QCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache FlinkQCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache Flink
 
Web Service
Web ServiceWeb Service
Web Service
 
Palo Alto Networks PAN-OS 4.0 New Features
Palo Alto Networks PAN-OS 4.0 New FeaturesPalo Alto Networks PAN-OS 4.0 New Features
Palo Alto Networks PAN-OS 4.0 New Features
 
Large scale, distributed and reliable messaging with Kafka
Large scale, distributed and reliable messaging with KafkaLarge scale, distributed and reliable messaging with Kafka
Large scale, distributed and reliable messaging with Kafka
 
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
 
INTERNET OF THINGS & AZURE
INTERNET OF THINGS & AZUREINTERNET OF THINGS & AZURE
INTERNET OF THINGS & AZURE
 
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza SeattleBuilding Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
 
Spark+flume seattle
Spark+flume seattleSpark+flume seattle
Spark+flume seattle
 

Plus de Chirag Ahuja

Deploy hadoop cluster
Deploy hadoop clusterDeploy hadoop cluster
Deploy hadoop clusterChirag Ahuja
 
Word count example in hadoop mapreduce using java
Word count example in hadoop mapreduce using javaWord count example in hadoop mapreduce using java
Word count example in hadoop mapreduce using javaChirag Ahuja
 
Big data introduction
Big data introductionBig data introduction
Big data introductionChirag Ahuja
 
Hive : WareHousing Over hadoop
Hive :  WareHousing Over hadoopHive :  WareHousing Over hadoop
Hive : WareHousing Over hadoopChirag Ahuja
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advancedChirag Ahuja
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introductionChirag Ahuja
 

Plus de Chirag Ahuja (10)

Deploy hadoop cluster
Deploy hadoop clusterDeploy hadoop cluster
Deploy hadoop cluster
 
Word count example in hadoop mapreduce using java
Word count example in hadoop mapreduce using javaWord count example in hadoop mapreduce using java
Word count example in hadoop mapreduce using java
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Hbase
HbaseHbase
Hbase
 
Pig
PigPig
Pig
 
Hive : WareHousing Over hadoop
Hive :  WareHousing Over hadoopHive :  WareHousing Over hadoop
Hive : WareHousing Over hadoop
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advanced
 
MapReduce basic
MapReduce basicMapReduce basic
MapReduce basic
 
Hdfs
HdfsHdfs
Hdfs
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 

Dernier

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Dernier (20)

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Apache Flume Data Collection System for Streaming Log Data

  • 1. Apache Flume: Data Collection System for HADOOP
  • 2. Outline Overview of Flume Flume Sources Channels & Sinks Flume Topology Production Architecture Monitoring & Performance
  • 3. Overview of Flume Collection, Aggregation of streaming Event Data Typically used for log data Significant advantages over ad-hoc solutions Reliable, Scalable, Manageable, Customizable and High Performance Declarative, Dynamic Configuration Contextual Routing Feature rich Fully extensible
  • 4. Core Concepts: Event An Event is the fundamental unit of data transported by Flume from its point of origination to its final destination. Event is a byte array payload accompanied by optional headers. Payload is opaque to Flume Headers are specified as an unordered collection of string key-value pairs, with keys being unique across the collection Headers can be used for contextual routing
  • 5. Core Concepts: Client An entity that generates events and sends them to one or more Agents. Example Flume log4j Appender Custom Client using Client SDK (org.apache.flume.api) Decouples Flume from the system where event data is consumed from Not needed in all cases
  • 6. Core Concepts: Agent A container for hosting Sources, Channels, Sinks and other components that enable the transportation of events from one place to another. Fundamental part of a Flume flow Provides Configuration, Life-Cycle Management, and Monitoring Support for hosted components
  • 7. Typical Aggregation Flow [Client]+  Agent [ Agent]*  Destination
  • 8. Core Concepts: Source An active component that receives events from a specialized location or mechanism and places it on one or Channels. Different Source types: Specialized sources for integrating with well-known systems. Example: Syslog, Netcat Auto-Generating Sources: Exec, SEQ IPC sources for Agent-to-Agent communication: Avro Require at least one channel to function
  • 9. Source Reads data from the source system and passes onto the next hop or to the final destination. Flume Sources: Avro Source Exec Source JMS Source Spooling Directory Source
  • 10. Core Concepts: Channel A passive component that buffers the incoming events until they are drained by Sinks. Different Channels offer different levels of persistence: Memory Channel: volatile File Channel: backed by WAL implementation JDBC Channel: backed by embedded Database Channels are fully transactional Provide weak ordering guarantees Can work with any number of Sources and Sinks.
  • 11. Core Concepts: Sink An active component that removes events from a Channel and transmits them to their next hop destination. Different types of Sinks: Terminal sinks that deposit events to their final destination. For example: HDFS, HBase IPC sink for Agent-to-Agent communication: Avro Require exactly one channel to function
  • 12. Sinks Writes data to the next hop or to the final destination. Flume Sinks: Avro Sink HDFS Sink HBASE Sink File Sink Null Sink Logger Sink
  • 13. What is the source in Flume
  • 15. Flume Channels Memory Channel Recommended if data loss due to crashes are ok File Channel Recommended channel. JDBC Channel Persistent store of data but introduces bottleneck and single point of failure.
  • 16. Memory Channel Events stored on heap Limited capacity No persistence after a system/process crash Very fast 3 config parameters: capacity: Maximum # of events that can be in the channel transactionCapacity: Maximum # of events in one txn. keepAlive: how long to wait to put/take an event
  • 19. Monitoring: protocol support Several monitoring protocols supported out of the box JMX Ganglia HTTP (JSON) Java opts must be set in flume-env.sh to configure monitoring Ganglia and HTTP monitoring are mutually exclusive