2. Outline
● Current problem
● What is Apache Flume?
● The Flume Model
○ Flows and Nodes
○ Agent, Processor and Collector Nodes
○ Data and Control Path
● Flume goals
○ Reliability
○ Scalability
○ Extensibility
○ Manageability
● Use case: Near Realtime Aggregator
3. Current Problem
● Situation:
You have hundreds of services running in different servers
that produce lots of large logs which should be analyzed
altogether. You have Hadoop to process them.
● Problem:
How do I send all my logs to a place that has Hadoop? I
need a reliable, scalable, extensible and manageable way
to do it!
4. What is Apache Flume?
● It is a distributed data collection service that gets
flows of data (like logs) from their source and
aggregates them to where they have to be processed.
● Goals: reliability, scalability, extensibility,
manageability.
Exactly what I needed!
5. The Flume Model: Flows and Nodes
● A flow corresponds to a type of data source (server
logs, machine monitoring metrics...).
● Flows are comprised of nodes chained together (see
slide 7).
6. The Flume Model: Flows and Nodes
● In a Node, data come in through a source...
...are optionally processed by one or more decorators...
...and then are transmitted out via a sink.
Examples: Console, Exec, Syslog, IRC,
Twitter, other nodes...
Examples: Console, local files, HDFS, S3,
other nodes...
Examples: wire batching, compression,
sampling, projection, extraction...
7. The Flume Model: Agent, Processor and
Collector Nodes
● Agent:
receives data from an
application.
● Processor (optional):
intermediate processing.
● Collector:
write data to permanent
storage.
8. The Flume Model: Data and Control
Path (1/2)
Nodes are in the data path.
9. The Flume Model: Data and Control
Path (2/2)
Masters are in the control path.
● Centralized point of configuration. Multiple: ZK.
● Specify sources, sinks and control data flows.
13. Flume Goals: Extensibility
● Simple Source and Sink API
○ Event streaming and composition of simple
operation
● Plug in Architecture
○ Add your own sources, sinks, decorators