SlideShare une entreprise Scribd logo
1  sur  8
Apache Flume
● What is it ?
● How does it work ?
● Architecture
● Reliability
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Flume – What is it ?
● A data collection service for Hadoop
● For distributed systems
● Open source
● Scaleable
● Reliable
● Manageable
● Fault tolerant
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Flume – How does it work ?
● Flumes uses agents which have
– A source
● Listen for events
● Write events to channel
– A channel
● Queue event data as transactions
– A sink
● Write event data to target i.e. HDFS
● Remove event from queue
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Flume – Architecture
● A single agent showing its parts
● Generally one agent for a given data type
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Flume – Architecture
● Agents can be chained into flows
● Avro can be used for data serialization
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Flume – Architecture
In complicated flows it may be necessary to think about
● Event Data Reliability
● Should we have
– Complete end to end reliability
– Send and forget
– Or something in between ?
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Flume – Architecture
● Complex flows may have many links
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems

Contenu connexe

En vedette

ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016
Jayesh Thakrar
 
Flume-based Independent News Aggregator
Flume-based Independent News AggregatorFlume-based Independent News Aggregator
Flume-based Independent News Aggregator
Mário Almeida
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
Patrick McFadin
 

En vedette (19)

ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016
 
Flume-based Independent News Aggregator
Flume-based Independent News AggregatorFlume-based Independent News Aggregator
Flume-based Independent News Aggregator
 
Flume and HBase
Flume and HBase Flume and HBase
Flume and HBase
 
Linen chute & its cleaning procedure
Linen chute & its cleaning procedureLinen chute & its cleaning procedure
Linen chute & its cleaning procedure
 
Streaming map reduce
Streaming map reduceStreaming map reduce
Streaming map reduce
 
Marketing Chutes and Ladders Infographic
Marketing Chutes and Ladders InfographicMarketing Chutes and Ladders Infographic
Marketing Chutes and Ladders Infographic
 
Flume intro-100715
Flume intro-100715Flume intro-100715
Flume intro-100715
 
HCatalog
HCatalogHCatalog
HCatalog
 
Lcu14 Lightning Talk- NGINX
Lcu14 Lightning Talk- NGINXLcu14 Lightning Talk- NGINX
Lcu14 Lightning Talk- NGINX
 
El scoring bancario en los tiempos del Big Data
El scoring bancario en los tiempos del Big DataEl scoring bancario en los tiempos del Big Data
El scoring bancario en los tiempos del Big Data
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 
Cloudera's Flume
Cloudera's FlumeCloudera's Flume
Cloudera's Flume
 
Load Balancing and Scaling with NGINX
Load Balancing and Scaling with NGINXLoad Balancing and Scaling with NGINX
Load Balancing and Scaling with NGINX
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 
IoT Architecture - are traditional architectures good enough?
IoT Architecture - are traditional architectures good enough?IoT Architecture - are traditional architectures good enough?
IoT Architecture - are traditional architectures good enough?
 
ELK at LinkedIn - Kafka, scaling, lessons learned
ELK at LinkedIn - Kafka, scaling, lessons learnedELK at LinkedIn - Kafka, scaling, lessons learned
ELK at LinkedIn - Kafka, scaling, lessons learned
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 

Plus de Mike Frampton

An introduction to Apache Mesos
An introduction to Apache MesosAn introduction to Apache Mesos
An introduction to Apache Mesos
Mike Frampton
 
An introduction to Pentaho
An introduction to PentahoAn introduction to Pentaho
An introduction to Pentaho
Mike Frampton
 

Plus de Mike Frampton (20)

Apache Airavata
Apache AiravataApache Airavata
Apache Airavata
 
Apache MADlib AI/ML
Apache MADlib AI/MLApache MADlib AI/ML
Apache MADlib AI/ML
 
Apache MXNet AI
Apache MXNet AIApache MXNet AI
Apache MXNet AI
 
Apache Gobblin
Apache GobblinApache Gobblin
Apache Gobblin
 
Apache Singa AI
Apache Singa AIApache Singa AI
Apache Singa AI
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
OrientDB
OrientDBOrientDB
OrientDB
 
Prometheus
PrometheusPrometheus
Prometheus
 
Apache Tephra
Apache TephraApache Tephra
Apache Tephra
 
Apache Kudu
Apache KuduApache Kudu
Apache Kudu
 
Apache Bahir
Apache BahirApache Bahir
Apache Bahir
 
Apache Arrow
Apache ArrowApache Arrow
Apache Arrow
 
JanusGraph DB
JanusGraph DBJanusGraph DB
JanusGraph DB
 
Apache Ignite
Apache IgniteApache Ignite
Apache Ignite
 
Apache Samza
Apache SamzaApache Samza
Apache Samza
 
Apache Flink
Apache FlinkApache Flink
Apache Flink
 
Apache Edgent
Apache EdgentApache Edgent
Apache Edgent
 
Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
An introduction to Apache Mesos
An introduction to Apache MesosAn introduction to Apache Mesos
An introduction to Apache Mesos
 
An introduction to Pentaho
An introduction to PentahoAn introduction to Pentaho
An introduction to Pentaho
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 

An Introduction to Apache Flume

  • 1. Apache Flume ● What is it ? ● How does it work ? ● Architecture ● Reliability www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 2. Flume – What is it ? ● A data collection service for Hadoop ● For distributed systems ● Open source ● Scaleable ● Reliable ● Manageable ● Fault tolerant www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 3. Flume – How does it work ? ● Flumes uses agents which have – A source ● Listen for events ● Write events to channel – A channel ● Queue event data as transactions – A sink ● Write event data to target i.e. HDFS ● Remove event from queue www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 4. Flume – Architecture ● A single agent showing its parts ● Generally one agent for a given data type www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 5. Flume – Architecture ● Agents can be chained into flows ● Avro can be used for data serialization www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 6. Flume – Architecture In complicated flows it may be necessary to think about ● Event Data Reliability ● Should we have – Complete end to end reliability – Send and forget – Or something in between ? www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 7. Flume – Architecture ● Complex flows may have many links www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 8. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems