SlideShare une entreprise Scribd logo
1  sur  41
Apache NiFi – MiNiFi
Taking Dataflow Management
to the Edge
Joe Percivall - @JPercivall
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
About Me
• Software Engineer at Hortonworks
• Apache NiFi committer and PMC member
• Github: github.com/JPercivall
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Why create MiNiFi?
• MiNiFi 0.0.1-Java
• Demo
• Prospective plans
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Why create MiNiFi?
• MiNiFi 0.0.1-Java
• Demo
• Prospective plans
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
• Web-based User Interface for creating, monitoring,
& controlling data flows
• Directed graphs of data routing and transformation
• Highly configurable - modify data flow at runtime,
dynamically prioritize data
• Easily extensible through development of custom
components
• Data Provenance tracks data through entire system
[1] https://nifi.apache.org/
Apache NiFi
Dataflow
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi
Key Features
• Guaranteed delivery
• Data buffering
- Backpressure
- Pressure release
• Prioritized queuing
• Flow specific QoS
- Latency vs. throughput
- Loss tolerance
• Data provenance
• Supports push and pull
models
• Recovery/recording
a rolling log of fine-
grained history
• Visual command and
control
• Flow templates
• Pluggable/multi-role
security
• Designed for extension
• Clustering
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simplified Example
Let’s consider the needs of a courier service
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
On Delivery Routes
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Courier service from the perspective of NiFi & MiNiFi
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
On Delivery Routes
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
Client
Libraries
Client
Libraries
MiNiFi
MiNiFi
NiFi NiFi NiFi NiFi NiFi NiFi
Client
Libraries
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi MiNiFi
Key Features
• Guaranteed delivery
• Data buffering
- Backpressure
- Pressure release
• Prioritized queuing
• Flow specific QoS
- Latency vs. throughput
- Loss tolerance
• Data provenance
• Recovery/recording
a rolling log of fine-
grained history
• Designed for extension
• Design and Deploy
• Warm re-deploys
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi MiNiFi
Key Features
• Guaranteed delivery
• Data buffering
- Backpressure
- Pressure release
• Prioritized queuing
• Flow specific QoS
- Latency vs. throughput
- Loss tolerance
• Data provenance
• Recovery/recording
a rolling log of fine-
grained history
• Designed for extension
• Design and Deploy
• Warm re-deploys
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Visual Command and Control
vs.
Design and Deploy
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Created to more effectively collect
data at the edge
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Why create MiNiFi?
• MiNiFi 0.0.1-Java
• Demo
• Prospective plans
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi vs MiNiFi Java Processes
NiFi Framework
Components
MiNiFi
NiFi Framework
User Interface
Components
NiFi
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Java Processes
Bootstrap
NiFi
UI
bootstrap.conf
nifi.properties
flow.xml.gzreads &
modifies
reads
reads
starts
NiFi MiNiFi
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi Java Processes
MiNiFi
Bootstrap
Configuration
Change Notifier(s)
bootstrap.conf
nifi.properties
flow.xml.gz
reads
reads
starts
config.ymltransforms
reads
into
NiFi MiNiFi
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Same Extensible framework (nars)
 In minifi-0.0.1, the nifi-0.6.1 standard processors are bundled (~20mb)
– Tailing a Log
– UpdateAttribute
– Routing by content or attributes
– PutEmail
Allows MiNiFi to use NiFi processors
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi 0.0.1-Java
 Declarative configuration of processing flows through a YAML configuration file
 Exporting of provenance events to another NiFi instance via a Reporting Task over Site
to Site
 Flow change configuration watcher implementations that provide reloading a NiFi
instance when receiving an updated flow over REST or changes on a file system
 Providing a mechanism to query an instance's status
 <40mb binary distribution
Release Notes
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simple Config.yml
Tail a rolling file -> Site to Site
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
But what about the NiFi.properties values?
Can omitted for default values
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Provenance Reporting
 Site to Site Reporting Task
 JSON formatted provenance events
 Configured via config.yml
 Optional
“SiteToSiteProvenanceReportingTask” in NiFi 0.7.0
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
A bit more complex Config.yml
Tail a rolling File -> Secure Site to Site with Provenance
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi Toolkit
 Convert NiFi templates to config.yml
 Validate config.yml files
CLI to facilitate config.yml building
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Config Change Notifiers
 Two implementations
– RestChangeNotifier
• Http(s)
– FileChangeNotifier
 Configured in bootstrap.conf
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
1. Initial state
–Both running
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
user creates new configuration
2. User sends update through
notifier
–HTTP(S) post request
–Change watched file
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
3. Bootstrap validation
–Basic validation
–Rest notifier will respond
accordingly
–Results logged
validate new configuration
user creates new configuration
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
config.yml
saves new
4. Bootstrap saves and
transforms
–Copy old config.yml to a
swap file
validate new configuration
user creates new configuration
nifi.properties
flow.xml.gz
transforms into
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
nifi.properties
flow.xml.gz
attempt restart
config.yml
saves new
reads
transforms into
5. Bootstrap attempts restart
–MiNiFi reads in the new
nifi.properties and
flow.xml.gz
validate new configuration
user creates new configuration
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
6. Success or Fail
–Successful restart continue
processing
–Failure, rollback to old
config
–Existing Data is mapped or
orphaned
MiNiFi
Bootstrap
Configuration
Change Notifiers
nifi.properties
flow.xml.gz
attempt restart
config.yml
saves new
reads
transforms into
validate new configuration
user creates new configuration
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
minifi.sh flowStatus
 Components
 Instance
 System Diagnostics
Get flow status at the command line
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Why create MiNiFi?
• MiNiFi 0.0.1-Java
• Demo
• Prospective plans
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Courier service from the perspective of NiFi
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
On Delivery Routes
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
Client
Libraries
Client
Libraries
MiNiFi
MiNiFi
NiFi NiFi NiFi NiFi NiFi NiFi
Client
Libraries
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Why create MiNiFi?
• MiNiFi 0.0.1-Java
• Demo
• Prospective plans
35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Prospective Plans
 MiNiFi 0.0.1-Cpp
– Close to a vote to release
– Code itself is 1.2mb without optimization
– Data size
• ~20mb for dynamic RAM for heap
• Static ~50kb
 Configurable Status Reporters
– minifi.sh flowStatus -> regular status update
• MQTT?
• HTTP?
• S2S?
 Handle component Annotation Data
– UpdateAttribute advanced rules
36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Major Feature - Centralized Command and Control
 Design at a centralized place, deploy on the edge
– Flow deployment
– NAR deployment
– Agent deployment
 Version control of flows
 Agent status monitoring
 Bi-directional command and control
Centralized management console with a UI
37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Questions?
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank you!
39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Learn more and join us!
Apache NiFi site
http://nifi.apache.org
Subproject MiNiFi site
http://nifi.apache.org/minifi/
Subscribe to and collaborate at
dev@nifi.apache.org
users@nifi.apache.org
Submit Ideas or Issues
https://issues.apache.org/jira/browse/NIFI
Follow us on Twitter
@apachenifi
40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Matured at NSA 2006-2014
Brief history of the Apache NiFi Community
• Contributors from Government and several commercial industries
• Releases on a 6-8 week schedule
• Apache NiFi 1.0.0. release on the horizon
• Zero-Master Clustering
Code developed
at NSA
2006
Today
Achieved TLP
status in just
7 months
July 2015
Code available
open source
ASL v2
November 2014
41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi differentiation
 Let me get the key parts of NiFi close to where data begins and provide bidrectional
communication
 NiFi lives in the data center. Give it an enterprise server or a cluster of them.
 MiNiFi lives close to where data is born and may be a guest on that device or system

Contenu connexe

Tendances

Tendances (20)

Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 
Apache NiFi Record Processing
Apache NiFi Record ProcessingApache NiFi Record Processing
Apache NiFi Record Processing
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
 
Apache NiFi Crash Course Intro
Apache NiFi Crash Course IntroApache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
 
Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016
 
NiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseNiFi Best Practices for the Enterprise
NiFi Best Practices for the Enterprise
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
 
Apache NiFi Meetup - Introduction to NiFi Registry
Apache NiFi Meetup - Introduction to NiFi RegistryApache NiFi Meetup - Introduction to NiFi Registry
Apache NiFi Meetup - Introduction to NiFi Registry
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifiIntroduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
 
How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBase
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - Ranger
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFi
 

En vedette

Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Hortonworks
 

En vedette (12)

Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFiTaking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
 
Apache Flume (NG)
Apache Flume (NG)Apache Flume (NG)
Apache Flume (NG)
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
Streamline Apache Hadoop Operations with Apache Ambari and SmartSenseStreamline Apache Hadoop Operations with Apache Ambari and SmartSense
Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
 
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
 
Benefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at ScaleBenefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at Scale
 

Similaire à Apache NiFi- MiNiFi meetup Slides

The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiThe First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
DataWorks Summit
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
DataWorks Summit
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
DataWorks Summit
 

Similaire à Apache NiFi- MiNiFi meetup Slides (20)

The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating NiFi and Apex
Integrating NiFi and ApexIntegrating NiFi and Apex
Integrating NiFi and Apex
 
Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
 
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
 
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiThe First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in Nutshell
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFi
 
Apache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitApache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop Summit
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
 

Dernier

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

Apache NiFi- MiNiFi meetup Slides

  • 1. Apache NiFi – MiNiFi Taking Dataflow Management to the Edge Joe Percivall - @JPercivall
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved About Me • Software Engineer at Hortonworks • Apache NiFi committer and PMC member • Github: github.com/JPercivall
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Why create MiNiFi? • MiNiFi 0.0.1-Java • Demo • Prospective plans
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Why create MiNiFi? • MiNiFi 0.0.1-Java • Demo • Prospective plans
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved • Web-based User Interface for creating, monitoring, & controlling data flows • Directed graphs of data routing and transformation • Highly configurable - modify data flow at runtime, dynamically prioritize data • Easily extensible through development of custom components • Data Provenance tracks data through entire system [1] https://nifi.apache.org/ Apache NiFi Dataflow
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi Key Features • Guaranteed delivery • Data buffering - Backpressure - Pressure release • Prioritized queuing • Flow specific QoS - Latency vs. throughput - Loss tolerance • Data provenance • Supports push and pull models • Recovery/recording a rolling log of fine- grained history • Visual command and control • Flow templates • Pluggable/multi-role security • Designed for extension • Clustering
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Simplified Example Let’s consider the needs of a courier service Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster On Delivery Routes Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Courier service from the perspective of NiFi & MiNiFi Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster On Delivery Routes Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/ Client Libraries Client Libraries MiNiFi MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi Client Libraries
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi MiNiFi Key Features • Guaranteed delivery • Data buffering - Backpressure - Pressure release • Prioritized queuing • Flow specific QoS - Latency vs. throughput - Loss tolerance • Data provenance • Recovery/recording a rolling log of fine- grained history • Designed for extension • Design and Deploy • Warm re-deploys
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi MiNiFi Key Features • Guaranteed delivery • Data buffering - Backpressure - Pressure release • Prioritized queuing • Flow specific QoS - Latency vs. throughput - Loss tolerance • Data provenance • Recovery/recording a rolling log of fine- grained history • Designed for extension • Design and Deploy • Warm re-deploys
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Visual Command and Control vs. Design and Deploy
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Created to more effectively collect data at the edge
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Why create MiNiFi? • MiNiFi 0.0.1-Java • Demo • Prospective plans
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi vs MiNiFi Java Processes NiFi Framework Components MiNiFi NiFi Framework User Interface Components NiFi
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi Java Processes Bootstrap NiFi UI bootstrap.conf nifi.properties flow.xml.gzreads & modifies reads reads starts NiFi MiNiFi
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi Java Processes MiNiFi Bootstrap Configuration Change Notifier(s) bootstrap.conf nifi.properties flow.xml.gz reads reads starts config.ymltransforms reads into NiFi MiNiFi
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Same Extensible framework (nars)  In minifi-0.0.1, the nifi-0.6.1 standard processors are bundled (~20mb) – Tailing a Log – UpdateAttribute – Routing by content or attributes – PutEmail Allows MiNiFi to use NiFi processors
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi 0.0.1-Java  Declarative configuration of processing flows through a YAML configuration file  Exporting of provenance events to another NiFi instance via a Reporting Task over Site to Site  Flow change configuration watcher implementations that provide reloading a NiFi instance when receiving an updated flow over REST or changes on a file system  Providing a mechanism to query an instance's status  <40mb binary distribution Release Notes
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Simple Config.yml Tail a rolling file -> Site to Site
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved But what about the NiFi.properties values? Can omitted for default values
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Provenance Reporting  Site to Site Reporting Task  JSON formatted provenance events  Configured via config.yml  Optional “SiteToSiteProvenanceReportingTask” in NiFi 0.7.0
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved A bit more complex Config.yml Tail a rolling File -> Secure Site to Site with Provenance
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi Toolkit  Convert NiFi templates to config.yml  Validate config.yml files CLI to facilitate config.yml building
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Config Change Notifiers  Two implementations – RestChangeNotifier • Http(s) – FileChangeNotifier  Configured in bootstrap.conf
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers 1. Initial state –Both running
  • 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers user creates new configuration 2. User sends update through notifier –HTTP(S) post request –Change watched file
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers 3. Bootstrap validation –Basic validation –Rest notifier will respond accordingly –Results logged validate new configuration user creates new configuration
  • 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers config.yml saves new 4. Bootstrap saves and transforms –Copy old config.yml to a swap file validate new configuration user creates new configuration nifi.properties flow.xml.gz transforms into
  • 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers nifi.properties flow.xml.gz attempt restart config.yml saves new reads transforms into 5. Bootstrap attempts restart –MiNiFi reads in the new nifi.properties and flow.xml.gz validate new configuration user creates new configuration
  • 30. 30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update 6. Success or Fail –Successful restart continue processing –Failure, rollback to old config –Existing Data is mapped or orphaned MiNiFi Bootstrap Configuration Change Notifiers nifi.properties flow.xml.gz attempt restart config.yml saves new reads transforms into validate new configuration user creates new configuration
  • 31. 31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved minifi.sh flowStatus  Components  Instance  System Diagnostics Get flow status at the command line
  • 32. 32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Why create MiNiFi? • MiNiFi 0.0.1-Java • Demo • Prospective plans
  • 33. 33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Courier service from the perspective of NiFi Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster On Delivery Routes Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/ Client Libraries Client Libraries MiNiFi MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi Client Libraries
  • 34. 34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Why create MiNiFi? • MiNiFi 0.0.1-Java • Demo • Prospective plans
  • 35. 35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Prospective Plans  MiNiFi 0.0.1-Cpp – Close to a vote to release – Code itself is 1.2mb without optimization – Data size • ~20mb for dynamic RAM for heap • Static ~50kb  Configurable Status Reporters – minifi.sh flowStatus -> regular status update • MQTT? • HTTP? • S2S?  Handle component Annotation Data – UpdateAttribute advanced rules
  • 36. 36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Major Feature - Centralized Command and Control  Design at a centralized place, deploy on the edge – Flow deployment – NAR deployment – Agent deployment  Version control of flows  Agent status monitoring  Bi-directional command and control Centralized management console with a UI
  • 37. 37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions?
  • 38. 38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank you!
  • 39. 39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Learn more and join us! Apache NiFi site http://nifi.apache.org Subproject MiNiFi site http://nifi.apache.org/minifi/ Subscribe to and collaborate at dev@nifi.apache.org users@nifi.apache.org Submit Ideas or Issues https://issues.apache.org/jira/browse/NIFI Follow us on Twitter @apachenifi
  • 40. 40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Matured at NSA 2006-2014 Brief history of the Apache NiFi Community • Contributors from Government and several commercial industries • Releases on a 6-8 week schedule • Apache NiFi 1.0.0. release on the horizon • Zero-Master Clustering Code developed at NSA 2006 Today Achieved TLP status in just 7 months July 2015 Code available open source ASL v2 November 2014
  • 41. 41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi differentiation  Let me get the key parts of NiFi close to where data begins and provide bidrectional communication  NiFi lives in the data center. Give it an enterprise server or a cluster of them.  MiNiFi lives close to where data is born and may be a guest on that device or system

Notes de l'éditeur

  1. Understand what MiNiFi is and the basics Go over the features of MiNiFi 0.0.1 Basic IoT demo Plans
  2. The Apache NiFi project as a whole (including MiNiFi) is all about routing getting the right data to the right place.
  3. In order to know why the MiNiFi was started as a sub-project of NiFi you first need to know what NiFi is. Quick refresher on Apache NiFi
  4. Let me get the key parts of NiFi close to where data begins and provide bidrectional communication NiFi lives in the data center. Give it an enterprise server or a cluster of them. MiNiFi lives close to where data is born and may be a guest on that device or system
  5. Aggregator vs. Agent
  6. The Apache NiFi project as a whole (including MiNiFi) is all about routing getting the right data to the right place.
  7. NiFi 0.7.0 is ~600mb, but most of that is UI and components Framework – put a new wrapper on the framework, or in maven terms, we kept the underlying modules and wrote minifi-framework-core replacing nifi-framework-core MiNifI packaged components ~20mb
  8. Initiates with ./bin/nifi.sh start
  9. user, only need bootstrap and config.yml nifi.properties and flow.xml are implementation details
  10. Since it uses the same underlying framework, MiNiFi is extensible exactly like NiFi NiFi 0.7.0 has 155 different processors to chose from Pub/sub communication (ie. Kafka, MQTT) Endpoint delivery (ie. HDFS, HBase) Format validation/transformation (ie. JSON, XML) Yandex Language Translation
  11. Now that we’ve covered the basic architecture of MiNiFi we can talk about 0.0.1
  12. Validate limitations
  13. configured
  14. With the removal of the UI needed a way to get all the status of the flow Full usage in System Admin Guide Admittedly not user-friendly at the moment Setting up for a fully interface to get the bootstrap to report on the flow
  15. The Apache NiFi project as a whole (including MiNiFi) is all about routing getting the right data to the right place.
  16. Truck: need to be notified of high temp or humidity
  17. The Apache NiFi project as a whole (including MiNiFi) is all about routing getting the right data to the right place.
  18. Think NiFi Process Groups but MiNiFi Agents