SlideShare a Scribd company logo
1 of 29
Download to read offline
Hortonworks DataFlow
Real-time Data Flow powered by Apache NiFi
© Hortonworks Inc. 2011 – 2015. All Rights Reserved
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Simplistic View of Enterprise Data Flow
The Data Flow Thing
Process and
Analyze Data
Acquire Data
Store Data
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Realty of Point to Point Connections
For every connection, these must agree:
1.  Protocol
2.  Format
3.  Schema
4.  Priority
5.  Size of event
6.  Frequency of event
7.  Authorization access
8.  Relevance
9.  Security
P1
Producer
C1
Consumer
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Different organizations/business units across different geographic locations…
Realistic View of Enterprise Data Flow
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Conducting business in different legal and network domains…
Realistic View of Enterprise Data Flow
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Operating on very different infrastructure (power, space, cooling)…
Realistic View of Enterprise Data Flow
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Capable of different volume, velocity, bandwidth, and latency…
Realistic View of Enterprise Data Flow
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Interacting with different business partners and customers
Realistic View of Enterprise Data Flow
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
The Need for Data Provenance
For Operators
•  Traceability, lineage
•  Recovery and replay
For Compliance
•  Audit trail
•  Remediation
For Business
•  Value sources
•  Value IT investment
BEGIN
END
LINEAGE
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
The Need for Fine-grained Security and Compliance
It’s not enough to say you have
encrypted communications
•  Enterprise authorization
services –entitlements
change often
•  People and systems with
different roles require
difference access levels
•  Tagged/classified data
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Challenges at the Jagged Edge
•  Small footprint
•  Low power
•  Expensive bandwidth
•  High latency
•  Access to data exceeds
bandwidth (if you're doing
it right)
•  Needs recoverability
•  Distributed Agent Model
•  Needs to be secured for
both the data plane and
control plane
GATHER
DELIVER
PRIORITIZE
Track from the edge Through the datacenter
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Real-time Data Flow
It’s not just how quickly you
move data – it’s about how
quickly you can change behavior
and seize new opportunities
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Do you think organizations struggle with dataflow management?
Realistic View of Enterprise Data Flow
?
?
?
?
?
?
?
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
HDF Powered by Apache NiFi Addresses Modern Data
Flow Challenges
Aggregate all IoAT data from sensors, geo-location devices,
machines, logs, files, and feeds via a highly secure lightweight agent
Collect: Bring Together•  Logs
•  Files
•  Feeds
•  Sensors
Mediate point-to-point and bi-directional data flows, delivering data
reliably to real-time applications and storage platforms such as HDP
Conduct: Mediate the Data Flow•  Deliver
•  Secure
•  Govern
•  Audit
Parse, filter, join, transform, fork, and clone data in motion to
empower analytics and perishable insights
Curate: Gain Insights•  Parse
•  Filter
•  Transform
•  Fork
•  Clone
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Developed by the NSA
over the last 8 years.
"NSA's innovators work
on some of the most
challenging national
security problems
imaginable,"
"Commercial enterprises
could use it to quickly
control, manage, and
analyze the flow of
information from
geographically dispersed
sites – creating
comprehensive situational
awareness"
-- Linda L. Burger,
Director of the NSA
NiFi Developed by the National Security Agency
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Proven Technology, through Onyara Acquisition
Hortonworks with Onyara’s technology
delivers a state-of-the-art solution for
the Internet of Anything
•  Easier, reliable and secure data collection
•  From anywhere and anything
Availability
•  Hortonworks DataFlow subscription available now
•  Hortonworks Data Platform subscription since 2012
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Designed In Response to Real World Demands
Visual User Interface
Drag and drop for efficient, agile operations
Immediate Feedback
Start, stop, tune, replay dataflows in real-time
Adaptive to Volume and Bandwidth
Any data, big or small
Provenance Metadata
Governance, compliance & data evaluation
Secure Data Acquisition & Transport
Fine grained encryption for controlled data
sharing
Powered by
Apache NiFi
Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache NiFi – Key Features
•  Guaranteed delivery
•  Data buffering
-  Backpressure
-  Pressure release
•  Prioritized queuing
•  Flow specific QoS
-  Latency vs. throughput
-  Loss tolerance
•  Data provenance
•  Recovery/recording
a rolling log of fine-
grained history
•  Visual command and
control
•  Flow templates
•  Pluggable/multi-role
security
•  Designed for extension
•  Clustering
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Key Concepts
NiFi Term FBP Term Description
FlowFile Information Packet Each object moving through the system.
FlowFile Processor Black Box Performs the work, doing some combination of
data acquisition, routing, transformation, or
mediation between systems.
Connection Bounded Buffer The linkage between processors, acting as queues
and allowing various processes to interact at
differing rates.
Flow Controller Scheduler Maintains the knowledge of how processes are
connected, and manages the threads and
allocations thereof which all processes use.
Process Group Subnet A set of processes and their connections, which
can receive and send data via ports. A process
group allows creation of entirely new component
simply by composition of its components.
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
OS/Host
JVM
Flow Controller
Web Server
Processor 1 Extension N
FlowFile
Repository
Content
Repository
Provenance
Repository
Local Storage
OS/Host
JVM
Flow Controller
Web Server
Processor 1 Extension N
FlowFile
Repository
Content
Repository
Provenance
Repository
Local Storage
Architecture – Supports Single and Clustered
Implementation OS/Host
JVM
NiFi Cluster Manager – Request Replicator
Web Server
Master
NiFi Cluster
Manager (NCM)
OS/Host
JVM
Flow Controller
Web Server
Processor 1 Extension N
FlowFile
Repository
Content
Repository
Provenance
Repository
Local Storage
Slaves
NiFi Nodes
Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cluster Architecture
§  Master
§  Tracks of which Nodes are in the
cluster, their status, and to replicate
requests to modify or observe the
flow
§  Slaves
§  Do the flow processing
§  Manage Local Repositories
§  FlowFile Repository – Metadata of
FlowFile currently active
§  Content Repository – Contents of
FlowFile
§  Provenance Repository – Archive of
provenance events
OS/Host
JVM
Flow Controller
Web Server
Processor 1 Extension N
FlowFile
Repository
Content
Repository
Provenance
Repository
Local Storage
OS/Host
JVM
Flow Controller
Web Server
Processor 1 Extension N
FlowFile
Repository
Content
Repository
Provenance
Repository
Local Storage
OS/Host
JVM
NiFi Cluster Manager – Request Replicator
Web Server
Master
NiFi Cluster
Manager (NCM)
OS/Host
JVM
Flow Controller
Web Server
Processor 1 Extension N
FlowFile
Repository
Content
Repository
Provenance
Repository
Local Storage
Slaves
NiFi Nodes
Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Security
Administration
Central management and
consistent security
•  NiFi Cluster Manager
Authentication
Authenticate users and systems
•  2-Way SSL support out of the box; additional types coming
Authorization
Provision access to data
•  Pluggable authorization designed to fit any Identity and Access Management (IAM) scheme
•  File-based authority provider out of the box
•  Multi-role
Audit
Maintain a record of data access
•  Detailed logging of all user actions
•  Detailed logging of key system behaviors
•  Data Provenance enables unparalleled tracking from the edge through the Lake
Data Protection
Protect data at rest and in motion
•  Support a variety of SSL/encrypted protocols
•  Tag and utilize tags on data for fine grained access controls
•  Encrypt/decrypt content using pre-shared key mechanisms
Administrator Configure system threads, user
accounts, and flow audit history
Data Flow Manager Manipulate the dataflow
Read Only View the dataflow only
+NiFi Configure system threads, user
accounts, and flow audit history
Proxy Manipulate the dataflow
Provenance Query the provenance
repository and
download content
Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Operations
•  Dynamic flow changes
-  Push new business rules via REST API
(closed loop)
-  Pull updates periodically from web
services
•  Site-to-site
-  Stay at the ‘flow level’ not suddenly
doing file transfer protocols
•  Reporting tasks (push)
•  Statistics / status (pull)
•  Extensible
•  Optimized user
experience – log hunts
should be the exception
Scale down, up, and out – in
containers and on virtual machines
Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Typical HDF Sizing Scenarios: Sustained Throughput
For Sustained
Throughput of 50MB/sec
and thousands of events
per second
•  1-2 nodes
•  8+ cores per node
(more is better)
•  6+ disks per node
(SSD or Spinning)
•  2 GB of mem per node
•  1GB bonded NICs
ideally
For Sustained
Throughput of 100MB/
sec and tens of
thousands of events per
second
•  3-4 nodes
•  8+ cores per node
(more is better)
•  6+ disks per node
(SSD or Spinning)
•  2 GB of mem per node
•  1GB bonded NICs
ideally
For Sustained
Throughput of 200MB/
sec and hundreds of
thousands of events per
second
•  5-7 nodes
•  24+ cores per node
(effective cpus)
•  12+ disks per node
(SSD or spinning)
•  4GB of mem per node
•  10GB bonded NICs
For Sustained
Throughput of
400-500MB/sec and
hundreds of thousands
of events per second
•  7-10 nodes
•  24+ cores per node
(effective cpus)
•  12+ disks per node
(SSD or spinning)
•  6GB of mem per node
•  10GB bonded NICs
Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
FAQs
§  Can I use NiFi for single site deployment
§  Yes. Data movement between machines and services in a site face many of the same challenges as
data movement across sites
§  How does NiFi compare with Apache Flume
§  NiFi supersedes Flume capabilities providing visual interface, dynamic flow changes, information
provenance and advanced transformation capabilities
§  Will Flume be supported by Hortonworks
§  Currently there are no plans to remove Flume from HDP
§  Existing Flume users are encouraged to evaluate NiFi for their usage case
Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
FAQs
§  Are NiFi and Kafka overlapping in functionality
§  No. NiFi and Kafka are complimentary
§  NiFi focus is data acquisition, data transformation, data routing and data delivery with data
provenance and replay capabilities. Kafka is a pub-sub message bus
§  NiFi can publish and consume data from Kafka
§  NiFi supports data delivery to applications using it’s supported protocols, format and schema. No
application change is required
§  Can I score and act on incoming events using NiFi without leveraging Storm/Spark-
Streaming
§  Yes. NiFi is a simple event processing system. It is ideal for scoring and triggering actions on events
that can be analyzed using the event’s data and metadata.
§  Storm/Spark-Streaming are complex event processing systems. Usage of Storm/Spark-Streaming is
recommended when analysis is required over a window of events.
Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
FAQs
§  Does NiFi overlap with ETL tools
§  No. ETL focus is bulk extract, bulk data joins, batch data quality from traditional data sources. ETL
tools operate in the hadoop cluster.
§  NiFi focus is streaming and batch data ingest from new (machine data) and traditional data sources.
NiFi operates outside of the hadoop cluster and is often deployed at remote sites.
Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Labs
https://gist.github.com/abajwa-hw/e884bcc423cfbadf4a1d
Lab 1: https://github.com/abajwa-hw/ambari-nifi-service
Lab 2: https://github.com/abajwa-hw/nifi-network-processor
Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Thank you

More Related Content

What's hot

Apache NiFi Toronto Meetup
Apache NiFi Toronto MeetupApache NiFi Toronto Meetup
Apache NiFi Toronto MeetupHortonworks
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Hortonworks
 
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC IsilonScaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC IsilonHortonworks
 
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSenseDouble Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSenseHortonworks
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambariHortonworks
 
Hortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks
 
Apache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleApache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleHortonworks
 
Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?DataWorks Summit
 
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseDataWorks Summit
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?DataWorks Summit
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0DataWorks Summit
 
Hive ACID Apache BigData 2016
Hive ACID Apache BigData 2016Hive ACID Apache BigData 2016
Hive ACID Apache BigData 2016alanfgates
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with sparkHortonworks
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and FlinkBryan Bende
 

What's hot (20)

Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Apache NiFi Toronto Meetup
Apache NiFi Toronto MeetupApache NiFi Toronto Meetup
Apache NiFi Toronto Meetup
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in Nutshell
 
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016
 
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC IsilonScaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC Isilon
 
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSenseDouble Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambari
 
Hortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical Applications
 
Apache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleApache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, Scale
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
 
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
 
Hive ACID Apache BigData 2016
Hive ACID Apache BigData 2016Hive ACID Apache BigData 2016
Hive ACID Apache BigData 2016
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with spark
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 

Viewers also liked

Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesIsheeta Sanghi
 
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFDesign a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFHortonworks
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks
 
Hortonworks DataFlow & Apache Nifi @Oslo Hadoop Big Data
Hortonworks DataFlow & Apache Nifi @Oslo Hadoop Big DataHortonworks DataFlow & Apache Nifi @Oslo Hadoop Big Data
Hortonworks DataFlow & Apache Nifi @Oslo Hadoop Big DataMats Johansson
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkHortonworks
 
Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.Emiland
 
Beyond Messaging Enterprise Dataflow powered by Apache NiFi
Beyond Messaging Enterprise Dataflow powered by Apache NiFiBeyond Messaging Enterprise Dataflow powered by Apache NiFi
Beyond Messaging Enterprise Dataflow powered by Apache NiFiIsheeta Sanghi
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionMilind Pandit
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop EcosystemApache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop EcosystemBryan Bende
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesDataWorks Summit
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprisesmarkgrover
 
AWS User Group Meetup Berlin - Kay Lerch on Apache NiFi (2016-04-19)
AWS User Group Meetup Berlin - Kay Lerch on Apache NiFi (2016-04-19)AWS User Group Meetup Berlin - Kay Lerch on Apache NiFi (2016-04-19)
AWS User Group Meetup Berlin - Kay Lerch on Apache NiFi (2016-04-19)Kay Lerch
 
Introduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupIntroduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupSaptak Sen
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveBryan Bende
 

Viewers also liked (20)

Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup Slides
 
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFDesign a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDF
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Admiral Group
Admiral GroupAdmiral Group
Admiral Group
 
Hortonworks DataFlow & Apache Nifi @Oslo Hadoop Big Data
Hortonworks DataFlow & Apache Nifi @Oslo Hadoop Big DataHortonworks DataFlow & Apache Nifi @Oslo Hadoop Big Data
Hortonworks DataFlow & Apache Nifi @Oslo Hadoop Big Data
 
Dataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJDataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJ
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
From Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFiFrom Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFi
 
Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.
 
Beyond Messaging Enterprise Dataflow powered by Apache NiFi
Beyond Messaging Enterprise Dataflow powered by Apache NiFiBeyond Messaging Enterprise Dataflow powered by Apache NiFi
Beyond Messaging Enterprise Dataflow powered by Apache NiFi
 
Hemodiafiltration
Hemodiafiltration Hemodiafiltration
Hemodiafiltration
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi Introduction
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop EcosystemApache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data Architectures
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
AWS User Group Meetup Berlin - Kay Lerch on Apache NiFi (2016-04-19)
AWS User Group Meetup Berlin - Kay Lerch on Apache NiFi (2016-04-19)AWS User Group Meetup Berlin - Kay Lerch on Apache NiFi (2016-04-19)
AWS User Group Meetup Berlin - Kay Lerch on Apache NiFi (2016-04-19)
 
Introduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupIntroduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability Meetup
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
 

Similar to HDF: Hortonworks DataFlow: Technical Workshop

BigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiBigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiAldrin Piri
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
 
Apache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupApache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupJoseph Witt
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHaimo Liu
 
[253] apache ni fi
[253] apache ni fi[253] apache ni fi
[253] apache ni fiNAVER D2
 
Integração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia CetaxIntegração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia CetaxMarco Garcia
 
Curing the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerCuring the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerDataWorks Summit
 
Data on the Move - DataCon DC
Data on the Move - DataCon DCData on the Move - DataCon DC
Data on the Move - DataCon DCJoseph Witt
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveAldrin Piri
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA
 
Make Streaming Analytics work for you: The Devil is in the Details
Make Streaming Analytics work for you: The Devil is in the DetailsMake Streaming Analytics work for you: The Devil is in the Details
Make Streaming Analytics work for you: The Devil is in the DetailsDataWorks Summit/Hadoop Summit
 
Make Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouMake Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouHortonworks
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiAldrin Piri
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityAccumulo Summit
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Seetharam Venkatesh
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDataWorks Summit
 

Similar to HDF: Hortonworks DataFlow: Technical Workshop (20)

BigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiBigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFi
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
Apache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupApache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming Meetup
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
 
[253] apache ni fi
[253] apache ni fi[253] apache ni fi
[253] apache ni fi
 
Integração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia CetaxIntegração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia Cetax
 
Curing the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerCuring the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging Manager
 
Data on the Move - DataCon DC
Data on the Move - DataCon DCData on the Move - DataCon DC
Data on the Move - DataCon DC
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
 
Joseph Witt
Joseph WittJoseph Witt
Joseph Witt
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
 
Make Streaming Analytics work for you: The Devil is in the Details
Make Streaming Analytics work for you: The Devil is in the DetailsMake Streaming Analytics work for you: The Devil is in the Details
Make Streaming Analytics work for you: The Devil is in the Details
 
Make Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouMake Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for You
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 

Recently uploaded (20)

KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 

HDF: Hortonworks DataFlow: Technical Workshop

  • 1. Hortonworks DataFlow Real-time Data Flow powered by Apache NiFi © Hortonworks Inc. 2011 – 2015. All Rights Reserved
  • 2. Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Simplistic View of Enterprise Data Flow The Data Flow Thing Process and Analyze Data Acquire Data Store Data
  • 3. Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Realty of Point to Point Connections For every connection, these must agree: 1.  Protocol 2.  Format 3.  Schema 4.  Priority 5.  Size of event 6.  Frequency of event 7.  Authorization access 8.  Relevance 9.  Security P1 Producer C1 Consumer
  • 4. Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Different organizations/business units across different geographic locations… Realistic View of Enterprise Data Flow
  • 5. Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Conducting business in different legal and network domains… Realistic View of Enterprise Data Flow
  • 6. Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Operating on very different infrastructure (power, space, cooling)… Realistic View of Enterprise Data Flow
  • 7. Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Capable of different volume, velocity, bandwidth, and latency… Realistic View of Enterprise Data Flow
  • 8. Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Interacting with different business partners and customers Realistic View of Enterprise Data Flow
  • 9. Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved The Need for Data Provenance For Operators •  Traceability, lineage •  Recovery and replay For Compliance •  Audit trail •  Remediation For Business •  Value sources •  Value IT investment BEGIN END LINEAGE
  • 10. Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved The Need for Fine-grained Security and Compliance It’s not enough to say you have encrypted communications •  Enterprise authorization services –entitlements change often •  People and systems with different roles require difference access levels •  Tagged/classified data
  • 11. Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Challenges at the Jagged Edge •  Small footprint •  Low power •  Expensive bandwidth •  High latency •  Access to data exceeds bandwidth (if you're doing it right) •  Needs recoverability •  Distributed Agent Model •  Needs to be secured for both the data plane and control plane GATHER DELIVER PRIORITIZE Track from the edge Through the datacenter
  • 12. Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Real-time Data Flow It’s not just how quickly you move data – it’s about how quickly you can change behavior and seize new opportunities
  • 13. Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Do you think organizations struggle with dataflow management? Realistic View of Enterprise Data Flow ? ? ? ? ? ? ?
  • 14. Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved HDF Powered by Apache NiFi Addresses Modern Data Flow Challenges Aggregate all IoAT data from sensors, geo-location devices, machines, logs, files, and feeds via a highly secure lightweight agent Collect: Bring Together•  Logs •  Files •  Feeds •  Sensors Mediate point-to-point and bi-directional data flows, delivering data reliably to real-time applications and storage platforms such as HDP Conduct: Mediate the Data Flow•  Deliver •  Secure •  Govern •  Audit Parse, filter, join, transform, fork, and clone data in motion to empower analytics and perishable insights Curate: Gain Insights•  Parse •  Filter •  Transform •  Fork •  Clone
  • 15. Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Developed by the NSA over the last 8 years. "NSA's innovators work on some of the most challenging national security problems imaginable," "Commercial enterprises could use it to quickly control, manage, and analyze the flow of information from geographically dispersed sites – creating comprehensive situational awareness" -- Linda L. Burger, Director of the NSA NiFi Developed by the National Security Agency
  • 16. Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Proven Technology, through Onyara Acquisition Hortonworks with Onyara’s technology delivers a state-of-the-art solution for the Internet of Anything •  Easier, reliable and secure data collection •  From anywhere and anything Availability •  Hortonworks DataFlow subscription available now •  Hortonworks Data Platform subscription since 2012
  • 17. Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Designed In Response to Real World Demands Visual User Interface Drag and drop for efficient, agile operations Immediate Feedback Start, stop, tune, replay dataflows in real-time Adaptive to Volume and Bandwidth Any data, big or small Provenance Metadata Governance, compliance & data evaluation Secure Data Acquisition & Transport Fine grained encryption for controlled data sharing Powered by Apache NiFi
  • 18. Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Apache NiFi – Key Features •  Guaranteed delivery •  Data buffering -  Backpressure -  Pressure release •  Prioritized queuing •  Flow specific QoS -  Latency vs. throughput -  Loss tolerance •  Data provenance •  Recovery/recording a rolling log of fine- grained history •  Visual command and control •  Flow templates •  Pluggable/multi-role security •  Designed for extension •  Clustering
  • 19. Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Key Concepts NiFi Term FBP Term Description FlowFile Information Packet Each object moving through the system. FlowFile Processor Black Box Performs the work, doing some combination of data acquisition, routing, transformation, or mediation between systems. Connection Bounded Buffer The linkage between processors, acting as queues and allowing various processes to interact at differing rates. Flow Controller Scheduler Maintains the knowledge of how processes are connected, and manages the threads and allocations thereof which all processes use. Process Group Subnet A set of processes and their connections, which can receive and send data via ports. A process group allows creation of entirely new component simply by composition of its components.
  • 20. Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage Architecture – Supports Single and Clustered Implementation OS/Host JVM NiFi Cluster Manager – Request Replicator Web Server Master NiFi Cluster Manager (NCM) OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage Slaves NiFi Nodes
  • 21. Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cluster Architecture §  Master §  Tracks of which Nodes are in the cluster, their status, and to replicate requests to modify or observe the flow §  Slaves §  Do the flow processing §  Manage Local Repositories §  FlowFile Repository – Metadata of FlowFile currently active §  Content Repository – Contents of FlowFile §  Provenance Repository – Archive of provenance events OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage OS/Host JVM NiFi Cluster Manager – Request Replicator Web Server Master NiFi Cluster Manager (NCM) OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage Slaves NiFi Nodes
  • 22. Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Security Administration Central management and consistent security •  NiFi Cluster Manager Authentication Authenticate users and systems •  2-Way SSL support out of the box; additional types coming Authorization Provision access to data •  Pluggable authorization designed to fit any Identity and Access Management (IAM) scheme •  File-based authority provider out of the box •  Multi-role Audit Maintain a record of data access •  Detailed logging of all user actions •  Detailed logging of key system behaviors •  Data Provenance enables unparalleled tracking from the edge through the Lake Data Protection Protect data at rest and in motion •  Support a variety of SSL/encrypted protocols •  Tag and utilize tags on data for fine grained access controls •  Encrypt/decrypt content using pre-shared key mechanisms Administrator Configure system threads, user accounts, and flow audit history Data Flow Manager Manipulate the dataflow Read Only View the dataflow only +NiFi Configure system threads, user accounts, and flow audit history Proxy Manipulate the dataflow Provenance Query the provenance repository and download content
  • 23. Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Operations •  Dynamic flow changes -  Push new business rules via REST API (closed loop) -  Pull updates periodically from web services •  Site-to-site -  Stay at the ‘flow level’ not suddenly doing file transfer protocols •  Reporting tasks (push) •  Statistics / status (pull) •  Extensible •  Optimized user experience – log hunts should be the exception Scale down, up, and out – in containers and on virtual machines
  • 24. Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Typical HDF Sizing Scenarios: Sustained Throughput For Sustained Throughput of 50MB/sec and thousands of events per second •  1-2 nodes •  8+ cores per node (more is better) •  6+ disks per node (SSD or Spinning) •  2 GB of mem per node •  1GB bonded NICs ideally For Sustained Throughput of 100MB/ sec and tens of thousands of events per second •  3-4 nodes •  8+ cores per node (more is better) •  6+ disks per node (SSD or Spinning) •  2 GB of mem per node •  1GB bonded NICs ideally For Sustained Throughput of 200MB/ sec and hundreds of thousands of events per second •  5-7 nodes •  24+ cores per node (effective cpus) •  12+ disks per node (SSD or spinning) •  4GB of mem per node •  10GB bonded NICs For Sustained Throughput of 400-500MB/sec and hundreds of thousands of events per second •  7-10 nodes •  24+ cores per node (effective cpus) •  12+ disks per node (SSD or spinning) •  6GB of mem per node •  10GB bonded NICs
  • 25. Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved FAQs §  Can I use NiFi for single site deployment §  Yes. Data movement between machines and services in a site face many of the same challenges as data movement across sites §  How does NiFi compare with Apache Flume §  NiFi supersedes Flume capabilities providing visual interface, dynamic flow changes, information provenance and advanced transformation capabilities §  Will Flume be supported by Hortonworks §  Currently there are no plans to remove Flume from HDP §  Existing Flume users are encouraged to evaluate NiFi for their usage case
  • 26. Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved FAQs §  Are NiFi and Kafka overlapping in functionality §  No. NiFi and Kafka are complimentary §  NiFi focus is data acquisition, data transformation, data routing and data delivery with data provenance and replay capabilities. Kafka is a pub-sub message bus §  NiFi can publish and consume data from Kafka §  NiFi supports data delivery to applications using it’s supported protocols, format and schema. No application change is required §  Can I score and act on incoming events using NiFi without leveraging Storm/Spark- Streaming §  Yes. NiFi is a simple event processing system. It is ideal for scoring and triggering actions on events that can be analyzed using the event’s data and metadata. §  Storm/Spark-Streaming are complex event processing systems. Usage of Storm/Spark-Streaming is recommended when analysis is required over a window of events.
  • 27. Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved FAQs §  Does NiFi overlap with ETL tools §  No. ETL focus is bulk extract, bulk data joins, batch data quality from traditional data sources. ETL tools operate in the hadoop cluster. §  NiFi focus is streaming and batch data ingest from new (machine data) and traditional data sources. NiFi operates outside of the hadoop cluster and is often deployed at remote sites.
  • 28. Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Labs https://gist.github.com/abajwa-hw/e884bcc423cfbadf4a1d Lab 1: https://github.com/abajwa-hw/ambari-nifi-service Lab 2: https://github.com/abajwa-hw/nifi-network-processor
  • 29. Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Thank you