SlideShare a Scribd company logo
1 of 56
Download to read offline
1 © Hortonworks Inc. 2011–2018. All rights reserved
Dataflow Management From Edge to
Core with Apache NiFi
Andy LoPresto | @yolopey
Sr. Member of Technical Staff at Hortonworks, Apache NiFi PMC & Committer
11 October 2018 Dataworks Summit Singapore
2 © Hortonworks Inc. 2011–2018. All rights reserved
Gauging Audience Familiarity with NiFi
“What’s a NeeFee?”
No experience with dataflow
No experience with NiFi
“I can pick this up pretty quickly”
Some experience with dataflow
Some experience with NiFi
“I refactored the Ambari
integration endpoint to
allow for mutual
authentication TLS during
my coffee break”
Forgotten more about NiFi
than most of us will ever
know
3 © Hortonworks Inc. 2011–2018. All rights reserved
Agenda
• What is dataflow and what are the challenges?
• Apache NiFi
• Apache MiNiFi
• Apache NiFi Registry
• Complementary Tools
• Community
• All slides provided online, so no need to transcribe
4 © Hortonworks Inc. 2011–2018. All rights reserved
What Is Dataflow?
5 © Hortonworks Inc. 2011–2018. All rights reserved
What Is Dataflow?
• Moving some content from A to B
• Content could be any bytes
• Logs
• HTTP
• XML
• CSV
• Images
• Video
• Telemetry
Producers A.K.A
Things
Anything
AND
Everything
Internet!
Consumers
• User
• Storage
• System
• …More Things
© Hortonworks Inc. 2011–2018. All rights reserved
Moving Data Effectively Is Hard
“Data Pipeline” https://xkcd.com/2054/
© Hortonworks Inc. 2011–2018. All rights reserved
• Standards
• Formats
• Protocols
• Veracity
• Validity
• Schemas
• Partitioning/Bun
dling
Data
Dataflow Challenges in 3 Categories
Infrastructure
• “Exactly Once”
Delivery
• Ensuring
Security
• Overcoming
Security
• Credential
Management
• Network
People
• Compliance
• “That
[person|team|g
roup]”
• Consumers
Change
• Requirements
Change
• “Exactly Once”
Delivery
© Hortonworks Inc. 2011–2018. All rights reserved
Raise your hand if you want to maintain Python scripts for the rest of your life
Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs
9 © Hortonworks Inc. 2011–2018. All rights reserved
Apache NiFi
© Hortonworks Inc. 2011–2018. All rights reserved
• Guaranteed delivery
• Data buffering
• Backpressure
• Pressure release
• Prioritized queuing
• Flow specific QoS
• Latency vs. throughput
• Loss tolerance
Key Features
Apache NiFi
• Data provenance
• Supports push and pull models
• Recovery/recording
a rolling log of fine-grained history
• Visual command and control
• Flow templates
• Pluggable, multi-tenant security
• Designed for extension
• Clustering
© Hortonworks Inc. 2011–2018. All rights reserved
Flowfiles Are Like HTTP Data
HTTP Data FlowFile
HTTP/1.1 200 OK
Date: Sun, 10 Oct 2010 23:26:07 GMT
Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g
Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT
ETag: "45b6-834-49130cc1182c0"
Accept-Ranges: bytes
Content-Length: 13
Connection: close
Content-Type: text/html
Hello world!
Standard FlowFile Attributes
Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'fileSize’ Value: '23609'
FlowFile Attribute Map Content
Key: 'filename’ Value: '15650246997242'
Key: 'path’ Value: './’
Binary Content *
Header
Content
© Hortonworks Inc. 2011–2018. All rights reserved
User Interface
Less of this… … more of this
© Hortonworks Inc. 2011–2018. All rights reserved
Deeper Ecosystem Integration: 274+ Processors,
57 Controller Services
Hash
Extract
Merge
Duplicate
Scan
GeoEnrich
Replace
ConvertSplit
Translate
Route Content
Route Context
Route Text
Control Rate
Distribute Load
Generate Table Fetch
Jolt Transform JSON
Prioritized Delivery
Encrypt
Tail
Evaluate
Execute
All Apache project logos are trademarks of the ASF and the respective projects.
Fetch
HTTP
Syslog
Email
HTML
Image
HL7
FTP
UDP
XML
SFTP
AMQP
WebSocket
Parse Records Convert Records
22 © Hortonworks Inc. 2011–2018. All rights reserved
Apache MiNiFi
23 © Hortonworks Inc. 2011–2018. All rights reserved
IoT Challenges
• Limited computing capability
• Limited power/network
• Restricted software library/platform
availability
• No UI
• Physically inaccessible
• Not frequently updated
• Competing standards/protocols
• Scalability
• Privacy & Security
@_lennart
© Hortonworks Inc. 2011–2018. All rights reserved27
• NiFi is designed to “own the box”
• NiFi 0.7.x started up in about 10-15 minutes on RP3 (593 MB)
• NiFi 1.x started up in about 30 minutes on RP3 (760 MB)
• 33 new processors
• Rewrite for multi tenant authorization
• Complete UI overhaul
So Why Do We Need a Different Solution?
© Hortonworks Inc. 2011–2018. All rights reserved28
• Get the key parts of NiFi close to where data begins and provide bidirectional
communication
• NiFi lives in the data center — give it an enterprise server or a cluster of them
• MiNiFi lives as close to where data is born and is a guest on that device or system
• IoT
• Connected car
• Legacy hardware
Apache NiFi Subproject: MiNiFi
© Hortonworks Inc. 2011–2018. All rights reserved30
• MiNiFi Java (v0.5.0)
• Modified version of NiFi
• No UI
• YAML configuration
• Reduced processor count
• 63+ by default, more
available with
additional NARs
• MiNiFi C++ (v0.5.0)
• Written from scratch
• 33 processors by default
• Bi-directional site-to-site & provenance data
Flavors of MiNiFi
© Hortonworks Inc. 2011–2018. All rights reserved32
• NiFi
• Design flows
• Aggregate data from many
sources
• Perform routing/analysis/SEP
• MiNiFi
• Receive flows
• Collect data
• Send for processing
How Does MiNiFi Interact with NiFi?
© Hortonworks Inc. 2011–2018. All rights reserved33
• We’ve been imagining EDGE to CORE as a bi-directional linear system
• Let’s expand
that to the real
world
Let’s Add Dimensionality
© Hortonworks Inc. 2011–2018. All rights reserved34
• Data tagging/provenance
• Governance from edge (geopolitical
restrictions)
• Security (encryption, certificate-based
authentication)
• Low latency (immediate reactions &
decision-making)
What Does MiNiFi provide? Connected Car Reference Platform Box
Tuner + DSRC CardConnectivity Card
© Hortonworks Inc. 2011–2018. All rights reserved37
• Site-to-Site
• NiFi protocol
• Two implementations
• Raw socket
• HTTP(S)
• Secured with mutual authentication TLS
• HTTP(S), (S)FTP, JMS, Syslog, File, Email, Process
MiNiFi Exfil
38 © Hortonworks Inc. 2011–2018. All rights reserved
Apache NiFi Registry
39 © Hortonworks Inc. 2011–2018. All rights reserved
Flow Development Lifecycle (FDLC)
• Origins of NiFi
• Operator Experience
• MC data, don’t drop, mitigate temporarily
• Version Control
• Environment Promotion
40 © Hortonworks Inc. 2011–2018. All rights reserved
Operator Experience
© Hortonworks Inc. 2011–2018. All rights reserved41
• Shows previous values (user,
time changed)
• Sensitive values are always
encrypted at rest and never
returned via the API
Component Property History
42 © Hortonworks Inc. 2011–2018. All rights reserved
Exporting Flows
• XML templates
• Copying flow.xml.gz
between systems
43 © Hortonworks Inc. 2011–2018. All rights reserved
Challenges
• Templates
• Updates/replacement
• Sensitive property replacement
• Flow.xml.gz migration
• Key synchronization
• Environment promotion
• Approval processes
• Verifiability
44 © Hortonworks Inc. 2011–2018. All rights reserved
Template Replacement
• Export a new version of template
• Transfer (somehow)
• Verify?
• Import onto canvas side-by-side existing
flow
• Stop processors
• Empty queues
• Reconnect queues
• Start
• Pray?
45 © Hortonworks Inc. 2011–2018. All rights reserved
Template Replacement
© Hortonworks Inc. 2011–2018. All rights reserved46
• Previously, flows were exported via XML
templates
• Didn’t contain sensitive values
• Couldn’t be updated in-place
• No tracking system
• NiFi Registry brings asset management
as first-class citizen to NiFi
• Flows can be versioned
Introducing Apache NiFi Registry 0.3.0
NiFi Registry for Dataflows
© Hortonworks Inc. 2011–2018. All rights reserved47
• Connect multiple NiFi instances
to a NiFi Registry instance
• Communicate between multiple
NiFi Registry instances
• via multiple Registry Clients
• via NiFi CLI
Flows Can Be Promoted Between Environments
© Hortonworks Inc. 2011–2018. All rights reserved48
• Git-backed persistence
• Share flows via GitHub, etc.
• Commit hooks
• Register a hook & action
• “When a new version of the
flow is committed to QA
Registry, email the QA team
and post in the QA Deploy
Slack channel”
• Pluggable DB implementations
Extensibility
49 © Hortonworks Inc. 2011–2018. All rights reserved
Demo
© Hortonworks Inc. 2011–2018. All rights reserved50
• Install nifi-registry
• $ mvn clean install
• $ ./bin/nifi-registry.sh
start
• Browse to http://localhost:18080
Create Registry
51 © Hortonworks Inc. 2011–2018. All rights reserved
Create Bucket
© Hortonworks Inc. 2011–2018. All rights reserved52
Connect to NiFi
© Hortonworks Inc. 2011–2018. All rights reserved53
Create Process Group
© Hortonworks Inc. 2011–2018. All rights reserved54
Commit Version
© Hortonworks Inc. 2011–2018. All rights reserved55
View Flow in Registry
© Hortonworks Inc. 2011–2018. All rights reserved56
Import New Instance into NiFi
© Hortonworks Inc. 2011–2018. All rights reserved57
Modify the Original Flow
© Hortonworks Inc. 2011–2018. All rights reserved58
See Local Changes Before Committing
© Hortonworks Inc. 2011–2018. All rights reserved59
Commit
© Hortonworks Inc. 2011–2018. All rights reserved60
Update New Instance from Registry
61 © Hortonworks Inc. 2011–2018. All rights reserved
Complementary Tools
© Hortonworks Inc. 2011–2018. All rights reserved62
• NiFi Toolkit
• NiPyAPI
• MiNiFi Converter Toolkit
Complementary Tools
63 © Hortonworks Inc. 2011–2018. All rights reserved
NiFi Toolkit
• TLS Toolkit
• Generates, signs, and packages
keys and certificates for NiFi
services (node/cluster, clients)
• Encrypt Config
• Protects sensitive configuration
values like passwords
• CLI
• Interacts with NiFi & NiFi
Registry to operate on flows
64 © Hortonworks Inc. 2011–2018. All rights reserved
NiPyAPI
• Python wrapper around NiFi REST API
• Community-provided by Daniel Chaffelson
• Exposes common operations for automation, batch processing, recursion, etc.
dev_bucket = nipyapi.versioning.get_registry_bucket(dev_bucket_name)
dev_ver_flow = nipyapi.versioning.get_flow_in_bucket(
dev_bucket.identifier,
identifier=dev_ver_flow_name
)
dev_export = nipyapi.versioning.export_flow_version(
bucket_id=dev_bucket.identifier,
flow_id=dev_ver_flow.identifier,
mode='yaml'
)
65 © Hortonworks Inc. 2011–2018. All rights reserved
MiNiFi Converter Toolkit
• Save as template from NiFi
• Run $ ./bin/config.sh transform
template.xml config.yml
• MiNiFi flow ready to run
66 © Hortonworks Inc. 2011–2018. All rights reserved
Community
© Hortonworks Inc. 2011–2018. All rights reserved67
• FDLC with Apache NiFi, Kevin
Doran
• NiPyAPI Docs, Daniel
Chaffelson
• DevOps Tips, Tim Spann
• Automate Workflow, Pierre
Villard
More Resources
© Hortonworks Inc. 2011–2018. All rights reserved68
• NiFi 1.8.0 — … Oct 2018 (170+ Jiras)
• Jetty, DB improvements
• Auto load-balancing queues
• TLS Toolkit w/ external CA
• Record processor improvements
• MiNiFi C++ 0.5.0 — 6 June 2018
• MiNiFi Java 0.5.0 — 7 July 2018
• NiFi Registry 0.3.0 — 25 Sept 2018
New Announcements
© Hortonworks Inc. 2011–2018. All rights reserved69
Community Health
© Hortonworks Inc. 2011–2018. All rights reserved70
Apache NiFi site
https://nifi.apache.org
Subproject MiNiFi site
https://nifi.apache.org/minifi/
Subscribe to and collaborate at
dev@nifi.apache.org
users@nifi.apache.org
Submit Ideas or Issues
https://issues.apache.org/jira/browse/NIFI
Follow us on Twitter
@apachenifi
Learn More and Join Us
72 © Hortonworks Inc. 2011–2018. All rights reserved
Thank you
alopresto@hortonworks.com | alopresto@apache.org | @yolopey
github.com/alopresto/slides

More Related Content

What's hot

Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseDataWorks Summit
 
A Kafka journey and why migrate to Confluent Cloud?
A Kafka journey and why migrate to Confluent Cloud?A Kafka journey and why migrate to Confluent Cloud?
A Kafka journey and why migrate to Confluent Cloud?confluent
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultDataWorks Summit
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsTimothy Spann
 
Intelligently collecting data at the edge—intro to Apache MiNiFi
Intelligently collecting data at the edge—intro to Apache MiNiFiIntelligently collecting data at the edge—intro to Apache MiNiFi
Intelligently collecting data at the edge—intro to Apache MiNiFiDataWorks Summit
 
NiFi Developer Guide
NiFi Developer GuideNiFi Developer Guide
NiFi Developer GuideDeon Huang
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxVinay Shukla
 
Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message BrokerHaluan Irsad
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkHortonworks
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache RangerDataWorks Summit
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsBryan Bende
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Timothy Spann
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryCloudera, Inc.
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 

What's hot (20)

Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
 
A Kafka journey and why migrate to Confluent Cloud?
A Kafka journey and why migrate to Confluent Cloud?A Kafka journey and why migrate to Confluent Cloud?
A Kafka journey and why migrate to Confluent Cloud?
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at Renault
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
 
Intelligently collecting data at the edge—intro to Apache MiNiFi
Intelligently collecting data at the edge—intro to Apache MiNiFiIntelligently collecting data at the edge—intro to Apache MiNiFi
Intelligently collecting data at the edge—intro to Apache MiNiFi
 
NiFi Developer Guide
NiFi Developer GuideNiFi Developer Guide
NiFi Developer Guide
 
Apache NiFi Crash Course Intro
Apache NiFi Crash Course IntroApache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message Broker
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC Improvements
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 

Similar to Dataflow Management From Edge to Core with Apache NiFi

The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiDataWorks Summit
 
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiThe First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiDataWorks Summit
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataWorks Summit
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityAccumulo Summit
 
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiDataWorks Summit
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiJoe Percivall
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiDataWorks Summit
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiAldrin Piri
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIsheeta Sanghi
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIsheeta Sanghi
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIsheeta Sanghi
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and FlinkBryan Bende
 
Integrating NiFi and Apex
Integrating NiFi and ApexIntegrating NiFi and Apex
Integrating NiFi and ApexBryan Bende
 
Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Apache Apex
 
Apache Deep Learning 101 - DWS Berlin 2018
Apache Deep Learning 101 - DWS Berlin 2018Apache Deep Learning 101 - DWS Berlin 2018
Apache Deep Learning 101 - DWS Berlin 2018Timothy Spann
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveBryan Bende
 

Similar to Dataflow Management From Edge to Core with Apache NiFi (20)

The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
 
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiThe First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 
Integrating NiFi and Apex
Integrating NiFi and ApexIntegrating NiFi and Apex
Integrating NiFi and Apex
 
Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex
 
Apache Deep Learning 101 - DWS Berlin 2018
Apache Deep Learning 101 - DWS Berlin 2018Apache Deep Learning 101 - DWS Berlin 2018
Apache Deep Learning 101 - DWS Berlin 2018
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Recently uploaded (20)

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Dataflow Management From Edge to Core with Apache NiFi

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Dataflow Management From Edge to Core with Apache NiFi Andy LoPresto | @yolopey Sr. Member of Technical Staff at Hortonworks, Apache NiFi PMC & Committer 11 October 2018 Dataworks Summit Singapore
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved Gauging Audience Familiarity with NiFi “What’s a NeeFee?” No experience with dataflow No experience with NiFi “I can pick this up pretty quickly” Some experience with dataflow Some experience with NiFi “I refactored the Ambari integration endpoint to allow for mutual authentication TLS during my coffee break” Forgotten more about NiFi than most of us will ever know
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved Agenda • What is dataflow and what are the challenges? • Apache NiFi • Apache MiNiFi • Apache NiFi Registry • Complementary Tools • Community • All slides provided online, so no need to transcribe
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved What Is Dataflow?
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved What Is Dataflow? • Moving some content from A to B • Content could be any bytes • Logs • HTTP • XML • CSV • Images • Video • Telemetry Producers A.K.A Things Anything AND Everything Internet! Consumers • User • Storage • System • …More Things
  • 6. © Hortonworks Inc. 2011–2018. All rights reserved Moving Data Effectively Is Hard “Data Pipeline” https://xkcd.com/2054/
  • 7. © Hortonworks Inc. 2011–2018. All rights reserved • Standards • Formats • Protocols • Veracity • Validity • Schemas • Partitioning/Bun dling Data Dataflow Challenges in 3 Categories Infrastructure • “Exactly Once” Delivery • Ensuring Security • Overcoming Security • Credential Management • Network People • Compliance • “That [person|team|g roup]” • Consumers Change • Requirements Change • “Exactly Once” Delivery
  • 8. © Hortonworks Inc. 2011–2018. All rights reserved Raise your hand if you want to maintain Python scripts for the rest of your life Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved Apache NiFi
  • 10. © Hortonworks Inc. 2011–2018. All rights reserved • Guaranteed delivery • Data buffering • Backpressure • Pressure release • Prioritized queuing • Flow specific QoS • Latency vs. throughput • Loss tolerance Key Features Apache NiFi • Data provenance • Supports push and pull models • Recovery/recording a rolling log of fine-grained history • Visual command and control • Flow templates • Pluggable, multi-tenant security • Designed for extension • Clustering
  • 11. © Hortonworks Inc. 2011–2018. All rights reserved Flowfiles Are Like HTTP Data HTTP Data FlowFile HTTP/1.1 200 OK Date: Sun, 10 Oct 2010 23:26:07 GMT Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT ETag: "45b6-834-49130cc1182c0" Accept-Ranges: bytes Content-Length: 13 Connection: close Content-Type: text/html Hello world! Standard FlowFile Attributes Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'fileSize’ Value: '23609' FlowFile Attribute Map Content Key: 'filename’ Value: '15650246997242' Key: 'path’ Value: './’ Binary Content * Header Content
  • 12. © Hortonworks Inc. 2011–2018. All rights reserved User Interface Less of this… … more of this
  • 13. © Hortonworks Inc. 2011–2018. All rights reserved Deeper Ecosystem Integration: 274+ Processors, 57 Controller Services Hash Extract Merge Duplicate Scan GeoEnrich Replace ConvertSplit Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery Encrypt Tail Evaluate Execute All Apache project logos are trademarks of the ASF and the respective projects. Fetch HTTP Syslog Email HTML Image HL7 FTP UDP XML SFTP AMQP WebSocket Parse Records Convert Records
  • 14. 22 © Hortonworks Inc. 2011–2018. All rights reserved Apache MiNiFi
  • 15. 23 © Hortonworks Inc. 2011–2018. All rights reserved IoT Challenges • Limited computing capability • Limited power/network • Restricted software library/platform availability • No UI • Physically inaccessible • Not frequently updated • Competing standards/protocols • Scalability • Privacy & Security @_lennart
  • 16. © Hortonworks Inc. 2011–2018. All rights reserved27 • NiFi is designed to “own the box” • NiFi 0.7.x started up in about 10-15 minutes on RP3 (593 MB) • NiFi 1.x started up in about 30 minutes on RP3 (760 MB) • 33 new processors • Rewrite for multi tenant authorization • Complete UI overhaul So Why Do We Need a Different Solution?
  • 17. © Hortonworks Inc. 2011–2018. All rights reserved28 • Get the key parts of NiFi close to where data begins and provide bidirectional communication • NiFi lives in the data center — give it an enterprise server or a cluster of them • MiNiFi lives as close to where data is born and is a guest on that device or system • IoT • Connected car • Legacy hardware Apache NiFi Subproject: MiNiFi
  • 18. © Hortonworks Inc. 2011–2018. All rights reserved30 • MiNiFi Java (v0.5.0) • Modified version of NiFi • No UI • YAML configuration • Reduced processor count • 63+ by default, more available with additional NARs • MiNiFi C++ (v0.5.0) • Written from scratch • 33 processors by default • Bi-directional site-to-site & provenance data Flavors of MiNiFi
  • 19. © Hortonworks Inc. 2011–2018. All rights reserved32 • NiFi • Design flows • Aggregate data from many sources • Perform routing/analysis/SEP • MiNiFi • Receive flows • Collect data • Send for processing How Does MiNiFi Interact with NiFi?
  • 20. © Hortonworks Inc. 2011–2018. All rights reserved33 • We’ve been imagining EDGE to CORE as a bi-directional linear system • Let’s expand that to the real world Let’s Add Dimensionality
  • 21. © Hortonworks Inc. 2011–2018. All rights reserved34 • Data tagging/provenance • Governance from edge (geopolitical restrictions) • Security (encryption, certificate-based authentication) • Low latency (immediate reactions & decision-making) What Does MiNiFi provide? Connected Car Reference Platform Box Tuner + DSRC CardConnectivity Card
  • 22. © Hortonworks Inc. 2011–2018. All rights reserved37 • Site-to-Site • NiFi protocol • Two implementations • Raw socket • HTTP(S) • Secured with mutual authentication TLS • HTTP(S), (S)FTP, JMS, Syslog, File, Email, Process MiNiFi Exfil
  • 23. 38 © Hortonworks Inc. 2011–2018. All rights reserved Apache NiFi Registry
  • 24. 39 © Hortonworks Inc. 2011–2018. All rights reserved Flow Development Lifecycle (FDLC) • Origins of NiFi • Operator Experience • MC data, don’t drop, mitigate temporarily • Version Control • Environment Promotion
  • 25. 40 © Hortonworks Inc. 2011–2018. All rights reserved Operator Experience
  • 26. © Hortonworks Inc. 2011–2018. All rights reserved41 • Shows previous values (user, time changed) • Sensitive values are always encrypted at rest and never returned via the API Component Property History
  • 27. 42 © Hortonworks Inc. 2011–2018. All rights reserved Exporting Flows • XML templates • Copying flow.xml.gz between systems
  • 28. 43 © Hortonworks Inc. 2011–2018. All rights reserved Challenges • Templates • Updates/replacement • Sensitive property replacement • Flow.xml.gz migration • Key synchronization • Environment promotion • Approval processes • Verifiability
  • 29. 44 © Hortonworks Inc. 2011–2018. All rights reserved Template Replacement • Export a new version of template • Transfer (somehow) • Verify? • Import onto canvas side-by-side existing flow • Stop processors • Empty queues • Reconnect queues • Start • Pray?
  • 30. 45 © Hortonworks Inc. 2011–2018. All rights reserved Template Replacement
  • 31. © Hortonworks Inc. 2011–2018. All rights reserved46 • Previously, flows were exported via XML templates • Didn’t contain sensitive values • Couldn’t be updated in-place • No tracking system • NiFi Registry brings asset management as first-class citizen to NiFi • Flows can be versioned Introducing Apache NiFi Registry 0.3.0 NiFi Registry for Dataflows
  • 32. © Hortonworks Inc. 2011–2018. All rights reserved47 • Connect multiple NiFi instances to a NiFi Registry instance • Communicate between multiple NiFi Registry instances • via multiple Registry Clients • via NiFi CLI Flows Can Be Promoted Between Environments
  • 33. © Hortonworks Inc. 2011–2018. All rights reserved48 • Git-backed persistence • Share flows via GitHub, etc. • Commit hooks • Register a hook & action • “When a new version of the flow is committed to QA Registry, email the QA team and post in the QA Deploy Slack channel” • Pluggable DB implementations Extensibility
  • 34. 49 © Hortonworks Inc. 2011–2018. All rights reserved Demo
  • 35. © Hortonworks Inc. 2011–2018. All rights reserved50 • Install nifi-registry • $ mvn clean install • $ ./bin/nifi-registry.sh start • Browse to http://localhost:18080 Create Registry
  • 36. 51 © Hortonworks Inc. 2011–2018. All rights reserved Create Bucket
  • 37. © Hortonworks Inc. 2011–2018. All rights reserved52 Connect to NiFi
  • 38. © Hortonworks Inc. 2011–2018. All rights reserved53 Create Process Group
  • 39. © Hortonworks Inc. 2011–2018. All rights reserved54 Commit Version
  • 40. © Hortonworks Inc. 2011–2018. All rights reserved55 View Flow in Registry
  • 41. © Hortonworks Inc. 2011–2018. All rights reserved56 Import New Instance into NiFi
  • 42. © Hortonworks Inc. 2011–2018. All rights reserved57 Modify the Original Flow
  • 43. © Hortonworks Inc. 2011–2018. All rights reserved58 See Local Changes Before Committing
  • 44. © Hortonworks Inc. 2011–2018. All rights reserved59 Commit
  • 45. © Hortonworks Inc. 2011–2018. All rights reserved60 Update New Instance from Registry
  • 46. 61 © Hortonworks Inc. 2011–2018. All rights reserved Complementary Tools
  • 47. © Hortonworks Inc. 2011–2018. All rights reserved62 • NiFi Toolkit • NiPyAPI • MiNiFi Converter Toolkit Complementary Tools
  • 48. 63 © Hortonworks Inc. 2011–2018. All rights reserved NiFi Toolkit • TLS Toolkit • Generates, signs, and packages keys and certificates for NiFi services (node/cluster, clients) • Encrypt Config • Protects sensitive configuration values like passwords • CLI • Interacts with NiFi & NiFi Registry to operate on flows
  • 49. 64 © Hortonworks Inc. 2011–2018. All rights reserved NiPyAPI • Python wrapper around NiFi REST API • Community-provided by Daniel Chaffelson • Exposes common operations for automation, batch processing, recursion, etc. dev_bucket = nipyapi.versioning.get_registry_bucket(dev_bucket_name) dev_ver_flow = nipyapi.versioning.get_flow_in_bucket( dev_bucket.identifier, identifier=dev_ver_flow_name ) dev_export = nipyapi.versioning.export_flow_version( bucket_id=dev_bucket.identifier, flow_id=dev_ver_flow.identifier, mode='yaml' )
  • 50. 65 © Hortonworks Inc. 2011–2018. All rights reserved MiNiFi Converter Toolkit • Save as template from NiFi • Run $ ./bin/config.sh transform template.xml config.yml • MiNiFi flow ready to run
  • 51. 66 © Hortonworks Inc. 2011–2018. All rights reserved Community
  • 52. © Hortonworks Inc. 2011–2018. All rights reserved67 • FDLC with Apache NiFi, Kevin Doran • NiPyAPI Docs, Daniel Chaffelson • DevOps Tips, Tim Spann • Automate Workflow, Pierre Villard More Resources
  • 53. © Hortonworks Inc. 2011–2018. All rights reserved68 • NiFi 1.8.0 — … Oct 2018 (170+ Jiras) • Jetty, DB improvements • Auto load-balancing queues • TLS Toolkit w/ external CA • Record processor improvements • MiNiFi C++ 0.5.0 — 6 June 2018 • MiNiFi Java 0.5.0 — 7 July 2018 • NiFi Registry 0.3.0 — 25 Sept 2018 New Announcements
  • 54. © Hortonworks Inc. 2011–2018. All rights reserved69 Community Health
  • 55. © Hortonworks Inc. 2011–2018. All rights reserved70 Apache NiFi site https://nifi.apache.org Subproject MiNiFi site https://nifi.apache.org/minifi/ Subscribe to and collaborate at dev@nifi.apache.org users@nifi.apache.org Submit Ideas or Issues https://issues.apache.org/jira/browse/NIFI Follow us on Twitter @apachenifi Learn More and Join Us
  • 56. 72 © Hortonworks Inc. 2011–2018. All rights reserved Thank you alopresto@hortonworks.com | alopresto@apache.org | @yolopey github.com/alopresto/slides