SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
1 © Hortonworks Inc. 2011–2018. All rights reserved
Hortonworks DataFlow (HDF) 3.3
Taking Stream Processing to the next level
Dinesh Chandrasekhar
Director, Product Marketing, HDF & IoT
@AppInt4All
2 © Hortonworks Inc. 2011–2018. All rights reserved
What Is Hortonworks DataFlow (HDF)?
Hortonworks DataFlow (HDF) is a scalable,
real-time streaming data platform that
collects, curates, and analyzes data so
customers gain key insights for immediate
actionable intelligence.
3 © Hortonworks Inc. 2011–2018. All rights reserved
4 © Hortonworks Inc. 2011–2018. All rights reserved
Data Movement
Optimize resource utilization by moving data
between data centers or between on-
premises infrastructure and cloud
infrastructure
Optimize Log Collection & Analysis
Optimize log analytics solutions such as Splunk
by using HDF as a single platform to collect
and deliver multiple data sources and using
HDP for lower cost storage options
Gain key insights with Streaming Analytics
Accelerate big data ROI by analyzing streaming
data for patterns, comparing with ML models
and delivering actionable intelligence
Single view / 360 view of customer
Ingest, transform and combine customer
data from multiple sources into a single
data view / lake
Stream Processing
Combine multiple streams of data in
real-time, enrich the data and route it to
different end points based on rules
Capture IoT Data
Ingest sensor data from IoT devices and
stream it for further processing and
comprehensive analysis
Common HDF Use Cases
5 © Hortonworks Inc. 2011–2018. All rights reserved
Improving Healthcare with SMART data
Combine multi-format data
streams, with hundreds of
sources, into one platform
• Needed a platform that could
combine multi-format data
streaming
• Data scarcity & latency problems
• Machine learning & data science
• First to deliver SMART real-time
streaming data
• Clearsense’s Inception™ product
enables fast decisions for clinicians
• Customers have access to all data
sources with HDP & HDF
Cloud-based systems
architected to deliver SMART
data, using HDP and HDF
• Mission critical data is now available
for doctors to make critical decisions
• Cost efficiencies led to access for
2,000 rural providers
• Real-time data helps prevent “Code
Blue”
Mission-critical data and
relevant insight for 2,000 rural
providers
R E S U L TS O L U T I O NC H A L L E N G E
Photo by rawpixel on Unsplash
6 © Hortonworks Inc. 2011–2018. All rights reserved
Positioning technology products & services empower companies worldwide
Provide accurate data for
small carriers to improve
business results
• 95% of small carriers (less than 50
trucks) have a deficit of data available
• Estimated data, price points and
revenue base opportunity for
controlling fuel cost
• Understanding of freight and lane
movement
• Leveraging big data powering
Blockchain, with machine learning,
to revolutionize Transportation and
Logistics industries
• Analyzed fuel data; can consolidate
data set for small carriers to
generate community data lake
Big Data in the Cloud with
HDP, HDF, and Microsoft
Azure
• Managing for 4 million trucks daily
• $31 billion dollars in freight
movement guides customers to
profitability
• Blockchain driven architecture
Double digit revenue
increase, year over year
R E S U L TS O L U T I O NC H A L L E N G E
Photo by rawpixel.com on Unsplash
7 © Hortonworks Inc. 2011–2018. All rights reserved
What’s new in HDF 3.3?
8 © Hortonworks Inc. 2011–2018. All rights reserved
Version
3.3
Version
2.0
Version
1.8
New
9 © Hortonworks Inc. 2011–2018. All rights reserved
HDF 3.3 Release Themes
Core HDF
Enhancements
Cross-Platform
Integrations
Enhanced
Operations /
Administrative
Features
Enhanced
Streaming
Support
• Support for Kafka 2.0
• Kafka 2.0 NiFi processors
• NiFi connection load balancing
• MQTT performance improvements
• Kafka Streams Support
• Integration with SMM
• With HDP 3.1
• New Hive Kafka Storage Handler
• New Druid Kafka Indexing Service
• Ambari support for Express and Rolling
upgrades from HDF 3.2
• Smartsense use without needing HDP
• Ranger integration with NiFi Registry
• Decommission NiFi nodes in a cluster
• Kafka 2.0
• Ambari and Ranger
• Kafka Streams
• Schema Registry and Ranger
• Knox SSO Support
• Schema Registry and SAM
10 © Hortonworks Inc. 2011–2018. All rights reserved
Core HDF Enhancements
11 © Hortonworks Inc. 2011–2018. All rights reserved
Core HDF Enhancements
Support for NiFi 1.8.0 and Kafka 2.0
• NiFi serves as the centralized hub for managing dataflows in an enterprise environment.
• Kafka serves as the central component in every major streaming architecture.
Key New Features
• Kafka 2.0 support
• Hive 3.1.0 support
• Connection load balancing
• MQTT Performance improvements
Updated to 1.8.0 Updated to 2.0
12 © Hortonworks Inc. 2011–2018. All rights reserved
Enhanced Streaming
Support
13 © Hortonworks Inc. 2011–2018. All rights reserved
Streaming Analytics Reference Architecture
Data Flow Apps
Powered by NiFi
Subcribing
Streaming Analytics
Apps
Kafka is Everywhere. Critical Component of Streaming Architectures
Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers
Data Syndication
Services Powered by
Kafka
syndicate-
speed
Kafka Topic
syndicate-
geo
Kafka Topic
syndicate-
transmission
Kafka Topic
syndicate-
temp
Kafka Topic
syndicate-
oil
Kafka Topic
syndicate-
breaks
Kafka Topic
syndicate-
battery
Kafka Topic
syndicate-
start/stop
Kafka Topic
syndicate-
acceleration
Kafka Topic
syndicate-
idle
Kafka Topic
Data Collection
at the Edge
US West Fleet
Truck	Sensors	 C++
Agent
US Central Fleet
Truck	Sensors	 C++
Agent
US East Fleet
Truck	Sensors	 C++
Agent
Analytics App 1
Analytics App 2
Analytics App 5
Analytics App 3
Analytics App 4
gateway-west-
raw-sensors
Kafka Topic
IOT Ingest Gateway
Powered by Kafka
gateway-central-
raw-sensors
Kafka Topic
gateway-east-
raw-sensors
Kafka Topic
14 © Hortonworks Inc. 2011–2018. All rights reserved
Cure is Here: Hortonworks Streams Messaging Manager (SMM)
What is SMM?
à Kafka Management and Monitoring tool
à Cure the “Kafka Blindness”
à Single Monitoring Dashboard for all your
Kafka Clusters across 4 entities
– Broker
– Producer
– Topic
– Consumer
à Supports multiple HDP and/or HDF Kafka
Clusters
à REST as a First Class Citizen
à Delivered as a DataPlane Service
15 © Hortonworks Inc. 2011–2018. All rights reserved
Different Kafka Analytics Engine / Access Patterns
for Different Use Cases
Streaming Event Storage Substrate
Topic A
Kafka Topic Kafka Topic
Topic B
Kafka Topic
Topic C
Kafka Topic
Topic D
Kafka Topic
Topic X
3 New Kafka Analytics Access Patterns
Stream Processing
Streaming Access Pattern
N
ew
16 © Hortonworks Inc. 2011–2018. All rights reserved
When is Kafka Streams an ideal choice for Stream Processing?
• Your Application consists of Kafka to Kafka Pipeline
• You don’t need/want another cluster for stream processing
• You want to perform common stream processing functions like filtering, joins,
aggregations, enrichments on the stream for simpler stream processing apps
• Your target user are developers with java dev backgrounds
• Your use cases are building lightweight microservices, simple ETL and stream analytics
apps
17 © Hortonworks Inc. 2011–2018. All rights reserved
Common Microservice / Streaming Analytics Requirements
Requirement # Requirement Description
Req. #1 Create streams consuming from the two Kafka topics.
Req. #2 Join the streams of the Geo and Speed sensors over a time based aggregation window.
Req. #3 Apply rules on the stream to filter on events of interest.
Req. #4 Calculate the average speed of driver over 3 minute window and create alert for speeding driver
Req. #5 Find all the drivers who have be speeding (> 80) over that 3 minute window
Req. #6 Send alerts for speeding drivers to downstream alert topic
Req. #7 Apply access control (ACL) to the the source kafka topics, the alert topic and intermediate topics that are created by
Kafka Streams apps
Req. #8 Monitor each MicroService providing a view into producers, consumers, brokers, and key metrics like consumer
group lag, etc.
18 © Hortonworks Inc. 2011–2018. All rights reserved
Kafka Streams Microservices Architecture
syndicate-
geo-event-
avro
Kafka Topic
syndicate-
speed-event-
avro
Kafka Topic
driver-
violation-
events
Kafka Topic
Req #1: Create streams consuming from
2 Kafka topics
Req #2: Join the Geo & Speed sensors over
a time based aggregation window.
Req #3: Apply rules on the stream to filter
on events of interest.
MicroService 1
JoinFilterGeoSpeed
MicroService
MicroService 1
JoinFilterGeoSpeed
MicroService
Req #4: Calculate the average
speed of driver over 3 minute
window
MicroService 2
CalclateDriverAvgSpeed
MicroService
MicroService 2
CalclateDriverAvgSpeed
MicroService
driver-
average-
speed
Kafka Topic
Req #5: Find all the drivers who have be
speeding (> 80) over that 3 minute
window.
Req #6: Send alerts for speeding drivers to
downstream alert topic
MicroService 3
AlertSpeedingDrivers
MicroService
MicroService 3
AlertSpeedingDrivers
MicroService
alerts-
speeding-
drivers
Kafka Topic
19 © Hortonworks Inc. 2011–2018. All rights reserved
Different Kafka Analytics Engine / Access Patterns
for Different Use Cases
SQL Analytics
SQL Access Pattern
Streaming Event Storage Substrate
Topic A
Kafka Topic Kafka Topic
Topic B
Kafka Topic
Topic C
Kafka Topic
Topic D
Kafka Topic
Topic X
3 New Kafka Analytics Access Patterns
Stream Processing
Streaming Access Pattern
N
ew
KAFKA SQL
New
20 © Hortonworks Inc. 2011–2018. All rights reserved
Kafka SQL – Interactive SQL Analytics
à Customers are starting to use Kafka for
longer term storage. E.g.: Retention Periods
for Kafka topics are getting longer
à Hence, customers want ability to query and
perform interactive analytics on the data
stored in Kafka topics
Key Requirements
à SQL is a must. Users don’t want to learn a new
engine/DSL to do analytics.
à Kafka Topics becomes tables that customers can execute
SQL analytical queries
à Joins across multiple streaming kafka topics and
traditional tables (reference data) for enrichment
à Aggregations over windows (group by, order by, lag, lead)
à UDF support for extensibility
à JDBC/ODBC support. Ablity to connect enterprise BI tools
to do analytics on Kafka topics
à Rich ACL support including column level security
à Governance with robust Audit and Lineage
BI User
SQL
BI Tools
Data Analytics Studio
(DAS)
Tableau
Excel, MSI,
SAS/Access
SQL
Access
Pattern
Kafka Topic
syndicate-
speed-event-
json
Kafka Topic
syndicate-
geo-event-
json
Kafka Cluster
JDBC
ODBC
21 © Hortonworks Inc. 2011–2018. All rights reserved
Hive Kafka SQL – “Real SQL on Real Time Stream”
What is Hive Kafka SQL?
à New Hive Storage Handler for Kafka that
allows users to view Kafka topics as Hive
tables.
à Takes full advantage of all the Hive
analytical operators/capabilities
supporting joins, aggregations, UDFs, push
down predicate filtering, windowing etc.
à Support full Hive/Ranger integration
enabling capabilities such as column level
ACLs for events in Kafka topics
à Supports secure/Kerberized Hive and Kafka
deployments
BI User
Kafka Topic
syndicate-
speed-event-
json
Kafka Topic
syndicate-
geo-event-
json
Kafka Cluster
SQL
BI Tools
Data Analytics Studio
(DAS)
Tableau
Excel, MSI,
SAS/Access
KAFKA SQL
22 © Hortonworks Inc. 2011–2018. All rights reserved
Different Kafka Analytics Engine / Access Patterns
for Different Use Cases
OLAP Analytics
OLAP Access Pattern
SQL Analytics
SQL Access Pattern
Streaming Event Storage Substrate
Topic A
Kafka Topic Kafka Topic
Topic B
Kafka Topic
Topic C
Kafka Topic
Topic D
Kafka Topic
Topic X
3 New Kafka Analytics Access Patterns
Stream Processing
Streaming Access Pattern
N
ew
KAFKA SQL
New
KAFKA OLAP
New
23 © Hortonworks Inc. 2011–2018. All rights reserved
Kafka + Druid + Hive = Powerful New Access Pattern for Streaming Data in Kafka
à Apache Druid (incubating) is a high
performance analytics data store for event-
driven data. Druid combines ideas from
OLAP/timeseries databases, and search
systems to create a unified system for
operational analytics.
à The new Druid Kafka Indexing Service indexes
the streaming data in a Kafka topic into a
Druid cube.
à The Indexing Service can be managed by Hive
as an external table providing SQL interfaces
to the Druid cube.
Topic A
Kafka Topic Kafka Topic
Topic B
Kafka Topic
Topic C
Kafka Topic
Topic D
Kafka Topic
Topic X
Druid Cube
Cube A
Druid Cube
Cube X
Druid Cube
Cube M
Druid Cube
Cube N
Druid Cube
Cube O
Indexing Services
Managed By Hive
Kafka Index Service Kafka Index Service
BI User
Query Kafka
Druid Cube
via Hive SQL
Superset
Query Kafka Druid
Cube Directly
with BI tools like
Superset
24 © Hortonworks Inc. 2011–2018. All rights reserved
Layer on OLAP Analytics on top of Microservices architecture
syndicate-
geo-event-
avro
Kafka Topic
syndicate-
speed-event-
avro
Kafka Topic
driver-
violation-
events
Kafka Topic
MicroService 1
JoinFilterGeoSpeed
MicroService
MicroService 1
JoinFilterGeoSpeed
MicroService
MicroService 2
CalclateDriverAvgSpeed
MicroService
MicroService 2
CalclateDriverAvgSpeed
MicroService
driver-
average-
speed
Kafka Topic
MicroService 3
AlertSpeedingDrivers
MicroService
MicroService 3
AlertSpeedingDrivers
MicroService
alerts-
speeding-
drivers
Kafka Topic
Which Drivers have the
most violations in the
last 30 minutes?
Rollup violations by
driver, violation type
and the route?
Are there specific routes that tend to
cause more violations/incidents than
others?
How many violations in
the last hour?
Analytics on streaming data in
driver-violation-events topic
How many speeding alerts
in the last 5 minutes?
Which drivers have the most
speeding alerts in last 2 hours?
Analytics on streaming data in
driver-violation-events topic
25 © Hortonworks Inc. 2011–2018. All rights reserved
Cross-Platform
Enhancements
26 © Hortonworks Inc. 2011–2018. All rights reserved
3 Streaming Engines with the same set of shared platform services
Kafka Cluster
Topic A
Kafka Topic Kafka Topic
Topic B
Kafka Topic
Topic C
Kafka Topic
Topic D
Shared Platform Services
*
HDF/HDP Now Supports
3 Streaming Engines to Choose From
Analytics App 2 Analytics App 3
Analytics App 1
27 © Hortonworks Inc. 2011–2018. All rights reserved
Version
3.3
Version
2.0
Version
1.8
New
28 © Hortonworks Inc. 2011–2018. All rights reserved
Comprehensive streaming platform – Only big data vendor to offer a comprehensive
streaming platform from real-time data ingestion, transformation, routing to
descriptive, prescriptive and predictive analytics.
100% open source technology – Only vendor with this strategy; prevents vendor lock-in
260+ pre-built processors – Only product to offer such comprehensive connectivity
from edge to enterprise
Built-in data provenance – Only product in the market to offer out-of-the-box data
provenance on data-in-motion
Key Differentiators
3 Stream processing engines – Only vendor to offer a choice of three stream processing
engines to customers for all their streaming architecture needs
29 © Hortonworks Inc. 2011–2018. All rights reserved
Where can I find more information?
• HDF product page – www.hortonworks.com/hdf
• HDF 3.3 product documentation
• HDF 3.3 Release notes
• Blog posts –
• What’s new in HDF 3.3?
• Democratizing Analytics within Kafka with Three New Access Patterns
• Kafka Streams – Is it the right Stream Processing engine for you?
• Building Secure and Governed Microservices with Kafka Streams
• More to come
• Monitoring Kafka Streams Microservices with Streams Messaging Manager
• Real SQL on Real-time Streams in Kafka: Introducing the new Kafka Hive Integration
30 © Hortonworks Inc. 2011–2018. All rights reserved
Happy Holidays!
Happy New Year!!

Contenu connexe

Tendances

Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?DataWorks Summit
 
Reaching scale limits on a Hadoop platform: issues and errors created by spee...
Reaching scale limits on a Hadoop platform: issues and errors created by spee...Reaching scale limits on a Hadoop platform: issues and errors created by spee...
Reaching scale limits on a Hadoop platform: issues and errors created by spee...DataWorks Summit
 
Multi-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASMulti-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASDataWorks Summit
 
Synchronicity of a distributed financial system
Synchronicity of a distributed financial systemSynchronicity of a distributed financial system
Synchronicity of a distributed financial systemDataWorks Summit
 
Lessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNLessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNDataWorks Summit
 
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...DataWorks Summit
 
Compute-based sizing and system dashboard
Compute-based sizing and system dashboardCompute-based sizing and system dashboard
Compute-based sizing and system dashboardDataWorks Summit
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicDataWorks Summit
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureDataWorks Summit
 
Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...DataWorks Summit
 
IoT Story: From Edge to HDP
IoT Story: From Edge to HDPIoT Story: From Edge to HDP
IoT Story: From Edge to HDPDataWorks Summit
 
HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSDataWorks Summit
 
Real Time Streaming Architecture at Ford
Real Time Streaming Architecture at FordReal Time Streaming Architecture at Ford
Real Time Streaming Architecture at FordDataWorks Summit
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksHortonworks
 
Running Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data ArchitectureDataWorks Summit
 
Building a data-driven authorization framework
Building a data-driven authorization frameworkBuilding a data-driven authorization framework
Building a data-driven authorization frameworkDataWorks Summit
 

Tendances (20)

Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?
 
Reaching scale limits on a Hadoop platform: issues and errors created by spee...
Reaching scale limits on a Hadoop platform: issues and errors created by spee...Reaching scale limits on a Hadoop platform: issues and errors created by spee...
Reaching scale limits on a Hadoop platform: issues and errors created by spee...
 
Multi-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASMulti-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLAS
 
Synchronicity of a distributed financial system
Synchronicity of a distributed financial systemSynchronicity of a distributed financial system
Synchronicity of a distributed financial system
 
Lessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNLessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARN
 
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...
 
Compute-based sizing and system dashboard
Compute-based sizing and system dashboardCompute-based sizing and system dashboard
Compute-based sizing and system dashboard
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
 
Containers and Big Data
Containers and Big DataContainers and Big Data
Containers and Big Data
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
 
Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...
 
IoT Story: From Edge to HDP
IoT Story: From Edge to HDPIoT Story: From Edge to HDP
IoT Story: From Edge to HDP
 
HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFS
 
Deep learning 101
Deep learning 101Deep learning 101
Deep learning 101
 
Real Time Streaming Architecture at Ford
Real Time Streaming Architecture at FordReal Time Streaming Architecture at Ford
Real Time Streaming Architecture at Ford
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
 
Running Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data Architecture
 
Building a data-driven authorization framework
Building a data-driven authorization frameworkBuilding a data-driven authorization framework
Building a data-driven authorization framework
 

Similaire à Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level

Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveAldrin Piri
 
HDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesHDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesTimothy Spann
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA
 
Make Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouMake Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouHortonworks
 
Schema Registry & Stream Analytics Manager
Schema Registry  & Stream Analytics ManagerSchema Registry  & Stream Analytics Manager
Schema Registry & Stream Analytics ManagerSriharsha Chintalapani
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafkaconfluent
 
Unlocking insights in streaming data
Unlocking insights in streaming dataUnlocking insights in streaming data
Unlocking insights in streaming dataCarolyn Duby
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Impetus Technologies
 
Curing the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerCuring the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerDataWorks Summit
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
 
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-diveKinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-diveYifeng Jiang
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionMilind Pandit
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHaimo Liu
 
Social Media Monitoring with NiFi, Druid and Superset
Social Media Monitoring with NiFi, Druid and SupersetSocial Media Monitoring with NiFi, Druid and Superset
Social Media Monitoring with NiFi, Druid and SupersetThiago Santiago
 
ADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented MiddlewareADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented MiddlewareDATAVERSITY
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Hortonworks
 

Similaire à Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level (20)

Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
 
HDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesHDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New Features
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
 
Make Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouMake Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for You
 
Schema Registry & Stream Analytics Manager
Schema Registry  & Stream Analytics ManagerSchema Registry  & Stream Analytics Manager
Schema Registry & Stream Analytics Manager
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafka
 
Unlocking insights in streaming data
Unlocking insights in streaming dataUnlocking insights in streaming data
Unlocking insights in streaming data
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
 
Curing the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerCuring the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging Manager
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-diveKinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-dive
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi Introduction
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
 
Social Media Monitoring with NiFi, Druid and Superset
Social Media Monitoring with NiFi, Druid and SupersetSocial Media Monitoring with NiFi, Druid and Superset
Social Media Monitoring with NiFi, Druid and Superset
 
ADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented MiddlewareADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016
 

Plus de Hortonworks

IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive DataHortonworks
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of DataHortonworks
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateHortonworks
 
Sprint's Data Modernization Journey
Sprint's Data Modernization JourneySprint's Data Modernization Journey
Sprint's Data Modernization JourneyHortonworks
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformHortonworks
 
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 Hortonworks
 

Plus de Hortonworks (20)

IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
 
Sprint's Data Modernization Journey
Sprint's Data Modernization JourneySprint's Data Modernization Journey
Sprint's Data Modernization Journey
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
 
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
 

Dernier

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Dernier (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Hortonworks DataFlow (HDF) 3.3 Taking Stream Processing to the next level Dinesh Chandrasekhar Director, Product Marketing, HDF & IoT @AppInt4All
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved What Is Hortonworks DataFlow (HDF)? Hortonworks DataFlow (HDF) is a scalable, real-time streaming data platform that collects, curates, and analyzes data so customers gain key insights for immediate actionable intelligence.
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved Data Movement Optimize resource utilization by moving data between data centers or between on- premises infrastructure and cloud infrastructure Optimize Log Collection & Analysis Optimize log analytics solutions such as Splunk by using HDF as a single platform to collect and deliver multiple data sources and using HDP for lower cost storage options Gain key insights with Streaming Analytics Accelerate big data ROI by analyzing streaming data for patterns, comparing with ML models and delivering actionable intelligence Single view / 360 view of customer Ingest, transform and combine customer data from multiple sources into a single data view / lake Stream Processing Combine multiple streams of data in real-time, enrich the data and route it to different end points based on rules Capture IoT Data Ingest sensor data from IoT devices and stream it for further processing and comprehensive analysis Common HDF Use Cases
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved Improving Healthcare with SMART data Combine multi-format data streams, with hundreds of sources, into one platform • Needed a platform that could combine multi-format data streaming • Data scarcity & latency problems • Machine learning & data science • First to deliver SMART real-time streaming data • Clearsense’s Inception™ product enables fast decisions for clinicians • Customers have access to all data sources with HDP & HDF Cloud-based systems architected to deliver SMART data, using HDP and HDF • Mission critical data is now available for doctors to make critical decisions • Cost efficiencies led to access for 2,000 rural providers • Real-time data helps prevent “Code Blue” Mission-critical data and relevant insight for 2,000 rural providers R E S U L TS O L U T I O NC H A L L E N G E Photo by rawpixel on Unsplash
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved Positioning technology products & services empower companies worldwide Provide accurate data for small carriers to improve business results • 95% of small carriers (less than 50 trucks) have a deficit of data available • Estimated data, price points and revenue base opportunity for controlling fuel cost • Understanding of freight and lane movement • Leveraging big data powering Blockchain, with machine learning, to revolutionize Transportation and Logistics industries • Analyzed fuel data; can consolidate data set for small carriers to generate community data lake Big Data in the Cloud with HDP, HDF, and Microsoft Azure • Managing for 4 million trucks daily • $31 billion dollars in freight movement guides customers to profitability • Blockchain driven architecture Double digit revenue increase, year over year R E S U L TS O L U T I O NC H A L L E N G E Photo by rawpixel.com on Unsplash
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved What’s new in HDF 3.3?
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved Version 3.3 Version 2.0 Version 1.8 New
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved HDF 3.3 Release Themes Core HDF Enhancements Cross-Platform Integrations Enhanced Operations / Administrative Features Enhanced Streaming Support • Support for Kafka 2.0 • Kafka 2.0 NiFi processors • NiFi connection load balancing • MQTT performance improvements • Kafka Streams Support • Integration with SMM • With HDP 3.1 • New Hive Kafka Storage Handler • New Druid Kafka Indexing Service • Ambari support for Express and Rolling upgrades from HDF 3.2 • Smartsense use without needing HDP • Ranger integration with NiFi Registry • Decommission NiFi nodes in a cluster • Kafka 2.0 • Ambari and Ranger • Kafka Streams • Schema Registry and Ranger • Knox SSO Support • Schema Registry and SAM
  • 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved Core HDF Enhancements
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved Core HDF Enhancements Support for NiFi 1.8.0 and Kafka 2.0 • NiFi serves as the centralized hub for managing dataflows in an enterprise environment. • Kafka serves as the central component in every major streaming architecture. Key New Features • Kafka 2.0 support • Hive 3.1.0 support • Connection load balancing • MQTT Performance improvements Updated to 1.8.0 Updated to 2.0
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved Enhanced Streaming Support
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved Streaming Analytics Reference Architecture Data Flow Apps Powered by NiFi Subcribing Streaming Analytics Apps Kafka is Everywhere. Critical Component of Streaming Architectures Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers Data Syndication Services Powered by Kafka syndicate- speed Kafka Topic syndicate- geo Kafka Topic syndicate- transmission Kafka Topic syndicate- temp Kafka Topic syndicate- oil Kafka Topic syndicate- breaks Kafka Topic syndicate- battery Kafka Topic syndicate- start/stop Kafka Topic syndicate- acceleration Kafka Topic syndicate- idle Kafka Topic Data Collection at the Edge US West Fleet Truck Sensors C++ Agent US Central Fleet Truck Sensors C++ Agent US East Fleet Truck Sensors C++ Agent Analytics App 1 Analytics App 2 Analytics App 5 Analytics App 3 Analytics App 4 gateway-west- raw-sensors Kafka Topic IOT Ingest Gateway Powered by Kafka gateway-central- raw-sensors Kafka Topic gateway-east- raw-sensors Kafka Topic
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved Cure is Here: Hortonworks Streams Messaging Manager (SMM) What is SMM? à Kafka Management and Monitoring tool à Cure the “Kafka Blindness” à Single Monitoring Dashboard for all your Kafka Clusters across 4 entities – Broker – Producer – Topic – Consumer à Supports multiple HDP and/or HDF Kafka Clusters à REST as a First Class Citizen à Delivered as a DataPlane Service
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved Different Kafka Analytics Engine / Access Patterns for Different Use Cases Streaming Event Storage Substrate Topic A Kafka Topic Kafka Topic Topic B Kafka Topic Topic C Kafka Topic Topic D Kafka Topic Topic X 3 New Kafka Analytics Access Patterns Stream Processing Streaming Access Pattern N ew
  • 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved When is Kafka Streams an ideal choice for Stream Processing? • Your Application consists of Kafka to Kafka Pipeline • You don’t need/want another cluster for stream processing • You want to perform common stream processing functions like filtering, joins, aggregations, enrichments on the stream for simpler stream processing apps • Your target user are developers with java dev backgrounds • Your use cases are building lightweight microservices, simple ETL and stream analytics apps
  • 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved Common Microservice / Streaming Analytics Requirements Requirement # Requirement Description Req. #1 Create streams consuming from the two Kafka topics. Req. #2 Join the streams of the Geo and Speed sensors over a time based aggregation window. Req. #3 Apply rules on the stream to filter on events of interest. Req. #4 Calculate the average speed of driver over 3 minute window and create alert for speeding driver Req. #5 Find all the drivers who have be speeding (> 80) over that 3 minute window Req. #6 Send alerts for speeding drivers to downstream alert topic Req. #7 Apply access control (ACL) to the the source kafka topics, the alert topic and intermediate topics that are created by Kafka Streams apps Req. #8 Monitor each MicroService providing a view into producers, consumers, brokers, and key metrics like consumer group lag, etc.
  • 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved Kafka Streams Microservices Architecture syndicate- geo-event- avro Kafka Topic syndicate- speed-event- avro Kafka Topic driver- violation- events Kafka Topic Req #1: Create streams consuming from 2 Kafka topics Req #2: Join the Geo & Speed sensors over a time based aggregation window. Req #3: Apply rules on the stream to filter on events of interest. MicroService 1 JoinFilterGeoSpeed MicroService MicroService 1 JoinFilterGeoSpeed MicroService Req #4: Calculate the average speed of driver over 3 minute window MicroService 2 CalclateDriverAvgSpeed MicroService MicroService 2 CalclateDriverAvgSpeed MicroService driver- average- speed Kafka Topic Req #5: Find all the drivers who have be speeding (> 80) over that 3 minute window. Req #6: Send alerts for speeding drivers to downstream alert topic MicroService 3 AlertSpeedingDrivers MicroService MicroService 3 AlertSpeedingDrivers MicroService alerts- speeding- drivers Kafka Topic
  • 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved Different Kafka Analytics Engine / Access Patterns for Different Use Cases SQL Analytics SQL Access Pattern Streaming Event Storage Substrate Topic A Kafka Topic Kafka Topic Topic B Kafka Topic Topic C Kafka Topic Topic D Kafka Topic Topic X 3 New Kafka Analytics Access Patterns Stream Processing Streaming Access Pattern N ew KAFKA SQL New
  • 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved Kafka SQL – Interactive SQL Analytics à Customers are starting to use Kafka for longer term storage. E.g.: Retention Periods for Kafka topics are getting longer à Hence, customers want ability to query and perform interactive analytics on the data stored in Kafka topics Key Requirements à SQL is a must. Users don’t want to learn a new engine/DSL to do analytics. à Kafka Topics becomes tables that customers can execute SQL analytical queries à Joins across multiple streaming kafka topics and traditional tables (reference data) for enrichment à Aggregations over windows (group by, order by, lag, lead) à UDF support for extensibility à JDBC/ODBC support. Ablity to connect enterprise BI tools to do analytics on Kafka topics à Rich ACL support including column level security à Governance with robust Audit and Lineage BI User SQL BI Tools Data Analytics Studio (DAS) Tableau Excel, MSI, SAS/Access SQL Access Pattern Kafka Topic syndicate- speed-event- json Kafka Topic syndicate- geo-event- json Kafka Cluster JDBC ODBC
  • 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved Hive Kafka SQL – “Real SQL on Real Time Stream” What is Hive Kafka SQL? à New Hive Storage Handler for Kafka that allows users to view Kafka topics as Hive tables. à Takes full advantage of all the Hive analytical operators/capabilities supporting joins, aggregations, UDFs, push down predicate filtering, windowing etc. à Support full Hive/Ranger integration enabling capabilities such as column level ACLs for events in Kafka topics à Supports secure/Kerberized Hive and Kafka deployments BI User Kafka Topic syndicate- speed-event- json Kafka Topic syndicate- geo-event- json Kafka Cluster SQL BI Tools Data Analytics Studio (DAS) Tableau Excel, MSI, SAS/Access KAFKA SQL
  • 22. 22 © Hortonworks Inc. 2011–2018. All rights reserved Different Kafka Analytics Engine / Access Patterns for Different Use Cases OLAP Analytics OLAP Access Pattern SQL Analytics SQL Access Pattern Streaming Event Storage Substrate Topic A Kafka Topic Kafka Topic Topic B Kafka Topic Topic C Kafka Topic Topic D Kafka Topic Topic X 3 New Kafka Analytics Access Patterns Stream Processing Streaming Access Pattern N ew KAFKA SQL New KAFKA OLAP New
  • 23. 23 © Hortonworks Inc. 2011–2018. All rights reserved Kafka + Druid + Hive = Powerful New Access Pattern for Streaming Data in Kafka à Apache Druid (incubating) is a high performance analytics data store for event- driven data. Druid combines ideas from OLAP/timeseries databases, and search systems to create a unified system for operational analytics. à The new Druid Kafka Indexing Service indexes the streaming data in a Kafka topic into a Druid cube. à The Indexing Service can be managed by Hive as an external table providing SQL interfaces to the Druid cube. Topic A Kafka Topic Kafka Topic Topic B Kafka Topic Topic C Kafka Topic Topic D Kafka Topic Topic X Druid Cube Cube A Druid Cube Cube X Druid Cube Cube M Druid Cube Cube N Druid Cube Cube O Indexing Services Managed By Hive Kafka Index Service Kafka Index Service BI User Query Kafka Druid Cube via Hive SQL Superset Query Kafka Druid Cube Directly with BI tools like Superset
  • 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved Layer on OLAP Analytics on top of Microservices architecture syndicate- geo-event- avro Kafka Topic syndicate- speed-event- avro Kafka Topic driver- violation- events Kafka Topic MicroService 1 JoinFilterGeoSpeed MicroService MicroService 1 JoinFilterGeoSpeed MicroService MicroService 2 CalclateDriverAvgSpeed MicroService MicroService 2 CalclateDriverAvgSpeed MicroService driver- average- speed Kafka Topic MicroService 3 AlertSpeedingDrivers MicroService MicroService 3 AlertSpeedingDrivers MicroService alerts- speeding- drivers Kafka Topic Which Drivers have the most violations in the last 30 minutes? Rollup violations by driver, violation type and the route? Are there specific routes that tend to cause more violations/incidents than others? How many violations in the last hour? Analytics on streaming data in driver-violation-events topic How many speeding alerts in the last 5 minutes? Which drivers have the most speeding alerts in last 2 hours? Analytics on streaming data in driver-violation-events topic
  • 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved Cross-Platform Enhancements
  • 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved 3 Streaming Engines with the same set of shared platform services Kafka Cluster Topic A Kafka Topic Kafka Topic Topic B Kafka Topic Topic C Kafka Topic Topic D Shared Platform Services * HDF/HDP Now Supports 3 Streaming Engines to Choose From Analytics App 2 Analytics App 3 Analytics App 1
  • 27. 27 © Hortonworks Inc. 2011–2018. All rights reserved Version 3.3 Version 2.0 Version 1.8 New
  • 28. 28 © Hortonworks Inc. 2011–2018. All rights reserved Comprehensive streaming platform – Only big data vendor to offer a comprehensive streaming platform from real-time data ingestion, transformation, routing to descriptive, prescriptive and predictive analytics. 100% open source technology – Only vendor with this strategy; prevents vendor lock-in 260+ pre-built processors – Only product to offer such comprehensive connectivity from edge to enterprise Built-in data provenance – Only product in the market to offer out-of-the-box data provenance on data-in-motion Key Differentiators 3 Stream processing engines – Only vendor to offer a choice of three stream processing engines to customers for all their streaming architecture needs
  • 29. 29 © Hortonworks Inc. 2011–2018. All rights reserved Where can I find more information? • HDF product page – www.hortonworks.com/hdf • HDF 3.3 product documentation • HDF 3.3 Release notes • Blog posts – • What’s new in HDF 3.3? • Democratizing Analytics within Kafka with Three New Access Patterns • Kafka Streams – Is it the right Stream Processing engine for you? • Building Secure and Governed Microservices with Kafka Streams • More to come • Monitoring Kafka Streams Microservices with Streams Messaging Manager • Real SQL on Real-time Streams in Kafka: Introducing the new Kafka Hive Integration
  • 30. 30 © Hortonworks Inc. 2011–2018. All rights reserved Happy Holidays! Happy New Year!!