SlideShare une entreprise Scribd logo
1  sur  55
Télécharger pour lire hors ligne
What can
Apache Pulsar
do for FinTech?
streamnative.io
Tim Spann
Developer Advocate
StreamNative
● FLiP(N) Stack = Flink, Pulsar and NiFi Stack
● Streaming Systems & Data Architecture Expert
● Experience:
○ 15+ years of experience with streaming technologies including Apache
Pulsar, Apache Flink, Apache Spark, Apache NiFi, Big Data, Cloud,
Trino, Aerospike, IoT and more.
John Kinson
Head of Sales, EMEA
StreamNative
● Startup, Scale-up and Large Enterprise expert
● Building the StreamNative Sales function in EMEA
● Experience:
○ 25+ years of building and selling distributed and embedded systems in
the telecoms, digital media and cloud enterprise software industries
Agenda
01 Welcome
02 Introduction to Messaging + Data Streaming
03 Introduction to Apache Pulsar
04 Why Open Source
05 Resources
06 Q&A
3
4
➔ Asynchronous messages triggered by
events
➔ Consuming messages regardless of
Language, System, Sender
➔ Queueing
➔ Routing
➔ Work Queues
➔ JPMorgan Chase AMQP
MESSAGING
5
➔ Perform in Real-Time
➔ Process Events as They Happen
➔ Joining Streams with SQL
➔ Find Anomalies Immediately
➔ Ordering and Arrival Semantics
➔ Continuous Streams of Data
DATA STREAMING
streamnative.io
Accessing historical as well as
real-time data
Pub/sub model enables event streams
to be sent from multiple producers,
and consumed by multiple consumers
To process large amounts of data in a
highly scalable way
When is Messaging and
Streaming used?
Industry trends
Banking
Transforming from
siloed systems
to combined data streams
Provide faster claim
processing, fraud detection and
system integration
Insurance
Handle huge columns of
data from sensors
IoT
7
Apache Pulsar is a Cloud-Native Messaging
and Event-Streaming Platform.
Messaging
Ideal for work queues that do not
require tasks to be performed in a
particular order—for example,
sending one email message to many
recipients.
RabbitMQ and Amazon SQS are
examples of popular queue-based
message systems.
Pulsar: Unified Messaging + Data Streaming
Messaging
Ideal for work queues that do not
require tasks to be performed in a
particular order—for example,
sending one email message to many
recipients.
RabbitMQ and Amazon SQS are
examples of popular queue-based
message systems.
Pulsar: Unified Messaging + Data Streaming
.. and Streaming
Works best in situations where the
order of messages is important—for
example, data ingestion.
Kafka and Amazon Kinesis are
examples of messaging systems that
use streaming semantics for
consuming messages.
Unified Messaging and Streaming
StreamNative Hub
StreamNative Cloud
Unified Batch and Stream COMPUTING
Batch
(Batch + Stream)
Unified Batch and Stream STORAGE
Offload
(Queuing + Streaming)
Tiered Storage
Pulsar
---
KoP
---
MoP
---
Websocket
Pulsar
Sink
Streaming
Edge Gateway
Protocols
CDC
Apps
Building
Microservices
Asynchronous
Communication
Building Real Time
Applications
Highly Resilient
Tiered storage
12
Pulsar Benefits
Pulsar Global Adoption
Using Pulsar with Fintech
14
Low latency
Geo-replication
Data integrity
High availability
Durability
Multi-tenancy
Multiple data consumers:
Transactions, payment
processing, alerts,
analytics, KYC, fraud
detection with ML & AI
Large data volumes,
high scalability
Financial event
messaging
Many topics, producers,
consumers
Why Open
Source Pulsar?
Sijie Guo
ASF Member
Pulsar/BookKeeper PMC
Founder and CEO
Jia Zhai
Pulsar/BookKeeper PMC
Co-Founder
Matteo Merli
ASF Member
Pulsar/BookKeeper PMC
CTO
16
● We would get many benefits from an
open source model
○ Other companies would help
develop the product
○ Better security, code escrow,
longevity
● We would keep the core features in the
OSS version
● We could build commercial offerings,
services around the core product
OUR BETS AND EARLY DECISIONS
Why Open
Source Pulsar?
17
C/OSS Model
Benefits Challenges
Many developers
Security,
Longevity,
Escrow
Why pay?
Multiple roadmaps
RESOURCES
Here are resources to continue your journey
with Apache Pulsar
Now Available
On-Demand Pulsar
Training
Academy.StreamNative.io
19
[On-Demand Video]
Introduction to Pulsar
Watch Now!
20
FREE ebook
Apache Pulsar
in Action
Access Now!
John Kinson
Head of Sales
EMEA
Q&A
Tim Spann
Developer Advocate
@PaaSDev
linkedin.com/in/
timothyspann
github.com/tspannhw
john@streamnative.io
linkedin.com/in/
johnkinson
+44 207 072 1095
22
Thank you
streamnative.io
Industry trends
Notable industries and sectors using data streaming:
Banking - transforming from siloed systems to combined data streams
○ Typical applications of event streaming include banking sector processing of
financial transactions, with multiple customer touchpoints, notifications, and
support for mobile devices
○ Banking data (transactions and meta data) can be streamed in parallel for
fraud detection using ML and AI in near real-time
Insurance - building a single view from multiple data sources to provide faster claim
processing, fraud detection and system integration
IoT - handling huge volumes of data from sensors
Adopted Pulsar to replace
Kafka in their DSP (Data
Streaming Platform).
● 1.5-2x lower in capex
cost
● 5-50x improvement in
latency
● 2-3x lower in opex due
● Process 10
petabytes/day
Adopted Pulsar to power
their billing platform,
Midas, which processing
hundreds of billions of
financial transactions daily.
Adoption then expanded to
Tencent’s Federated
Learning Platform and
Tencent Gaming.
Applied Materials is one of
the biggest semiconductor
hardware and software
supplier in the industry.
They adopted Pulsar to
enable them to build a
message bus to tie all of
their data together. They
previously used Tibco.
Pulsar Adoption Use Cases
Agenda
Welcome
Introduction to Messaging + Data Streaming
● What is messaging and data streaming?
● When is it used?
● What are the industry trends?
Introduction to Apache Pulsar
● What it is
● What it enables
● Who uses it today?
● Using Apache Pulsar in FinTech applications
Why Open Source
● Why open source Apache Pulsar?
● What have been the benefits and challenges?
Resources
Q&A
Industry trends
Banking
Transforming from
siloed systems
to combined data streams
Provide faster claim
processing, fraud detection and
system integration
Insurance
Handle huge columns of
data from sensors
IoT
26
Pulsar Adoption Spreads
Tencent serves billions of users and over a million merchants.
Use Case #1: Payments
Early 2019, Tencent
adopts Pulsar to power
their billing platform,
Midas, processing
hundreds of billions of
financial transactions
daily.
Use Case #2: ML/AI
Pulsar adoption
spreads to Tencent’s
Federated Learning
Platform where it
supports trillions of
concurrent federated
learnings every day.
Use Case #3: Gaming
Tencent’s Gaming
Department replaces
Kafka with Pulsar for
its logging pipeline.
Founded By The
Creators Of Apache Pulsar
Sijie Guo
ASF Member
Pulsar/BookKeeper PMC
Founder and CEO
Jia Zhai
Pulsar/BookKeeper PMC
Co-Founder
Matteo Merli
ASF Member
Pulsar/BookKeeper PMC
CTO
Data veterans with extensive industry experience
Messages - the basic unit of Pulsar
Component Description
Value / data payload The data carried by the message. All Pulsar messages contain raw bytes, although
message data can also conform to data schemas.
Key Messages are optionally tagged with keys, used in partitioning and also is useful for
things like topic compaction.
Properties An optional key/value map of user-defined properties.
Producer name The name of the producer who produces the message. If you do not specify a producer
name, the default name is used. Message De-Duplication.
Sequence ID Each Pulsar message belongs to an ordered sequence on its topic. The sequence ID of
the message is its order in that sequence. Message De-Duplication.
Producer-Consumer
Producer Consumer
Publisher sends data and
doesn't know about the
subscribers or their status.
All interactions go through
Pulsar and it handles all
communication.
Subscriber receives data
from publisher and never
directly interacts with it
Topic
Topic
Pulsar’s Publish-Subscribe model
Broker
Subscription
Consumer 1
Consumer 2
Consumer 3
Topic
Producer 1
Producer 2
● Producers send messages.
● Topics are an ordered, named channel that producers
use to transmit messages to subscribed consumers.
● Messages belong to a topic and contain an arbitrary
payload.
● Brokers handle connections and routes
messages between producers / consumers.
● Subscriptions are named configuration rules
that determine how messages are delivered to
consumers.
● Consumers receive messages.
Pulsar Subscription Modes
Different subscription modes
have different semantics:
Exclusive/Failover - guaranteed
order, single active consumer
Shared - multiple active
consumers, no order
Key_Shared - multiple active
consumers, order for given key
Producer 1
Producer 2
Pulsar Topic
Subscription D
Consumer D-1
Consumer D-2
Key-Shared
<
K
1,
V
10
>
<
K
1,
V
11
>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
2
,V
2
1>
<
K
2
,V
2
2
>
Subscription C
Consumer C-1
Consumer C-2
Shared
<
K
1,
V
10
>
<
K
2,
V
21
>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
1,
V
11
>
<
K
2
,V
2
2
>
Subscription A Consumer A
Exclusive
Subscription B
Consumer B-1
Consumer B-2
In case of failure in
Consumer B-1
Failover
Messaging
Ordering Guarantees
Topic Ordering Guarantees:
● Messages sent to a single topic or
partition DO have an ordering
guarantee.
● Messages sent to different partitions
DO NOT have an ordering guarantee.
33
Subscription Mode Guarantees:
● A single consumer can receive
messages from the same partition in
order using an exclusive or failover
subscription mode.
● Multiple consumers can receive
messages from the same key in order
using the key_shared subscription
mode.
Messaging
Ordering Guarantees
Topic Ordering Guarantees:
● Messages sent to a single topic or
partition DO have an ordering
guarantee.
● Messages sent to different partitions
DO NOT have an ordering guarantee.
34
Subscription Mode Guarantees:
● A single consumer can receive
messages from the same partition in
order using an exclusive or failover
subscription mode.
● Multiple consumers can receive
messages from the same key in order
using the key_shared subscription
mode.
Unified Messaging Model
Streaming
Messaging
Producer 1
Producer 2
Pulsar
Topic/Partition
m0
m1
m2
m3
m4
Consumer D-1
Consumer D-2
Consumer D-3
Subscription D
<
k
2
,
v
1
>
<
k
2
,
v
3
>
<k3,v2>
<
k
1
,
v
0
>
<
k
1
,
v
4
>
Key-Shared
Consumer C-1
Consumer C-2
Consumer C-3
Subscription C
m1
m2
m3
m4
m0
Shared
Failover
Consumer B-1
Consumer B-0
Subscription B
m1
m2
m3
m4
m0
In case of failure in
Consumer B-0
Consumer A-1
Consumer A-0
Subscription A
m1
m2
m3
m4
m0
Exclusive
X
Connectivity
• Libraries - (Java, Python, Go, NodeJS,
WebSockets, C++, C#, Scala, Rust,...)
• Functions - Lightweight Stream
Processing (Java, Python, Go)
• Connectors - Sources & Sinks
(Cassandra, Kafka, …)
• Protocol Handlers - AoP (AMQP), KoP
(Kafka), MoP (MQTT)
• Processing Engines - Flink, Spark,
Presto/Trino via Pulsar SQL
• Data Offloaders - Tiered Storage - (S3)
hub.streamnative.io
Use Cases
Multi-Tenant Data
Infrastructure
AdTech
Fraud Detection
FinTech
IoT Analytics
Microservices Development
Schema Registry
Schema Registry
schema-1 (value=Avro/Protobuf/JSON) schema-2 (value=Avro/Protobuf/JSON) schema-3
(value=Avro/Protobuf/JSON)
Schema
Data
ID
Local Cache
for Schemas
+
Schema
Data
ID +
Local Cache
for Schemas
Send schema-1
(value=Avro/Protobuf/JSON) data
serialized per schema ID
Send (register)
schema (if not in
local cache)
Read schema-1
(value=Avro/Protobuf/JSON) data
deserialized per schema ID
Get schema by ID (if
not in local cache)
Producers Consumers
Pulsar Functions
● Lightweight computation
similar to AWS Lambda.
● Specifically designed to use
Apache Pulsar as a message
bus.
● Function runtime can be
located within Pulsar Broker.
A serverless event streaming
framework
● Consume messages from one
or more Pulsar topics.
● Apply user-supplied
processing logic to each
message.
● Publish the results of the
computation to another topic.
● Support multiple
programming languages (Java,
Python, Go)
● Can leverage 3rd-party
libraries to support the
execution of ML models on
the edge.
Pulsar Functions
Moving Data In and Out of Pulsar
IO/Connectors are a simple way to integrate with external systems and move
data in and out of Pulsar. https://pulsar.apache.org/docs/en/io-jdbc-sink/
● Built on top of Pulsar Functions
● Built-in connectors - hub.streamnative.io
Source Sink
Kafka-on-Pulsar (Kop)
Pulsar SQL
Presto/Trino workers can read
segments directly from
bookies (or offloaded storage)
in parallel.
Bookie
1
Segment 1
Producer Consumer
Broker 1
Topic1-Part1
Broker 2
Topic1-Part2
Broker 3
Topic1-Part3
Segment 2 Segment 3 Segment 4 Segment X
Segment 1
Segment 1 Segment 1
Segment 3 Segment 3
Segment 3
Segment 2
Segment 2
Segment 2
Segment 4
Segment 4
Segment 4
Segment X
Segment X
Segment X
Bookie
2
Bookie
3
Query
Coordinator
...
...
SQL Worker SQL Worker SQL Worker
SQL Worker
Query
Topic
Metadata
<-> Events <->
Streaming FLiPS Apps
StreamNative Hub
StreamNative Cloud
Unified Batch and Stream COMPUTING
Batch
(Batch + Stream)
Unified Batch and Stream STORAGE
Offload
(Queuing + Streaming)
Tiered Storage
Pulsar
---
KoP
---
MoP
---
Websocket
Pulsar
Sink
Streaming
Edge Gateway
Protocols
<-> Events <->
CDC
Apps
Review: Key Pulsar Terminology
● Producer is a process that publishes messages to a topic.
● Consumer is a process that establishes a subscription to a topic
and processes messages published to that topic.
● Subscription: A subscription is a named configuration rule that
determines how messages are delivered to consumers. Four
subscription modes are available in Pulsar: exclusive, shared,
failover, and key-shared.
● Brokers handle the connections and routes messages.
● Topics are named channels for transmitting messages from
producers to consumers. Partitioned Topics are “virtual” topics
composed of multiple topics.
● Messages belong to a topic and contain an arbitrary payload.
● Instance is a group of clusters that
act together as a single unit.
● Cluster is a set of Pulsar brokers,
ZooKeeper quorum, and an
ensemble of BookKeeper bookies.
● Tenants are the administrative unit
for allocating capacity and enforcing
an authentication/ authorization
scheme.
● Namespaces are a grouping
mechanism for related topics.
The Need For Real-Time Data
Hybrid and multi-cloud
strategies with native
geo-replication
Seamlessly build
microservice architectures
with support for streaming
and messaging workloads
Built for Kubernetes
CloudNative
migrations with tools
360 degree customer data
multi-tenancy, infinite
retention, and extensive
connector ecosystem
streamnative.io
Tim Spann
Developer Advocate
StreamNative
● FLiP(N) Stack = Flink, Pulsar and NiFi Stack
● Streaming Systems & Data Architecture Expert
● Experience:
○ 15+ years of experience with streaming technologies including Apache
Pulsar, Apache Flink, Apache Spark, Apache NiFi, Big Data, Cloud,
Trino, Aerospike, IoT and more.
Background
● Provides a data platform
for the cloud
● Customers include 92 of
the Fortune 100
● Core use cases include
real-time monitoring,
interactive applications,
log processing & analytics,
IOT analytics, streaming
data transformation,
real-time analytics &
event-driven workflows
Why Pulsar
● Scalability
● Durability
● Fault Tolerance
● High Availability
● Sharing & Isolation
● Messaging Models
● Persistence
● Client Languages
● Deployment in k8s
● Operability
● Disaster REcovery
● TCO
● Community & Adoption
Benefits
● 1.5-2x lower in capex
cost
● 5-50x improvement in
latency
● 2-3x lower in opex due to
layered architecture
● Processes billions of
messages/day in
production
Background
● The third-largest payment
provider in China behind
Alipay and WeChat
Payment
● 500 million registered users
and 41.9 million active users
● Need to improve the
efficiency of fraud detection
for mobile payments
● Current lambda architecture
of Kafka + Hive is complex
and difficult to maintain
Benefits
● Reduce complexity by 33%
(clusters reduced from six to
four)
● Improve production
efficiency by 11 times
● Higher stability due to the
unified architecture
Why Pulsar
● Cloud-native architecture
and segment-centric
storage
● Pulsar is able to do both
streaming and batch
processing
● Able to build a unified
data processing stack
with Pulsar and Spark,
streamlining messy
operations problems
StreamNative Customer Spotlight:
Background
● Flipkart is the largest
e-commerce company
in India with $6B+ in
annual revenue
● Company-wide
messaging platform,
supporting different
types of streaming use
cases, including:
payment processing,
order tracking,
warehouse, logistics, etc.
Why StreamNative
● Work with the original
developers of Pulsar and
top Pulsar engineers
● Experience operating
large scale,
geo-replicated
messaging systems
● 24 x 7 support to
support mission-critical
business applications
Benefits
● Able to handle spikes in
traffic without manual
rebalancing or system failure
● Reduced operational
complexity and total cost of
ownership
● Support the move to cloud
StreamNative Customer Spotlight:
Background
● Narvar provides
e-commerce supply chain
management software,
powering 300 retailers and
650 brands
● Core use case:
asynchronous processing
to distribute tasks between
the various systems,
including individual
retailers’ ordering and
warehouse management
applications
Why StreamNative
● Work with the original
developers of Pulsar and
top Pulsar engineers
● “Before we began working
with StreamNative, Sijie
Guo and his team helped us
work out some production
issues. We were very
impressed by how quickly
they solved our problems
and their willingness to
help.” - Ankush Goyal
Benefits
● Accelerate application
development
● Able to handle spikes in
traffic without manual
rebalancing or system failure
● Reduced customer issues
streamnative.io
Passionate and dedicated team.
Founded by the original developers of
Apache Pulsar.
StreamNative helps teams to capture,
manage, and leverage data using Pulsar’s
unified messaging and streaming
platform.
Building An App
Code Along With Tim
<<DEMO>>
Geo-Replication
Pulsar has built-in cross
data center replication
that is used in production
already.
Why Open
Source Pulsar?
Sijie Guo
ASF Member
Pulsar/BookKeeper PMC
Founder and CEO
Jia Zhai
Pulsar/BookKeeper PMC
Co-Founder
Matteo Merli
ASF Member
Pulsar/BookKeeper PMC
CTO
● Other companies would help develop the
product
● We could build commercial offerings, services
around the core product
● We would get many benefits from an open
source model

Contenu connexe

Tendances

Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
Timothy Spann
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302
Timothy Spann
 

Tendances (20)

Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
 
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
 
Python web conference 2022 apache pulsar development 101 with python (f li-...
Python web conference 2022   apache pulsar development 101 with python (f li-...Python web conference 2022   apache pulsar development 101 with python (f li-...
Python web conference 2022 apache pulsar development 101 with python (f li-...
 
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open source
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data Lakes
 
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
 
StreamNative FLiP into scylladb - scylla summit 2022
StreamNative   FLiP into scylladb - scylla summit 2022StreamNative   FLiP into scylladb - scylla summit 2022
StreamNative FLiP into scylladb - scylla summit 2022
 
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
 
Architecting for Scale
Architecting for ScaleArchitecting for Scale
Architecting for Scale
 
Data minutes #2 Apache Pulsar with MQTT for Edge Computing Lightning - 2022
Data minutes #2   Apache Pulsar with MQTT for Edge Computing Lightning - 2022Data minutes #2   Apache Pulsar with MQTT for Edge Computing Lightning - 2022
Data minutes #2 Apache Pulsar with MQTT for Edge Computing Lightning - 2022
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
fluentd -- the missing log collector
fluentd -- the missing log collectorfluentd -- the missing log collector
fluentd -- the missing log collector
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open Source
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
 
Apache Pulsar, Supporting the Entire Lifecycle of Streaming Data
Apache Pulsar, Supporting the Entire Lifecycle of Streaming DataApache Pulsar, Supporting the Entire Lifecycle of Streaming Data
Apache Pulsar, Supporting the Entire Lifecycle of Streaming Data
 
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
 
Kafka and Spark Streaming
Kafka and Spark StreamingKafka and Spark Streaming
Kafka and Spark Streaming
 
Pulsar summit asia 2021: Designing Pulsar for Isolation
Pulsar summit asia 2021: Designing Pulsar for IsolationPulsar summit asia 2021: Designing Pulsar for Isolation
Pulsar summit asia 2021: Designing Pulsar for Isolation
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302
 

Similaire à Open Source Bristol 30 March 2022

MuleSoft Meetup Singapore #8 March 2021
MuleSoft Meetup Singapore #8 March 2021MuleSoft Meetup Singapore #8 March 2021
MuleSoft Meetup Singapore #8 March 2021
Julian Douch
 
7_considerations_final
7_considerations_final7_considerations_final
7_considerations_final
Jane Roberts
 
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
HostedbyConfluent
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Society
confluent
 
Combating Mobile Device Theft with Blockchain
Combating Mobile Device Theft with BlockchainCombating Mobile Device Theft with Blockchain
Combating Mobile Device Theft with Blockchain
Nagesh Caparthy
 
(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference
Timothy Spann
 
ITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming Apps
Timothy Spann
 

Similaire à Open Source Bristol 30 March 2022 (20)

[AerospikeRoadshow] Apache Pulsar Unifies Streaming and Messaging for Real-Ti...
[AerospikeRoadshow] Apache Pulsar Unifies Streaming and Messaging for Real-Ti...[AerospikeRoadshow] Apache Pulsar Unifies Streaming and Messaging for Real-Ti...
[AerospikeRoadshow] Apache Pulsar Unifies Streaming and Messaging for Real-Ti...
 
Built on Pulsar: A Commercial Consent Management System for 80 Million Citizens
Built on Pulsar: A Commercial Consent Management System for 80 Million CitizensBuilt on Pulsar: A Commercial Consent Management System for 80 Million Citizens
Built on Pulsar: A Commercial Consent Management System for 80 Million Citizens
 
Confluent Messaging Modernization Forum
Confluent Messaging Modernization ForumConfluent Messaging Modernization Forum
Confluent Messaging Modernization Forum
 
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
 
MuleSoft Meetup Singapore #8 March 2021
MuleSoft Meetup Singapore #8 March 2021MuleSoft Meetup Singapore #8 March 2021
MuleSoft Meetup Singapore #8 March 2021
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
 
7_considerations_final
7_considerations_final7_considerations_final
7_considerations_final
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
[IoT Tech Expo] Smart Cities – Leveraging Messaging from Project to City to ...
[IoT Tech Expo] Smart Cities – Leveraging Messaging from Project to City to ...[IoT Tech Expo] Smart Cities – Leveraging Messaging from Project to City to ...
[IoT Tech Expo] Smart Cities – Leveraging Messaging from Project to City to ...
 
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
 
All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Society
 
apidays New York 2022 - Leveraging Event Streaming to Super-Charge your Busin...
apidays New York 2022 - Leveraging Event Streaming to Super-Charge your Busin...apidays New York 2022 - Leveraging Event Streaming to Super-Charge your Busin...
apidays New York 2022 - Leveraging Event Streaming to Super-Charge your Busin...
 
How io t is changing our world
How io t is changing our worldHow io t is changing our world
How io t is changing our world
 
Combating Mobile Device Theft with Blockchain
Combating Mobile Device Theft with BlockchainCombating Mobile Device Theft with Blockchain
Combating Mobile Device Theft with Blockchain
 
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
 
(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference
 
ITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming Apps
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 

Plus de Timothy Spann

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Timothy Spann
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
Timothy Spann
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI Pipelines
Timothy Spann
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
Timothy Spann
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
Timothy Spann
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
Timothy Spann
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
Timothy Spann
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
Timothy Spann
 

Plus de Timothy Spann (20)

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI Pipelines
 
2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python Processors
 
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
 
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
 
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
 

Dernier

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Dernier (20)

How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 

Open Source Bristol 30 March 2022

  • 2. streamnative.io Tim Spann Developer Advocate StreamNative ● FLiP(N) Stack = Flink, Pulsar and NiFi Stack ● Streaming Systems & Data Architecture Expert ● Experience: ○ 15+ years of experience with streaming technologies including Apache Pulsar, Apache Flink, Apache Spark, Apache NiFi, Big Data, Cloud, Trino, Aerospike, IoT and more. John Kinson Head of Sales, EMEA StreamNative ● Startup, Scale-up and Large Enterprise expert ● Building the StreamNative Sales function in EMEA ● Experience: ○ 25+ years of building and selling distributed and embedded systems in the telecoms, digital media and cloud enterprise software industries
  • 3. Agenda 01 Welcome 02 Introduction to Messaging + Data Streaming 03 Introduction to Apache Pulsar 04 Why Open Source 05 Resources 06 Q&A 3
  • 4. 4 ➔ Asynchronous messages triggered by events ➔ Consuming messages regardless of Language, System, Sender ➔ Queueing ➔ Routing ➔ Work Queues ➔ JPMorgan Chase AMQP MESSAGING
  • 5. 5 ➔ Perform in Real-Time ➔ Process Events as They Happen ➔ Joining Streams with SQL ➔ Find Anomalies Immediately ➔ Ordering and Arrival Semantics ➔ Continuous Streams of Data DATA STREAMING
  • 6. streamnative.io Accessing historical as well as real-time data Pub/sub model enables event streams to be sent from multiple producers, and consumed by multiple consumers To process large amounts of data in a highly scalable way When is Messaging and Streaming used?
  • 7. Industry trends Banking Transforming from siloed systems to combined data streams Provide faster claim processing, fraud detection and system integration Insurance Handle huge columns of data from sensors IoT 7
  • 8. Apache Pulsar is a Cloud-Native Messaging and Event-Streaming Platform.
  • 9. Messaging Ideal for work queues that do not require tasks to be performed in a particular order—for example, sending one email message to many recipients. RabbitMQ and Amazon SQS are examples of popular queue-based message systems. Pulsar: Unified Messaging + Data Streaming
  • 10. Messaging Ideal for work queues that do not require tasks to be performed in a particular order—for example, sending one email message to many recipients. RabbitMQ and Amazon SQS are examples of popular queue-based message systems. Pulsar: Unified Messaging + Data Streaming .. and Streaming Works best in situations where the order of messages is important—for example, data ingestion. Kafka and Amazon Kinesis are examples of messaging systems that use streaming semantics for consuming messages.
  • 11. Unified Messaging and Streaming StreamNative Hub StreamNative Cloud Unified Batch and Stream COMPUTING Batch (Batch + Stream) Unified Batch and Stream STORAGE Offload (Queuing + Streaming) Tiered Storage Pulsar --- KoP --- MoP --- Websocket Pulsar Sink Streaming Edge Gateway Protocols CDC Apps
  • 14. Using Pulsar with Fintech 14 Low latency Geo-replication Data integrity High availability Durability Multi-tenancy Multiple data consumers: Transactions, payment processing, alerts, analytics, KYC, fraud detection with ML & AI Large data volumes, high scalability Financial event messaging Many topics, producers, consumers
  • 15. Why Open Source Pulsar? Sijie Guo ASF Member Pulsar/BookKeeper PMC Founder and CEO Jia Zhai Pulsar/BookKeeper PMC Co-Founder Matteo Merli ASF Member Pulsar/BookKeeper PMC CTO
  • 16. 16 ● We would get many benefits from an open source model ○ Other companies would help develop the product ○ Better security, code escrow, longevity ● We would keep the core features in the OSS version ● We could build commercial offerings, services around the core product OUR BETS AND EARLY DECISIONS Why Open Source Pulsar?
  • 17. 17 C/OSS Model Benefits Challenges Many developers Security, Longevity, Escrow Why pay? Multiple roadmaps
  • 18. RESOURCES Here are resources to continue your journey with Apache Pulsar
  • 20. 20 FREE ebook Apache Pulsar in Action Access Now!
  • 21. John Kinson Head of Sales EMEA Q&A Tim Spann Developer Advocate @PaaSDev linkedin.com/in/ timothyspann github.com/tspannhw john@streamnative.io linkedin.com/in/ johnkinson +44 207 072 1095
  • 23. streamnative.io Industry trends Notable industries and sectors using data streaming: Banking - transforming from siloed systems to combined data streams ○ Typical applications of event streaming include banking sector processing of financial transactions, with multiple customer touchpoints, notifications, and support for mobile devices ○ Banking data (transactions and meta data) can be streamed in parallel for fraud detection using ML and AI in near real-time Insurance - building a single view from multiple data sources to provide faster claim processing, fraud detection and system integration IoT - handling huge volumes of data from sensors
  • 24. Adopted Pulsar to replace Kafka in their DSP (Data Streaming Platform). ● 1.5-2x lower in capex cost ● 5-50x improvement in latency ● 2-3x lower in opex due ● Process 10 petabytes/day Adopted Pulsar to power their billing platform, Midas, which processing hundreds of billions of financial transactions daily. Adoption then expanded to Tencent’s Federated Learning Platform and Tencent Gaming. Applied Materials is one of the biggest semiconductor hardware and software supplier in the industry. They adopted Pulsar to enable them to build a message bus to tie all of their data together. They previously used Tibco. Pulsar Adoption Use Cases
  • 25. Agenda Welcome Introduction to Messaging + Data Streaming ● What is messaging and data streaming? ● When is it used? ● What are the industry trends? Introduction to Apache Pulsar ● What it is ● What it enables ● Who uses it today? ● Using Apache Pulsar in FinTech applications Why Open Source ● Why open source Apache Pulsar? ● What have been the benefits and challenges? Resources Q&A
  • 26. Industry trends Banking Transforming from siloed systems to combined data streams Provide faster claim processing, fraud detection and system integration Insurance Handle huge columns of data from sensors IoT 26
  • 27. Pulsar Adoption Spreads Tencent serves billions of users and over a million merchants. Use Case #1: Payments Early 2019, Tencent adopts Pulsar to power their billing platform, Midas, processing hundreds of billions of financial transactions daily. Use Case #2: ML/AI Pulsar adoption spreads to Tencent’s Federated Learning Platform where it supports trillions of concurrent federated learnings every day. Use Case #3: Gaming Tencent’s Gaming Department replaces Kafka with Pulsar for its logging pipeline.
  • 28. Founded By The Creators Of Apache Pulsar Sijie Guo ASF Member Pulsar/BookKeeper PMC Founder and CEO Jia Zhai Pulsar/BookKeeper PMC Co-Founder Matteo Merli ASF Member Pulsar/BookKeeper PMC CTO Data veterans with extensive industry experience
  • 29. Messages - the basic unit of Pulsar Component Description Value / data payload The data carried by the message. All Pulsar messages contain raw bytes, although message data can also conform to data schemas. Key Messages are optionally tagged with keys, used in partitioning and also is useful for things like topic compaction. Properties An optional key/value map of user-defined properties. Producer name The name of the producer who produces the message. If you do not specify a producer name, the default name is used. Message De-Duplication. Sequence ID Each Pulsar message belongs to an ordered sequence on its topic. The sequence ID of the message is its order in that sequence. Message De-Duplication.
  • 30. Producer-Consumer Producer Consumer Publisher sends data and doesn't know about the subscribers or their status. All interactions go through Pulsar and it handles all communication. Subscriber receives data from publisher and never directly interacts with it Topic Topic
  • 31. Pulsar’s Publish-Subscribe model Broker Subscription Consumer 1 Consumer 2 Consumer 3 Topic Producer 1 Producer 2 ● Producers send messages. ● Topics are an ordered, named channel that producers use to transmit messages to subscribed consumers. ● Messages belong to a topic and contain an arbitrary payload. ● Brokers handle connections and routes messages between producers / consumers. ● Subscriptions are named configuration rules that determine how messages are delivered to consumers. ● Consumers receive messages.
  • 32. Pulsar Subscription Modes Different subscription modes have different semantics: Exclusive/Failover - guaranteed order, single active consumer Shared - multiple active consumers, no order Key_Shared - multiple active consumers, order for given key Producer 1 Producer 2 Pulsar Topic Subscription D Consumer D-1 Consumer D-2 Key-Shared < K 1, V 10 > < K 1, V 11 > < K 1, V 12 > < K 2 ,V 2 0 > < K 2 ,V 2 1> < K 2 ,V 2 2 > Subscription C Consumer C-1 Consumer C-2 Shared < K 1, V 10 > < K 2, V 21 > < K 1, V 12 > < K 2 ,V 2 0 > < K 1, V 11 > < K 2 ,V 2 2 > Subscription A Consumer A Exclusive Subscription B Consumer B-1 Consumer B-2 In case of failure in Consumer B-1 Failover
  • 33. Messaging Ordering Guarantees Topic Ordering Guarantees: ● Messages sent to a single topic or partition DO have an ordering guarantee. ● Messages sent to different partitions DO NOT have an ordering guarantee. 33 Subscription Mode Guarantees: ● A single consumer can receive messages from the same partition in order using an exclusive or failover subscription mode. ● Multiple consumers can receive messages from the same key in order using the key_shared subscription mode.
  • 34. Messaging Ordering Guarantees Topic Ordering Guarantees: ● Messages sent to a single topic or partition DO have an ordering guarantee. ● Messages sent to different partitions DO NOT have an ordering guarantee. 34 Subscription Mode Guarantees: ● A single consumer can receive messages from the same partition in order using an exclusive or failover subscription mode. ● Multiple consumers can receive messages from the same key in order using the key_shared subscription mode.
  • 35. Unified Messaging Model Streaming Messaging Producer 1 Producer 2 Pulsar Topic/Partition m0 m1 m2 m3 m4 Consumer D-1 Consumer D-2 Consumer D-3 Subscription D < k 2 , v 1 > < k 2 , v 3 > <k3,v2> < k 1 , v 0 > < k 1 , v 4 > Key-Shared Consumer C-1 Consumer C-2 Consumer C-3 Subscription C m1 m2 m3 m4 m0 Shared Failover Consumer B-1 Consumer B-0 Subscription B m1 m2 m3 m4 m0 In case of failure in Consumer B-0 Consumer A-1 Consumer A-0 Subscription A m1 m2 m3 m4 m0 Exclusive X
  • 36. Connectivity • Libraries - (Java, Python, Go, NodeJS, WebSockets, C++, C#, Scala, Rust,...) • Functions - Lightweight Stream Processing (Java, Python, Go) • Connectors - Sources & Sinks (Cassandra, Kafka, …) • Protocol Handlers - AoP (AMQP), KoP (Kafka), MoP (MQTT) • Processing Engines - Flink, Spark, Presto/Trino via Pulsar SQL • Data Offloaders - Tiered Storage - (S3) hub.streamnative.io
  • 37. Use Cases Multi-Tenant Data Infrastructure AdTech Fraud Detection FinTech IoT Analytics Microservices Development
  • 38. Schema Registry Schema Registry schema-1 (value=Avro/Protobuf/JSON) schema-2 (value=Avro/Protobuf/JSON) schema-3 (value=Avro/Protobuf/JSON) Schema Data ID Local Cache for Schemas + Schema Data ID + Local Cache for Schemas Send schema-1 (value=Avro/Protobuf/JSON) data serialized per schema ID Send (register) schema (if not in local cache) Read schema-1 (value=Avro/Protobuf/JSON) data deserialized per schema ID Get schema by ID (if not in local cache) Producers Consumers
  • 39. Pulsar Functions ● Lightweight computation similar to AWS Lambda. ● Specifically designed to use Apache Pulsar as a message bus. ● Function runtime can be located within Pulsar Broker. A serverless event streaming framework
  • 40. ● Consume messages from one or more Pulsar topics. ● Apply user-supplied processing logic to each message. ● Publish the results of the computation to another topic. ● Support multiple programming languages (Java, Python, Go) ● Can leverage 3rd-party libraries to support the execution of ML models on the edge. Pulsar Functions
  • 41. Moving Data In and Out of Pulsar IO/Connectors are a simple way to integrate with external systems and move data in and out of Pulsar. https://pulsar.apache.org/docs/en/io-jdbc-sink/ ● Built on top of Pulsar Functions ● Built-in connectors - hub.streamnative.io Source Sink
  • 43. Pulsar SQL Presto/Trino workers can read segments directly from bookies (or offloaded storage) in parallel. Bookie 1 Segment 1 Producer Consumer Broker 1 Topic1-Part1 Broker 2 Topic1-Part2 Broker 3 Topic1-Part3 Segment 2 Segment 3 Segment 4 Segment X Segment 1 Segment 1 Segment 1 Segment 3 Segment 3 Segment 3 Segment 2 Segment 2 Segment 2 Segment 4 Segment 4 Segment 4 Segment X Segment X Segment X Bookie 2 Bookie 3 Query Coordinator ... ... SQL Worker SQL Worker SQL Worker SQL Worker Query Topic Metadata
  • 44. <-> Events <-> Streaming FLiPS Apps StreamNative Hub StreamNative Cloud Unified Batch and Stream COMPUTING Batch (Batch + Stream) Unified Batch and Stream STORAGE Offload (Queuing + Streaming) Tiered Storage Pulsar --- KoP --- MoP --- Websocket Pulsar Sink Streaming Edge Gateway Protocols <-> Events <-> CDC Apps
  • 45. Review: Key Pulsar Terminology ● Producer is a process that publishes messages to a topic. ● Consumer is a process that establishes a subscription to a topic and processes messages published to that topic. ● Subscription: A subscription is a named configuration rule that determines how messages are delivered to consumers. Four subscription modes are available in Pulsar: exclusive, shared, failover, and key-shared. ● Brokers handle the connections and routes messages. ● Topics are named channels for transmitting messages from producers to consumers. Partitioned Topics are “virtual” topics composed of multiple topics. ● Messages belong to a topic and contain an arbitrary payload. ● Instance is a group of clusters that act together as a single unit. ● Cluster is a set of Pulsar brokers, ZooKeeper quorum, and an ensemble of BookKeeper bookies. ● Tenants are the administrative unit for allocating capacity and enforcing an authentication/ authorization scheme. ● Namespaces are a grouping mechanism for related topics.
  • 46. The Need For Real-Time Data Hybrid and multi-cloud strategies with native geo-replication Seamlessly build microservice architectures with support for streaming and messaging workloads Built for Kubernetes CloudNative migrations with tools 360 degree customer data multi-tenancy, infinite retention, and extensive connector ecosystem
  • 47. streamnative.io Tim Spann Developer Advocate StreamNative ● FLiP(N) Stack = Flink, Pulsar and NiFi Stack ● Streaming Systems & Data Architecture Expert ● Experience: ○ 15+ years of experience with streaming technologies including Apache Pulsar, Apache Flink, Apache Spark, Apache NiFi, Big Data, Cloud, Trino, Aerospike, IoT and more.
  • 48. Background ● Provides a data platform for the cloud ● Customers include 92 of the Fortune 100 ● Core use cases include real-time monitoring, interactive applications, log processing & analytics, IOT analytics, streaming data transformation, real-time analytics & event-driven workflows Why Pulsar ● Scalability ● Durability ● Fault Tolerance ● High Availability ● Sharing & Isolation ● Messaging Models ● Persistence ● Client Languages ● Deployment in k8s ● Operability ● Disaster REcovery ● TCO ● Community & Adoption Benefits ● 1.5-2x lower in capex cost ● 5-50x improvement in latency ● 2-3x lower in opex due to layered architecture ● Processes billions of messages/day in production
  • 49. Background ● The third-largest payment provider in China behind Alipay and WeChat Payment ● 500 million registered users and 41.9 million active users ● Need to improve the efficiency of fraud detection for mobile payments ● Current lambda architecture of Kafka + Hive is complex and difficult to maintain Benefits ● Reduce complexity by 33% (clusters reduced from six to four) ● Improve production efficiency by 11 times ● Higher stability due to the unified architecture Why Pulsar ● Cloud-native architecture and segment-centric storage ● Pulsar is able to do both streaming and batch processing ● Able to build a unified data processing stack with Pulsar and Spark, streamlining messy operations problems
  • 50. StreamNative Customer Spotlight: Background ● Flipkart is the largest e-commerce company in India with $6B+ in annual revenue ● Company-wide messaging platform, supporting different types of streaming use cases, including: payment processing, order tracking, warehouse, logistics, etc. Why StreamNative ● Work with the original developers of Pulsar and top Pulsar engineers ● Experience operating large scale, geo-replicated messaging systems ● 24 x 7 support to support mission-critical business applications Benefits ● Able to handle spikes in traffic without manual rebalancing or system failure ● Reduced operational complexity and total cost of ownership ● Support the move to cloud
  • 51. StreamNative Customer Spotlight: Background ● Narvar provides e-commerce supply chain management software, powering 300 retailers and 650 brands ● Core use case: asynchronous processing to distribute tasks between the various systems, including individual retailers’ ordering and warehouse management applications Why StreamNative ● Work with the original developers of Pulsar and top Pulsar engineers ● “Before we began working with StreamNative, Sijie Guo and his team helped us work out some production issues. We were very impressed by how quickly they solved our problems and their willingness to help.” - Ankush Goyal Benefits ● Accelerate application development ● Able to handle spikes in traffic without manual rebalancing or system failure ● Reduced customer issues
  • 52. streamnative.io Passionate and dedicated team. Founded by the original developers of Apache Pulsar. StreamNative helps teams to capture, manage, and leverage data using Pulsar’s unified messaging and streaming platform.
  • 53. Building An App Code Along With Tim <<DEMO>>
  • 54. Geo-Replication Pulsar has built-in cross data center replication that is used in production already.
  • 55. Why Open Source Pulsar? Sijie Guo ASF Member Pulsar/BookKeeper PMC Founder and CEO Jia Zhai Pulsar/BookKeeper PMC Co-Founder Matteo Merli ASF Member Pulsar/BookKeeper PMC CTO ● Other companies would help develop the product ● We could build commercial offerings, services around the core product ● We would get many benefits from an open source model