SlideShare une entreprise Scribd logo
1  sur  100
Apache Samza*
Reliable Stream Processing atop
Apache Kafka and Yarn
Sriram Subramanian
Me on Linkedin
Me on twitter - @sriramsub1
* Incubating
Agenda
• Why Stream Processing?
• What is Samza’s Design ?
• How is Samza’s Design Implemented?
• How can you use Samza ?
• Example usage at Linkedin
Why Stream Processing?
Response latency
0 ms
Response latency
Synchronous
0 ms
Response latency
Synchronous Later. Possibly much later.
0 ms
Response latency
Milliseconds to minutes
Synchronous Later. Possibly much later.
0 ms
Newsfeed
Ad Relevance
Search Index
Metrics and Monitoring
What is Samza’s Design ?
Stream A
JOB
Stream B
Stream C
Stream A
JOB 1
Stream B
Stream C
Stream D
JOB 2
Stream E
Stream F
JOB 3
Stream G
Streams
Partition 0 Partition 1 Partition 2
Streams
Partition 0 Partition 1 Partition 2
1
2
3
4
5
6
1
2
3
4
5
1
2
3
4
5
6
7
Streams
Partition 0 Partition 1 Partition 2
1
2
3
4
5
6
1
2
3
4
5
1
2
3
4
5
6
7
Streams
Partition 0 Partition 1 Partition 2
1
2
3
4
5
6
1
2
3
4
5
1
2
3
4
5
6
7
Streams
Partition 0 Partition 1 Partition 2
1
2
3
4
5
6
1
2
3
4
5
1
2
3
4
5
6
7
Streams
Partition 0 Partition 1 Partition 2
1
2
3
4
5
6
1
2
3
4
5
1
2
3
4
5
6
7
Streams
Partition 0 Partition 1 Partition 2
next append
1
2
3
4
5
6
1
2
3
4
5
1
2
3
4
5
6
7
Jobs
Stream A Stream B
Task 1 Task 2 Task 3
Stream C
Jobs
AdViews AdClicks
Task 1 Task 2 Task 3
AdClickThroughRate
Tasks
AdViews
CounterTask
Partition 0 Partition 1
Ad Views - Partition 0
1
2
3
4
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0 Partition 1
Ad Views - Partition 0
1
2
3
4
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0 Partition 1
Ad Views - Partition 0
1
2
3
4
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0
Partition 1
Ad Views - Partition 0
1
2
3
4
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0
Partition 1
Ad Views - Partition 0
1
2
3
4
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0
Partition 1
Ad Views - Partition 0
1
2
3
4
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0 Partition 1
Ad Views - Partition 0
1
2
3
4
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0 Partition 1
Ad Views - Partition 0
1
2
3
4
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0 Partition 1
1
2
3
4
2
Partition 1
Checkpoint
Stream
Ad Views - Partition 0
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0 Partition 1
1
2
3
4
2
Partition 1
Checkpoint
Stream
Ad Views - Partition 0
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0 Partition 1
1
2
3
4
2
Partition 1
Checkpoint
Stream
Ad Views - Partition 0
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0
Partition 1
1
2
3
4
2
Partition 1
Checkpoint
Stream
Ad Views - Partition 0
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0
Partition 1
1
2
3
4
2
Partition 1
Checkpoint
Stream
Ad Views - Partition 0
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0
Partition 1
1
2
3
4
2
Partition 1
Checkpoint
Stream
Ad Views - Partition 0
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0
Partition 1
1
2
3
4
2
Partition 1
Checkpoint
Stream
Ad Views - Partition 0
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0
Partition 1
1
2
3
4
2
Partition 1
Checkpoint
Stream
Ad Views - Partition 0
Output Count Stream
Tasks
AdViews
CounterTask
Partition 0
Partition 1
1
2
3
4
2
Partition 1
Checkpoint
Stream
Ad Views - Partition 0
Output Count Stream
Dataflow
Stream A Stream B Stream C
Stream E
Stream B
Job 1 Job 2
Stream D
Job 3
Dataflow
Stream A Stream B Stream C
Stream E
Stream B
Job 1 Job 2
Stream D
Job 3
Stateful Processing
• Windowed Aggregation
– Counting the number of page views for each user per hour
• Stream Stream Join
– Join stream of ad clicks to stream of ad views to identify the view that
lead to the click
• Stream Table Join
– Join user region info to stream of page views to create an augmented
stream
• In memory state with checkpointing
– Periodically save out the task’s in memory
data
– As state grows becomes very expensive
– Some implementation checkpoints diffs but
adds complexity
How do people do this?
• Using an external store
– Push state to an external store
– Performance suffers because of remote queries
– Lack of isolation
– Limited query capabilities
How do people do this?
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Stateful Tasks
Stream A
Task 1 Task 2 Task 3
Stream B Changelog Stream
Key-Value Store
• put(table_name, key, value)
• get(table_name, key)
• delete(table_name, key)
• range(table_name, key1, key2)
How is Samza’s Design
Implemented?
Apache Kafka
• Persistent,
reliable,
distributed
message queue
At LinkedIn
10+ billion
writes per day
172k
messages per second
(average)
60+ billion
messages per day
to real-time consumers
Apache Kafka
• Models streams as topics
• Each topic is partitioned and each partition is
replicated
• Producer sends messages to a topic
• Messages are stored in brokers
• Consumers consume from a topic (pull from broker)
YARN- Yet another resource
negotiator
• Framework to run your code on a grid of
machines
• Distributes our tasks across multiple
machines
• Notifies our framework when a task has
died
• Isolates our tasks from each other
Jobs
Stream A
Task 1 Task 2 Task 3
Stream B
Containers
Task 1 Task 2 Task 3
Stream B
Stream A
Containers
Stream B
Stream A
Samza Container 1 Samza Container 2
Containers
Samza Container 1 Samza Container 2
YARN
Samza Container 1 Samza Container 2
Host 1 Host 2
YARN
Samza Container 1 Samza Container 2
NodeManager NodeManager
Host 1 Host 2
YARN
Samza Container 1 Samza Container 2
NodeManager NodeManager
Samza YARN AM
Host 1 Host 2
YARN
Samza Container 1 Samza Container 2
NodeManager
Kafka Broker
NodeManager
Samza YARN AM
Kafka Broker
Host 1 Host 2
YARN
MapReduce
Container
MapReduce
Container
NodeManager
HDFS
NodeManager
MapReduce
YARN AM
HDFS
Host 1 Host 2
YARN
Samza Container 1
NodeManager
Kafka Broker
Host 1
Stream C
Stream A
Samza Container 1
Samza
Container 2
YARN
Samza Container 1
NodeManager
Kafka Broker
Host 1
Stream C
Stream A
Samza Container 1
Samza
Container 2
YARN
Samza Container 1
NodeManager
Kafka Broker
Host 1
Stream C
Stream A
Samza Container 1
Samza
Container 2
YARN
Samza Container 1
NodeManager
Kafka Broker
Host 1
Stream C
Stream A
Samza Container 1
Samza
Container 2
YARN
Samza Container 1 Samza Container 2
NodeManager
Kafka Broker
NodeManager
Samza YARN AM
Kafka Broker
Host 1 Host 2
CGroups
Samza Container 1 Samza Container 2
NodeManager
Kafka Broker
NodeManager
Samza YARN AM
Kafka Broker
Host 1 Host 2
How can you use Samza ?
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Tasks
Partition 0
class PageKeyViewsCounterTask implements StreamTask {
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = ((GenericRecord) envelope.getMsg());
String pageKey = record.get("page-key").toString();
int newCount = pageKeyViews.get(pageKey).incrementAndGet();
collector.send(countStream, pageKey, newCount);
}
}
Stateful Stream Task
public class SimpleStatefulTask implements StreamTask, InitableTask {
private KeyValueStore<String, String> store;
public void init(Config config, TaskContext context) {
this.store = context.getStore("mystore");
}
public void process(
IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = (GenericRecord) envelope.getMessage();
String memberId = record.get("member_id");
String name = record.get("name");
System.out.println("old name: " + store.get(memberId));
store.put(memberId, name);
}
}
Stateful Stream Task
public class SimpleStatefulTask implements StreamTask, InitableTask {
private KeyValueStore<String, String> store;
public void init(Config config, TaskContext context) {
this.store = context.getStore("mystore");
}
public void process(
IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = (GenericRecord) envelope.getMessage();
String memberId = record.get("member_id");
String name = record.get("name");
System.out.println("old name: " + store.get(memberId));
store.put(memberId, name);
}
}
Stateful Stream Task
public class SimpleStatefulTask implements StreamTask, InitableTask {
private KeyValueStore<String, String> store;
public void init(Config config, TaskContext context) {
this.store = context.getStore("mystore");
}
public void process(
IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = (GenericRecord) envelope.getMessage();
String memberId = record.get("member_id");
String name = record.get("name");
System.out.println("old name: " + store.get(memberId));
store.put(memberId, name);
}
}
Stateful Stream Task
public class SimpleStatefulTask implements StreamTask, InitableTask {
private KeyValueStore<String, String> store;
public void init(Config config, TaskContext context) {
this.store = context.getStore("mystore");
}
public void process(
IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
GenericRecord record = (GenericRecord) envelope.getMessage();
String memberId = record.get("member_id");
String name = record.get("name");
System.out.println("old name: " + store.get(memberId));
store.put(memberId, name);
}
}
Example usage at Linkedin
Call graph assembly
get_unread_msg_count()
get_PYMK()
get_Pulse_news()
get_relevant_ads()
get_news_updates()
Lots of calls == lots of machines,
logs
get_unread_msg_count()
get_PYMK()
get_Pulse_news()
get_relevant_ads()
get_news_updates()
unread_msg_service_call
get_PYMK_service_call
pulse_news_service_call
add_relevance_service_call
news_update_service_call
TreeID: Unique identifier
page_view_event (123456)
unread_msg_service_call (123456)
another_service_call (123456)
silly_service_call (123456)
get_PYMK_service_call (123456) counter_service_call (123456)
unread_msg_service_call (123456)
count_invites_service_call (123
count_msgs_service_call (1234
OK, now lots of streams with
TreeIDs…
all_service_calls
(partitioned by TreeID)
Samza job:
Repartition-By-TreeID
*_service_call
Samza job:
Assemble Call Graph
service_call_graphs
• Near real-time holistic view of how we’re actually serving data
• Compare day-over-day, cost, changes, outages
Thank you
• Quick start: bit.ly/hello-samza
• Project homepage:
samza.incubator.apache.org
• Newbie issues: bit.ly/samza_newbie_issues
• Detailed Samza and YARN talk:
bit.ly/samza_and_yarn
• A must-read: http://bit.ly/jay_on_logs
• Twitter: @samzastream
• Me on Twitter: @sriramsub1

Contenu connexe

Tendances

Span Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified logSpan Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified logAlexander Dean
 
From Batch to Streaming ET(L) with Apache Apex at Berlin Buzzwords 2017
From Batch to Streaming ET(L) with Apache Apex at Berlin Buzzwords 2017From Batch to Streaming ET(L) with Apache Apex at Berlin Buzzwords 2017
From Batch to Streaming ET(L) with Apache Apex at Berlin Buzzwords 2017Thomas Weise
 
Amsterdam meetup at ING June 18, 2019
Amsterdam meetup at ING June 18, 2019Amsterdam meetup at ING June 18, 2019
Amsterdam meetup at ING June 18, 2019confluent
 
Using Apache Kafka to Analyze Session Windows
Using Apache Kafka to Analyze Session WindowsUsing Apache Kafka to Analyze Session Windows
Using Apache Kafka to Analyze Session Windowsconfluent
 
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...confluent
 
Crossing the streams viktor gamov
Crossing the streams viktor gamovCrossing the streams viktor gamov
Crossing the streams viktor gamovconfluent
 
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...Paul Brebner
 
Introducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabricIntroducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabricAlexander Dean
 
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Beaming flink to the cloud @ netflix   ff 2016-monal-daxiniBeaming flink to the cloud @ netflix   ff 2016-monal-daxini
Beaming flink to the cloud @ netflix ff 2016-monal-daxiniMonal Daxini
 
Kafka as an Event Store (Guido Schmutz, Trivadis) Kafka Summit NYC 2019
Kafka as an Event Store (Guido Schmutz, Trivadis) Kafka Summit NYC 2019Kafka as an Event Store (Guido Schmutz, Trivadis) Kafka Summit NYC 2019
Kafka as an Event Store (Guido Schmutz, Trivadis) Kafka Summit NYC 2019confluent
 
Harvesting the Power of Samza in LinkedIn's Feed
Harvesting the Power of Samza in LinkedIn's FeedHarvesting the Power of Samza in LinkedIn's Feed
Harvesting the Power of Samza in LinkedIn's FeedMohamed El-Geish
 
Streams and Tables: Two Sides of the Same Coin (BIRTE 2018)
Streams and Tables: Two Sides of the Same Coin (BIRTE 2018)Streams and Tables: Two Sides of the Same Coin (BIRTE 2018)
Streams and Tables: Two Sides of the Same Coin (BIRTE 2018)confluent
 
What Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registriesWhat Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registriesAlexander Dean
 
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...confluent
 
A guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update ConferenceA guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update ConferenceEldert Grootenboer
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
Keynote: Jay Kreps, Confluent | Kafka ♥ Cloud | Kafka Summit 2020
Keynote: Jay Kreps, Confluent | Kafka ♥ Cloud | Kafka Summit 2020Keynote: Jay Kreps, Confluent | Kafka ♥ Cloud | Kafka Summit 2020
Keynote: Jay Kreps, Confluent | Kafka ♥ Cloud | Kafka Summit 2020confluent
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniMonal Daxini
 
Simplify Governance of Streaming Data
Simplify Governance of Streaming Data Simplify Governance of Streaming Data
Simplify Governance of Streaming Data confluent
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017Monal Daxini
 

Tendances (20)

Span Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified logSpan Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified log
 
From Batch to Streaming ET(L) with Apache Apex at Berlin Buzzwords 2017
From Batch to Streaming ET(L) with Apache Apex at Berlin Buzzwords 2017From Batch to Streaming ET(L) with Apache Apex at Berlin Buzzwords 2017
From Batch to Streaming ET(L) with Apache Apex at Berlin Buzzwords 2017
 
Amsterdam meetup at ING June 18, 2019
Amsterdam meetup at ING June 18, 2019Amsterdam meetup at ING June 18, 2019
Amsterdam meetup at ING June 18, 2019
 
Using Apache Kafka to Analyze Session Windows
Using Apache Kafka to Analyze Session WindowsUsing Apache Kafka to Analyze Session Windows
Using Apache Kafka to Analyze Session Windows
 
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
 
Crossing the streams viktor gamov
Crossing the streams viktor gamovCrossing the streams viktor gamov
Crossing the streams viktor gamov
 
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
 
Introducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabricIntroducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabric
 
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Beaming flink to the cloud @ netflix   ff 2016-monal-daxiniBeaming flink to the cloud @ netflix   ff 2016-monal-daxini
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
 
Kafka as an Event Store (Guido Schmutz, Trivadis) Kafka Summit NYC 2019
Kafka as an Event Store (Guido Schmutz, Trivadis) Kafka Summit NYC 2019Kafka as an Event Store (Guido Schmutz, Trivadis) Kafka Summit NYC 2019
Kafka as an Event Store (Guido Schmutz, Trivadis) Kafka Summit NYC 2019
 
Harvesting the Power of Samza in LinkedIn's Feed
Harvesting the Power of Samza in LinkedIn's FeedHarvesting the Power of Samza in LinkedIn's Feed
Harvesting the Power of Samza in LinkedIn's Feed
 
Streams and Tables: Two Sides of the Same Coin (BIRTE 2018)
Streams and Tables: Two Sides of the Same Coin (BIRTE 2018)Streams and Tables: Two Sides of the Same Coin (BIRTE 2018)
Streams and Tables: Two Sides of the Same Coin (BIRTE 2018)
 
What Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registriesWhat Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registries
 
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
 
A guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update ConferenceA guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update Conference
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Keynote: Jay Kreps, Confluent | Kafka ♥ Cloud | Kafka Summit 2020
Keynote: Jay Kreps, Confluent | Kafka ♥ Cloud | Kafka Summit 2020Keynote: Jay Kreps, Confluent | Kafka ♥ Cloud | Kafka Summit 2020
Keynote: Jay Kreps, Confluent | Kafka ♥ Cloud | Kafka Summit 2020
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
 
Simplify Governance of Streaming Data
Simplify Governance of Streaming Data Simplify Governance of Streaming Data
Simplify Governance of Streaming Data
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
 

En vedette

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNApache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNblueboxtraveler
 
Building Real-time Data Products at LinkedIn with Apache Samza
Building Real-time Data Products at LinkedIn with Apache SamzaBuilding Real-time Data Products at LinkedIn with Apache Samza
Building Real-time Data Products at LinkedIn with Apache SamzaTrieu Nguyen
 
Samza at LinkedIn: Taking Stream Processing to the Next Level
Samza at LinkedIn: Taking Stream Processing to the Next LevelSamza at LinkedIn: Taking Stream Processing to the Next Level
Samza at LinkedIn: Taking Stream Processing to the Next LevelMartin Kleppmann
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInChris Riccomini
 
Benchmarking Apache Samza: 1.2 million messages per sec per node
Benchmarking Apache Samza: 1.2 million messages per sec per nodeBenchmarking Apache Samza: 1.2 million messages per sec per node
Benchmarking Apache Samza: 1.2 million messages per sec per nodeTao Feng
 
Graph database super star
Graph database super starGraph database super star
Graph database super starandres_taylor
 

En vedette (6)

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNApache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
 
Building Real-time Data Products at LinkedIn with Apache Samza
Building Real-time Data Products at LinkedIn with Apache SamzaBuilding Real-time Data Products at LinkedIn with Apache Samza
Building Real-time Data Products at LinkedIn with Apache Samza
 
Samza at LinkedIn: Taking Stream Processing to the Next Level
Samza at LinkedIn: Taking Stream Processing to the Next LevelSamza at LinkedIn: Taking Stream Processing to the Next Level
Samza at LinkedIn: Taking Stream Processing to the Next Level
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
 
Benchmarking Apache Samza: 1.2 million messages per sec per node
Benchmarking Apache Samza: 1.2 million messages per sec per nodeBenchmarking Apache Samza: 1.2 million messages per sec per node
Benchmarking Apache Samza: 1.2 million messages per sec per node
 
Graph database super star
Graph database super starGraph database super star
Graph database super star
 

Similaire à Apache Samza* Reliable Stream Processing atop Apache Kafka and Yarn

Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInChris Riccomini
 
Essential Ingredients of Realtime Stream Processing @ Scale
Essential Ingredients of Realtime Stream Processing @ ScaleEssential Ingredients of Realtime Stream Processing @ Scale
Essential Ingredients of Realtime Stream Processing @ ScaleKartik Paramasivam
 
Samza tech talk_2015 - huawei
Samza tech talk_2015 - huaweiSamza tech talk_2015 - huawei
Samza tech talk_2015 - huaweiYi Pan
 
Samza at LinkedIn
Samza at LinkedInSamza at LinkedIn
Samza at LinkedInVenu Ryali
 
stream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzastream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzaAbhishek Shivanna
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Big Data Spain
 
ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?Jagadish Venkatraman
 
Netflix Keystone Pipeline at Samza Meetup 10-13-2015
Netflix Keystone Pipeline at Samza Meetup 10-13-2015Netflix Keystone Pipeline at Samza Meetup 10-13-2015
Netflix Keystone Pipeline at Samza Meetup 10-13-2015Monal Daxini
 
Real Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaReal Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaDaria Litvinov
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQXin Wang
 
Lambda-less stream processing - linked in
Lambda-less stream processing - linked inLambda-less stream processing - linked in
Lambda-less stream processing - linked inYi Pan
 
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...Amazon Web Services
 
QCON 2015: Gearpump, Realtime Streaming on Akka
QCON 2015: Gearpump, Realtime Streaming on AkkaQCON 2015: Gearpump, Realtime Streaming on Akka
QCON 2015: Gearpump, Realtime Streaming on AkkaSean Zhong
 
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...Rob Skillington
 
LinkedIn-Teradata Summit feb 25, 2015
LinkedIn-Teradata Summit feb 25, 2015LinkedIn-Teradata Summit feb 25, 2015
LinkedIn-Teradata Summit feb 25, 2015Navina Ramesh
 
Apache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engineApache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engineTianlun Zhang
 
AWS September Webinar Series - Getting Started with DynamoDB Streams
AWS September Webinar Series - Getting Started with DynamoDB Streams AWS September Webinar Series - Getting Started with DynamoDB Streams
AWS September Webinar Series - Getting Started with DynamoDB Streams Amazon Web Services
 
Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure bloomreacheng
 
Beam me up, Samza!
Beam me up, Samza!Beam me up, Samza!
Beam me up, Samza!Xinyu Liu
 

Similaire à Apache Samza* Reliable Stream Processing atop Apache Kafka and Yarn (20)

Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
 
Essential Ingredients of Realtime Stream Processing @ Scale
Essential Ingredients of Realtime Stream Processing @ ScaleEssential Ingredients of Realtime Stream Processing @ Scale
Essential Ingredients of Realtime Stream Processing @ Scale
 
Samza tech talk_2015 - huawei
Samza tech talk_2015 - huaweiSamza tech talk_2015 - huawei
Samza tech talk_2015 - huawei
 
Samza at LinkedIn
Samza at LinkedInSamza at LinkedIn
Samza at LinkedIn
 
stream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzastream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samza
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
 
ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?
 
Netflix Keystone Pipeline at Samza Meetup 10-13-2015
Netflix Keystone Pipeline at Samza Meetup 10-13-2015Netflix Keystone Pipeline at Samza Meetup 10-13-2015
Netflix Keystone Pipeline at Samza Meetup 10-13-2015
 
Real Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaReal Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and Kafka
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQ
 
Lambda-less stream processing - linked in
Lambda-less stream processing - linked inLambda-less stream processing - linked in
Lambda-less stream processing - linked in
 
Lambda-less Stream Processing @Scale in LinkedIn
Lambda-less Stream Processing @Scale in LinkedIn Lambda-less Stream Processing @Scale in LinkedIn
Lambda-less Stream Processing @Scale in LinkedIn
 
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
 
QCON 2015: Gearpump, Realtime Streaming on Akka
QCON 2015: Gearpump, Realtime Streaming on AkkaQCON 2015: Gearpump, Realtime Streaming on Akka
QCON 2015: Gearpump, Realtime Streaming on Akka
 
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
 
LinkedIn-Teradata Summit feb 25, 2015
LinkedIn-Teradata Summit feb 25, 2015LinkedIn-Teradata Summit feb 25, 2015
LinkedIn-Teradata Summit feb 25, 2015
 
Apache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engineApache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engine
 
AWS September Webinar Series - Getting Started with DynamoDB Streams
AWS September Webinar Series - Getting Started with DynamoDB Streams AWS September Webinar Series - Getting Started with DynamoDB Streams
AWS September Webinar Series - Getting Started with DynamoDB Streams
 
Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure
 
Beam me up, Samza!
Beam me up, Samza!Beam me up, Samza!
Beam me up, Samza!
 

Dernier

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Dernier (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Apache Samza* Reliable Stream Processing atop Apache Kafka and Yarn

Notes de l'éditeur

  1. - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  2. - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  3. - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  4. - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  5. - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  6. Provide timely, relevant updates to your newsfeed
  7. Update search results with new information as it appears
  8. - open area of research- been around for 20 years
  9. Example – Stream 1 -&gt; Ad Views
  10. partitioned
  11. re-playableorderedfault tolerantinfinitevery heavyweight definition of a stream (vs. s4, storm, etc)
  12. At least once messaging. Duplicates are possible.Future: exact semantics.Transparent to user. No ack’ing API.
  13. connected by stream name onlyfully buffered
  14. Can also consume these streams from other jobs.
  15. - can’t keep messages forever. - log compaction: delete over-written keys over time.
  16. - can’t keep messages forever. - log compaction: delete over-written keys over time.
  17. store API is pluggable: Lucene, buffered sort, external sort, bitmap index, bloom filters and sketches
  18. Very much a production system, critical to LinkedIn