SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Concord: Simple & Flexible
Stream Processing on Apache Mesos
Shinji Kim
Co-founder, Concord Systems
@concord
@databythebay #datagrid
Overview
•  What is Stream Processing?
•  Today’s Stream Processing
•  Introducing Concord
1. Concepts & API
2. Job Topology Management
3. Operations, Toolings, Performance
4. Message Delivery Guarantees
•  Future Development Plans
Page 2
What is stream processing?
Page 3
•  Processing Data in motion
•  Sits between message queues and databases
•  Used for faster:
–  Data enrichment
–  Aggregation
–  Filtering / deduplication
Today’s Stream Processing
•  Faster MapReduce jobs à ends up running core
business logic on top
–  Fradulent click detection
–  Real-time budget updates
–  Trigger-based trading
•  Your stream processing jobs are more like microservices
•  Need support for services / application management:
Cluster mgmt, Monitoring, Debuggability
Page 4
Introducing Concord
Concord is a distributed stream processing framework
built in C++ on top of Apache Mesos, designed for
high-performance, real-time applications that require
flexibility & control.
Page 5
Introducing Concord
Page 6
Data	
  Sources	
   Data	
  Sinks	
  
Pub / Sub Operator Model
•  Composable jobs by Metadata
A	
   B	
  
words	
  Metadata(
Name=‘A’,
istreams=[],
ostreams=[‘words’])
Metadata(
Name=‘B’,
istreams=[‘words’,
StreamGrouping.GROUP_BY],
ostreams=[])
Page 7
Pub / Sub Operator Model
•  Composable jobs by Metadata
A	
   B	
  
words	
  Metadata(
Name=‘A’,
istreams=[],
ostreams=[‘words’])
Metadata(
Name=‘B’,
istreams=[‘words’,
StreamGrouping.GROUP_BY],
ostreams=[])
Page 8
C	
   Metadata(
Name=‘C’,
istreams=[‘words’,
StreamGrouping.SHUFFLE],
ostreams=[])
Simple API in Multiple Languages
•  ProcessRecord, ProduceRecord, ProcessTimer
•  GetState, SetState backed by Rocksdb
•  API available in Python, Ruby, Go, Java/Scala, C++
B	
  Metadata(
Name=‘C’,
istreams=[‘words’,
StreamGrouping.GROUP_BY],
ostreams=[‘wordcount’])
Page 9
words	
   wordcount	
  
Key	
   Value	
  
Corgi	
   2	
  
Chiwawa	
   4	
  
Dashhound	
   5	
  
Useful for multiple teams to consume the same
streaming data in real-time
Page 10
Native Integration with Apache Mesos
Page 11
•  Dynamic resource
scheduling
•  Task Isolation
•  Task supervision
•  High Availability
Containerized Execution Environment
•  Horizontal scaling
•  Multi-tenancy
•  Hot code deployment &
dynamic topology
Page 12
Mesos	
  Agent	
  
RocksDB	
  
Concord is Flexible: Run-time deployment
Page 13
Concord is Flexible: Run-time deployment
Page 14
Concord is Flexible: Run-time deployment
Page 15
Concord is Flexible: Run-time deployment
Page 16
Concord supports Distributed Tracing
Page 17
Monitor all operator instances at glance
Page 18
Concord supports Transparent Debugging
[2015-11-02 15:36:44.770] [dispatcher_latencies] [info] 127.0.0.1:31000:
traceId: -8816532120874703981,
parentId: 0, id: -6816766813334129096,
p50: 388179us, p95: 519668us, p99: 524812us, p999: 526425us
[2015-11-02 15:37:13.929] [principal_latencies] [info] 127.0.0.1:31001:
traceId: -4811311467074699790,
parentId: -7681059555040553620,
id: -1899872683843643522,
p50: 73355us, p95: 145626us, p99: 210345us, p999: 272018us
[2015-11-02 15:36:43.323] [incoming_throughput] [info] 12288 req in 1045515us. total: 367616 req
[2015-11-02 15:36:30.240] [outgoing_throughput] [info] 100000 req in 4804526us. total: 600000 req
Page 19
Concord performs well at scale
•  Word count benchmark (1.13B msgs)
–  Concord: 500K QPS/node at 10ms/event
–  Storm: 16K QPS/node at 100ms/event
–  Spark Streaming: 100K QPS/node at 1s batch window
•  Server log processing (29G server log, ~260M msgs)
–  4 nodes, 8 vCPU, 32GB RAM each
–  Concord: 1M – 1.8M QPS
–  Spark Streaming: 72K – 2M QPS
•  Consistent performance
Page 20
Concord is designed for Predictability
•  As you scale, JVM reconfiguration and GC pauses are
inevitable (Framework GC vs. Application GC)
•  Cluster abstracted as CPU, Memory, Disk numbers à
cluster optimization & overall runtime
•  Fast Compile à Test à Deploy cycle without downtime
Page 21
Message Delivery Guarantees
Today: Fast > Complete or Perfect
•  Best-effort / at-most-once processing
–  When operator or node crashes, the local cache goes away
–  Automatically retries the failed operator (number of retries is
configurable)
–  Recommends implementing check mechanisms in operators
(e.g., Concord Kafka consumer)
Page 22
Message Delivery Guarantees
Soon: Fast + Complete > Perfect
•  In development for at-least-once with Kafka
–  Kafka acts as a message bus between operators
–  Kafka replays data from checked offset (data duplication)
Eventually: Fast + Complete + Perfect
•  Transactional datastore in design phase
Page 23
Future plans
•  “At least once” guarantee support with Kafka
•  DC/OS integration
•  More data source / data sink connector support
•  Higher level DSL
Page 24
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 25
•  Operator model that you can use multiple languages
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 26
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 27
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 28
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 29
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
•  High performance at scale
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 30
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
•  High performance at scale
à Predictable system for real-time applications
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 31
•  Low-latency / Real-time applications:
–  Real-time fraud detection
–  Financial market data processing for real-time risks and triggers
–  Real-time campaign management for real-time bidding (RTB)
Thank You!
Get Started: http://concord.io
shinji@concord.io / @shinjikim
@concord
@databythebay #datagrid

Contenu connexe

Tendances

Spark-on-YARN: Empower Spark Applications on Hadoop Cluster
Spark-on-YARN: Empower Spark Applications on Hadoop ClusterSpark-on-YARN: Empower Spark Applications on Hadoop Cluster
Spark-on-YARN: Empower Spark Applications on Hadoop ClusterDataWorks Summit
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsAlluxio, Inc.
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkFlink Forward
 
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrLet's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrSease
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...ScyllaDB
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021StreamNative
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsCloudera, Inc.
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache icebergAlluxio, Inc.
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsdatamantra
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...Altinity Ltd
 
Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkDongwon Kim
 
[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅NAVER D2
 
Scalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBScalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBAlluxio, Inc.
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiLev Brailovskiy
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 

Tendances (20)

Spark-on-YARN: Empower Spark Applications on Hadoop Cluster
Spark-on-YARN: Empower Spark Applications on Hadoop ClusterSpark-on-YARN: Empower Spark Applications on Hadoop Cluster
Spark-on-YARN: Empower Spark Applications on Hadoop Cluster
 
InnoDB Locking Explained with Stick Figures
InnoDB Locking Explained with Stick FiguresInnoDB Locking Explained with Stick Figures
InnoDB Locking Explained with Stick Figures
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrLet's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloads
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
 
Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmark
 
[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅
 
Internal Hive
Internal HiveInternal Hive
Internal Hive
 
Scalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBScalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDB
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 

Similaire à Concord: Flexible Stream Processing on Apache Mesos

Migrate to platform of your choice
Migrate to platform of your choiceMigrate to platform of your choice
Migrate to platform of your choiceAshnikbiz
 
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBShaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBMongoDB
 
Move fast and make things with microservices
Move fast and make things with microservicesMove fast and make things with microservices
Move fast and make things with microservicesMithun Arunan
 
Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue MongoDB
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table NotesTimothy Spann
 
Docker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge ComputingDocker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge ComputingBukhary Ikhwan Ismail
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022StreamNative
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureHuxing Zhang
 
Accelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyAccelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyMongoDB
 
DevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatDevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatJessica DeVita
 
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast groupPasocoPteLtd
 
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Continuent
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveWalid Shaari
 
.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los AngelesVMware Tanzu
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenChristoph Adler
 
MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021Ieva Navickaite
 

Similaire à Concord: Flexible Stream Processing on Apache Mesos (20)

Migrate to platform of your choice
Migrate to platform of your choiceMigrate to platform of your choice
Migrate to platform of your choice
 
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBShaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDB
 
Move fast and make things with microservices
Move fast and make things with microservicesMove fast and make things with microservices
Move fast and make things with microservices
 
Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table Notes
 
Docker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge ComputingDocker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge Computing
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architecture
 
Accelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyAccelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data Strategy
 
DevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatDevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to Habitat
 
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
 
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspective
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für Administratoren
 
Robotics technical Presentation
Robotics technical PresentationRobotics technical Presentation
Robotics technical Presentation
 
MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021
 

Dernier

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Dernier (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Concord: Flexible Stream Processing on Apache Mesos

  • 1. Concord: Simple & Flexible Stream Processing on Apache Mesos Shinji Kim Co-founder, Concord Systems @concord @databythebay #datagrid
  • 2. Overview •  What is Stream Processing? •  Today’s Stream Processing •  Introducing Concord 1. Concepts & API 2. Job Topology Management 3. Operations, Toolings, Performance 4. Message Delivery Guarantees •  Future Development Plans Page 2
  • 3. What is stream processing? Page 3 •  Processing Data in motion •  Sits between message queues and databases •  Used for faster: –  Data enrichment –  Aggregation –  Filtering / deduplication
  • 4. Today’s Stream Processing •  Faster MapReduce jobs à ends up running core business logic on top –  Fradulent click detection –  Real-time budget updates –  Trigger-based trading •  Your stream processing jobs are more like microservices •  Need support for services / application management: Cluster mgmt, Monitoring, Debuggability Page 4
  • 5. Introducing Concord Concord is a distributed stream processing framework built in C++ on top of Apache Mesos, designed for high-performance, real-time applications that require flexibility & control. Page 5
  • 6. Introducing Concord Page 6 Data  Sources   Data  Sinks  
  • 7. Pub / Sub Operator Model •  Composable jobs by Metadata A   B   words  Metadata( Name=‘A’, istreams=[], ostreams=[‘words’]) Metadata( Name=‘B’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[]) Page 7
  • 8. Pub / Sub Operator Model •  Composable jobs by Metadata A   B   words  Metadata( Name=‘A’, istreams=[], ostreams=[‘words’]) Metadata( Name=‘B’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[]) Page 8 C   Metadata( Name=‘C’, istreams=[‘words’, StreamGrouping.SHUFFLE], ostreams=[])
  • 9. Simple API in Multiple Languages •  ProcessRecord, ProduceRecord, ProcessTimer •  GetState, SetState backed by Rocksdb •  API available in Python, Ruby, Go, Java/Scala, C++ B  Metadata( Name=‘C’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[‘wordcount’]) Page 9 words   wordcount   Key   Value   Corgi   2   Chiwawa   4   Dashhound   5  
  • 10. Useful for multiple teams to consume the same streaming data in real-time Page 10
  • 11. Native Integration with Apache Mesos Page 11 •  Dynamic resource scheduling •  Task Isolation •  Task supervision •  High Availability
  • 12. Containerized Execution Environment •  Horizontal scaling •  Multi-tenancy •  Hot code deployment & dynamic topology Page 12 Mesos  Agent   RocksDB  
  • 13. Concord is Flexible: Run-time deployment Page 13
  • 14. Concord is Flexible: Run-time deployment Page 14
  • 15. Concord is Flexible: Run-time deployment Page 15
  • 16. Concord is Flexible: Run-time deployment Page 16
  • 17. Concord supports Distributed Tracing Page 17
  • 18. Monitor all operator instances at glance Page 18
  • 19. Concord supports Transparent Debugging [2015-11-02 15:36:44.770] [dispatcher_latencies] [info] 127.0.0.1:31000: traceId: -8816532120874703981, parentId: 0, id: -6816766813334129096, p50: 388179us, p95: 519668us, p99: 524812us, p999: 526425us [2015-11-02 15:37:13.929] [principal_latencies] [info] 127.0.0.1:31001: traceId: -4811311467074699790, parentId: -7681059555040553620, id: -1899872683843643522, p50: 73355us, p95: 145626us, p99: 210345us, p999: 272018us [2015-11-02 15:36:43.323] [incoming_throughput] [info] 12288 req in 1045515us. total: 367616 req [2015-11-02 15:36:30.240] [outgoing_throughput] [info] 100000 req in 4804526us. total: 600000 req Page 19
  • 20. Concord performs well at scale •  Word count benchmark (1.13B msgs) –  Concord: 500K QPS/node at 10ms/event –  Storm: 16K QPS/node at 100ms/event –  Spark Streaming: 100K QPS/node at 1s batch window •  Server log processing (29G server log, ~260M msgs) –  4 nodes, 8 vCPU, 32GB RAM each –  Concord: 1M – 1.8M QPS –  Spark Streaming: 72K – 2M QPS •  Consistent performance Page 20
  • 21. Concord is designed for Predictability •  As you scale, JVM reconfiguration and GC pauses are inevitable (Framework GC vs. Application GC) •  Cluster abstracted as CPU, Memory, Disk numbers à cluster optimization & overall runtime •  Fast Compile à Test à Deploy cycle without downtime Page 21
  • 22. Message Delivery Guarantees Today: Fast > Complete or Perfect •  Best-effort / at-most-once processing –  When operator or node crashes, the local cache goes away –  Automatically retries the failed operator (number of retries is configurable) –  Recommends implementing check mechanisms in operators (e.g., Concord Kafka consumer) Page 22
  • 23. Message Delivery Guarantees Soon: Fast + Complete > Perfect •  In development for at-least-once with Kafka –  Kafka acts as a message bus between operators –  Kafka replays data from checked offset (data duplication) Eventually: Fast + Complete + Perfect •  Transactional datastore in design phase Page 23
  • 24. Future plans •  “At least once” guarantee support with Kafka •  DC/OS integration •  More data source / data sink connector support •  Higher level DSL Page 24
  • 25. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 25 •  Operator model that you can use multiple languages
  • 26. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 26 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data
  • 27. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 27 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling
  • 28. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 28 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling à Decoupled development & dev ops work
  • 29. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 29 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling à Decoupled development & dev ops work •  High performance at scale
  • 30. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 30 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling à Decoupled development & dev ops work •  High performance at scale à Predictable system for real-time applications
  • 31. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 31 •  Low-latency / Real-time applications: –  Real-time fraud detection –  Financial market data processing for real-time risks and triggers –  Real-time campaign management for real-time bidding (RTB)
  • 32. Thank You! Get Started: http://concord.io shinji@concord.io / @shinjikim @concord @databythebay #datagrid