SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
Rhino: Efficient Management of
Very Large Distributed State
for Stream Processing Engines
Bonaventura Del Monte, Steffen Zeuch, Tilmann Rabl, Volker Markl
ACM SIGMOD 2020
What is this talk about?
Enabling Continuous Stateful Stream Processing
in the presence of TB-sized operator state,
regardless of failures and data rate fluctuations
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 2
1B+ Events
M Events/s Distributed
Stateful SPE
Use case: a real-time bidding platform
High-cardinality data stream + long-running stateful queries + large temporal aggregations or joins =
Very Large Distributed State
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 3
0-th User
Query 1 . Query N. .
100M-th User
1M-th User
.
.
Real-Time
Dashboards
Real-time
User Suggestions
Historical Data
1B+ Events
M Events/s Distributed
Stateful SPE
Use case: a real-time bidding platform
High-cardinality data stream + long-running stateful queries + large temporal aggregations or joins =
Very Large Distributed State
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 4
Query 1 . Query N. .
.
.
Real-Time
Dashboards
Real-time
User Suggestions
Historical Data
0-th User
100M-th User
1M-th User
1B+ Events
M Events/s Distributed
Stateful SPE
Use case: a real-time bidding platform
High-cardinality data stream + long-running stateful queries + large temporal aggregations or joins =
Very Large Distributed State
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 5
Query 1 . Query N. .
.
.
Real-Time
Dashboards
Real-time
User Suggestions
Historical Data
0-th User
100M-th User
1M-th User
1B+ Events
M Events/s Distributed
Stateful SPE
What could go wrong?
Very Large Distributed State + anomalous operational events =
slow reconfiguration = high latency + downtime + data loss = DISASTER
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 6
Query 1 . Query N. .
.
.
Real-Time
Dashboards
Real-time
System X
Historical Data
Failure
Data rate Fluctuation
0-th User
100M-th User
1M-th User
1B+ Events
M Events/s Distributed
Stateful SPE
What could go wrong?
Very Large Distributed State + anomalous operational events =
slow reconfiguration = high latency + downtime + data loss = DISASTER
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 7
Query 1 . Query N. .
.
.
Real-Time
Dashboards
Real-time
System X
Historical Data
Failure
Data rate Fluctuation
0-th User
100M-th User
1M-th User
What about current SPEs?
Production-ready SPEs
• Spark/Flink/Storm
• Reconfiguration via restart
• Support TB-sized State
Research Prototypes
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 8
• Megaphone/Chi/SDG/SEEP
• Fine-grained reconfiguration
• Small state size
We want to efficiently support TB-sized state and provide fine-grained reconfiguration of running queries
Do we really need yet a new system?
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 9
NeXMark Query 8 (Large Windowed Join) on 8+1 cloud instances
State-of-the-art SPEs are not ready to handle reconfiguration with TB-sized state
72
46
121
75
209
Out−of−Memory
257
Out−of−Memory
250 GB 500 GB 750 GB 1000 GB
0
50
100
150
200
250
300
State Size
Reconfiguration(s)
SUT (left to right) Flink Megaphone
Research Goal
Efficient State Management and on-the-fly Query Reconfiguration
in the presence of TB-sized Operator State to support:
Fault-tolerance
Resource Elasticity
Runtime Optimizations
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 10
Research Challenge
1. Processing overhead: minimal impact on query processing
performance
2. Consistency: do not break exactly-once progressing semantics
3. Network Overhead: state migration
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 11
Our Solution: Rhino
1. Handover Protocol
• Consistent reconfiguration without halting query execution
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 12
Our Solution: Rhino
1. Handover Protocol
• Consistent reconfiguration without halting query execution
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 13
72
46
15
121
75
23
209
Out−of−Memory
41
257
Out−of−Memory
67
250 GB 500 GB 750 GB 1000 GB
0
50
100
150
200
250
300
State Size
Reconfiguration(s)
SUT (left to right) Flink Megaphone Rhino
Our Solution: Rhino
1. Handover Protocol
• Consistent reconfiguration without halting query execution
2. Proactive, Incremental State Migration Protocol
• Tailored to efficiently transfer large state for future reconfigurations
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 14
Impact of Handover and State Migration
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 15
Reconfiguring a query with large operator state is feasible with minimal impact
72
4
46
15
121
5
75
23
209
5
Out−of−Memory
41
257
5
Out−of−Memory
67
250 GB 500 GB 750 GB 1000 GB
0
50
100
150
200
250
300
State Size
Reconfiguration(s)
SUT (left to right) Flink Megaphone Rhino Rhino+
Our contribution
• Enable on-the-fly reconfiguration of running queries with large
stateful operators
• Support for fault tolerance, resource elasticity, and runtime
optimizations for running queries with large stateful operators
• Validation of our system design at TB scale
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 16
Experiments
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 17
Event-time Data
Generator
Event-time Data
Generator
Event-time Data
Generator
Reliable Queue
Reliable Queue
Reliable Queue
System under
Test
End-to-end Processing Latency
NeXMark Benchmark Suite (Q5-Q8-QX)
Distributed setting (16 VMs on GCP)
SUT just below saturation point on NBQ8
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 18
Rhino keeps latency in check whereas baseline shows up to 3 orders of magnitude increment in latency
Load Balancing Scaling Out Fault Tolerance
Avg Min P99 Avg Min P99 Avg Min P99
141618202224 141618202224 141618202224
100
101
102
103
104
105
106
Time (minutes)
ProcessingLatency(ms)
Flink Rhino Rhino+
141618202224 141618202224 141618202224
100
101
102
103
104
105
106
Time (minutes)
ProcessingLatency(ms)
Flink Megaphone Rhino Rhino+
141618202224 141618202224 141618202224
100
101
102
103
104
105
106
Time (minutes)
ProcessingLatency(ms)
Flink Rhino Rhino+
Conclusion
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 19
Rhino removes the bottleneck due to large state transfer upon a query reconfiguration
Up to 3 orders of magnitude latency reduction upon a reconfiguration
Enables fault-tolerance, resource elasticity, runtime optimizations for running stateful queries

Contenu connexe

Similaire à Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines

Zerto for dr migration to cloud overview
Zerto for dr migration to cloud overviewZerto for dr migration to cloud overview
Zerto for dr migration to cloud overviewMorgan Davidson
 
Putting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream ProcessingPutting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream Processingconfluent
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Ben Stopford
 
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...MapR Technologies
 
Spark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike FreedmanSpark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike FreedmanSpark Summit
 
QCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uberQCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uberDanny Yuan
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Dataconomy Media
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...GetInData
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...balmanme
 
High-Speed Reactive Microservices
High-Speed Reactive MicroservicesHigh-Speed Reactive Microservices
High-Speed Reactive MicroservicesRick Hightower
 
Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Pete Siddall
 
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward
 
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...Rajesh Kannan S
 
An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...Angad Singh
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfAlbert Wong
 
Real Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalReal Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalVMware Tanzu Korea
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsSeeling Cheung
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 

Similaire à Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines (20)

Big Data and OSS at IBM
Big Data and OSS at IBMBig Data and OSS at IBM
Big Data and OSS at IBM
 
Zerto for dr migration to cloud overview
Zerto for dr migration to cloud overviewZerto for dr migration to cloud overview
Zerto for dr migration to cloud overview
 
Putting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream ProcessingPutting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream Processing
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...
 
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
 
Spark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike FreedmanSpark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike Freedman
 
QCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uberQCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uber
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
 
High-Speed Reactive Microservices
High-Speed Reactive MicroservicesHigh-Speed Reactive Microservices
High-Speed Reactive Microservices
 
Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...
 
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
 
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
 
An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdf
 
Real Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalReal Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from Pivotal
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 

Dernier

Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...ranjana rawat
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 

Dernier (20)

Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 

Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines

  • 1. Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines Bonaventura Del Monte, Steffen Zeuch, Tilmann Rabl, Volker Markl ACM SIGMOD 2020
  • 2. What is this talk about? Enabling Continuous Stateful Stream Processing in the presence of TB-sized operator state, regardless of failures and data rate fluctuations Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 2
  • 3. 1B+ Events M Events/s Distributed Stateful SPE Use case: a real-time bidding platform High-cardinality data stream + long-running stateful queries + large temporal aggregations or joins = Very Large Distributed State Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 3 0-th User Query 1 . Query N. . 100M-th User 1M-th User . . Real-Time Dashboards Real-time User Suggestions Historical Data
  • 4. 1B+ Events M Events/s Distributed Stateful SPE Use case: a real-time bidding platform High-cardinality data stream + long-running stateful queries + large temporal aggregations or joins = Very Large Distributed State Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 4 Query 1 . Query N. . . . Real-Time Dashboards Real-time User Suggestions Historical Data 0-th User 100M-th User 1M-th User
  • 5. 1B+ Events M Events/s Distributed Stateful SPE Use case: a real-time bidding platform High-cardinality data stream + long-running stateful queries + large temporal aggregations or joins = Very Large Distributed State Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 5 Query 1 . Query N. . . . Real-Time Dashboards Real-time User Suggestions Historical Data 0-th User 100M-th User 1M-th User
  • 6. 1B+ Events M Events/s Distributed Stateful SPE What could go wrong? Very Large Distributed State + anomalous operational events = slow reconfiguration = high latency + downtime + data loss = DISASTER Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 6 Query 1 . Query N. . . . Real-Time Dashboards Real-time System X Historical Data Failure Data rate Fluctuation 0-th User 100M-th User 1M-th User
  • 7. 1B+ Events M Events/s Distributed Stateful SPE What could go wrong? Very Large Distributed State + anomalous operational events = slow reconfiguration = high latency + downtime + data loss = DISASTER Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 7 Query 1 . Query N. . . . Real-Time Dashboards Real-time System X Historical Data Failure Data rate Fluctuation 0-th User 100M-th User 1M-th User
  • 8. What about current SPEs? Production-ready SPEs • Spark/Flink/Storm • Reconfiguration via restart • Support TB-sized State Research Prototypes Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 8 • Megaphone/Chi/SDG/SEEP • Fine-grained reconfiguration • Small state size We want to efficiently support TB-sized state and provide fine-grained reconfiguration of running queries
  • 9. Do we really need yet a new system? Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 9 NeXMark Query 8 (Large Windowed Join) on 8+1 cloud instances State-of-the-art SPEs are not ready to handle reconfiguration with TB-sized state 72 46 121 75 209 Out−of−Memory 257 Out−of−Memory 250 GB 500 GB 750 GB 1000 GB 0 50 100 150 200 250 300 State Size Reconfiguration(s) SUT (left to right) Flink Megaphone
  • 10. Research Goal Efficient State Management and on-the-fly Query Reconfiguration in the presence of TB-sized Operator State to support: Fault-tolerance Resource Elasticity Runtime Optimizations Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 10
  • 11. Research Challenge 1. Processing overhead: minimal impact on query processing performance 2. Consistency: do not break exactly-once progressing semantics 3. Network Overhead: state migration Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 11
  • 12. Our Solution: Rhino 1. Handover Protocol • Consistent reconfiguration without halting query execution Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 12
  • 13. Our Solution: Rhino 1. Handover Protocol • Consistent reconfiguration without halting query execution Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 13 72 46 15 121 75 23 209 Out−of−Memory 41 257 Out−of−Memory 67 250 GB 500 GB 750 GB 1000 GB 0 50 100 150 200 250 300 State Size Reconfiguration(s) SUT (left to right) Flink Megaphone Rhino
  • 14. Our Solution: Rhino 1. Handover Protocol • Consistent reconfiguration without halting query execution 2. Proactive, Incremental State Migration Protocol • Tailored to efficiently transfer large state for future reconfigurations Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 14
  • 15. Impact of Handover and State Migration Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 15 Reconfiguring a query with large operator state is feasible with minimal impact 72 4 46 15 121 5 75 23 209 5 Out−of−Memory 41 257 5 Out−of−Memory 67 250 GB 500 GB 750 GB 1000 GB 0 50 100 150 200 250 300 State Size Reconfiguration(s) SUT (left to right) Flink Megaphone Rhino Rhino+
  • 16. Our contribution • Enable on-the-fly reconfiguration of running queries with large stateful operators • Support for fault tolerance, resource elasticity, and runtime optimizations for running queries with large stateful operators • Validation of our system design at TB scale Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 16
  • 17. Experiments Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 17 Event-time Data Generator Event-time Data Generator Event-time Data Generator Reliable Queue Reliable Queue Reliable Queue System under Test End-to-end Processing Latency NeXMark Benchmark Suite (Q5-Q8-QX) Distributed setting (16 VMs on GCP)
  • 18. SUT just below saturation point on NBQ8 Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 18 Rhino keeps latency in check whereas baseline shows up to 3 orders of magnitude increment in latency Load Balancing Scaling Out Fault Tolerance Avg Min P99 Avg Min P99 Avg Min P99 141618202224 141618202224 141618202224 100 101 102 103 104 105 106 Time (minutes) ProcessingLatency(ms) Flink Rhino Rhino+ 141618202224 141618202224 141618202224 100 101 102 103 104 105 106 Time (minutes) ProcessingLatency(ms) Flink Megaphone Rhino Rhino+ 141618202224 141618202224 141618202224 100 101 102 103 104 105 106 Time (minutes) ProcessingLatency(ms) Flink Rhino Rhino+
  • 19. Conclusion Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 19 Rhino removes the bottleneck due to large state transfer upon a query reconfiguration Up to 3 orders of magnitude latency reduction upon a reconfiguration Enables fault-tolerance, resource elasticity, runtime optimizations for running stateful queries