SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Massively-Parallel
Stream Processing
Under QoS Constraints
with Nephele
Björn Lohrmann, Daniel Warneke,
and Odej Kao
Technische Universität Berlin
Background
Nephele is part of the Stratosphere platform for
massively-parallel data processing
in

in

map

red

match
out

Cloud

PACTs
Compiler
Nephele
Runtime

Cluster

Open Source, downloadable at stratosphere.eu
22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

2
Background
Nephele and PACTs currently focus on batch-job
workloads
-to-

What about streaming workloads?
Possible with Nephele, but (as of now) not PACTs
May have different goals
Meet pipeline latency and throughput requirements
Max/Min other custom metrics

22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

3
Motivation
Live Processing of streamed data is an important issue
Proliferation of mobile devices capable of producing
streamed data (video, audio, other sensors)
Large Scale Deployments of Sensors in Science and
Industry
Examples: Smart Grids, Traffic Monitoring, Astronomy

Why not adapt todays mass.-parallel frameworks?

22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

4
Goals
Identify major aspects of massively-parallel
frameworks that affect QoS goals
Find general strategies to deal with QoS goals
Implement & Evaluate them using the Nephele
Execution Engine

22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

5
Agenda
1. Highlight common mass.-parallel framework design
principles
2. Explain implications for streamed workloads
3. Meeting latency requirements in Nephele
4. Experimental Results

22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

6
Framework Design
Principles
Compute Node X
Task
n

Task
n+1

Compute Node Y
Task
n

Task
n+1

Task
n

Compute Node Z
Task
n

Input Buffer
Queue

22.06.2012

Task
n+1

Output
Buffer

Data
Item

Thread/Process

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

7
Implications for Streaming
Applications
Large buffer = high tp, high latency
Small buffer = low tp, low latency
Trade-off needs to be found to meet latency goals

Thread/Process Model
1 Task= 1 Thread model is flexible, but has overhead
Thread scheduling, synchronization, communication
Serialization may be necessary (bad for TP & latency)

N Tasks = 1 Thread model can sometimes provide
better better tp and latency
22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

8
Meeting Latency
Requirements
QoS goal:
Meet latency constraint X, then maximize throughput

Based on observations we designed two strategies:
1. Adaptive Output Buffer Sizing
2. Dynamic Task Chaining

Both strategies
work autonomously (only latency constraint is required)
are applied on-demand at runtime
are applicable in systems with similar design principles
22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

9
Adaptive Output Buffer
Sizing
Only applied when latency constraint violated

For each channel
Determine output buffer latency (obl)
If obl > threshold, decrease buffer size:
size : max( , size r obl )

r 0.98,
200
If obl < threshold, increase buffer size again
size : min( , size r obl )

r 1.1,
22.06.2012

500 103

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

10
Task Chaining
Again, only applied when overall latency constraint is
violated
Compute Node

Conditions:

Task
n

Task
n+1

Compute Node
Task
n

18.11.2013

Task
n+1

Autor - Vortragstitel

Pipeline of unchained
tasks
Sum of CPU utilizations
is < 90% of capacity of
one core
Only apply to longest
chainable pipeline of
tasks
11
Complete System Overview
300ms
JM

Periodical measurements
(latency, throughput)

TM

22.06.2012

TM

TM

Buffer Size Updates,
Chain Commands

TM

TM

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

TM

TM

12
Sample Application: Video
Livestreaming
Node 1

Node 2

RTP
Server

Node n-1

Node n

Encoder

Overlay

Merger

Decoder

Partitioner

22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

13
Latency w/o Optimizations
Setup:
10 nodes, 80 cores
32 KB output buffer
size
320 video streams

Results:
Latency oscillates
around 4s
Large buffers cause

22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

14
Latency w/ Adaptive Buffer
Sizing

Final Latency:
improvement)

22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

15
Latency /w ABS+TC

Final Latency:
improvement)

22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

16
Conclusion and Future Work
Massively-parallel frameworks can be adapted to do
latency constrained stream processing
Prototype implementation on Nephele showed up to
94% latency improvement on video livestreaming job
Future Work
Distribute latency monitoring (better scalability)
Adapt PACT layer of Stratosphere to provide streaming
capabilities and latency awareness

22.06.2012

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

17

Contenu connexe

En vedette

Evolve: InSTEDD's Global Early Warning and Response System
Evolve: InSTEDD's Global Early Warning and Response SystemEvolve: InSTEDD's Global Early Warning and Response System
Evolve: InSTEDD's Global Early Warning and Response SystemTaha Kass-Hout, MD, MS
 
BioSense Program Going Forward: HIMSS10 Conference
BioSense Program Going Forward: HIMSS10 ConferenceBioSense Program Going Forward: HIMSS10 Conference
BioSense Program Going Forward: HIMSS10 ConferenceTaha Kass-Hout, MD, MS
 
Public Health Surveillance Through Collaboration
Public Health Surveillance Through CollaborationPublic Health Surveillance Through Collaboration
Public Health Surveillance Through CollaborationTaha Kass-Hout, MD, MS
 
Geohash: Integration of Disparate Geospatial Data
Geohash: Integration of Disparate Geospatial DataGeohash: Integration of Disparate Geospatial Data
Geohash: Integration of Disparate Geospatial DataDataCards
 
Latest Advances in Megapixel Surveillance
Latest Advances in Megapixel SurveillanceLatest Advances in Megapixel Surveillance
Latest Advances in Megapixel SurveillanceSteve Ma
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechRob Emanuele
 
Matchinguu droidcon presentation
Matchinguu droidcon presentationMatchinguu droidcon presentation
Matchinguu droidcon presentationDroidcon Berlin
 
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Taha Kass-Hout, MD, MS
 
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리BJ Jang
 
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Shalin Hai-Jew
 
Using Qualtrics to Create Automated Online Trainings
Using Qualtrics to Create Automated Online TrainingsUsing Qualtrics to Create Automated Online Trainings
Using Qualtrics to Create Automated Online TrainingsShalin Hai-Jew
 
Writing and Publishing about Applied Technologies in Tech Journals and Books
Writing and Publishing about Applied Technologies in Tech Journals and BooksWriting and Publishing about Applied Technologies in Tech Journals and Books
Writing and Publishing about Applied Technologies in Tech Journals and BooksShalin Hai-Jew
 
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...Shalin Hai-Jew
 

En vedette (18)

Evolve: InSTEDD's Global Early Warning and Response System
Evolve: InSTEDD's Global Early Warning and Response SystemEvolve: InSTEDD's Global Early Warning and Response System
Evolve: InSTEDD's Global Early Warning and Response System
 
BioSense Program Going Forward: HIMSS10 Conference
BioSense Program Going Forward: HIMSS10 ConferenceBioSense Program Going Forward: HIMSS10 Conference
BioSense Program Going Forward: HIMSS10 Conference
 
Public Health Surveillance Through Collaboration
Public Health Surveillance Through CollaborationPublic Health Surveillance Through Collaboration
Public Health Surveillance Through Collaboration
 
Social Media for the Meta-Leader
Social Media for the Meta-LeaderSocial Media for the Meta-Leader
Social Media for the Meta-Leader
 
BioSense 2.0
BioSense 2.0BioSense 2.0
BioSense 2.0
 
Big Data in Public Health
Big Data in Public HealthBig Data in Public Health
Big Data in Public Health
 
Geohash: Integration of Disparate Geospatial Data
Geohash: Integration of Disparate Geospatial DataGeohash: Integration of Disparate Geospatial Data
Geohash: Integration of Disparate Geospatial Data
 
precisionFDA
precisionFDAprecisionFDA
precisionFDA
 
Latest Advances in Megapixel Surveillance
Latest Advances in Megapixel SurveillanceLatest Advances in Megapixel Surveillance
Latest Advances in Megapixel Surveillance
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
 
Matchinguu droidcon presentation
Matchinguu droidcon presentationMatchinguu droidcon presentation
Matchinguu droidcon presentation
 
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
 
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
 
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
 
Using Qualtrics to Create Automated Online Trainings
Using Qualtrics to Create Automated Online TrainingsUsing Qualtrics to Create Automated Online Trainings
Using Qualtrics to Create Automated Online Trainings
 
Epi Info™ Mesh4x
Epi Info™ Mesh4xEpi Info™ Mesh4x
Epi Info™ Mesh4x
 
Writing and Publishing about Applied Technologies in Tech Journals and Books
Writing and Publishing about Applied Technologies in Tech Journals and BooksWriting and Publishing about Applied Technologies in Tech Journals and Books
Writing and Publishing about Applied Technologies in Tech Journals and Books
 
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
 

Dernier

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Dernier (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Massively-Parallel Stream Processing Under QoS Constraints with Nephele

  • 1. Massively-Parallel Stream Processing Under QoS Constraints with Nephele Björn Lohrmann, Daniel Warneke, and Odej Kao Technische Universität Berlin
  • 2. Background Nephele is part of the Stratosphere platform for massively-parallel data processing in in map red match out Cloud PACTs Compiler Nephele Runtime Cluster Open Source, downloadable at stratosphere.eu 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 2
  • 3. Background Nephele and PACTs currently focus on batch-job workloads -to- What about streaming workloads? Possible with Nephele, but (as of now) not PACTs May have different goals Meet pipeline latency and throughput requirements Max/Min other custom metrics 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 3
  • 4. Motivation Live Processing of streamed data is an important issue Proliferation of mobile devices capable of producing streamed data (video, audio, other sensors) Large Scale Deployments of Sensors in Science and Industry Examples: Smart Grids, Traffic Monitoring, Astronomy Why not adapt todays mass.-parallel frameworks? 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 4
  • 5. Goals Identify major aspects of massively-parallel frameworks that affect QoS goals Find general strategies to deal with QoS goals Implement & Evaluate them using the Nephele Execution Engine 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 5
  • 6. Agenda 1. Highlight common mass.-parallel framework design principles 2. Explain implications for streamed workloads 3. Meeting latency requirements in Nephele 4. Experimental Results 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 6
  • 7. Framework Design Principles Compute Node X Task n Task n+1 Compute Node Y Task n Task n+1 Task n Compute Node Z Task n Input Buffer Queue 22.06.2012 Task n+1 Output Buffer Data Item Thread/Process Massively-Parallel Stream Processing Under QoS Constraints with Nephele 7
  • 8. Implications for Streaming Applications Large buffer = high tp, high latency Small buffer = low tp, low latency Trade-off needs to be found to meet latency goals Thread/Process Model 1 Task= 1 Thread model is flexible, but has overhead Thread scheduling, synchronization, communication Serialization may be necessary (bad for TP & latency) N Tasks = 1 Thread model can sometimes provide better better tp and latency 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 8
  • 9. Meeting Latency Requirements QoS goal: Meet latency constraint X, then maximize throughput Based on observations we designed two strategies: 1. Adaptive Output Buffer Sizing 2. Dynamic Task Chaining Both strategies work autonomously (only latency constraint is required) are applied on-demand at runtime are applicable in systems with similar design principles 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 9
  • 10. Adaptive Output Buffer Sizing Only applied when latency constraint violated For each channel Determine output buffer latency (obl) If obl > threshold, decrease buffer size: size : max( , size r obl ) r 0.98, 200 If obl < threshold, increase buffer size again size : min( , size r obl ) r 1.1, 22.06.2012 500 103 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 10
  • 11. Task Chaining Again, only applied when overall latency constraint is violated Compute Node Conditions: Task n Task n+1 Compute Node Task n 18.11.2013 Task n+1 Autor - Vortragstitel Pipeline of unchained tasks Sum of CPU utilizations is < 90% of capacity of one core Only apply to longest chainable pipeline of tasks 11
  • 12. Complete System Overview 300ms JM Periodical measurements (latency, throughput) TM 22.06.2012 TM TM Buffer Size Updates, Chain Commands TM TM Massively-Parallel Stream Processing Under QoS Constraints with Nephele TM TM 12
  • 13. Sample Application: Video Livestreaming Node 1 Node 2 RTP Server Node n-1 Node n Encoder Overlay Merger Decoder Partitioner 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 13
  • 14. Latency w/o Optimizations Setup: 10 nodes, 80 cores 32 KB output buffer size 320 video streams Results: Latency oscillates around 4s Large buffers cause 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 14
  • 15. Latency w/ Adaptive Buffer Sizing Final Latency: improvement) 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 15
  • 16. Latency /w ABS+TC Final Latency: improvement) 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 16
  • 17. Conclusion and Future Work Massively-parallel frameworks can be adapted to do latency constrained stream processing Prototype implementation on Nephele showed up to 94% latency improvement on video livestreaming job Future Work Distribute latency monitoring (better scalability) Adapt PACT layer of Stratosphere to provide streaming capabilities and latency awareness 22.06.2012 Massively-Parallel Stream Processing Under QoS Constraints with Nephele 17